All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] PPGTT with 48b addressing
@ 2015-02-20 17:45 Michel Thierry
  2015-02-20 17:45 ` [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic Michel Thierry
                   ` (13 more replies)
  0 siblings, 14 replies; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:45 UTC (permalink / raw)
  To: intel-gfx

These patches rely on "PPGTT dynamic page allocations", currently under review,
to provide GEN8 dynamic page table support with 64b addresses. As the review
progresses, these patches may be combined.

In order expand the GPU address space, a 4th level translation is added, the
Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255],
each pointing to a PDP.

For now, this feature will only be available in BDW, in LRC submission mode
(execlists) and when i915.enable_ppgtt=3 is set.
Also note that this expanded address space is only available for full PPGTT,
aliasing PPGTT remains 32b.

Ben Widawsky (9):
  drm/i915/bdw: Make pdp allocation more dynamic
  drm/i915/bdw: Abstract PDP usage
  drm/i915/bdw: Add dynamic page trace events
  drm/i915/bdw: Add ppgtt info for dynamic pages
  drm/i915/bdw: implement alloc/free for 4lvl
  drm/i915/bdw: Add 4 level switching infrastructure
  drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT
  drm/i915: Plumb sg_iter through va allocation ->maps
  drm/i915: Expand error state's address width to 64b

Michel Thierry (3):
  drm/i915/bdw: Support 64 bit PPGTT in lrc mode
  drm/i915/bdw: Add 4 level support in insert_entries and clear_range
  drm/i915/bdw: Flip the 48b switch

 drivers/gpu/drm/i915/i915_debugfs.c   |  19 +-
 drivers/gpu/drm/i915/i915_drv.h       |  11 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 624 ++++++++++++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  77 ++++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  17 +-
 drivers/gpu/drm/i915/i915_params.c    |   2 +-
 drivers/gpu/drm/i915/i915_reg.h       |   1 +
 drivers/gpu/drm/i915/i915_trace.h     |  16 +
 drivers/gpu/drm/i915/intel_lrc.c      | 167 ++++++---
 9 files changed, 746 insertions(+), 188 deletions(-)

-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
@ 2015-02-20 17:45 ` Michel Thierry
  2015-03-03 11:48   ` akash goel
  2015-02-20 17:45 ` [PATCH 02/12] drm/i915/bdw: Abstract PDP usage Michel Thierry
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:45 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

This transitional patch doesn't do much for the existing code. However,
it should make upcoming patches to use the full 48b address space a bit
easier to swallow. The patch also introduces the PML4, ie. the new top
level structure of the page tables.

v2: Renamed  pdp_free to be similar to  pd/pt (unmap_and_free_pdp),
To facilitate testing, 48b mode will be available on Broadwell, when
i915.enable_ppgtt = 3.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
---
 drivers/gpu/drm/i915/i915_drv.h     |   7 ++-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 108 +++++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_gtt.h |  41 +++++++++++---
 3 files changed, 126 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2dedd43..af0d149 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2432,7 +2432,12 @@ struct drm_i915_cmd_table {
 #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
 #define HAS_LOGICAL_RING_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 8)
 #define USES_PPGTT(dev)		(i915.enable_ppgtt)
-#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt == 2)
+#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt >= 2)
+#ifdef CONFIG_64BIT
+# define USES_FULL_48BIT_PPGTT(dev)	(i915.enable_ppgtt == 3)
+#else
+# define USES_FULL_48BIT_PPGTT(dev)	false
+#endif
 
 #define HAS_OVERLAY(dev)		(INTEL_INFO(dev)->has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev)	(INTEL_INFO(dev)->overlay_needs_physical)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ff86501..489f8db 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -100,10 +100,17 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 {
 	bool has_aliasing_ppgtt;
 	bool has_full_ppgtt;
+	bool has_full_64bit_ppgtt;
 
 	has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
 	has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
 
+#ifdef CONFIG_64BIT
+	has_full_64bit_ppgtt = IS_BROADWELL(dev) && false; /* FIXME: 64b */
+#else
+	has_full_64bit_ppgtt = false;
+#endif
+
 	if (intel_vgpu_active(dev))
 		has_full_ppgtt = false; /* emulation is too hard */
 
@@ -121,6 +128,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 	if (enable_ppgtt == 2 && has_full_ppgtt)
 		return 2;
 
+	if (enable_ppgtt == 3 && has_full_64bit_ppgtt)
+		return 3;
+
 #ifdef CONFIG_INTEL_IOMMU
 	/* Disable ppgtt on SNB if VT-d is on. */
 	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
@@ -462,6 +472,45 @@ free_pd:
 	return ERR_PTR(ret);
 }
 
+static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp)
+{
+	kfree(pdp->used_pdpes);
+	kfree(pdp->page_directory);
+	/* HACK */
+	pdp->page_directory = NULL;
+}
+
+static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp,
+			    struct drm_device *dev)
+{
+	__pdp_fini(pdp);
+	if (USES_FULL_48BIT_PPGTT(dev))
+		kfree(pdp);
+}
+
+static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
+		      struct drm_device *dev)
+{
+	size_t pdpes = I915_PDPES_PER_PDP(dev);
+
+	pdp->used_pdpes = kcalloc(BITS_TO_LONGS(pdpes),
+				  sizeof(unsigned long),
+				  GFP_KERNEL);
+	if (!pdp->used_pdpes)
+		return -ENOMEM;
+
+	pdp->page_directory = kcalloc(pdpes, sizeof(*pdp->page_directory), GFP_KERNEL);
+	if (!pdp->page_directory) {
+		kfree(pdp->used_pdpes);
+		/* the PDP might be the statically allocated top level. Keep it
+		 * as clean as possible */
+		pdp->used_pdpes = NULL;
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
 /* Broadwell Page Directory Pointer Descriptors */
 static int gen8_write_pdp(struct intel_engine_cs *ring,
 			  unsigned entry,
@@ -491,7 +540,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 {
 	int i, ret;
 
-	for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) {
+	for (i = 3; i >= 0; i--) {
 		struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i];
 		dma_addr_t pd_daddr = pd ? pd->daddr : ppgtt->scratch_pd->daddr;
 		/* The page directory might be NULL, but we need to clear out
@@ -580,9 +629,6 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 	pt_vaddr = NULL;
 
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
-		if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES))
-			break;
-
 		if (pt_vaddr == NULL) {
 			struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe];
 			struct i915_page_table_entry *pt = pd->page_tables[pde];
@@ -664,7 +710,8 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 	struct pci_dev *hwdev = ppgtt->base.dev->pdev;
 	int i, j;
 
-	for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
+	for_each_set_bit(i, ppgtt->pdp.used_pdpes,
+			I915_PDPES_PER_PDP(ppgtt->base.dev)) {
 		struct i915_page_directory_entry *pd;
 
 		if (WARN_ON(!ppgtt->pdp.page_directory[i]))
@@ -696,13 +743,15 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
 {
 	int i;
 
-	for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
+	for_each_set_bit(i, ppgtt->pdp.used_pdpes,
+				I915_PDPES_PER_PDP(ppgtt->base.dev)) {
 		if (WARN_ON(!ppgtt->pdp.page_directory[i]))
 			continue;
 
 		gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
 		unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
 	}
+	unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
 }
 
 static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
@@ -799,8 +848,9 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 	struct i915_page_directory_entry *pd;
 	uint64_t temp;
 	uint32_t pdpe;
+	size_t pdpes =  I915_PDPES_PER_PDP(ppgtt->base.dev);
 
-	BUG_ON(!bitmap_empty(new_pds, GEN8_LEGACY_PDPES));
+	BUG_ON(!bitmap_empty(new_pds, pdpes));
 
 	/* FIXME: PPGTT container_of won't work for 64b */
 	BUG_ON((start + length) > 0x800000000ULL);
@@ -820,18 +870,19 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 	return 0;
 
 unwind_out:
-	for_each_set_bit(pdpe, new_pds, GEN8_LEGACY_PDPES)
+	for_each_set_bit(pdpe, new_pds, pdpes)
 		unmap_and_free_pd(pdp->page_directory[pdpe], ppgtt->base.dev);
 
 	return -ENOMEM;
 }
 
 static inline void
-free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
+free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts,
+		       size_t pdpes)
 {
 	int i;
 
-	for (i = 0; i < GEN8_LEGACY_PDPES; i++)
+	for (i = 0; i < pdpes; i++)
 		kfree(new_pts[i]);
 	kfree(new_pts);
 	kfree(new_pds);
@@ -841,13 +892,14 @@ free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
  * of these are based on the number of PDPEs in the system.
  */
 int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
-					 unsigned long ***new_pts)
+					 unsigned long ***new_pts,
+					 size_t pdpes)
 {
 	int i;
 	unsigned long *pds;
 	unsigned long **pts;
 
-	pds = kcalloc(BITS_TO_LONGS(GEN8_LEGACY_PDPES), sizeof(unsigned long), GFP_KERNEL);
+	pds = kcalloc(BITS_TO_LONGS(pdpes), sizeof(unsigned long), GFP_KERNEL);
 	if (!pds)
 		return -ENOMEM;
 
@@ -857,7 +909,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
 		return -ENOMEM;
 	}
 
-	for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
+	for (i = 0; i < pdpes; i++) {
 		pts[i] = kcalloc(BITS_TO_LONGS(GEN8_PDES_PER_PAGE),
 				 sizeof(unsigned long), GFP_KERNEL);
 		if (!pts[i])
@@ -870,7 +922,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
 	return 0;
 
 err_out:
-	free_gen8_temp_bitmaps(pds, pts);
+	free_gen8_temp_bitmaps(pds, pts, pdpes);
 	return -ENOMEM;
 }
 
@@ -886,6 +938,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	const uint64_t orig_length = length;
 	uint64_t temp;
 	uint32_t pdpe;
+	size_t pdpes = I915_PDPES_PER_PDP(dev);
 	int ret;
 
 #ifndef CONFIG_64BIT
@@ -903,7 +956,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	if (WARN_ON(start + length < start))
 		return -ERANGE;
 
-	ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables);
+	ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables, pdpes);
 	if (ret)
 		return ret;
 
@@ -911,7 +964,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
 					new_page_dirs);
 	if (ret) {
-		free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+		free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 		return ret;
 	}
 
@@ -968,7 +1021,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 		set_bit(pdpe, ppgtt->pdp.used_pdpes);
 	}
 
-	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 	return 0;
 
 err_out:
@@ -977,13 +1030,19 @@ err_out:
 			unmap_and_free_pt(pd->page_tables[temp], vm->dev);
 	}
 
-	for_each_set_bit(pdpe, new_page_dirs, GEN8_LEGACY_PDPES)
+	for_each_set_bit(pdpe, new_page_dirs, pdpes)
 		unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
 
-	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 	return ret;
 }
 
+static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
+{
+	unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
+	unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
+}
+
 /**
  * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
  * with a net effect resembling a 2-level page table in normal x86 terms. Each
@@ -1004,6 +1063,15 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 
 	ppgtt->switch_mm = gen8_mm_switch;
 
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		int ret = __pdp_init(&ppgtt->pdp, false);
+		if (ret) {
+			unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
+			return ret;
+		}
+	} else
+		return -EPERM; /* Not yet implemented */
+
 	return 0;
 }
 
@@ -1025,7 +1093,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	 * eventually. */
 	ret = gen8_alloc_va_range(&ppgtt->base, start, size);
 	if (ret) {
-		unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
+		gen8_ppgtt_fini_common(ppgtt);
 		return ret;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index c68ec3a..a33c6e9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -85,8 +85,12 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
  * The difference as compared to normal x86 3 level page table is the PDPEs are
  * programmed via register.
  */
+#define GEN8_PML4ES_PER_PML4		512
+#define GEN8_PML4E_SHIFT		39
 #define GEN8_PDPE_SHIFT			30
-#define GEN8_PDPE_MASK			0x3
+/* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
+ * tables */
+#define GEN8_PDPE_MASK			0x1ff
 #define GEN8_PDE_SHIFT			21
 #define GEN8_PDE_MASK			0x1ff
 #define GEN8_PTE_SHIFT			12
@@ -95,6 +99,13 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
 #define GEN8_PTES_PER_PAGE		(PAGE_SIZE / sizeof(gen8_gtt_pte_t))
 #define GEN8_PDES_PER_PAGE		(PAGE_SIZE / sizeof(gen8_ppgtt_pde_t))
 
+#ifdef CONFIG_64BIT
+# define I915_PDPES_PER_PDP(dev) (USES_FULL_48BIT_PPGTT(dev) ?\
+		GEN8_PML4ES_PER_PML4 : GEN8_LEGACY_PDPES)
+#else
+# define I915_PDPES_PER_PDP		GEN8_LEGACY_PDPES
+#endif
+
 #define PPAT_UNCACHED_INDEX		(_PAGE_PWT | _PAGE_PCD)
 #define PPAT_CACHED_PDE_INDEX		0 /* WB LLC */
 #define PPAT_CACHED_INDEX		_PAGE_PAT /* WB LLCeLLC */
@@ -210,9 +221,17 @@ struct i915_page_directory_entry {
 };
 
 struct i915_page_directory_pointer_entry {
-	/* struct page *page; */
-	DECLARE_BITMAP(used_pdpes, GEN8_LEGACY_PDPES);
-	struct i915_page_directory_entry *page_directory[GEN8_LEGACY_PDPES];
+	struct page *page;
+	dma_addr_t daddr;
+	unsigned long *used_pdpes;
+	struct i915_page_directory_entry **page_directory;
+};
+
+struct i915_pml4 {
+	struct page *page;
+	dma_addr_t daddr;
+	DECLARE_BITMAP(used_pml4es, GEN8_PML4ES_PER_PML4);
+	struct i915_page_directory_pointer_entry *pdps[GEN8_PML4ES_PER_PML4];
 };
 
 struct i915_address_space {
@@ -302,8 +321,9 @@ struct i915_hw_ppgtt {
 	struct drm_mm_node node;
 	unsigned long pd_dirty_rings;
 	union {
-		struct i915_page_directory_pointer_entry pdp;
-		struct i915_page_directory_entry pd;
+		struct i915_pml4 pml4;		/* GEN8+ & 64b PPGTT */
+		struct i915_page_directory_pointer_entry pdp;	/* GEN8+ */
+		struct i915_page_directory_entry pd;		/* GEN6-7 */
 	};
 
 	union {
@@ -399,14 +419,17 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 	     temp = min(temp, length),					\
 	     start += temp, length -= temp)
 
-#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)		\
-	for (iter = gen8_pdpe_index(start), pd = (pdp)->page_directory[iter];	\
-	     length > 0 && iter < GEN8_LEGACY_PDPES;			\
+#define gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, b)	\
+	for (iter = gen8_pdpe_index(start), pd = (pdp)->page_directory[iter]; \
+	     length > 0 && (iter < b);					\
 	     pd = (pdp)->page_directory[++iter],				\
 	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
 	     temp = min(temp, length),					\
 	     start += temp, length -= temp)
 
+#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)		\
+	gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
+
 /* Clamp length to the next page_directory boundary */
 static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length)
 {
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 02/12] drm/i915/bdw: Abstract PDP usage
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
  2015-02-20 17:45 ` [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic Michel Thierry
@ 2015-02-20 17:45 ` Michel Thierry
  2015-03-03 12:16   ` akash goel
  2015-03-04  3:07   ` akash goel
  2015-02-20 17:45 ` [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events Michel Thierry
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:45 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

Up until now, ppgtt->pdp has always been the root of our page tables.
Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.

In preparation for 4 level page tables, we need to stop use ppgtt->pdp
directly unless we know it's what we want. The future structure will use
ppgtt->pml4 for the top level, and the pdp is just one of the entries
being pointed to by a pml4e.

v2: Updated after dynamic page allocation changes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 123 ++++++++++++++++++++----------------
 1 file changed, 70 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 489f8db..d3ad517 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -560,6 +560,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
+	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
 	gen8_gtt_pte_t *pt_vaddr, scratch_pte;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -575,10 +576,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 		struct i915_page_table_entry *pt;
 		struct page *page_table;
 
-		if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
+		if (WARN_ON(!pdp->page_directory[pdpe]))
 			continue;
 
-		pd = ppgtt->pdp.page_directory[pdpe];
+		pd = pdp->page_directory[pdpe];
 
 		if (WARN_ON(!pd->page_tables[pde]))
 			continue;
@@ -620,6 +621,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
+	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
 	gen8_gtt_pte_t *pt_vaddr;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -630,7 +632,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
 		if (pt_vaddr == NULL) {
-			struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe];
+			struct i915_page_directory_entry *pd = pdp->page_directory[pdpe];
 			struct i915_page_table_entry *pt = pd->page_tables[pde];
 			struct page *page_table = pt->page;
 
@@ -708,16 +710,17 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d
 static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 {
 	struct pci_dev *hwdev = ppgtt->base.dev->pdev;
+	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
 	int i, j;
 
-	for_each_set_bit(i, ppgtt->pdp.used_pdpes,
+	for_each_set_bit(i, pdp->used_pdpes,
 			I915_PDPES_PER_PDP(ppgtt->base.dev)) {
 		struct i915_page_directory_entry *pd;
 
-		if (WARN_ON(!ppgtt->pdp.page_directory[i]))
+		if (WARN_ON(!pdp->page_directory[i]))
 			continue;
 
-		pd = ppgtt->pdp.page_directory[i];
+		pd = pdp->page_directory[i];
 		if (!pd->daddr)
 			pci_unmap_page(hwdev, pd->daddr, PAGE_SIZE,
 					PCI_DMA_BIDIRECTIONAL);
@@ -743,15 +746,21 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
 {
 	int i;
 
-	for_each_set_bit(i, ppgtt->pdp.used_pdpes,
-				I915_PDPES_PER_PDP(ppgtt->base.dev)) {
-		if (WARN_ON(!ppgtt->pdp.page_directory[i]))
-			continue;
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		for_each_set_bit(i, ppgtt->pdp.used_pdpes,
+				 I915_PDPES_PER_PDP(ppgtt->base.dev)) {
+			if (WARN_ON(!ppgtt->pdp.page_directory[i]))
+				continue;
 
-		gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
-		unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
+			gen8_free_page_tables(ppgtt->pdp.page_directory[i],
+					      ppgtt->base.dev);
+			unmap_and_free_pd(ppgtt->pdp.page_directory[i],
+					  ppgtt->base.dev);
+		}
+		unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
+	} else {
+		BUG(); /* to be implemented later */
 	}
-	unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
 }
 
 static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
@@ -765,7 +774,7 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 
 /**
  * gen8_ppgtt_alloc_pagetabs() - Allocate page tables for VA range.
- * @ppgtt:	Master ppgtt structure.
+ * @vm:		Master vm structure.
  * @pd:		Page directory for this address range.
  * @start:	Starting virtual address to begin allocations.
  * @length	Size of the allocations.
@@ -781,12 +790,13 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
  *
  * Return: 0 if success; negative error code otherwise.
  */
-static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
+static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
 				     struct i915_page_directory_entry *pd,
 				     uint64_t start,
 				     uint64_t length,
 				     unsigned long *new_pts)
 {
+	struct drm_device *dev = vm->dev;
 	struct i915_page_table_entry *pt;
 	uint64_t temp;
 	uint32_t pde;
@@ -799,7 +809,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 			continue;
 		}
 
-		pt = alloc_pt_single(ppgtt->base.dev);
+		pt = alloc_pt_single(dev);
 		if (IS_ERR(pt))
 			goto unwind_out;
 
@@ -811,14 +821,14 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 
 unwind_out:
 	for_each_set_bit(pde, new_pts, GEN8_PDES_PER_PAGE)
-		unmap_and_free_pt(pd->page_tables[pde], ppgtt->base.dev);
+		unmap_and_free_pt(pd->page_tables[pde], dev);
 
 	return -ENOMEM;
 }
 
 /**
  * gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range.
- * @ppgtt:	Master ppgtt structure.
+ * @vm:		Master vm structure.
  * @pdp:	Page directory pointer for this address range.
  * @start:	Starting virtual address to begin allocations.
  * @length	Size of the allocations.
@@ -839,16 +849,17 @@ unwind_out:
  *
  * Return: 0 if success; negative error code otherwise.
  */
-static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
+static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
 				     struct i915_page_directory_pointer_entry *pdp,
 				     uint64_t start,
 				     uint64_t length,
 				     unsigned long *new_pds)
 {
+	struct drm_device *dev = vm->dev;
 	struct i915_page_directory_entry *pd;
 	uint64_t temp;
 	uint32_t pdpe;
-	size_t pdpes =  I915_PDPES_PER_PDP(ppgtt->base.dev);
+	size_t pdpes =  I915_PDPES_PER_PDP(vm->dev);
 
 	BUG_ON(!bitmap_empty(new_pds, pdpes));
 
@@ -859,7 +870,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 		if (pd)
 			continue;
 
-		pd = alloc_pd_single(ppgtt->base.dev);
+		pd = alloc_pd_single(dev);
 		if (IS_ERR(pd))
 			goto unwind_out;
 
@@ -871,7 +882,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 
 unwind_out:
 	for_each_set_bit(pdpe, new_pds, pdpes)
-		unmap_and_free_pd(pdp->page_directory[pdpe], ppgtt->base.dev);
+		unmap_and_free_pd(pdp->page_directory[pdpe], dev);
 
 	return -ENOMEM;
 }
@@ -926,13 +937,13 @@ err_out:
 	return -ENOMEM;
 }
 
-static int gen8_alloc_va_range(struct i915_address_space *vm,
-			       uint64_t start,
-			       uint64_t length)
+static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
+				    struct i915_page_directory_pointer_entry *pdp,
+				    uint64_t start,
+				    uint64_t length)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
 	unsigned long *new_page_dirs, **new_page_tables;
+	struct drm_device *dev = vm->dev;
 	struct i915_page_directory_entry *pd;
 	const uint64_t orig_start = start;
 	const uint64_t orig_length = length;
@@ -961,17 +972,15 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 		return ret;
 
 	/* Do the allocations first so we can easily bail out */
-	ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
-					new_page_dirs);
+	ret = gen8_ppgtt_alloc_page_directories(vm, pdp, start, length, new_page_dirs);
 	if (ret) {
 		free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 		return ret;
 	}
 
-	/* For every page directory referenced, allocate page tables */
-	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
 		bitmap_zero(new_page_tables[pdpe], GEN8_PDES_PER_PAGE);
-		ret = gen8_ppgtt_alloc_pagetabs(ppgtt, pd, start, length,
+		ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
 						new_page_tables[pdpe]);
 		if (ret)
 			goto err_out;
@@ -980,10 +989,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	start = orig_start;
 	length = orig_length;
 
-	/* Allocations have completed successfully, so set the bitmaps, and do
-	 * the mappings. */
-	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
-		gen8_ppgtt_pde_t *const page_directory = kmap_atomic(pd->page);
+	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
 		struct i915_page_table_entry *pt;
 		uint64_t pd_len = gen8_clamp_pd(start, length);
 		uint64_t pd_start = start;
@@ -1005,20 +1011,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 
 			/* Our pde is now pointing to the pagetable, pt */
 			set_bit(pde, pd->used_pdes);
-
-			/* Map the PDE to the page table */
-			__gen8_do_map_pt(page_directory + pde, pt, vm->dev);
-
-			/* NB: We haven't yet mapped ptes to pages. At this
-			 * point we're still relying on insert_entries() */
 		}
 
-		if (!HAS_LLC(vm->dev))
-			drm_clflush_virt_range(page_directory, PAGE_SIZE);
-
-		kunmap_atomic(page_directory);
-
-		set_bit(pdpe, ppgtt->pdp.used_pdpes);
+		set_bit(pdpe, pdp->used_pdpes);
+		gen8_map_pagetable_range(pd, start, length, dev);
 	}
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
@@ -1027,16 +1023,36 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 err_out:
 	while (pdpe--) {
 		for_each_set_bit(temp, new_page_tables[pdpe], GEN8_PDES_PER_PAGE)
-			unmap_and_free_pt(pd->page_tables[temp], vm->dev);
+			unmap_and_free_pt(pd->page_tables[temp], dev);
 	}
 
 	for_each_set_bit(pdpe, new_page_dirs, pdpes)
-		unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
+		unmap_and_free_pd(pdp->page_directory[pdpe], dev);
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 	return ret;
 }
 
+static int __noreturn gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
+					       struct i915_pml4 *pml4,
+					       uint64_t start,
+					       uint64_t length)
+{
+	BUG(); /* to be implemented later */
+}
+
+static int gen8_alloc_va_range(struct i915_address_space *vm,
+			       uint64_t start, uint64_t length)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+
+	if (!USES_FULL_48BIT_PPGTT(vm->dev))
+		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
+	else
+		return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
+}
+
 static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
 {
 	unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
@@ -1079,12 +1095,13 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
 	struct i915_page_directory_entry *pd;
 	uint64_t temp, start = 0, size = dev_priv->gtt.base.total;
 	uint32_t pdpe;
 	int ret;
 
-	ret = gen8_ppgtt_init_common(ppgtt, dev_priv->gtt.base.total);
+	ret = gen8_ppgtt_init_common(ppgtt, size);
 	if (ret)
 		return ret;
 
@@ -1097,8 +1114,8 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 		return ret;
 	}
 
-	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, size, temp, pdpe)
-		gen8_map_pagetable_range(pd, start, size, ppgtt->base.dev);
+	gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
+		gen8_map_pagetable_range(pd, start, size, dev);
 
 	ppgtt->base.allocate_va_range = NULL;
 	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
  2015-02-20 17:45 ` [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic Michel Thierry
  2015-02-20 17:45 ` [PATCH 02/12] drm/i915/bdw: Abstract PDP usage Michel Thierry
@ 2015-02-20 17:45 ` Michel Thierry
  2015-02-24 10:56   ` Daniel Vetter
  2015-02-24 10:59   ` Daniel Vetter
  2015-02-20 17:45 ` [PATCH 04/12] drm/i915/bdw: Add ppgtt info for dynamic pages Michel Thierry
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:45 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

The dynamic page allocation patch series added it for GEN6, this patch
adds them for GEN8.

v2: Consolidate pagetable/page_directory events
v3: Multiple rebases.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 23 +++++++++++++++--------
 drivers/gpu/drm/i915/i915_trace.h   | 16 ++++++++++++++++
 2 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d3ad517..ecfb62a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -673,19 +673,24 @@ static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde,
 /* It's likely we'll map more than one pagetable at a time. This function will
  * save us unnecessary kmap calls, but do no more functionally than multiple
  * calls to map_pt. */
-static void gen8_map_pagetable_range(struct i915_page_directory_entry *pd,
+static void gen8_map_pagetable_range(struct i915_address_space *vm,
+				     struct i915_page_directory_entry *pd,
 				     uint64_t start,
-				     uint64_t length,
-				     struct drm_device *dev)
+				     uint64_t length)
 {
 	gen8_ppgtt_pde_t * const page_directory = kmap_atomic(pd->page);
 	struct i915_page_table_entry *pt;
 	uint64_t temp, pde;
 
-	gen8_for_each_pde(pt, pd, start, length, temp, pde)
-		__gen8_do_map_pt(page_directory + pde, pt, dev);
+	gen8_for_each_pde(pt, pd, start, length, temp, pde) {
+		__gen8_do_map_pt(page_directory + pde, pt, vm->dev);
+		trace_i915_page_table_entry_map(vm, pde, pt,
+					 gen8_pte_index(start),
+					 gen8_pte_count(start, length),
+					 GEN8_PTES_PER_PAGE);
+	}
 
-	if (!HAS_LLC(dev))
+	if (!HAS_LLC(vm->dev))
 		drm_clflush_virt_range(page_directory, PAGE_SIZE);
 
 	kunmap_atomic(page_directory);
@@ -815,6 +820,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
 
 		pd->page_tables[pde] = pt;
 		set_bit(pde, new_pts);
+		trace_i915_page_table_entry_alloc(vm, pde, start, GEN8_PDE_SHIFT);
 	}
 
 	return 0;
@@ -876,6 +882,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
 
 		pdp->page_directory[pdpe] = pd;
 		set_bit(pdpe, new_pds);
+		trace_i915_page_directory_entry_alloc(vm, pdpe, start, GEN8_PDPE_SHIFT);
 	}
 
 	return 0;
@@ -1014,7 +1021,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 		}
 
 		set_bit(pdpe, pdp->used_pdpes);
-		gen8_map_pagetable_range(pd, start, length, dev);
+		gen8_map_pagetable_range(vm, pd, start, length);
 	}
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
@@ -1115,7 +1122,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	}
 
 	gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
-		gen8_map_pagetable_range(pd, start, size, dev);
+		gen8_map_pagetable_range(&ppgtt->base, pd,start, size);
 
 	ppgtt->base.allocate_va_range = NULL;
 	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 3a657e4..6c20f76 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -214,6 +214,22 @@ DEFINE_EVENT(i915_page_table_entry, i915_page_table_entry_alloc,
 	     TP_ARGS(vm, pde, start, pde_shift)
 );
 
+DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_entry_alloc,
+		   TP_PROTO(struct i915_address_space *vm, u32 pdpe, u64 start, u64 pdpe_shift),
+		   TP_ARGS(vm, pdpe, start, pdpe_shift),
+
+		   TP_printk("vm=%p, pdpe=%d (0x%llx-0x%llx)",
+			     __entry->vm, __entry->pde, __entry->start, __entry->end)
+);
+
+DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_pointer_entry_alloc,
+		   TP_PROTO(struct i915_address_space *vm, u32 pml4e, u64 start, u64 pml4e_shift),
+		   TP_ARGS(vm, pml4e, start, pml4e_shift),
+
+		   TP_printk("vm=%p, pml4e=%d (0x%llx-0x%llx)",
+			     __entry->vm, __entry->pde, __entry->start, __entry->end)
+);
+
 /* Avoid extra math because we only support two sizes. The format is defined by
  * bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */
 #define TRACE_PT_SIZE(bits) \
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 04/12] drm/i915/bdw: Add ppgtt info for dynamic pages
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (2 preceding siblings ...)
  2015-02-20 17:45 ` [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events Michel Thierry
@ 2015-02-20 17:45 ` Michel Thierry
  2015-03-03 12:23   ` akash goel
  2015-02-20 17:45 ` [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl Michel Thierry
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:45 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

Note that there is no gen8 ppgtt debug_dump function yet.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 19 ++++++++++---------
 drivers/gpu/drm/i915/i915_gem_gtt.c | 32 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  9 +++++++++
 3 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 40630bd..93c34ab 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2165,7 +2165,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring;
-	struct drm_file *file;
 	int i;
 
 	if (INTEL_INFO(dev)->gen == 6)
@@ -2189,14 +2188,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 
 		ppgtt->debug_dump(ppgtt, m);
 	}
-
-	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
-		struct drm_i915_file_private *file_priv = file->driver_priv;
-
-		seq_printf(m, "proc: %s\n",
-			   get_pid_task(file->pid, PIDTYPE_PID)->comm);
-		idr_for_each(&file_priv->context_idr, per_file_ctx, m);
-	}
 }
 
 static int i915_ppgtt_info(struct seq_file *m, void *data)
@@ -2204,6 +2195,7 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_file *file;
 
 	int ret = mutex_lock_interruptible(&dev->struct_mutex);
 	if (ret)
@@ -2215,6 +2207,15 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
 	else if (INTEL_INFO(dev)->gen >= 6)
 		gen6_ppgtt_info(m, dev);
 
+	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
+		struct drm_i915_file_private *file_priv = file->driver_priv;
+
+		seq_printf(m, "\nproc: %s\n",
+			   get_pid_task(file->pid, PIDTYPE_PID)->comm);
+		idr_for_each(&file_priv->context_idr, per_file_ctx,
+			     (void *)(unsigned long)m);
+	}
+
 	intel_runtime_pm_put(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ecfb62a..1edcc17 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2125,6 +2125,38 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 	readl(gtt_base);
 }
 
+void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt,
+			     void (*callback)(struct i915_page_directory_pointer_entry *pdp,
+					      struct i915_page_directory_entry *pd,
+					      struct i915_page_table_entry *pt,
+					      unsigned pdpe,
+					      unsigned pde,
+					      void *data),
+			     void *data)
+{
+	uint64_t start = ppgtt->base.start;
+	uint64_t length = ppgtt->base.total;
+	uint64_t pdpe, pde, temp;
+
+	struct i915_page_directory_entry *pd;
+	struct i915_page_table_entry *pt;
+
+	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
+		uint64_t pd_start = start, pd_length = length;
+		int i;
+
+		if (pd == NULL) {
+			for (i = 0; i < GEN8_PDES_PER_PAGE; i++)
+				callback(&ppgtt->pdp, NULL, NULL, pdpe, i, data);
+			continue;
+		}
+
+		gen8_for_each_pde(pt, pd, pd_start, pd_length, temp, pde) {
+			callback(&ppgtt->pdp, pd, pt, pdpe, pde, data);
+		}
+	}
+}
+
 static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 				  uint64_t start,
 				  uint64_t length,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index a33c6e9..144858e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -483,6 +483,15 @@ static inline size_t gen8_pde_count(uint64_t addr, uint64_t length)
 	return i915_pde_index(end, GEN8_PDE_SHIFT) - i915_pde_index(addr, GEN8_PDE_SHIFT);
 }
 
+void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt,
+			     void (*callback)(struct i915_page_directory_pointer_entry *pdp,
+					      struct i915_page_directory_entry *pd,
+					      struct i915_page_table_entry *pt,
+					      unsigned pdpe,
+					      unsigned pde,
+					      void *data),
+			     void *data);
+
 int i915_gem_gtt_init(struct drm_device *dev);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_global_gtt_cleanup(struct drm_device *dev);
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (3 preceding siblings ...)
  2015-02-20 17:45 ` [PATCH 04/12] drm/i915/bdw: Add ppgtt info for dynamic pages Michel Thierry
@ 2015-02-20 17:45 ` Michel Thierry
  2015-03-03 12:55   ` akash goel
  2015-03-04  2:48   ` akash goel
  2015-02-20 17:46 ` [PATCH 06/12] drm/i915/bdw: Add 4 level switching infrastructure Michel Thierry
                   ` (8 subsequent siblings)
  13 siblings, 2 replies; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:45 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

The code for 4lvl works just as one would expect, and nicely it is able
to call into the existing 3lvl page table code to handle all of the
lower levels.

PML4 has no special attributes, and there will always be a PML4.
So simply initialize it at creation, and destroy it at the end.

v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
compiler happy. And define ret only in one place.
Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 240 +++++++++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_gtt.h |  11 +-
 2 files changed, 217 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1edcc17..edada33 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -483,9 +483,12 @@ static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp)
 static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp,
 			    struct drm_device *dev)
 {
-	__pdp_fini(pdp);
-	if (USES_FULL_48BIT_PPGTT(dev))
+	if (USES_FULL_48BIT_PPGTT(dev)) {
+		__pdp_fini(pdp);
+		i915_dma_unmap_single(pdp, dev);
+		__free_page(pdp->page);
 		kfree(pdp);
+	}
 }
 
 static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
@@ -511,6 +514,60 @@ static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
 	return 0;
 }
 
+static struct i915_page_directory_pointer_entry *alloc_pdp_single(struct i915_hw_ppgtt *ppgtt,
+					       struct i915_pml4 *pml4)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	struct i915_page_directory_pointer_entry *pdp;
+	int ret;
+
+	BUG_ON(!USES_FULL_48BIT_PPGTT(dev));
+
+	pdp = kmalloc(sizeof(*pdp), GFP_KERNEL);
+	if (!pdp)
+		return ERR_PTR(-ENOMEM);
+
+	pdp->page = alloc_page(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO);
+	if (!pdp->page) {
+		kfree(pdp);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	ret = __pdp_init(pdp, dev);
+	if (ret) {
+		__free_page(pdp->page);
+		kfree(pdp);
+		return ERR_PTR(ret);
+	}
+
+	i915_dma_map_px_single(pdp, dev);
+
+	return pdp;
+}
+
+static void pml4_fini(struct i915_pml4 *pml4)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(pml4, struct i915_hw_ppgtt, pml4);
+	i915_dma_unmap_single(pml4, ppgtt->base.dev);
+	__free_page(pml4->page);
+	/* HACK */
+	pml4->page = NULL;
+}
+
+static int pml4_init(struct i915_hw_ppgtt *ppgtt)
+{
+	struct i915_pml4 *pml4 = &ppgtt->pml4;
+
+	pml4->page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!pml4->page)
+		return -ENOMEM;
+
+	i915_dma_map_px_single(pml4, ppgtt->base.dev);
+
+	return 0;
+}
+
 /* Broadwell Page Directory Pointer Descriptors */
 static int gen8_write_pdp(struct intel_engine_cs *ring,
 			  unsigned entry,
@@ -712,14 +769,13 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d
 	}
 }
 
-static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
+static void gen8_ppgtt_unmap_pages_3lvl(struct i915_page_directory_pointer_entry *pdp,
+					struct drm_device *dev)
 {
-	struct pci_dev *hwdev = ppgtt->base.dev->pdev;
-	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
+	struct pci_dev *hwdev = dev->pdev;
 	int i, j;
 
-	for_each_set_bit(i, pdp->used_pdpes,
-			I915_PDPES_PER_PDP(ppgtt->base.dev)) {
+	for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
 		struct i915_page_directory_entry *pd;
 
 		if (WARN_ON(!pdp->page_directory[i]))
@@ -747,27 +803,73 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 	}
 }
 
-static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
+static void gen8_ppgtt_unmap_pages_4lvl(struct i915_hw_ppgtt *ppgtt)
 {
+	struct pci_dev *hwdev = ppgtt->base.dev->pdev;
 	int i;
 
-	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
-		for_each_set_bit(i, ppgtt->pdp.used_pdpes,
-				 I915_PDPES_PER_PDP(ppgtt->base.dev)) {
-			if (WARN_ON(!ppgtt->pdp.page_directory[i]))
-				continue;
+	for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
+		struct i915_page_directory_pointer_entry *pdp;
 
-			gen8_free_page_tables(ppgtt->pdp.page_directory[i],
-					      ppgtt->base.dev);
-			unmap_and_free_pd(ppgtt->pdp.page_directory[i],
-					  ppgtt->base.dev);
-		}
-		unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
-	} else {
-		BUG(); /* to be implemented later */
+		if (WARN_ON(!ppgtt->pml4.pdps[i]))
+			continue;
+
+		pdp = ppgtt->pml4.pdps[i];
+		if (!pdp->daddr)
+			pci_unmap_page(hwdev, pdp->daddr, PAGE_SIZE,
+				       PCI_DMA_BIDIRECTIONAL);
+
+		gen8_ppgtt_unmap_pages_3lvl(ppgtt->pml4.pdps[i],
+					    ppgtt->base.dev);
 	}
 }
 
+static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
+{
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
+		gen8_ppgtt_unmap_pages_3lvl(&ppgtt->pdp, ppgtt->base.dev);
+	else
+		gen8_ppgtt_unmap_pages_4lvl(ppgtt);
+}
+
+static void gen8_ppgtt_free_3lvl(struct i915_page_directory_pointer_entry *pdp,
+				 struct drm_device *dev)
+{
+	int i;
+
+	for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
+		if (WARN_ON(!pdp->page_directory[i]))
+			continue;
+
+		gen8_free_page_tables(pdp->page_directory[i], dev);
+		unmap_and_free_pd(pdp->page_directory[i], dev);
+	}
+
+	unmap_and_free_pdp(pdp, dev);
+}
+
+static void gen8_ppgtt_free_4lvl(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
+	for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
+		if (WARN_ON(!ppgtt->pml4.pdps[i]))
+			continue;
+
+		gen8_ppgtt_free_3lvl(ppgtt->pml4.pdps[i], ppgtt->base.dev);
+	}
+
+	pml4_fini(&ppgtt->pml4);
+}
+
+static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
+{
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
+		gen8_ppgtt_free_3lvl(&ppgtt->pdp, ppgtt->base.dev);
+	else
+		gen8_ppgtt_free_4lvl(ppgtt);
+}
+
 static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 {
 	struct i915_hw_ppgtt *ppgtt =
@@ -1040,12 +1142,74 @@ err_out:
 	return ret;
 }
 
-static int __noreturn gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
-					       struct i915_pml4 *pml4,
-					       uint64_t start,
-					       uint64_t length)
+static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
+				    struct i915_pml4 *pml4,
+				    uint64_t start,
+				    uint64_t length)
 {
-	BUG(); /* to be implemented later */
+	DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+	struct i915_page_directory_pointer_entry *pdp;
+	const uint64_t orig_start = start;
+	const uint64_t orig_length = length;
+	uint64_t temp, pml4e;
+	int ret = 0;
+
+	/* Do the pml4 allocations first, so we don't need to track the newly
+	 * allocated tables below the pdp */
+	bitmap_zero(new_pdps, GEN8_PML4ES_PER_PML4);
+
+	/* The page_directoryectory and pagetable allocations are done in the shared 3
+	 * and 4 level code. Just allocate the pdps.
+	 */
+	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+		if (!pdp) {
+			WARN_ON(test_bit(pml4e, pml4->used_pml4es));
+			pdp = alloc_pdp_single(ppgtt, pml4);
+			if (IS_ERR(pdp))
+				goto err_alloc;
+
+			pml4->pdps[pml4e] = pdp;
+			set_bit(pml4e, new_pdps);
+			trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, pml4e,
+						   pml4e << GEN8_PML4E_SHIFT,
+						   GEN8_PML4E_SHIFT);
+
+		}
+	}
+
+	WARN(bitmap_weight(new_pdps, GEN8_PML4ES_PER_PML4) > 2,
+	     "The allocation has spanned more than 512GB. "
+	     "It is highly likely this is incorrect.");
+
+	start = orig_start;
+	length = orig_length;
+
+	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+		BUG_ON(!pdp);
+
+		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
+		if (ret)
+			goto err_out;
+	}
+
+	bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
+		  GEN8_PML4ES_PER_PML4);
+
+	return 0;
+
+err_out:
+	start = orig_start;
+	length = orig_length;
+	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e)
+		gen8_ppgtt_free_3lvl(pdp, vm->dev);
+
+err_alloc:
+	for_each_set_bit(pml4e, new_pdps, GEN8_PML4ES_PER_PML4)
+		unmap_and_free_pdp(pdp, vm->dev);
+
+	return ret;
 }
 
 static int gen8_alloc_va_range(struct i915_address_space *vm,
@@ -1054,16 +1218,19 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 
-	if (!USES_FULL_48BIT_PPGTT(vm->dev))
-		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
-	else
+	if (USES_FULL_48BIT_PPGTT(vm->dev))
 		return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
+	else
+		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
 }
 
 static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
 {
 	unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
-	unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
+	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
+		pml4_fini(&ppgtt->pml4);
+	else
+		unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
 }
 
 /**
@@ -1086,14 +1253,21 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 
 	ppgtt->switch_mm = gen8_mm_switch;
 
-	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		int ret = pml4_init(ppgtt);
+		if (ret) {
+			unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
+			return ret;
+		}
+	} else {
 		int ret = __pdp_init(&ppgtt->pdp, false);
 		if (ret) {
 			unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
 			return ret;
 		}
-	} else
-		return -EPERM; /* Not yet implemented */
+
+		trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, 0, 0, GEN8_PML4E_SHIFT);
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 144858e..1477f54 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -87,6 +87,7 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
  */
 #define GEN8_PML4ES_PER_PML4		512
 #define GEN8_PML4E_SHIFT		39
+#define GEN8_PML4E_MASK			(GEN8_PML4ES_PER_PML4 - 1)
 #define GEN8_PDPE_SHIFT			30
 /* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
  * tables */
@@ -427,6 +428,14 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 	     temp = min(temp, length),					\
 	     start += temp, length -= temp)
 
+#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
+	for (iter = gen8_pml4e_index(start), pdp = (pml4)->pdps[iter];	\
+	     length > 0 && iter < GEN8_PML4ES_PER_PML4;			\
+	     pdp = (pml4)->pdps[++iter],				\
+	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
+	     temp = min(temp, length),					\
+	     start += temp, length -= temp)
+
 #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)		\
 	gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
 
@@ -458,7 +467,7 @@ static inline uint32_t gen8_pdpe_index(uint64_t address)
 
 static inline uint32_t gen8_pml4e_index(uint64_t address)
 {
-	BUG(); /* For 64B */
+	return (address >> GEN8_PML4E_SHIFT) & GEN8_PML4E_MASK;
 }
 
 static inline size_t gen8_pte_count(uint64_t addr, uint64_t length)
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 06/12] drm/i915/bdw: Add 4 level switching infrastructure
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (4 preceding siblings ...)
  2015-02-20 17:45 ` [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl Michel Thierry
@ 2015-02-20 17:46 ` Michel Thierry
  2015-03-03 13:01   ` akash goel
  2015-02-20 17:46 ` [PATCH 07/12] drm/i915/bdw: Support 64 bit PPGTT in lrc mode Michel Thierry
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:46 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

Map is easy, it's the same register as the PDP descriptor 0, but it only
has one entry.

v2: PML4 update in legacy context switch is left for historic reasons,
the preferred mode of operation is with lrc context based submission.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 56 +++++++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_gtt.h |  4 ++-
 drivers/gpu/drm/i915/i915_reg.h     |  1 +
 3 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index edada33..fb06f67 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -192,6 +192,9 @@ static inline gen8_ppgtt_pde_t gen8_pde_encode(struct drm_device *dev,
 	return pde;
 }
 
+#define gen8_pdpe_encode gen8_pde_encode
+#define gen8_pml4e_encode gen8_pde_encode
+
 static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
 				     enum i915_cache_level level,
 				     bool valid, u32 unused)
@@ -592,8 +595,8 @@ static int gen8_write_pdp(struct intel_engine_cs *ring,
 	return 0;
 }
 
-static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct intel_engine_cs *ring)
+static int gen8_legacy_mm_switch(struct i915_hw_ppgtt *ppgtt,
+				 struct intel_engine_cs *ring)
 {
 	int i, ret;
 
@@ -610,6 +613,12 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	return 0;
 }
 
+static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
+			      struct intel_engine_cs *ring)
+{
+	return gen8_write_pdp(ring, 0, ppgtt->pml4.daddr);
+}
+
 static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 				   uint64_t start,
 				   uint64_t length,
@@ -753,6 +762,37 @@ static void gen8_map_pagetable_range(struct i915_address_space *vm,
 	kunmap_atomic(page_directory);
 }
 
+static void gen8_map_page_directory(struct i915_page_directory_pointer_entry *pdp,
+				    struct i915_page_directory_entry *pd,
+				    int index,
+				    struct drm_device *dev)
+{
+	gen8_ppgtt_pdpe_t *page_directorypo;
+	gen8_ppgtt_pdpe_t pdpe;
+
+	/* We do not need to clflush because no platform requiring flush
+	 * supports 64b pagetables. */
+	if (!USES_FULL_48BIT_PPGTT(dev))
+		return;
+
+	page_directorypo = kmap_atomic(pdp->page);
+	pdpe = gen8_pdpe_encode(dev, pd->daddr, I915_CACHE_LLC);
+	page_directorypo[index] = pdpe;
+	kunmap_atomic(page_directorypo);
+}
+
+static void gen8_map_page_directory_pointer(struct i915_pml4 *pml4,
+					    struct i915_page_directory_pointer_entry *pdp,
+					    int index,
+					    struct drm_device *dev)
+{
+	gen8_ppgtt_pml4e_t *pagemap = kmap_atomic(pml4->page);
+	gen8_ppgtt_pml4e_t pml4e = gen8_pml4e_encode(dev, pdp->daddr, I915_CACHE_LLC);
+	BUG_ON(!USES_FULL_48BIT_PPGTT(dev));
+	pagemap[index] = pml4e;
+	kunmap_atomic(pagemap);
+}
+
 static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct drm_device *dev)
 {
 	int i;
@@ -1124,6 +1164,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 
 		set_bit(pdpe, pdp->used_pdpes);
 		gen8_map_pagetable_range(vm, pd, start, length);
+		gen8_map_page_directory(pdp, pd, pdpe, dev);
 	}
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
@@ -1192,6 +1233,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
 		if (ret)
 			goto err_out;
+
+		gen8_map_page_directory_pointer(pml4, pdp, pml4e, vm->dev);
 	}
 
 	bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
@@ -1251,14 +1294,14 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
 	ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
 
-	ppgtt->switch_mm = gen8_mm_switch;
-
 	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
 		int ret = pml4_init(ppgtt);
 		if (ret) {
 			unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
 			return ret;
 		}
+
+		ppgtt->switch_mm = gen8_48b_mm_switch;
 	} else {
 		int ret = __pdp_init(&ppgtt->pdp, false);
 		if (ret) {
@@ -1266,6 +1309,7 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 			return ret;
 		}
 
+		ppgtt->switch_mm = gen8_legacy_mm_switch;
 		trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, 0, 0, GEN8_PML4E_SHIFT);
 	}
 
@@ -1295,6 +1339,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 		return ret;
 	}
 
+	/* FIXME: PML4 */
 	gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
 		gen8_map_pagetable_range(&ppgtt->base, pd,start, size);
 
@@ -1500,8 +1545,9 @@ static void gen8_ppgtt_enable(struct drm_device *dev)
 	int j;
 
 	for_each_ring(ring, dev_priv, j) {
+		u32 four_level = USES_FULL_48BIT_PPGTT(dev) ? GEN8_GFX_PPGTT_64B : 0;
 		I915_WRITE(RING_MODE_GEN7(ring),
-			   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
+			   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE | four_level));
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 1477f54..1f4cdb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -38,7 +38,9 @@ struct drm_i915_file_private;
 
 typedef uint32_t gen6_gtt_pte_t;
 typedef uint64_t gen8_gtt_pte_t;
-typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
+typedef gen8_gtt_pte_t		gen8_ppgtt_pde_t;
+typedef gen8_ppgtt_pde_t	gen8_ppgtt_pdpe_t;
+typedef gen8_ppgtt_pdpe_t	gen8_ppgtt_pml4e_t;
 
 #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 1dc91de..305e5b7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1338,6 +1338,7 @@ enum skl_disp_power_wells {
 #define   GFX_REPLAY_MODE		(1<<11)
 #define   GFX_PSMI_GRANULARITY		(1<<10)
 #define   GFX_PPGTT_ENABLE		(1<<9)
+#define   GEN8_GFX_PPGTT_64B		(1<<7)
 
 #define VLV_DISPLAY_BASE 0x180000
 #define VLV_MIPI_BASE VLV_DISPLAY_BASE
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 07/12] drm/i915/bdw: Support 64 bit PPGTT in lrc mode
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (5 preceding siblings ...)
  2015-02-20 17:46 ` [PATCH 06/12] drm/i915/bdw: Add 4 level switching infrastructure Michel Thierry
@ 2015-02-20 17:46 ` Michel Thierry
  2015-03-03 13:08   ` akash goel
  2015-02-20 17:46 ` [PATCH 08/12] drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT Michel Thierry
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:46 UTC (permalink / raw)
  To: intel-gfx

In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
the base address to PML4, while the other PDP registers are ignored.

Also, the addressing mode must be specified in every context descriptor.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 167 ++++++++++++++++++++++++++-------------
 1 file changed, 114 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f461631..2b6d262 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -255,7 +255,8 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
 }
 
 static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
-					 struct drm_i915_gem_object *ctx_obj)
+					 struct drm_i915_gem_object *ctx_obj,
+					 bool legacy_64bit_ctx)
 {
 	struct drm_device *dev = ring->dev;
 	uint64_t desc;
@@ -264,7 +265,10 @@ static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
 	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
 
 	desc = GEN8_CTX_VALID;
-	desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
+	if (legacy_64bit_ctx)
+		desc |= LEGACY_64B_CONTEXT << GEN8_CTX_MODE_SHIFT;
+	else
+		desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
 	desc |= GEN8_CTX_L3LLC_COHERENT;
 	desc |= GEN8_CTX_PRIVILEGE;
 	desc |= lrca;
@@ -292,16 +296,17 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint64_t temp = 0;
 	uint32_t desc[4];
+	bool legacy_64bit_ctx = USES_FULL_48BIT_PPGTT(dev);
 
 	/* XXX: You must always write both descriptors in the order below. */
 	if (ctx_obj1)
-		temp = execlists_ctx_descriptor(ring, ctx_obj1);
+		temp = execlists_ctx_descriptor(ring, ctx_obj1, legacy_64bit_ctx);
 	else
 		temp = 0;
 	desc[1] = (u32)(temp >> 32);
 	desc[0] = (u32)temp;
 
-	temp = execlists_ctx_descriptor(ring, ctx_obj0);
+	temp = execlists_ctx_descriptor(ring, ctx_obj0, legacy_64bit_ctx);
 	desc[3] = (u32)(temp >> 32);
 	desc[2] = (u32)temp;
 
@@ -332,37 +337,60 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
 	reg_state[CTX_RING_TAIL+1] = tail;
 	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
 
-	/* True PPGTT with dynamic page allocation: update PDP registers and
-	 * point the unallocated PDPs to the scratch page
-	 */
-	if (ppgtt) {
+	if (ppgtt && USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		/* True 64b PPGTT (48bit canonical)
+		 * PDP0_DESCRIPTOR contains the base address to PML4 and
+		 * other PDP Descriptors are ignored
+		 */
+		reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pml4.daddr);
+		reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pml4.daddr);
+	} else if (ppgtt) {
+		/* True 32b PPGTT with dynamic page allocation: update PDP
+		 * registers and point the unallocated PDPs to the scratch page
+		 */
 		if (test_bit(3, ppgtt->pdp.used_pdpes)) {
-			reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
-			reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
+			reg_state[CTX_PDP3_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
+			reg_state[CTX_PDP3_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
 		} else {
-			reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-			reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP3_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP3_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
 		}
 		if (test_bit(2, ppgtt->pdp.used_pdpes)) {
-			reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
-			reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
+			reg_state[CTX_PDP2_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
+			reg_state[CTX_PDP2_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
 		} else {
-			reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-			reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP2_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP2_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
 		}
 		if (test_bit(1, ppgtt->pdp.used_pdpes)) {
-			reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
-			reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
+			reg_state[CTX_PDP1_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
+			reg_state[CTX_PDP1_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
 		} else {
-			reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-			reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP1_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP1_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
 		}
 		if (test_bit(0, ppgtt->pdp.used_pdpes)) {
-			reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
-			reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
+			reg_state[CTX_PDP0_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
+			reg_state[CTX_PDP0_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
 		} else {
-			reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-			reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP0_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP0_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
 		}
 	}
 
@@ -1771,36 +1799,69 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
 	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
 	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
 
-	/* With dynamic page allocation, PDPs may not be allocated at this point,
-	 * Point the unallocated PDPs to the scratch page
-	 */
-	if (test_bit(3, ppgtt->pdp.used_pdpes)) {
-		reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
-		reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
-	} else {
-		reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-		reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
-	}
-	if (test_bit(2, ppgtt->pdp.used_pdpes)) {
-		reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
-		reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
-	} else {
-		reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-		reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
-	}
-	if (test_bit(1, ppgtt->pdp.used_pdpes)) {
-		reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
-		reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
-	} else {
-		reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-		reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
-	}
-	if (test_bit(0, ppgtt->pdp.used_pdpes)) {
-		reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
-		reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
+	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		/* 64b PPGTT (48bit canonical)
+		 * PDP0_DESCRIPTOR contains the base address to PML4 and
+		 * other PDP Descriptors are ignored
+		 */
+		reg_state[CTX_PDP3_UDW+1] = 0;
+		reg_state[CTX_PDP3_LDW+1] = 0;
+		reg_state[CTX_PDP2_UDW+1] = 0;
+		reg_state[CTX_PDP2_LDW+1] = 0;
+		reg_state[CTX_PDP1_UDW+1] = 0;
+		reg_state[CTX_PDP1_LDW+1] = 0;
+		reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pml4.daddr);
+		reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pml4.daddr);
 	} else {
-		reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
-		reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
+		/* 32b PPGTT
+		 * PDP*_DESCRIPTOR contains the base address of space supported.
+		 * With dynamic page allocation, PDPs may not be allocated at
+		 * this point. Point the unallocated PDPs to the scratch page
+		 */
+		if (test_bit(3, ppgtt->pdp.used_pdpes)) {
+			reg_state[CTX_PDP3_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
+			reg_state[CTX_PDP3_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
+		} else {
+			reg_state[CTX_PDP3_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP3_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
+		}
+		if (test_bit(2, ppgtt->pdp.used_pdpes)) {
+			reg_state[CTX_PDP2_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
+			reg_state[CTX_PDP2_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
+		} else {
+			reg_state[CTX_PDP2_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP2_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
+		}
+		if (test_bit(1, ppgtt->pdp.used_pdpes)) {
+			reg_state[CTX_PDP1_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
+			reg_state[CTX_PDP1_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
+		} else {
+			reg_state[CTX_PDP1_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP1_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
+		}
+		if (test_bit(0, ppgtt->pdp.used_pdpes)) {
+			reg_state[CTX_PDP0_UDW+1] =
+					upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
+			reg_state[CTX_PDP0_LDW+1] =
+					lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
+		} else {
+			reg_state[CTX_PDP0_UDW+1] =
+					upper_32_bits(ppgtt->scratch_pd->daddr);
+			reg_state[CTX_PDP0_LDW+1] =
+					lower_32_bits(ppgtt->scratch_pd->daddr);
+		}
 	}
 
 	if (ring->id == RCS) {
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 08/12] drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (6 preceding siblings ...)
  2015-02-20 17:46 ` [PATCH 07/12] drm/i915/bdw: Support 64 bit PPGTT in lrc mode Michel Thierry
@ 2015-02-20 17:46 ` Michel Thierry
  2015-02-20 17:46 ` [PATCH 09/12] drm/i915: Plumb sg_iter through va allocation ->maps Michel Thierry
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:46 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

The insert_entries function was the function used to write PTEs. For the
PPGTT it was "hardcoded" to only understand two level page tables, which
was the case for GEN7. We can reuse this for 4 level page tables, and
remove the concept of insert_entries, which was never viable past 2
level page tables anyway, but it requires a bit of rework to make the
function a bit more generic.

This patch begins the generalization work, and it will be heavily used
upon when the 48b code is complete. The patch series attempts to make
each function which touches a part of code specific to the page table
level and here is no exception. Having extra variables (such as the
PPGTT) distracts and provides room to add bugs since the function
shouldn't be touching anything in the higher order page tables.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 55 +++++++++++++++++++++++++------------
 1 file changed, 38 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fb06f67..fcfcb00 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -619,23 +619,19 @@ static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	return gen8_write_pdp(ring, 0, ppgtt->pml4.daddr);
 }
 
-static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
-				   uint64_t start,
-				   uint64_t length,
-				   bool use_scratch)
+static void gen8_ppgtt_clear_pte_range(struct i915_page_directory_pointer_entry *pdp,
+				       uint64_t start,
+				       uint64_t length,
+				       gen8_gtt_pte_t scratch_pte,
+				       const bool flush)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
-	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
-	gen8_gtt_pte_t *pt_vaddr, scratch_pte;
+	gen8_gtt_pte_t *pt_vaddr;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
 	unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
 	unsigned num_entries = length >> PAGE_SHIFT;
 	unsigned last_pte, i;
 
-	scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
-				      I915_CACHE_LLC, use_scratch);
 
 	while (num_entries) {
 		struct i915_page_directory_entry *pd;
@@ -668,7 +664,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 			num_entries--;
 		}
 
-		if (!HAS_LLC(ppgtt->base.dev))
+		if (flush)
 			drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
 		kunmap_atomic(pt_vaddr);
 
@@ -680,14 +676,27 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 	}
 }
 
-static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
-				      struct sg_table *pages,
-				      uint64_t start,
-				      enum i915_cache_level cache_level, u32 unused)
+static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
+				   uint64_t start,
+				   uint64_t length,
+				   bool use_scratch)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
+
+	gen8_gtt_pte_t scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
+						     I915_CACHE_LLC, use_scratch);
+
+	gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte, !HAS_LLC(vm->dev));
+}
+
+static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_entry *pdp,
+					  struct sg_table *pages,
+					  uint64_t start,
+					  enum i915_cache_level cache_level,
+					  const bool flush)
+{
 	gen8_gtt_pte_t *pt_vaddr;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -709,7 +718,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 			gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
 					cache_level, true);
 		if (++pte == GEN8_PTES_PER_PAGE) {
-			if (!HAS_LLC(ppgtt->base.dev))
+			if (flush)
 				drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
 			kunmap_atomic(pt_vaddr);
 			pt_vaddr = NULL;
@@ -721,12 +730,24 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 	}
 	if (pt_vaddr) {
-		if (!HAS_LLC(ppgtt->base.dev))
+		if (flush)
 			drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
 		kunmap_atomic(pt_vaddr);
 	}
 }
 
+static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
+				      struct sg_table *pages,
+				      uint64_t start,
+				      enum i915_cache_level cache_level,
+				      u32 unused)
+{
+	struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base);
+	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
+
+	gen8_ppgtt_insert_pte_entries(pdp, pages, start, cache_level, !HAS_LLC(vm->dev));
+}
+
 static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde,
 			     struct i915_page_table_entry *pt,
 			     struct drm_device *dev)
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 09/12] drm/i915: Plumb sg_iter through va allocation ->maps
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (7 preceding siblings ...)
  2015-02-20 17:46 ` [PATCH 08/12] drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT Michel Thierry
@ 2015-02-20 17:46 ` Michel Thierry
  2015-02-20 17:46 ` [PATCH 10/12] drm/i915/bdw: Add 4 level support in insert_entries and clear_range Michel Thierry
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:46 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

As a step towards implementing 4 levels, while not discarding the
existing pte map functions, we need to pass the sg_iter through. The
current function understands to the page directory granularity. An
object's pages may span the page directory, and so using the iter
directly as we write the PTEs allows the iterator to stay coherent
through a VMA mapping operation spanning multiple page table levels.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 46 +++++++++++++++++++++++--------------
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fcfcb00..a1396cb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -692,7 +692,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 }
 
 static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_entry *pdp,
-					  struct sg_table *pages,
+					  struct sg_page_iter *sg_iter,
 					  uint64_t start,
 					  enum i915_cache_level cache_level,
 					  const bool flush)
@@ -701,11 +701,10 @@ static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_ent
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
 	unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
-	struct sg_page_iter sg_iter;
 
 	pt_vaddr = NULL;
 
-	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
+	while (__sg_page_iter_next(sg_iter)) {
 		if (pt_vaddr == NULL) {
 			struct i915_page_directory_entry *pd = pdp->page_directory[pdpe];
 			struct i915_page_table_entry *pt = pd->page_tables[pde];
@@ -715,7 +714,7 @@ static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_ent
 		}
 
 		pt_vaddr[pte] =
-			gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
+			gen8_pte_encode(sg_page_iter_dma_address(sg_iter),
 					cache_level, true);
 		if (++pte == GEN8_PTES_PER_PAGE) {
 			if (flush)
@@ -744,8 +743,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 {
 	struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
+	struct sg_page_iter sg_iter;
 
-	gen8_ppgtt_insert_pte_entries(pdp, pages, start, cache_level, !HAS_LLC(vm->dev));
+	__sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0);
+	gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start, cache_level, !HAS_LLC(vm->dev));
 }
 
 static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde,
@@ -1107,10 +1108,12 @@ err_out:
 	return -ENOMEM;
 }
 
-static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
-				    struct i915_page_directory_pointer_entry *pdp,
-				    uint64_t start,
-				    uint64_t length)
+static int __gen8_alloc_vma_range_3lvl(struct i915_address_space *vm,
+				       struct i915_page_directory_pointer_entry *pdp,
+				       struct sg_page_iter *sg_iter,
+				       uint64_t start,
+				       uint64_t length,
+				       u32 flags)
 {
 	unsigned long *new_page_dirs, **new_page_tables;
 	struct drm_device *dev = vm->dev;
@@ -1179,7 +1182,11 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 				   gen8_pte_index(pd_start),
 				   gen8_pte_count(pd_start, pd_len));
 
-			/* Our pde is now pointing to the pagetable, pt */
+			if (sg_iter) {
+				BUG_ON(!sg_iter->__nents);
+				gen8_ppgtt_insert_pte_entries(pdp, sg_iter, pd_start,
+							      flags, !HAS_LLC(vm->dev));
+			}
 			set_bit(pde, pd->used_pdes);
 		}
 
@@ -1204,10 +1211,12 @@ err_out:
 	return ret;
 }
 
-static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
-				    struct i915_pml4 *pml4,
-				    uint64_t start,
-				    uint64_t length)
+static int __gen8_alloc_vma_range_4lvl(struct i915_address_space *vm,
+				       struct i915_pml4 *pml4,
+				       struct sg_page_iter *sg_iter,
+				       uint64_t start,
+				       uint64_t length,
+				       u32 flags)
 {
 	DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
 	struct i915_hw_ppgtt *ppgtt =
@@ -1251,7 +1260,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
 		BUG_ON(!pdp);
 
-		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
+		ret = __gen8_alloc_vma_range_3lvl(vm, pdp, sg_iter,
+						  start, length, flags);
 		if (ret)
 			goto err_out;
 
@@ -1283,9 +1293,11 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 		container_of(vm, struct i915_hw_ppgtt, base);
 
 	if (USES_FULL_48BIT_PPGTT(vm->dev))
-		return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
+		return __gen8_alloc_vma_range_4lvl(vm, &ppgtt->pml4, NULL,
+						   start, length, 0);
 	else
-		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
+		return __gen8_alloc_vma_range_3lvl(vm, &ppgtt->pdp, NULL,
+						   start, length, 0);
 }
 
 static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 10/12] drm/i915/bdw: Add 4 level support in insert_entries and clear_range
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (8 preceding siblings ...)
  2015-02-20 17:46 ` [PATCH 09/12] drm/i915: Plumb sg_iter through va allocation ->maps Michel Thierry
@ 2015-02-20 17:46 ` Michel Thierry
  2015-03-03 16:39   ` akash goel
  2015-02-20 17:46 ` [PATCH 11/12] drm/i915: Expand error state's address width to 64b Michel Thierry
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:46 UTC (permalink / raw)
  To: intel-gfx

When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map
Level 4 (PML4), before it selects which Page Directory Pointer (PDP)
it will write to.

Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range.

Also add a scratch page for PML4.

This patch was inspired by Ben's "Depend exclusively on map and
unmap_vma".

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 66 ++++++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_gtt.h | 12 +++++++
 2 files changed, 67 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a1396cb..0954827 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -676,24 +676,52 @@ static void gen8_ppgtt_clear_pte_range(struct i915_page_directory_pointer_entry
 	}
 }
 
+static void gen8_ppgtt_clear_range_4lvl(struct i915_hw_ppgtt *ppgtt,
+					gen8_gtt_pte_t scratch_pte,
+					uint64_t start,
+					uint64_t length)
+{
+	struct i915_page_directory_pointer_entry *pdp;
+	uint64_t templ4, templ3, pml4e, pdpe;
+
+	gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+		struct i915_page_directory_entry *pd;
+		uint64_t pdp_len = gen8_clamp_pdp(start, length);
+		uint64_t pdp_start = start;
+
+		gen8_for_each_pdpe(pd, pdp, pdp_start, pdp_len, templ3, pdpe) {
+			uint64_t pd_len = gen8_clamp_pd(pdp_start, pdp_len);
+			uint64_t pd_start = pdp_start;
+
+			gen8_ppgtt_clear_pte_range(pdp, pd_start, pd_len,
+						   scratch_pte, !HAS_LLC(ppgtt->base.dev));
+		}
+	}
+}
+
 static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
-				   uint64_t start,
-				   uint64_t length,
+				   uint64_t start, uint64_t length,
 				   bool use_scratch)
 {
 	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
-	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
-
+			container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_gtt_pte_t scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
 						     I915_CACHE_LLC, use_scratch);
 
-	gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte, !HAS_LLC(vm->dev));
+	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+		struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp;
+
+		gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte,
+					   !HAS_LLC(ppgtt->base.dev));
+	} else {
+		gen8_ppgtt_clear_range_4lvl(ppgtt, scratch_pte, start, length);
+	}
 }
 
 static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_entry *pdp,
 					  struct sg_page_iter *sg_iter,
 					  uint64_t start,
+					  size_t pages,
 					  enum i915_cache_level cache_level,
 					  const bool flush)
 {
@@ -704,7 +732,7 @@ static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_ent
 
 	pt_vaddr = NULL;
 
-	while (__sg_page_iter_next(sg_iter)) {
+	while (pages-- && __sg_page_iter_next(sg_iter)) {
 		if (pt_vaddr == NULL) {
 			struct i915_page_directory_entry *pd = pdp->page_directory[pdpe];
 			struct i915_page_table_entry *pt = pd->page_tables[pde];
@@ -742,11 +770,26 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 				      u32 unused)
 {
 	struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base);
-	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
+	struct i915_page_directory_pointer_entry *pdp;
 	struct sg_page_iter sg_iter;
 
 	__sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0);
-	gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start, cache_level, !HAS_LLC(vm->dev));
+
+	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+		pdp = &ppgtt->pdp;
+		gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start,
+				sg_nents(pages->sgl),
+				cache_level, !HAS_LLC(vm->dev));
+	} else {
+		struct i915_pml4 *pml4;
+		unsigned pml4e = gen8_pml4e_index(start);
+
+		pml4 = &ppgtt->pml4;
+		pdp = pml4->pdps[pml4e];
+		gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start,
+				sg_nents(pages->sgl),
+				cache_level, !HAS_LLC(vm->dev));
+	}
 }
 
 static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde,
@@ -1185,7 +1228,8 @@ static int __gen8_alloc_vma_range_3lvl(struct i915_address_space *vm,
 			if (sg_iter) {
 				BUG_ON(!sg_iter->__nents);
 				gen8_ppgtt_insert_pte_entries(pdp, sg_iter, pd_start,
-							      flags, !HAS_LLC(vm->dev));
+						gen8_pte_count(pd_start, pd_len),
+						flags, !HAS_LLC(vm->dev));
 			}
 			set_bit(pde, pd->used_pdes);
 		}
@@ -1330,7 +1374,7 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
 		int ret = pml4_init(ppgtt);
 		if (ret) {
-			unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
+			unmap_and_free_pt(ppgtt->scratch_pml4, ppgtt->base.dev);
 			return ret;
 		}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 1f4cdb1..602d446c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -332,6 +332,7 @@ struct i915_hw_ppgtt {
 	union {
 		struct i915_page_table_entry *scratch_pt;
 		struct i915_page_table_entry *scratch_pd; /* Just need the daddr */
+		struct i915_page_table_entry *scratch_pml4;
 	};
 
 	struct drm_i915_file_private *file_priv;
@@ -452,6 +453,17 @@ static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length)
 	return next_pd - start;
 }
 
+/* Clamp length to the next page_directory pointer boundary */
+static inline uint64_t gen8_clamp_pdp(uint64_t start, uint64_t length)
+{
+	uint64_t next_pdp = ALIGN(start + 1, 1ULL << GEN8_PML4E_SHIFT);
+
+	if (next_pdp > (start + length))
+		return length;
+
+	return next_pdp - start;
+}
+
 static inline uint32_t gen8_pte_index(uint64_t address)
 {
 	return i915_pte_index(address, GEN8_PDE_SHIFT);
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 11/12] drm/i915: Expand error state's address width to 64b
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (9 preceding siblings ...)
  2015-02-20 17:46 ` [PATCH 10/12] drm/i915/bdw: Add 4 level support in insert_entries and clear_range Michel Thierry
@ 2015-02-20 17:46 ` Michel Thierry
  2015-03-03 16:42   ` akash goel
  2015-02-20 17:46 ` [PATCH 12/12] drm/i915/bdw: Flip the 48b switch Michel Thierry
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:46 UTC (permalink / raw)
  To: intel-gfx

From: Ben Widawsky <benjamin.widawsky@intel.com>

v2: 0 pad the new 8B fields or else intel_error_decode has a hard time.
Note, regardless we need an igt update.

v3: Make reloc_offset 64b also.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h       |  4 ++--
 drivers/gpu/drm/i915/i915_gpu_error.c | 17 +++++++++--------
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index af0d149..056ced5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -459,7 +459,7 @@ struct drm_i915_error_state {
 
 		struct drm_i915_error_object {
 			int page_count;
-			u32 gtt_offset;
+			u64 gtt_offset;
 			u32 *pages[0];
 		} *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
 
@@ -485,7 +485,7 @@ struct drm_i915_error_state {
 		u32 size;
 		u32 name;
 		u32 rseqno, wseqno;
-		u32 gtt_offset;
+		u64 gtt_offset;
 		u32 read_domains;
 		u32 write_domain;
 		s32 fence_reg:I915_MAX_NUM_FENCE_BITS;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index a982849..bbf25d0 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -195,7 +195,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
 	err_printf(m, "  %s [%d]:\n", name, count);
 
 	while (count--) {
-		err_printf(m, "    %08x %8u %02x %02x %x %x",
+		err_printf(m, "    %016llx %8u %02x %02x %x %x",
 			   err->gtt_offset,
 			   err->size,
 			   err->read_domains,
@@ -415,7 +415,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 				err_printf(m, " (submitted by %s [%d])",
 					   error->ring[i].comm,
 					   error->ring[i].pid);
-			err_printf(m, " --- gtt_offset = 0x%08x\n",
+			err_printf(m, " --- gtt_offset = 0x%016llx\n",
 				   obj->gtt_offset);
 			print_error_obj(m, obj);
 		}
@@ -423,7 +423,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		obj = error->ring[i].wa_batchbuffer;
 		if (obj) {
 			err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n",
-				   dev_priv->ring[i].name, obj->gtt_offset);
+				   dev_priv->ring[i].name,
+				   lower_32_bits(obj->gtt_offset));
 			print_error_obj(m, obj);
 		}
 
@@ -442,14 +443,14 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		if ((obj = error->ring[i].ringbuffer)) {
 			err_printf(m, "%s --- ringbuffer = 0x%08x\n",
 				   dev_priv->ring[i].name,
-				   obj->gtt_offset);
+				   lower_32_bits(obj->gtt_offset));
 			print_error_obj(m, obj);
 		}
 
 		if ((obj = error->ring[i].hws_page)) {
 			err_printf(m, "%s --- HW Status = 0x%08x\n",
 				   dev_priv->ring[i].name,
-				   obj->gtt_offset);
+				   lower_32_bits(obj->gtt_offset));
 			offset = 0;
 			for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
 				err_printf(m, "[%04x] %08x %08x %08x %08x\n",
@@ -465,13 +466,13 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		if ((obj = error->ring[i].ctx)) {
 			err_printf(m, "%s --- HW Context = 0x%08x\n",
 				   dev_priv->ring[i].name,
-				   obj->gtt_offset);
+				   lower_32_bits(obj->gtt_offset));
 			print_error_obj(m, obj);
 		}
 	}
 
 	if ((obj = error->semaphore_obj)) {
-		err_printf(m, "Semaphore page = 0x%08x\n", obj->gtt_offset);
+		err_printf(m, "Semaphore page = 0x%016llx\n", obj->gtt_offset);
 		for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
 			err_printf(m, "[%04x] %08x %08x %08x %08x\n",
 				   elt * 4,
@@ -571,7 +572,7 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 	int num_pages;
 	bool use_ggtt;
 	int i = 0;
-	u32 reloc_offset;
+	u64 reloc_offset;
 
 	if (src == NULL || src->pages == NULL)
 		return NULL;
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 12/12] drm/i915/bdw: Flip the 48b switch
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (10 preceding siblings ...)
  2015-02-20 17:46 ` [PATCH 11/12] drm/i915: Expand error state's address width to 64b Michel Thierry
@ 2015-02-20 17:46 ` Michel Thierry
  2015-02-24 10:54 ` [PATCH 00/12] PPGTT with 48b addressing Daniel Vetter
  2015-03-03 13:52 ` Damien Lespiau
  13 siblings, 0 replies; 32+ messages in thread
From: Michel Thierry @ 2015-02-20 17:46 UTC (permalink / raw)
  To: intel-gfx

Use 48b addresses if hw supports it and i915.enable_ppgtt=3.

Aliasing PPGTT remains 32b only.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 7 ++-----
 drivers/gpu/drm/i915/i915_params.c  | 2 +-
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0954827..c88cd81 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -106,7 +106,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 	has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
 
 #ifdef CONFIG_64BIT
-	has_full_64bit_ppgtt = IS_BROADWELL(dev) && false; /* FIXME: 64b */
+	has_full_64bit_ppgtt = IS_BROADWELL(dev);
 #else
 	has_full_64bit_ppgtt = false;
 #endif
@@ -1076,9 +1076,6 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
 
 	BUG_ON(!bitmap_empty(new_pds, pdpes));
 
-	/* FIXME: PPGTT container_of won't work for 64b */
-	BUG_ON((start + length) > 0x800000000ULL);
-
 	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
 		if (pd)
 			continue;
@@ -1397,7 +1394,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
+	struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b? */
 	struct i915_page_directory_entry *pd;
 	uint64_t temp, start = 0, size = dev_priv->gtt.base.total;
 	uint32_t pdpe;
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 44f2262..1cd43b0 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -119,7 +119,7 @@ MODULE_PARM_DESC(enable_hangcheck,
 module_param_named_unsafe(enable_ppgtt, i915.enable_ppgtt, int, 0400);
 MODULE_PARM_DESC(enable_ppgtt,
 	"Override PPGTT usage. "
-	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
+	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full, 3=full_64b)");
 
 module_param_named(enable_execlists, i915.enable_execlists, int, 0400);
 MODULE_PARM_DESC(enable_execlists,
-- 
2.1.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 00/12] PPGTT with 48b addressing
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (11 preceding siblings ...)
  2015-02-20 17:46 ` [PATCH 12/12] drm/i915/bdw: Flip the 48b switch Michel Thierry
@ 2015-02-24 10:54 ` Daniel Vetter
  2015-03-03 13:52 ` Damien Lespiau
  13 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2015-02-24 10:54 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Fri, Feb 20, 2015 at 05:45:54PM +0000, Michel Thierry wrote:
> These patches rely on "PPGTT dynamic page allocations", currently under review,
> to provide GEN8 dynamic page table support with 64b addresses. As the review
> progresses, these patches may be combined.
> 
> In order expand the GPU address space, a 4th level translation is added, the
> Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255],
> each pointing to a PDP.
> 
> For now, this feature will only be available in BDW, in LRC submission mode
> (execlists) and when i915.enable_ppgtt=3 is set.
> Also note that this expanded address space is only available for full PPGTT,
> aliasing PPGTT remains 32b.

Any reasons for restricting this to bdw and not just going with gen9+
right away? We could just merge it and then fix fallout or selective
revert when it blows up on chv/skl - those platforms aren't shipping yet,
so regressions aren't too onerous.
-Daniel

> 
> Ben Widawsky (9):
>   drm/i915/bdw: Make pdp allocation more dynamic
>   drm/i915/bdw: Abstract PDP usage
>   drm/i915/bdw: Add dynamic page trace events
>   drm/i915/bdw: Add ppgtt info for dynamic pages
>   drm/i915/bdw: implement alloc/free for 4lvl
>   drm/i915/bdw: Add 4 level switching infrastructure
>   drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT
>   drm/i915: Plumb sg_iter through va allocation ->maps
>   drm/i915: Expand error state's address width to 64b
> 
> Michel Thierry (3):
>   drm/i915/bdw: Support 64 bit PPGTT in lrc mode
>   drm/i915/bdw: Add 4 level support in insert_entries and clear_range
>   drm/i915/bdw: Flip the 48b switch
> 
>  drivers/gpu/drm/i915/i915_debugfs.c   |  19 +-
>  drivers/gpu/drm/i915/i915_drv.h       |  11 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c   | 624 ++++++++++++++++++++++++++++------
>  drivers/gpu/drm/i915/i915_gem_gtt.h   |  77 ++++-
>  drivers/gpu/drm/i915/i915_gpu_error.c |  17 +-
>  drivers/gpu/drm/i915/i915_params.c    |   2 +-
>  drivers/gpu/drm/i915/i915_reg.h       |   1 +
>  drivers/gpu/drm/i915/i915_trace.h     |  16 +
>  drivers/gpu/drm/i915/intel_lrc.c      | 167 ++++++---
>  9 files changed, 746 insertions(+), 188 deletions(-)
> 
> -- 
> 2.1.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events
  2015-02-20 17:45 ` [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events Michel Thierry
@ 2015-02-24 10:56   ` Daniel Vetter
  2015-02-24 10:59   ` Daniel Vetter
  1 sibling, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2015-02-24 10:56 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Fri, Feb 20, 2015 at 05:45:57PM +0000, Michel Thierry wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> The dynamic page allocation patch series added it for GEN6, this patch
> adds them for GEN8.

should this then be part of the dynamic page alloc series? Having
inconsistent tracepoints is kinda not cool.
-Daniel

> 
> v2: Consolidate pagetable/page_directory events
> v3: Multiple rebases.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3)
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 23 +++++++++++++++--------
>  drivers/gpu/drm/i915/i915_trace.h   | 16 ++++++++++++++++
>  2 files changed, 31 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index d3ad517..ecfb62a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -673,19 +673,24 @@ static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde,
>  /* It's likely we'll map more than one pagetable at a time. This function will
>   * save us unnecessary kmap calls, but do no more functionally than multiple
>   * calls to map_pt. */
> -static void gen8_map_pagetable_range(struct i915_page_directory_entry *pd,
> +static void gen8_map_pagetable_range(struct i915_address_space *vm,
> +				     struct i915_page_directory_entry *pd,
>  				     uint64_t start,
> -				     uint64_t length,
> -				     struct drm_device *dev)
> +				     uint64_t length)
>  {
>  	gen8_ppgtt_pde_t * const page_directory = kmap_atomic(pd->page);
>  	struct i915_page_table_entry *pt;
>  	uint64_t temp, pde;
>  
> -	gen8_for_each_pde(pt, pd, start, length, temp, pde)
> -		__gen8_do_map_pt(page_directory + pde, pt, dev);
> +	gen8_for_each_pde(pt, pd, start, length, temp, pde) {
> +		__gen8_do_map_pt(page_directory + pde, pt, vm->dev);
> +		trace_i915_page_table_entry_map(vm, pde, pt,
> +					 gen8_pte_index(start),
> +					 gen8_pte_count(start, length),
> +					 GEN8_PTES_PER_PAGE);
> +	}
>  
> -	if (!HAS_LLC(dev))
> +	if (!HAS_LLC(vm->dev))
>  		drm_clflush_virt_range(page_directory, PAGE_SIZE);
>  
>  	kunmap_atomic(page_directory);
> @@ -815,6 +820,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
>  
>  		pd->page_tables[pde] = pt;
>  		set_bit(pde, new_pts);
> +		trace_i915_page_table_entry_alloc(vm, pde, start, GEN8_PDE_SHIFT);
>  	}
>  
>  	return 0;
> @@ -876,6 +882,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
>  
>  		pdp->page_directory[pdpe] = pd;
>  		set_bit(pdpe, new_pds);
> +		trace_i915_page_directory_entry_alloc(vm, pdpe, start, GEN8_PDPE_SHIFT);
>  	}
>  
>  	return 0;
> @@ -1014,7 +1021,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>  		}
>  
>  		set_bit(pdpe, pdp->used_pdpes);
> -		gen8_map_pagetable_range(pd, start, length, dev);
> +		gen8_map_pagetable_range(vm, pd, start, length);
>  	}
>  
>  	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
> @@ -1115,7 +1122,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	}
>  
>  	gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
> -		gen8_map_pagetable_range(pd, start, size, dev);
> +		gen8_map_pagetable_range(&ppgtt->base, pd,start, size);
>  
>  	ppgtt->base.allocate_va_range = NULL;
>  	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 3a657e4..6c20f76 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -214,6 +214,22 @@ DEFINE_EVENT(i915_page_table_entry, i915_page_table_entry_alloc,
>  	     TP_ARGS(vm, pde, start, pde_shift)
>  );
>  
> +DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_entry_alloc,
> +		   TP_PROTO(struct i915_address_space *vm, u32 pdpe, u64 start, u64 pdpe_shift),
> +		   TP_ARGS(vm, pdpe, start, pdpe_shift),
> +
> +		   TP_printk("vm=%p, pdpe=%d (0x%llx-0x%llx)",
> +			     __entry->vm, __entry->pde, __entry->start, __entry->end)
> +);
> +
> +DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_pointer_entry_alloc,
> +		   TP_PROTO(struct i915_address_space *vm, u32 pml4e, u64 start, u64 pml4e_shift),
> +		   TP_ARGS(vm, pml4e, start, pml4e_shift),
> +
> +		   TP_printk("vm=%p, pml4e=%d (0x%llx-0x%llx)",
> +			     __entry->vm, __entry->pde, __entry->start, __entry->end)
> +);
> +
>  /* Avoid extra math because we only support two sizes. The format is defined by
>   * bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */
>  #define TRACE_PT_SIZE(bits) \
> -- 
> 2.1.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events
  2015-02-20 17:45 ` [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events Michel Thierry
  2015-02-24 10:56   ` Daniel Vetter
@ 2015-02-24 10:59   ` Daniel Vetter
  1 sibling, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2015-02-24 10:59 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Fri, Feb 20, 2015 at 05:45:57PM +0000, Michel Thierry wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> The dynamic page allocation patch series added it for GEN6, this patch
> adds them for GEN8.
> 
> v2: Consolidate pagetable/page_directory events
> v3: Multiple rebases.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3)

On second thought I wonder how useful these tracepoints really are. We
already have tracepoints for binding/unbinding, that should tell us all
about vm usage. Actual pt allocation itself seems fairly boring tbh.

What's the practical application of these tracepoints? Same goes ofc for
the corresponding gen6 patch. Note that tracepoints are somewhat
considered abi (or at least can become abi), we need a solid justification
for them. "Seemed useful and cheap to add" isn't enough imo.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 23 +++++++++++++++--------
>  drivers/gpu/drm/i915/i915_trace.h   | 16 ++++++++++++++++
>  2 files changed, 31 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index d3ad517..ecfb62a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -673,19 +673,24 @@ static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde,
>  /* It's likely we'll map more than one pagetable at a time. This function will
>   * save us unnecessary kmap calls, but do no more functionally than multiple
>   * calls to map_pt. */
> -static void gen8_map_pagetable_range(struct i915_page_directory_entry *pd,
> +static void gen8_map_pagetable_range(struct i915_address_space *vm,
> +				     struct i915_page_directory_entry *pd,
>  				     uint64_t start,
> -				     uint64_t length,
> -				     struct drm_device *dev)
> +				     uint64_t length)
>  {
>  	gen8_ppgtt_pde_t * const page_directory = kmap_atomic(pd->page);
>  	struct i915_page_table_entry *pt;
>  	uint64_t temp, pde;
>  
> -	gen8_for_each_pde(pt, pd, start, length, temp, pde)
> -		__gen8_do_map_pt(page_directory + pde, pt, dev);
> +	gen8_for_each_pde(pt, pd, start, length, temp, pde) {
> +		__gen8_do_map_pt(page_directory + pde, pt, vm->dev);
> +		trace_i915_page_table_entry_map(vm, pde, pt,
> +					 gen8_pte_index(start),
> +					 gen8_pte_count(start, length),
> +					 GEN8_PTES_PER_PAGE);
> +	}
>  
> -	if (!HAS_LLC(dev))
> +	if (!HAS_LLC(vm->dev))
>  		drm_clflush_virt_range(page_directory, PAGE_SIZE);
>  
>  	kunmap_atomic(page_directory);
> @@ -815,6 +820,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
>  
>  		pd->page_tables[pde] = pt;
>  		set_bit(pde, new_pts);
> +		trace_i915_page_table_entry_alloc(vm, pde, start, GEN8_PDE_SHIFT);
>  	}
>  
>  	return 0;
> @@ -876,6 +882,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
>  
>  		pdp->page_directory[pdpe] = pd;
>  		set_bit(pdpe, new_pds);
> +		trace_i915_page_directory_entry_alloc(vm, pdpe, start, GEN8_PDPE_SHIFT);
>  	}
>  
>  	return 0;
> @@ -1014,7 +1021,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>  		}
>  
>  		set_bit(pdpe, pdp->used_pdpes);
> -		gen8_map_pagetable_range(pd, start, length, dev);
> +		gen8_map_pagetable_range(vm, pd, start, length);
>  	}
>  
>  	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
> @@ -1115,7 +1122,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	}
>  
>  	gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
> -		gen8_map_pagetable_range(pd, start, size, dev);
> +		gen8_map_pagetable_range(&ppgtt->base, pd,start, size);
>  
>  	ppgtt->base.allocate_va_range = NULL;
>  	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 3a657e4..6c20f76 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -214,6 +214,22 @@ DEFINE_EVENT(i915_page_table_entry, i915_page_table_entry_alloc,
>  	     TP_ARGS(vm, pde, start, pde_shift)
>  );
>  
> +DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_entry_alloc,
> +		   TP_PROTO(struct i915_address_space *vm, u32 pdpe, u64 start, u64 pdpe_shift),
> +		   TP_ARGS(vm, pdpe, start, pdpe_shift),
> +
> +		   TP_printk("vm=%p, pdpe=%d (0x%llx-0x%llx)",
> +			     __entry->vm, __entry->pde, __entry->start, __entry->end)
> +);
> +
> +DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_pointer_entry_alloc,
> +		   TP_PROTO(struct i915_address_space *vm, u32 pml4e, u64 start, u64 pml4e_shift),
> +		   TP_ARGS(vm, pml4e, start, pml4e_shift),
> +
> +		   TP_printk("vm=%p, pml4e=%d (0x%llx-0x%llx)",
> +			     __entry->vm, __entry->pde, __entry->start, __entry->end)
> +);
> +
>  /* Avoid extra math because we only support two sizes. The format is defined by
>   * bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */
>  #define TRACE_PT_SIZE(bits) \
> -- 
> 2.1.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic
  2015-02-20 17:45 ` [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic Michel Thierry
@ 2015-03-03 11:48   ` akash goel
  2015-03-18 10:15     ` Michel Thierry
  0 siblings, 1 reply; 32+ messages in thread
From: akash goel @ 2015-03-03 11:48 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> This transitional patch doesn't do much for the existing code. However,
> it should make upcoming patches to use the full 48b address space a bit
> easier to swallow. The patch also introduces the PML4, ie. the new top
> level structure of the page tables.
>
> v2: Renamed  pdp_free to be similar to  pd/pt (unmap_and_free_pdp),
> To facilitate testing, 48b mode will be available on Broadwell, when
> i915.enable_ppgtt = 3.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
> ---
>  drivers/gpu/drm/i915/i915_drv.h     |   7 ++-
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 108 +++++++++++++++++++++++++++++-------
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  41 +++++++++++---
>  3 files changed, 126 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2dedd43..af0d149 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2432,7 +2432,12 @@ struct drm_i915_cmd_table {
>  #define HAS_HW_CONTEXTS(dev)   (INTEL_INFO(dev)->gen >= 6)
>  #define HAS_LOGICAL_RING_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 8)
>  #define USES_PPGTT(dev)                (i915.enable_ppgtt)
> -#define USES_FULL_PPGTT(dev)   (i915.enable_ppgtt == 2)
> +#define USES_FULL_PPGTT(dev)   (i915.enable_ppgtt >= 2)
> +#ifdef CONFIG_64BIT
> +# define USES_FULL_48BIT_PPGTT(dev)    (i915.enable_ppgtt == 3)
> +#else
> +# define USES_FULL_48BIT_PPGTT(dev)    false
> +#endif
>
>  #define HAS_OVERLAY(dev)               (INTEL_INFO(dev)->has_overlay)
>  #define OVERLAY_NEEDS_PHYSICAL(dev)    (INTEL_INFO(dev)->overlay_needs_physical)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index ff86501..489f8db 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -100,10 +100,17 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>  {
>         bool has_aliasing_ppgtt;
>         bool has_full_ppgtt;
> +       bool has_full_64bit_ppgtt;
>
>         has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
>         has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
>
> +#ifdef CONFIG_64BIT
> +       has_full_64bit_ppgtt = IS_BROADWELL(dev) && false; /* FIXME: 64b */
> +#else
> +       has_full_64bit_ppgtt = false;
> +#endif
> +
>         if (intel_vgpu_active(dev))
>                 has_full_ppgtt = false; /* emulation is too hard */
>
> @@ -121,6 +128,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>         if (enable_ppgtt == 2 && has_full_ppgtt)
>                 return 2;
>
> +       if (enable_ppgtt == 3 && has_full_64bit_ppgtt)
> +               return 3;
> +
>  #ifdef CONFIG_INTEL_IOMMU
>         /* Disable ppgtt on SNB if VT-d is on. */
>         if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
> @@ -462,6 +472,45 @@ free_pd:
>         return ERR_PTR(ret);
>  }
>
> +static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp)
> +{
> +       kfree(pdp->used_pdpes);
> +       kfree(pdp->page_directory);
> +       /* HACK */
> +       pdp->page_directory = NULL;
> +}
> +
> +static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp,
> +                           struct drm_device *dev)
> +{
> +       __pdp_fini(pdp);
> +       if (USES_FULL_48BIT_PPGTT(dev))
> +               kfree(pdp);
> +}
> +
> +static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
> +                     struct drm_device *dev)
> +{
> +       size_t pdpes = I915_PDPES_PER_PDP(dev);
> +
> +       pdp->used_pdpes = kcalloc(BITS_TO_LONGS(pdpes),
> +                                 sizeof(unsigned long),
> +                                 GFP_KERNEL);
> +       if (!pdp->used_pdpes)
> +               return -ENOMEM;
> +
> +       pdp->page_directory = kcalloc(pdpes, sizeof(*pdp->page_directory), GFP_KERNEL);
> +       if (!pdp->page_directory) {
> +               kfree(pdp->used_pdpes);
> +               /* the PDP might be the statically allocated top level. Keep it
> +                * as clean as possible */
> +               pdp->used_pdpes = NULL;
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
> +}
> +
>  /* Broadwell Page Directory Pointer Descriptors */
>  static int gen8_write_pdp(struct intel_engine_cs *ring,
>                           unsigned entry,
> @@ -491,7 +540,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
>  {
>         int i, ret;
>
> -       for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) {
> +       for (i = 3; i >= 0; i--) {
>                 struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i];
>                 dma_addr_t pd_daddr = pd ? pd->daddr : ppgtt->scratch_pd->daddr;
>                 /* The page directory might be NULL, but we need to clear out
> @@ -580,9 +629,6 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>         pt_vaddr = NULL;
>
>         for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
> -               if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES))
> -                       break;
> -
>                 if (pt_vaddr == NULL) {
>                         struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe];
>                         struct i915_page_table_entry *pt = pd->page_tables[pde];
> @@ -664,7 +710,8 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
>         struct pci_dev *hwdev = ppgtt->base.dev->pdev;
>         int i, j;
>
> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
> +       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> +                       I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>                 struct i915_page_directory_entry *pd;
>
>                 if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> @@ -696,13 +743,15 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
>  {
>         int i;
>
> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
> +       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> +                               I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>                 if (WARN_ON(!ppgtt->pdp.page_directory[i]))
>                         continue;
>
>                 gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
>                 unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
>         }
> +       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);

'ppgtt->scratch_pd' is not being de-allocated.

Probably it can be de-allocated explicitly from here, after the call
to  unmap_and_free_pdp or
from the gen8_ppgtt_cleanup function .

>  }
>
>  static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
> @@ -799,8 +848,9 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>         struct i915_page_directory_entry *pd;
>         uint64_t temp;
>         uint32_t pdpe;
> +       size_t pdpes =  I915_PDPES_PER_PDP(ppgtt->base.dev);
>
> -       BUG_ON(!bitmap_empty(new_pds, GEN8_LEGACY_PDPES));
> +       BUG_ON(!bitmap_empty(new_pds, pdpes));
>
>         /* FIXME: PPGTT container_of won't work for 64b */
>         BUG_ON((start + length) > 0x800000000ULL);
> @@ -820,18 +870,19 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>         return 0;
>
>  unwind_out:
> -       for_each_set_bit(pdpe, new_pds, GEN8_LEGACY_PDPES)
> +       for_each_set_bit(pdpe, new_pds, pdpes)
>                 unmap_and_free_pd(pdp->page_directory[pdpe], ppgtt->base.dev);
>
>         return -ENOMEM;
>  }
>
>  static inline void
> -free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
> +free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts,
> +                      size_t pdpes)
>  {
>         int i;
>
> -       for (i = 0; i < GEN8_LEGACY_PDPES; i++)
> +       for (i = 0; i < pdpes; i++)
>                 kfree(new_pts[i]);
>         kfree(new_pts);
>         kfree(new_pds);
> @@ -841,13 +892,14 @@ free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
>   * of these are based on the number of PDPEs in the system.
>   */
>  int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
> -                                        unsigned long ***new_pts)
> +                                        unsigned long ***new_pts,
> +                                        size_t pdpes)
>  {
>         int i;
>         unsigned long *pds;
>         unsigned long **pts;
>
> -       pds = kcalloc(BITS_TO_LONGS(GEN8_LEGACY_PDPES), sizeof(unsigned long), GFP_KERNEL);
> +       pds = kcalloc(BITS_TO_LONGS(pdpes), sizeof(unsigned long), GFP_KERNEL);
>         if (!pds)
>                 return -ENOMEM;
>
> @@ -857,7 +909,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
>                 return -ENOMEM;
>         }
>
> -       for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
> +       for (i = 0; i < pdpes; i++) {
>                 pts[i] = kcalloc(BITS_TO_LONGS(GEN8_PDES_PER_PAGE),
>                                  sizeof(unsigned long), GFP_KERNEL);
>                 if (!pts[i])
> @@ -870,7 +922,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
>         return 0;
>
>  err_out:
> -       free_gen8_temp_bitmaps(pds, pts);
> +       free_gen8_temp_bitmaps(pds, pts, pdpes);
>         return -ENOMEM;
>  }
>
> @@ -886,6 +938,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>         const uint64_t orig_length = length;
>         uint64_t temp;
>         uint32_t pdpe;
> +       size_t pdpes = I915_PDPES_PER_PDP(dev);
>         int ret;
>
>  #ifndef CONFIG_64BIT
> @@ -903,7 +956,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>         if (WARN_ON(start + length < start))
>                 return -ERANGE;
>
> -       ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables);
> +       ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables, pdpes);
>         if (ret)
>                 return ret;
>
> @@ -911,7 +964,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>         ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
>                                         new_page_dirs);
>         if (ret) {
> -               free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
> +               free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>                 return ret;
>         }
>
> @@ -968,7 +1021,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>                 set_bit(pdpe, ppgtt->pdp.used_pdpes);
>         }
>
> -       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
> +       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>         return 0;
>
>  err_out:
> @@ -977,13 +1030,19 @@ err_out:
>                         unmap_and_free_pt(pd->page_tables[temp], vm->dev);
>         }
>
> -       for_each_set_bit(pdpe, new_page_dirs, GEN8_LEGACY_PDPES)
> +       for_each_set_bit(pdpe, new_page_dirs, pdpes)
>                 unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
>
> -       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
> +       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>         return ret;
>  }
>
> +static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
> +{
> +       unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> +       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> +}
> +
>  /**
>   * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
>   * with a net effect resembling a 2-level page table in normal x86 terms. Each
> @@ -1004,6 +1063,15 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>
>         ppgtt->switch_mm = gen8_mm_switch;
>
> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +               int ret = __pdp_init(&ppgtt->pdp, false);
> +               if (ret) {
> +                       unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> +                       return ret;
> +               }
> +       } else
> +               return -EPERM; /* Not yet implemented */
> +
>         return 0;
>  }
>
> @@ -1025,7 +1093,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>          * eventually. */
>         ret = gen8_alloc_va_range(&ppgtt->base, start, size);
>         if (ret) {
> -               unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> +               gen8_ppgtt_fini_common(ppgtt);
>                 return ret;
>         }
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index c68ec3a..a33c6e9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -85,8 +85,12 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>   * The difference as compared to normal x86 3 level page table is the PDPEs are
>   * programmed via register.
>   */
> +#define GEN8_PML4ES_PER_PML4           512
> +#define GEN8_PML4E_SHIFT               39
>  #define GEN8_PDPE_SHIFT                        30
> -#define GEN8_PDPE_MASK                 0x3
> +/* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
> + * tables */
> +#define GEN8_PDPE_MASK                 0x1ff
>  #define GEN8_PDE_SHIFT                 21
>  #define GEN8_PDE_MASK                  0x1ff
>  #define GEN8_PTE_SHIFT                 12
> @@ -95,6 +99,13 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>  #define GEN8_PTES_PER_PAGE             (PAGE_SIZE / sizeof(gen8_gtt_pte_t))
>  #define GEN8_PDES_PER_PAGE             (PAGE_SIZE / sizeof(gen8_ppgtt_pde_t))
>
> +#ifdef CONFIG_64BIT
> +# define I915_PDPES_PER_PDP(dev) (USES_FULL_48BIT_PPGTT(dev) ?\
> +               GEN8_PML4ES_PER_PML4 : GEN8_LEGACY_PDPES)
> +#else
> +# define I915_PDPES_PER_PDP            GEN8_LEGACY_PDPES
> +#endif
> +
>  #define PPAT_UNCACHED_INDEX            (_PAGE_PWT | _PAGE_PCD)
>  #define PPAT_CACHED_PDE_INDEX          0 /* WB LLC */
>  #define PPAT_CACHED_INDEX              _PAGE_PAT /* WB LLCeLLC */
> @@ -210,9 +221,17 @@ struct i915_page_directory_entry {
>  };
>
>  struct i915_page_directory_pointer_entry {
> -       /* struct page *page; */
> -       DECLARE_BITMAP(used_pdpes, GEN8_LEGACY_PDPES);
> -       struct i915_page_directory_entry *page_directory[GEN8_LEGACY_PDPES];
> +       struct page *page;
> +       dma_addr_t daddr;
> +       unsigned long *used_pdpes;
> +       struct i915_page_directory_entry **page_directory;
> +};
> +
> +struct i915_pml4 {
> +       struct page *page;
> +       dma_addr_t daddr;
> +       DECLARE_BITMAP(used_pml4es, GEN8_PML4ES_PER_PML4);
> +       struct i915_page_directory_pointer_entry *pdps[GEN8_PML4ES_PER_PML4];
>  };
>
>  struct i915_address_space {
> @@ -302,8 +321,9 @@ struct i915_hw_ppgtt {
>         struct drm_mm_node node;
>         unsigned long pd_dirty_rings;
>         union {
> -               struct i915_page_directory_pointer_entry pdp;
> -               struct i915_page_directory_entry pd;
> +               struct i915_pml4 pml4;          /* GEN8+ & 64b PPGTT */
> +               struct i915_page_directory_pointer_entry pdp;   /* GEN8+ */
> +               struct i915_page_directory_entry pd;            /* GEN6-7 */
>         };
>
>         union {
> @@ -399,14 +419,17 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>              temp = min(temp, length),                                  \
>              start += temp, length -= temp)
>
> -#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
> -       for (iter = gen8_pdpe_index(start), pd = (pdp)->page_directory[iter];   \
> -            length > 0 && iter < GEN8_LEGACY_PDPES;                    \
> +#define gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, b)    \
> +       for (iter = gen8_pdpe_index(start), pd = (pdp)->page_directory[iter]; \
> +            length > 0 && (iter < b);                                  \
>              pd = (pdp)->page_directory[++iter],                                \
>              temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,       \
>              temp = min(temp, length),                                  \
>              start += temp, length -= temp)
>
> +#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
> +       gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
> +
>  /* Clamp length to the next page_directory boundary */
>  static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length)
>  {
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 02/12] drm/i915/bdw: Abstract PDP usage
  2015-02-20 17:45 ` [PATCH 02/12] drm/i915/bdw: Abstract PDP usage Michel Thierry
@ 2015-03-03 12:16   ` akash goel
  2015-03-18 10:16     ` Michel Thierry
  2015-03-04  3:07   ` akash goel
  1 sibling, 1 reply; 32+ messages in thread
From: akash goel @ 2015-03-03 12:16 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> Up until now, ppgtt->pdp has always been the root of our page tables.
> Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.
>
> In preparation for 4 level page tables, we need to stop use ppgtt->pdp
> directly unless we know it's what we want. The future structure will use
> ppgtt->pml4 for the top level, and the pdp is just one of the entries
> being pointed to by a pml4e.
>
> v2: Updated after dynamic page allocation changes.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 123 ++++++++++++++++++++----------------
>  1 file changed, 70 insertions(+), 53 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 489f8db..d3ad517 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -560,6 +560,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>  {
>         struct i915_hw_ppgtt *ppgtt =
>                 container_of(vm, struct i915_hw_ppgtt, base);
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         gen8_gtt_pte_t *pt_vaddr, scratch_pte;
>         unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
>         unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
> @@ -575,10 +576,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>                 struct i915_page_table_entry *pt;
>                 struct page *page_table;
>
> -               if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
> +               if (WARN_ON(!pdp->page_directory[pdpe]))
>                         continue;
>
> -               pd = ppgtt->pdp.page_directory[pdpe];
> +               pd = pdp->page_directory[pdpe];
>
>                 if (WARN_ON(!pd->page_tables[pde]))
>                         continue;
> @@ -620,6 +621,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>  {
>         struct i915_hw_ppgtt *ppgtt =
>                 container_of(vm, struct i915_hw_ppgtt, base);
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         gen8_gtt_pte_t *pt_vaddr;
>         unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
>         unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
> @@ -630,7 +632,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>
>         for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
>                 if (pt_vaddr == NULL) {
> -                       struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe];
> +                       struct i915_page_directory_entry *pd = pdp->page_directory[pdpe];
>                         struct i915_page_table_entry *pt = pd->page_tables[pde];
>                         struct page *page_table = pt->page;
>
> @@ -708,16 +710,17 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d
>  static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
>  {
>         struct pci_dev *hwdev = ppgtt->base.dev->pdev;
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         int i, j;
>
> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> +       for_each_set_bit(i, pdp->used_pdpes,
>                         I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>                 struct i915_page_directory_entry *pd;
>
> -               if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> +               if (WARN_ON(!pdp->page_directory[i]))
>                         continue;
>
> -               pd = ppgtt->pdp.page_directory[i];
> +               pd = pdp->page_directory[i];
>                 if (!pd->daddr)
>                         pci_unmap_page(hwdev, pd->daddr, PAGE_SIZE,
>                                         PCI_DMA_BIDIRECTIONAL);
> @@ -743,15 +746,21 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
>  {
>         int i;
>
> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> -                               I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> -               if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> -                       continue;
> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +               for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> +                                I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> +                       if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> +                               continue;
>
> -               gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
> -               unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
> +                       gen8_free_page_tables(ppgtt->pdp.page_directory[i],
> +                                             ppgtt->base.dev);
> +                       unmap_and_free_pd(ppgtt->pdp.page_directory[i],
> +                                         ppgtt->base.dev);
> +               }
> +               unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> +       } else {
> +               BUG(); /* to be implemented later */
>         }
> -       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
>  }
>
>  static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
> @@ -765,7 +774,7 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>
>  /**
>   * gen8_ppgtt_alloc_pagetabs() - Allocate page tables for VA range.
> - * @ppgtt:     Master ppgtt structure.
> + * @vm:                Master vm structure.
>   * @pd:                Page directory for this address range.
>   * @start:     Starting virtual address to begin allocations.
>   * @length     Size of the allocations.
> @@ -781,12 +790,13 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>   *
>   * Return: 0 if success; negative error code otherwise.
>   */
> -static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
> +static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
>                                      struct i915_page_directory_entry *pd,
>                                      uint64_t start,
>                                      uint64_t length,
>                                      unsigned long *new_pts)
>  {
> +       struct drm_device *dev = vm->dev;
>         struct i915_page_table_entry *pt;
>         uint64_t temp;
>         uint32_t pde;
> @@ -799,7 +809,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
>                         continue;
>                 }
>
> -               pt = alloc_pt_single(ppgtt->base.dev);
> +               pt = alloc_pt_single(dev);
>                 if (IS_ERR(pt))
>                         goto unwind_out;
>
> @@ -811,14 +821,14 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
>
>  unwind_out:
>         for_each_set_bit(pde, new_pts, GEN8_PDES_PER_PAGE)
> -               unmap_and_free_pt(pd->page_tables[pde], ppgtt->base.dev);
> +               unmap_and_free_pt(pd->page_tables[pde], dev);
>
>         return -ENOMEM;
>  }
>
>  /**
>   * gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range.
> - * @ppgtt:     Master ppgtt structure.
> + * @vm:                Master vm structure.
>   * @pdp:       Page directory pointer for this address range.
>   * @start:     Starting virtual address to begin allocations.
>   * @length     Size of the allocations.
> @@ -839,16 +849,17 @@ unwind_out:
>   *
>   * Return: 0 if success; negative error code otherwise.
>   */
> -static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
> +static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
>                                      struct i915_page_directory_pointer_entry *pdp,
>                                      uint64_t start,
>                                      uint64_t length,
>                                      unsigned long *new_pds)
>  {
> +       struct drm_device *dev = vm->dev;
>         struct i915_page_directory_entry *pd;
>         uint64_t temp;
>         uint32_t pdpe;
> -       size_t pdpes =  I915_PDPES_PER_PDP(ppgtt->base.dev);
> +       size_t pdpes =  I915_PDPES_PER_PDP(vm->dev);
>
>         BUG_ON(!bitmap_empty(new_pds, pdpes));
>
> @@ -859,7 +870,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>                 if (pd)
>                         continue;
>
> -               pd = alloc_pd_single(ppgtt->base.dev);
> +               pd = alloc_pd_single(dev);
>                 if (IS_ERR(pd))
>                         goto unwind_out;
>
> @@ -871,7 +882,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>
>  unwind_out:
>         for_each_set_bit(pdpe, new_pds, pdpes)
> -               unmap_and_free_pd(pdp->page_directory[pdpe], ppgtt->base.dev);
> +               unmap_and_free_pd(pdp->page_directory[pdpe], dev);
>
>         return -ENOMEM;
>  }
> @@ -926,13 +937,13 @@ err_out:
>         return -ENOMEM;
>  }
>
> -static int gen8_alloc_va_range(struct i915_address_space *vm,
> -                              uint64_t start,
> -                              uint64_t length)
> +static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
> +                                   struct i915_page_directory_pointer_entry *pdp,
> +                                   uint64_t start,
> +                                   uint64_t length)
>  {
> -       struct i915_hw_ppgtt *ppgtt =
> -               container_of(vm, struct i915_hw_ppgtt, base);
>         unsigned long *new_page_dirs, **new_page_tables;
> +       struct drm_device *dev = vm->dev;
>         struct i915_page_directory_entry *pd;
>         const uint64_t orig_start = start;
>         const uint64_t orig_length = length;
> @@ -961,17 +972,15 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>                 return ret;
>
>         /* Do the allocations first so we can easily bail out */
> -       ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
> -                                       new_page_dirs);
> +       ret = gen8_ppgtt_alloc_page_directories(vm, pdp, start, length, new_page_dirs);
>         if (ret) {
>                 free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>                 return ret;
>         }
>
> -       /* For every page directory referenced, allocate page tables */
> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
> +       gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
>                 bitmap_zero(new_page_tables[pdpe], GEN8_PDES_PER_PAGE);
> -               ret = gen8_ppgtt_alloc_pagetabs(ppgtt, pd, start, length,
> +               ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
>                                                 new_page_tables[pdpe]);
>                 if (ret)
>                         goto err_out;
> @@ -980,10 +989,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>         start = orig_start;
>         length = orig_length;
>
> -       /* Allocations have completed successfully, so set the bitmaps, and do
> -        * the mappings. */
> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
> -               gen8_ppgtt_pde_t *const page_directory = kmap_atomic(pd->page);
> +       gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
>                 struct i915_page_table_entry *pt;
>                 uint64_t pd_len = gen8_clamp_pd(start, length);
>                 uint64_t pd_start = start;
> @@ -1005,20 +1011,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>
>                         /* Our pde is now pointing to the pagetable, pt */
>                         set_bit(pde, pd->used_pdes);
> -
> -                       /* Map the PDE to the page table */
> -                       __gen8_do_map_pt(page_directory + pde, pt, vm->dev);
> -
> -                       /* NB: We haven't yet mapped ptes to pages. At this
> -                        * point we're still relying on insert_entries() */
>                 }
>
> -               if (!HAS_LLC(vm->dev))
> -                       drm_clflush_virt_range(page_directory, PAGE_SIZE);
> -
> -               kunmap_atomic(page_directory);
> -
> -               set_bit(pdpe, ppgtt->pdp.used_pdpes);
> +               set_bit(pdpe, pdp->used_pdpes);
> +               gen8_map_pagetable_range(pd, start, length, dev);
>         }
>
>         free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
> @@ -1027,16 +1023,36 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>  err_out:
>         while (pdpe--) {
>                 for_each_set_bit(temp, new_page_tables[pdpe], GEN8_PDES_PER_PAGE)
> -                       unmap_and_free_pt(pd->page_tables[temp], vm->dev);
> +                       unmap_and_free_pt(pd->page_tables[temp], dev);

Sorry the review comment may not be completely pertinent to this very patch.
In the while loop, on the change of 'pdpe' value, the 'pd' is not
being updated accordingly.
The above call to 'unmap_and_free_pt(pd->page_table[temp], dev);'
should be replaced with
    'unmap_and_free_pt(pdp->page_directory[pdpe]->page_table[temp], dev);'
This will give the right page directory.


>         }
>
>         for_each_set_bit(pdpe, new_page_dirs, pdpes)
> -               unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
> +               unmap_and_free_pd(pdp->page_directory[pdpe], dev);
>
>         free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>         return ret;
>  }
>
> +static int __noreturn gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> +                                              struct i915_pml4 *pml4,
> +                                              uint64_t start,
> +                                              uint64_t length)
> +{
> +       BUG(); /* to be implemented later */
> +}
> +
> +static int gen8_alloc_va_range(struct i915_address_space *vm,
> +                              uint64_t start, uint64_t length)
> +{
> +       struct i915_hw_ppgtt *ppgtt =
> +               container_of(vm, struct i915_hw_ppgtt, base);
> +
> +       if (!USES_FULL_48BIT_PPGTT(vm->dev))
> +               return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
> +       else
> +               return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
> +}
> +
>  static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
>  {
>         unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> @@ -1079,12 +1095,13 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  {
>         struct drm_device *dev = ppgtt->base.dev;
>         struct drm_i915_private *dev_priv = dev->dev_private;
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         struct i915_page_directory_entry *pd;
>         uint64_t temp, start = 0, size = dev_priv->gtt.base.total;
>         uint32_t pdpe;
>         int ret;
>
> -       ret = gen8_ppgtt_init_common(ppgtt, dev_priv->gtt.base.total);
> +       ret = gen8_ppgtt_init_common(ppgtt, size);
>         if (ret)
>                 return ret;
>
> @@ -1097,8 +1114,8 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>                 return ret;
>         }
>
> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, size, temp, pdpe)
> -               gen8_map_pagetable_range(pd, start, size, ppgtt->base.dev);
> +       gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
> +               gen8_map_pagetable_range(pd, start, size, dev);


Sorry, again this comment may not be relevant for this patch.
Is the explicit call to map the page of page tables really needed here ?
As prior to this, already there is a call to gen8_alloc_va_range,
which will map the page of page tables into the pdes, for the entire
virtual range.

>
>         ppgtt->base.allocate_va_range = NULL;
>         ppgtt->base.clear_range = gen8_ppgtt_clear_range;
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/12] drm/i915/bdw: Add ppgtt info for dynamic pages
  2015-02-20 17:45 ` [PATCH 04/12] drm/i915/bdw: Add ppgtt info for dynamic pages Michel Thierry
@ 2015-03-03 12:23   ` akash goel
  2015-03-18 10:17     ` Michel Thierry
  0 siblings, 1 reply; 32+ messages in thread
From: akash goel @ 2015-03-03 12:23 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> Note that there is no gen8 ppgtt debug_dump function yet.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 19 ++++++++++---------
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 32 ++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  9 +++++++++
>  3 files changed, 51 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 40630bd..93c34ab 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2165,7 +2165,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>  {
>         struct drm_i915_private *dev_priv = dev->dev_private;
>         struct intel_engine_cs *ring;
> -       struct drm_file *file;
>         int i;
>
>         if (INTEL_INFO(dev)->gen == 6)
> @@ -2189,14 +2188,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>
>                 ppgtt->debug_dump(ppgtt, m);
>         }
> -
> -       list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> -               struct drm_i915_file_private *file_priv = file->driver_priv;
> -
> -               seq_printf(m, "proc: %s\n",
> -                          get_pid_task(file->pid, PIDTYPE_PID)->comm);
> -               idr_for_each(&file_priv->context_idr, per_file_ctx, m);
> -       }
>  }
>
>  static int i915_ppgtt_info(struct seq_file *m, void *data)
> @@ -2204,6 +2195,7 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
>         struct drm_info_node *node = m->private;
>         struct drm_device *dev = node->minor->dev;
>         struct drm_i915_private *dev_priv = dev->dev_private;
> +       struct drm_file *file;
>
>         int ret = mutex_lock_interruptible(&dev->struct_mutex);
>         if (ret)
> @@ -2215,6 +2207,15 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
>         else if (INTEL_INFO(dev)->gen >= 6)
>                 gen6_ppgtt_info(m, dev);
>
> +       list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> +               struct drm_i915_file_private *file_priv = file->driver_priv;
> +
> +               seq_printf(m, "\nproc: %s\n",
> +                          get_pid_task(file->pid, PIDTYPE_PID)->comm);
> +               idr_for_each(&file_priv->context_idr, per_file_ctx,
> +                            (void *)(unsigned long)m);
> +       }
> +
>         intel_runtime_pm_put(dev_priv);
>         mutex_unlock(&dev->struct_mutex);
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index ecfb62a..1edcc17 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2125,6 +2125,38 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
>         readl(gtt_base);
>  }
>
> +void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt,
> +                            void (*callback)(struct i915_page_directory_pointer_entry *pdp,
> +                                             struct i915_page_directory_entry *pd,
> +                                             struct i915_page_table_entry *pt,
> +                                             unsigned pdpe,
> +                                             unsigned pde,
> +                                             void *data),
> +                            void *data)
> +{
> +       uint64_t start = ppgtt->base.start;
> +       uint64_t length = ppgtt->base.total;
> +       uint64_t pdpe, pde, temp;
> +
> +       struct i915_page_directory_entry *pd;
> +       struct i915_page_table_entry *pt;
> +
> +       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
> +               uint64_t pd_start = start, pd_length = length;
> +               int i;
> +
> +               if (pd == NULL) {
> +                       for (i = 0; i < GEN8_PDES_PER_PAGE; i++)
> +                               callback(&ppgtt->pdp, NULL, NULL, pdpe, i, data);
> +                       continue;
> +               }
> +
> +               gen8_for_each_pde(pt, pd, pd_start, pd_length, temp, pde) {
> +                       callback(&ppgtt->pdp, pd, pt, pdpe, pde, data);
> +               }
> +       }
> +}
> +
>  static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>                                   uint64_t start,
>                                   uint64_t length,
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index a33c6e9..144858e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -483,6 +483,15 @@ static inline size_t gen8_pde_count(uint64_t addr, uint64_t length)
>         return i915_pde_index(end, GEN8_PDE_SHIFT) - i915_pde_index(addr, GEN8_PDE_SHIFT);
>  }
>
> +void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt,
> +                            void (*callback)(struct i915_page_directory_pointer_entry *pdp,
> +                                             struct i915_page_directory_entry *pd,
> +                                             struct i915_page_table_entry *pt,
> +                                             unsigned pdpe,
> +                                             unsigned pde,
> +                                             void *data),
> +                            void *data);
> +

Caller of gen8_for_every_pdpe_pde is not there.
What is the (envisaged) usage of this function ?

>  int i915_gem_gtt_init(struct drm_device *dev);
>  void i915_gem_init_global_gtt(struct drm_device *dev);
>  void i915_global_gtt_cleanup(struct drm_device *dev);
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl
  2015-02-20 17:45 ` [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl Michel Thierry
@ 2015-03-03 12:55   ` akash goel
  2015-03-04 13:00     ` Daniel Vetter
  2015-03-04  2:48   ` akash goel
  1 sibling, 1 reply; 32+ messages in thread
From: akash goel @ 2015-03-03 12:55 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> The code for 4lvl works just as one would expect, and nicely it is able
> to call into the existing 3lvl page table code to handle all of the
> lower levels.
>
> PML4 has no special attributes, and there will always be a PML4.
> So simply initialize it at creation, and destroy it at the end.
>
> v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
> compiler happy. And define ret only in one place.
> Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 240 +++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  11 +-
>  2 files changed, 217 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 1edcc17..edada33 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -483,9 +483,12 @@ static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp)
>  static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp,
>                             struct drm_device *dev)
>  {
> -       __pdp_fini(pdp);
> -       if (USES_FULL_48BIT_PPGTT(dev))
> +       if (USES_FULL_48BIT_PPGTT(dev)) {
> +               __pdp_fini(pdp);

Call to __pdp_fini should be made for the 32 bit also.
The 'used_pdpes' bitmap & 'page_directory' double pointer needs to be freed
in 32 bit case also (allocated inside __pdp_init, called from
gen8_ppgtt_init_common).

> +               i915_dma_unmap_single(pdp, dev);
> +               __free_page(pdp->page);
>                 kfree(pdp);
> +       }
>  }
>
>  static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
> @@ -511,6 +514,60 @@ static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
>         return 0;
>  }
>
> +static struct i915_page_directory_pointer_entry *alloc_pdp_single(struct i915_hw_ppgtt *ppgtt,
> +                                              struct i915_pml4 *pml4)
> +{
> +       struct drm_device *dev = ppgtt->base.dev;
> +       struct i915_page_directory_pointer_entry *pdp;
> +       int ret;
> +
> +       BUG_ON(!USES_FULL_48BIT_PPGTT(dev));
> +
> +       pdp = kmalloc(sizeof(*pdp), GFP_KERNEL);
> +       if (!pdp)
> +               return ERR_PTR(-ENOMEM);
> +
> +       pdp->page = alloc_page(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO);
> +       if (!pdp->page) {
> +               kfree(pdp);
> +               return ERR_PTR(-ENOMEM);
> +       }
> +
> +       ret = __pdp_init(pdp, dev);
> +       if (ret) {
> +               __free_page(pdp->page);
> +               kfree(pdp);
> +               return ERR_PTR(ret);
> +       }
> +
> +       i915_dma_map_px_single(pdp, dev);
> +
> +       return pdp;
> +}
> +
> +static void pml4_fini(struct i915_pml4 *pml4)
> +{
> +       struct i915_hw_ppgtt *ppgtt =
> +               container_of(pml4, struct i915_hw_ppgtt, pml4);
> +       i915_dma_unmap_single(pml4, ppgtt->base.dev);
> +       __free_page(pml4->page);
> +       /* HACK */
> +       pml4->page = NULL;
> +}
> +
> +static int pml4_init(struct i915_hw_ppgtt *ppgtt)
> +{
> +       struct i915_pml4 *pml4 = &ppgtt->pml4;
> +
> +       pml4->page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +       if (!pml4->page)
> +               return -ENOMEM;
> +
> +       i915_dma_map_px_single(pml4, ppgtt->base.dev);
> +
> +       return 0;
> +}
> +
>  /* Broadwell Page Directory Pointer Descriptors */
>  static int gen8_write_pdp(struct intel_engine_cs *ring,
>                           unsigned entry,
> @@ -712,14 +769,13 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d
>         }
>  }
>
> -static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
> +static void gen8_ppgtt_unmap_pages_3lvl(struct i915_page_directory_pointer_entry *pdp,
> +                                       struct drm_device *dev)
>  {
> -       struct pci_dev *hwdev = ppgtt->base.dev->pdev;
> -       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
> +       struct pci_dev *hwdev = dev->pdev;
>         int i, j;
>
> -       for_each_set_bit(i, pdp->used_pdpes,
> -                       I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> +       for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
>                 struct i915_page_directory_entry *pd;
>
>                 if (WARN_ON(!pdp->page_directory[i]))
> @@ -747,27 +803,73 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
>         }
>  }
>
> -static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
> +static void gen8_ppgtt_unmap_pages_4lvl(struct i915_hw_ppgtt *ppgtt)
>  {
> +       struct pci_dev *hwdev = ppgtt->base.dev->pdev;
>         int i;
>
> -       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> -               for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> -                                I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> -                       if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> -                               continue;
> +       for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
> +               struct i915_page_directory_pointer_entry *pdp;
>
> -                       gen8_free_page_tables(ppgtt->pdp.page_directory[i],
> -                                             ppgtt->base.dev);
> -                       unmap_and_free_pd(ppgtt->pdp.page_directory[i],
> -                                         ppgtt->base.dev);
> -               }
> -               unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> -       } else {
> -               BUG(); /* to be implemented later */
> +               if (WARN_ON(!ppgtt->pml4.pdps[i]))
> +                       continue;
> +
> +               pdp = ppgtt->pml4.pdps[i];
> +               if (!pdp->daddr)
> +                       pci_unmap_page(hwdev, pdp->daddr, PAGE_SIZE,
> +                                      PCI_DMA_BIDIRECTIONAL);
> +

For consistency & cleanup,  the call to pci_unmap_page can be replaced
with i915_dma_unmap_single.
Same can be done inside the gen8_ppgtt_unmap_pages_3lvl function also.

> +               gen8_ppgtt_unmap_pages_3lvl(ppgtt->pml4.pdps[i],
> +                                           ppgtt->base.dev);
>         }
>  }
>
> +static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
> +{
> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> +               gen8_ppgtt_unmap_pages_3lvl(&ppgtt->pdp, ppgtt->base.dev);
> +       else
> +               gen8_ppgtt_unmap_pages_4lvl(ppgtt);
> +}
> +
> +static void gen8_ppgtt_free_3lvl(struct i915_page_directory_pointer_entry *pdp,
> +                                struct drm_device *dev)
> +{
> +       int i;
> +
> +       for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
> +               if (WARN_ON(!pdp->page_directory[i]))
> +                       continue;
> +
> +               gen8_free_page_tables(pdp->page_directory[i], dev);
> +               unmap_and_free_pd(pdp->page_directory[i], dev);
> +       }
> +
> +       unmap_and_free_pdp(pdp, dev);
> +}
> +
> +static void gen8_ppgtt_free_4lvl(struct i915_hw_ppgtt *ppgtt)
> +{
> +       int i;
> +
> +       for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
> +               if (WARN_ON(!ppgtt->pml4.pdps[i]))
> +                       continue;
> +
> +               gen8_ppgtt_free_3lvl(ppgtt->pml4.pdps[i], ppgtt->base.dev);
> +       }
> +
> +       pml4_fini(&ppgtt->pml4);
> +}
> +
> +static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
> +{
> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> +               gen8_ppgtt_free_3lvl(&ppgtt->pdp, ppgtt->base.dev);
> +       else
> +               gen8_ppgtt_free_4lvl(ppgtt);
> +}
> +
>  static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>  {
>         struct i915_hw_ppgtt *ppgtt =

Is the call to 'gen8_ppgtt_unmap_pages' really needed from
'gen8_ppgtt_cleanup' function,
considering that gen8_ppgtt_free will do the unmap operation also, for
the pml4, pdp, pd & pt pages.

> @@ -1040,12 +1142,74 @@ err_out:
>         return ret;
>  }
>
> -static int __noreturn gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> -                                              struct i915_pml4 *pml4,
> -                                              uint64_t start,
> -                                              uint64_t length)
> +static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> +                                   struct i915_pml4 *pml4,
> +                                   uint64_t start,
> +                                   uint64_t length)
>  {
> -       BUG(); /* to be implemented later */
> +       DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
> +       struct i915_hw_ppgtt *ppgtt =
> +               container_of(vm, struct i915_hw_ppgtt, base);
> +       struct i915_page_directory_pointer_entry *pdp;
> +       const uint64_t orig_start = start;
> +       const uint64_t orig_length = length;
> +       uint64_t temp, pml4e;
> +       int ret = 0;
> +
> +       /* Do the pml4 allocations first, so we don't need to track the newly
> +        * allocated tables below the pdp */
> +       bitmap_zero(new_pdps, GEN8_PML4ES_PER_PML4);
> +
> +       /* The page_directoryectory and pagetable allocations are done in the shared 3
> +        * and 4 level code. Just allocate the pdps.
> +        */
> +       gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> +               if (!pdp) {
> +                       WARN_ON(test_bit(pml4e, pml4->used_pml4es));
> +                       pdp = alloc_pdp_single(ppgtt, pml4);
> +                       if (IS_ERR(pdp))
> +                               goto err_alloc;
> +
> +                       pml4->pdps[pml4e] = pdp;
> +                       set_bit(pml4e, new_pdps);
> +                       trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, pml4e,
> +                                                  pml4e << GEN8_PML4E_SHIFT,
> +                                                  GEN8_PML4E_SHIFT);
> +
> +               }
> +       }
> +
> +       WARN(bitmap_weight(new_pdps, GEN8_PML4ES_PER_PML4) > 2,
> +            "The allocation has spanned more than 512GB. "
> +            "It is highly likely this is incorrect.");
> +
> +       start = orig_start;
> +       length = orig_length;
> +
> +       gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> +               BUG_ON(!pdp);
> +
> +               ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
> +               if (ret)
> +                       goto err_out;
> +       }
> +
> +       bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
> +                 GEN8_PML4ES_PER_PML4);
> +
> +       return 0;
> +
> +err_out:
> +       start = orig_start;
> +       length = orig_length;
> +       gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e)
> +               gen8_ppgtt_free_3lvl(pdp, vm->dev);
> +
> +err_alloc:
> +       for_each_set_bit(pml4e, new_pdps, GEN8_PML4ES_PER_PML4)
> +               unmap_and_free_pdp(pdp, vm->dev);
> +

If __gen8_alloc_vma_range_3lvl returns an error, then there can be 2
calls to unmap_and_free_pdp
for the same pdp value. The gen8_ppgtt_free_3lvl will also call the
unmap_and_free_pdp for the newly allocated
pdp.
Already __gen8_alloc_vma_range_3lvl seems to be error handling
internally, i.e. it is de-allocating the newly
allocated pages for page directory & page tables, if there is an
allocation failure somewhere.
So similarly __gen8_alloc_vma_range_4lvl function can just do the
de-allocation for the newly allocated pdps ('new_pdps').

> +       return ret;
>  }
>
>  static int gen8_alloc_va_range(struct i915_address_space *vm,
> @@ -1054,16 +1218,19 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>         struct i915_hw_ppgtt *ppgtt =
>                 container_of(vm, struct i915_hw_ppgtt, base);
>
> -       if (!USES_FULL_48BIT_PPGTT(vm->dev))
> -               return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
> -       else
> +       if (USES_FULL_48BIT_PPGTT(vm->dev))
>                 return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
> +       else
> +               return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
>  }
>
>  static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
>  {
>         unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> -       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> +       if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> +               pml4_fini(&ppgtt->pml4);
> +       else
> +               unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
>  }
>
>  /**
> @@ -1086,14 +1253,21 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>
>         ppgtt->switch_mm = gen8_mm_switch;
>
> -       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +       if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +               int ret = pml4_init(ppgtt);
> +               if (ret) {
> +                       unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> +                       return ret;
> +               }
> +       } else {
>                 int ret = __pdp_init(&ppgtt->pdp, false);
>                 if (ret) {
>                         unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
>                         return ret;
>                 }
> -       } else
> -               return -EPERM; /* Not yet implemented */
> +
> +               trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, 0, 0, GEN8_PML4E_SHIFT);
> +       }
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 144858e..1477f54 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -87,6 +87,7 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>   */
>  #define GEN8_PML4ES_PER_PML4           512
>  #define GEN8_PML4E_SHIFT               39
> +#define GEN8_PML4E_MASK                        (GEN8_PML4ES_PER_PML4 - 1)
>  #define GEN8_PDPE_SHIFT                        30
>  /* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
>   * tables */
> @@ -427,6 +428,14 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>              temp = min(temp, length),                                  \
>              start += temp, length -= temp)
>
> +#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
> +       for (iter = gen8_pml4e_index(start), pdp = (pml4)->pdps[iter];  \
> +            length > 0 && iter < GEN8_PML4ES_PER_PML4;                 \
> +            pdp = (pml4)->pdps[++iter],                                \
> +            temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,   \
> +            temp = min(temp, length),                                  \
> +            start += temp, length -= temp)
> +
>  #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
>         gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
>
> @@ -458,7 +467,7 @@ static inline uint32_t gen8_pdpe_index(uint64_t address)
>
>  static inline uint32_t gen8_pml4e_index(uint64_t address)
>  {
> -       BUG(); /* For 64B */
> +       return (address >> GEN8_PML4E_SHIFT) & GEN8_PML4E_MASK;
>  }
>
>  static inline size_t gen8_pte_count(uint64_t addr, uint64_t length)
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 06/12] drm/i915/bdw: Add 4 level switching infrastructure
  2015-02-20 17:46 ` [PATCH 06/12] drm/i915/bdw: Add 4 level switching infrastructure Michel Thierry
@ 2015-03-03 13:01   ` akash goel
  2015-03-04 13:08     ` Daniel Vetter
  0 siblings, 1 reply; 32+ messages in thread
From: akash goel @ 2015-03-03 13:01 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:16 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> Map is easy, it's the same register as the PDP descriptor 0, but it only
> has one entry.
>
> v2: PML4 update in legacy context switch is left for historic reasons,
> the preferred mode of operation is with lrc context based submission.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 56 +++++++++++++++++++++++++++++++++----
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  4 ++-
>  drivers/gpu/drm/i915/i915_reg.h     |  1 +
>  3 files changed, 55 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index edada33..fb06f67 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -192,6 +192,9 @@ static inline gen8_ppgtt_pde_t gen8_pde_encode(struct drm_device *dev,
>         return pde;
>  }
>
> +#define gen8_pdpe_encode gen8_pde_encode
> +#define gen8_pml4e_encode gen8_pde_encode
> +
>  static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
>                                      enum i915_cache_level level,
>                                      bool valid, u32 unused)
> @@ -592,8 +595,8 @@ static int gen8_write_pdp(struct intel_engine_cs *ring,
>         return 0;
>  }
>
> -static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
> -                         struct intel_engine_cs *ring)
> +static int gen8_legacy_mm_switch(struct i915_hw_ppgtt *ppgtt,
> +                                struct intel_engine_cs *ring)
>  {
>         int i, ret;
>
> @@ -610,6 +613,12 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
>         return 0;
>  }
>
> +static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
> +                             struct intel_engine_cs *ring)
> +{
> +       return gen8_write_pdp(ring, 0, ppgtt->pml4.daddr);
> +}
> +
>  static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>                                    uint64_t start,
>                                    uint64_t length,
> @@ -753,6 +762,37 @@ static void gen8_map_pagetable_range(struct i915_address_space *vm,
>         kunmap_atomic(page_directory);
>  }
>
> +static void gen8_map_page_directory(struct i915_page_directory_pointer_entry *pdp,
> +                                   struct i915_page_directory_entry *pd,
> +                                   int index,
> +                                   struct drm_device *dev)
> +{
> +       gen8_ppgtt_pdpe_t *page_directorypo;
> +       gen8_ppgtt_pdpe_t pdpe;
> +
> +       /* We do not need to clflush because no platform requiring flush
> +        * supports 64b pagetables. */

Would be more appropriate to place this comment, either after the ‘if’
condition or
at the end of the function (where clflush would have been placed, had
LLC not been there for platforms supporting 64 bit).
And same comment can be probably added, at the end of
gen8_map_page_directory_pointer function also.

> +       if (!USES_FULL_48BIT_PPGTT(dev))
> +               return;
> +
> +       page_directorypo = kmap_atomic(pdp->page);
> +       pdpe = gen8_pdpe_encode(dev, pd->daddr, I915_CACHE_LLC);
> +       page_directorypo[index] = pdpe;
> +       kunmap_atomic(page_directorypo);
> +}
> +
> +static void gen8_map_page_directory_pointer(struct i915_pml4 *pml4,
> +                                           struct i915_page_directory_pointer_entry *pdp,
> +                                           int index,
> +                                           struct drm_device *dev)
> +{
> +       gen8_ppgtt_pml4e_t *pagemap = kmap_atomic(pml4->page);
> +       gen8_ppgtt_pml4e_t pml4e = gen8_pml4e_encode(dev, pdp->daddr, I915_CACHE_LLC);
> +       BUG_ON(!USES_FULL_48BIT_PPGTT(dev));
> +       pagemap[index] = pml4e;
> +       kunmap_atomic(pagemap);
> +}
> +
>  static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct drm_device *dev)
>  {
>         int i;
> @@ -1124,6 +1164,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>
>                 set_bit(pdpe, pdp->used_pdpes);
>                 gen8_map_pagetable_range(vm, pd, start, length);
> +               gen8_map_page_directory(pdp, pd, pdpe, dev);
>         }
>
>         free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
> @@ -1192,6 +1233,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
>                 ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
>                 if (ret)
>                         goto err_out;
> +
> +               gen8_map_page_directory_pointer(pml4, pdp, pml4e, vm->dev);
>         }
>
>         bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
> @@ -1251,14 +1294,14 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>         ppgtt->base.cleanup = gen8_ppgtt_cleanup;
>         ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
>
> -       ppgtt->switch_mm = gen8_mm_switch;
> -
>         if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
>                 int ret = pml4_init(ppgtt);
>                 if (ret) {
>                         unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
>                         return ret;
>                 }
> +
> +               ppgtt->switch_mm = gen8_48b_mm_switch;
>         } else {
>                 int ret = __pdp_init(&ppgtt->pdp, false);
>                 if (ret) {
> @@ -1266,6 +1309,7 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>                         return ret;
>                 }
>
> +               ppgtt->switch_mm = gen8_legacy_mm_switch;
>                 trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, 0, 0, GEN8_PML4E_SHIFT);
>         }
>
> @@ -1295,6 +1339,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>                 return ret;
>         }
>
> +       /* FIXME: PML4 */
>         gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
>                 gen8_map_pagetable_range(&ppgtt->base, pd,start, size);
>
> @@ -1500,8 +1545,9 @@ static void gen8_ppgtt_enable(struct drm_device *dev)
>         int j;
>
>         for_each_ring(ring, dev_priv, j) {
> +               u32 four_level = USES_FULL_48BIT_PPGTT(dev) ? GEN8_GFX_PPGTT_64B : 0;
>                 I915_WRITE(RING_MODE_GEN7(ring),
> -                          _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
> +                          _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE | four_level));
>         }
>  }
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 1477f54..1f4cdb1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -38,7 +38,9 @@ struct drm_i915_file_private;
>
>  typedef uint32_t gen6_gtt_pte_t;
>  typedef uint64_t gen8_gtt_pte_t;
> -typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
> +typedef gen8_gtt_pte_t         gen8_ppgtt_pde_t;
> +typedef gen8_ppgtt_pde_t       gen8_ppgtt_pdpe_t;
> +typedef gen8_ppgtt_pdpe_t      gen8_ppgtt_pml4e_t;
>
>  #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 1dc91de..305e5b7 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1338,6 +1338,7 @@ enum skl_disp_power_wells {
>  #define   GFX_REPLAY_MODE              (1<<11)
>  #define   GFX_PSMI_GRANULARITY         (1<<10)
>  #define   GFX_PPGTT_ENABLE             (1<<9)
> +#define   GEN8_GFX_PPGTT_64B           (1<<7)
>
>  #define VLV_DISPLAY_BASE 0x180000
>  #define VLV_MIPI_BASE VLV_DISPLAY_BASE
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 07/12] drm/i915/bdw: Support 64 bit PPGTT in lrc mode
  2015-02-20 17:46 ` [PATCH 07/12] drm/i915/bdw: Support 64 bit PPGTT in lrc mode Michel Thierry
@ 2015-03-03 13:08   ` akash goel
  0 siblings, 0 replies; 32+ messages in thread
From: akash goel @ 2015-03-03 13:08 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:16 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
> the base address to PML4, while the other PDP registers are ignored.
>
> Also, the addressing mode must be specified in every context descriptor.
>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 167 ++++++++++++++++++++++++++-------------
>  1 file changed, 114 insertions(+), 53 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index f461631..2b6d262 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -255,7 +255,8 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
>  }
>
>  static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
> -                                        struct drm_i915_gem_object *ctx_obj)
> +                                        struct drm_i915_gem_object *ctx_obj,
> +                                        bool legacy_64bit_ctx)

The 'legacy_64bit_ctx' flag can be derived within the
execlists_ctx_descriptor function also,
through USES_FULL_48BIT_PPGTT macro, as 'dev' pointer is already available.
Doing so, will avoid the modification of  function’s prototype and
modification to ‘execlists_elsp_write’.

>  {
>         struct drm_device *dev = ring->dev;
>         uint64_t desc;
> @@ -264,7 +265,10 @@ static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
>         WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
>
>         desc = GEN8_CTX_VALID;
> -       desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
> +       if (legacy_64bit_ctx)
> +               desc |= LEGACY_64B_CONTEXT << GEN8_CTX_MODE_SHIFT;
> +       else
> +               desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
>         desc |= GEN8_CTX_L3LLC_COHERENT;
>         desc |= GEN8_CTX_PRIVILEGE;
>         desc |= lrca;
> @@ -292,16 +296,17 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
>         struct drm_i915_private *dev_priv = dev->dev_private;
>         uint64_t temp = 0;
>         uint32_t desc[4];
> +       bool legacy_64bit_ctx = USES_FULL_48BIT_PPGTT(dev);
>
>         /* XXX: You must always write both descriptors in the order below. */
>         if (ctx_obj1)
> -               temp = execlists_ctx_descriptor(ring, ctx_obj1);
> +               temp = execlists_ctx_descriptor(ring, ctx_obj1, legacy_64bit_ctx);
>         else
>                 temp = 0;
>         desc[1] = (u32)(temp >> 32);
>         desc[0] = (u32)temp;
>
> -       temp = execlists_ctx_descriptor(ring, ctx_obj0);
> +       temp = execlists_ctx_descriptor(ring, ctx_obj0, legacy_64bit_ctx);
>         desc[3] = (u32)(temp >> 32);
>         desc[2] = (u32)temp;
>
> @@ -332,37 +337,60 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
>         reg_state[CTX_RING_TAIL+1] = tail;
>         reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
>
> -       /* True PPGTT with dynamic page allocation: update PDP registers and
> -        * point the unallocated PDPs to the scratch page
> -        */
> -       if (ppgtt) {
> +       if (ppgtt && USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +               /* True 64b PPGTT (48bit canonical)
> +                * PDP0_DESCRIPTOR contains the base address to PML4 and
> +                * other PDP Descriptors are ignored
> +                */
> +               reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pml4.daddr);
> +               reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pml4.daddr);
> +       } else if (ppgtt) {
> +               /* True 32b PPGTT with dynamic page allocation: update PDP
> +                * registers and point the unallocated PDPs to the scratch page
> +                */
>                 if (test_bit(3, ppgtt->pdp.used_pdpes)) {
> -                       reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
> -                       reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
> +                       reg_state[CTX_PDP3_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
> +                       reg_state[CTX_PDP3_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
>                 } else {
> -                       reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -                       reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP3_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP3_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
>                 }
>                 if (test_bit(2, ppgtt->pdp.used_pdpes)) {
> -                       reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
> -                       reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
> +                       reg_state[CTX_PDP2_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
> +                       reg_state[CTX_PDP2_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
>                 } else {
> -                       reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -                       reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP2_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP2_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
>                 }
>                 if (test_bit(1, ppgtt->pdp.used_pdpes)) {
> -                       reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
> -                       reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
> +                       reg_state[CTX_PDP1_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
> +                       reg_state[CTX_PDP1_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
>                 } else {
> -                       reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -                       reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP1_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP1_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
>                 }
>                 if (test_bit(0, ppgtt->pdp.used_pdpes)) {
> -                       reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
> -                       reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
> +                       reg_state[CTX_PDP0_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
> +                       reg_state[CTX_PDP0_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
>                 } else {
> -                       reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -                       reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP0_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP0_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
>                 }
>         }
>
> @@ -1771,36 +1799,69 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
>         reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
>         reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
>
> -       /* With dynamic page allocation, PDPs may not be allocated at this point,
> -        * Point the unallocated PDPs to the scratch page
> -        */
> -       if (test_bit(3, ppgtt->pdp.used_pdpes)) {
> -               reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
> -               reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
> -       } else {
> -               reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -               reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> -       }
> -       if (test_bit(2, ppgtt->pdp.used_pdpes)) {
> -               reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
> -               reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
> -       } else {
> -               reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -               reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> -       }
> -       if (test_bit(1, ppgtt->pdp.used_pdpes)) {
> -               reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
> -               reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
> -       } else {
> -               reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -               reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> -       }
> -       if (test_bit(0, ppgtt->pdp.used_pdpes)) {
> -               reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
> -               reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
> +       if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +               /* 64b PPGTT (48bit canonical)
> +                * PDP0_DESCRIPTOR contains the base address to PML4 and
> +                * other PDP Descriptors are ignored
> +                */
> +               reg_state[CTX_PDP3_UDW+1] = 0;
> +               reg_state[CTX_PDP3_LDW+1] = 0;
> +               reg_state[CTX_PDP2_UDW+1] = 0;
> +               reg_state[CTX_PDP2_LDW+1] = 0;
> +               reg_state[CTX_PDP1_UDW+1] = 0;
> +               reg_state[CTX_PDP1_LDW+1] = 0;
> +               reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pml4.daddr);
> +               reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pml4.daddr);
>         } else {
> -               reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr);
> -               reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr);
> +               /* 32b PPGTT
> +                * PDP*_DESCRIPTOR contains the base address of space supported.
> +                * With dynamic page allocation, PDPs may not be allocated at
> +                * this point. Point the unallocated PDPs to the scratch page
> +                */
> +               if (test_bit(3, ppgtt->pdp.used_pdpes)) {
> +                       reg_state[CTX_PDP3_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[3]->daddr);
> +                       reg_state[CTX_PDP3_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[3]->daddr);
> +               } else {
> +                       reg_state[CTX_PDP3_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP3_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
> +               }
> +               if (test_bit(2, ppgtt->pdp.used_pdpes)) {
> +                       reg_state[CTX_PDP2_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[2]->daddr);
> +                       reg_state[CTX_PDP2_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[2]->daddr);
> +               } else {
> +                       reg_state[CTX_PDP2_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP2_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
> +               }
> +               if (test_bit(1, ppgtt->pdp.used_pdpes)) {
> +                       reg_state[CTX_PDP1_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[1]->daddr);
> +                       reg_state[CTX_PDP1_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[1]->daddr);
> +               } else {
> +                       reg_state[CTX_PDP1_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP1_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
> +               }
> +               if (test_bit(0, ppgtt->pdp.used_pdpes)) {
> +                       reg_state[CTX_PDP0_UDW+1] =
> +                                       upper_32_bits(ppgtt->pdp.page_directory[0]->daddr);
> +                       reg_state[CTX_PDP0_LDW+1] =
> +                                       lower_32_bits(ppgtt->pdp.page_directory[0]->daddr);
> +               } else {
> +                       reg_state[CTX_PDP0_UDW+1] =
> +                                       upper_32_bits(ppgtt->scratch_pd->daddr);
> +                       reg_state[CTX_PDP0_LDW+1] =
> +                                       lower_32_bits(ppgtt->scratch_pd->daddr);
> +               }
>         }
>
>         if (ring->id == RCS) {
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 00/12] PPGTT with 48b addressing
  2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
                   ` (12 preceding siblings ...)
  2015-02-24 10:54 ` [PATCH 00/12] PPGTT with 48b addressing Daniel Vetter
@ 2015-03-03 13:52 ` Damien Lespiau
  13 siblings, 0 replies; 32+ messages in thread
From: Damien Lespiau @ 2015-03-03 13:52 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Fri, Feb 20, 2015 at 05:45:54PM +0000, Michel Thierry wrote:
> These patches rely on "PPGTT dynamic page allocations", currently under review,
> to provide GEN8 dynamic page table support with 64b addresses. As the review
> progresses, these patches may be combined.
> 
> In order expand the GPU address space, a 4th level translation is added, the
> Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255],
> each pointing to a PDP.
> 
> For now, this feature will only be available in BDW, in LRC submission mode
> (execlists) and when i915.enable_ppgtt=3 is set.
> Also note that this expanded address space is only available for full PPGTT,
> aliasing PPGTT remains 32b.

FWIW, I don't think it sounds like a good idea to enable 48bits address
spaces without having implemented Wa32bitGeneralStateOffset and
Wa32bitInstructionBaseOffset.

-- 
Damien
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 10/12] drm/i915/bdw: Add 4 level support in insert_entries and clear_range
  2015-02-20 17:46 ` [PATCH 10/12] drm/i915/bdw: Add 4 level support in insert_entries and clear_range Michel Thierry
@ 2015-03-03 16:39   ` akash goel
  0 siblings, 0 replies; 32+ messages in thread
From: akash goel @ 2015-03-03 16:39 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:16 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map
> Level 4 (PML4), before it selects which Page Directory Pointer (PDP)
> it will write to.
>
> Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range.
>
> Also add a scratch page for PML4.
>
> This patch was inspired by Ben's "Depend exclusively on map and
> unmap_vma".
>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 66 ++++++++++++++++++++++++++++++-------
>  drivers/gpu/drm/i915/i915_gem_gtt.h | 12 +++++++
>  2 files changed, 67 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index a1396cb..0954827 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -676,24 +676,52 @@ static void gen8_ppgtt_clear_pte_range(struct i915_page_directory_pointer_entry
>         }
>  }
>
> +static void gen8_ppgtt_clear_range_4lvl(struct i915_hw_ppgtt *ppgtt,
> +                                       gen8_gtt_pte_t scratch_pte,
> +                                       uint64_t start,
> +                                       uint64_t length)
> +{
> +       struct i915_page_directory_pointer_entry *pdp;
> +       uint64_t templ4, templ3, pml4e, pdpe;
> +
> +       gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
> +               struct i915_page_directory_entry *pd;
> +               uint64_t pdp_len = gen8_clamp_pdp(start, length);
> +               uint64_t pdp_start = start;
> +
> +               gen8_for_each_pdpe(pd, pdp, pdp_start, pdp_len, templ3, pdpe) {

The 'gen8_ppgtt_clear_pte_range' function is equipped to switch to a
new page directory appropriately.
So having just an outer loop of pml4 entries should suffice, with the
use of  gen8_clamp_pdp.
The inner loop  'gen8_for_each_pdpe' is not really needed.

> +                       uint64_t pd_len = gen8_clamp_pd(pdp_start, pdp_len);
> +                       uint64_t pd_start = pdp_start;
> +
> +                       gen8_ppgtt_clear_pte_range(pdp, pd_start, pd_len,
> +                                                  scratch_pte, !HAS_LLC(ppgtt->base.dev));
> +               }
> +       }
> +}
> +
>  static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
> -                                  uint64_t start,
> -                                  uint64_t length,
> +                                  uint64_t start, uint64_t length,
>                                    bool use_scratch)
>  {
>         struct i915_hw_ppgtt *ppgtt =
> -               container_of(vm, struct i915_hw_ppgtt, base);
> -       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
> -
> +                       container_of(vm, struct i915_hw_ppgtt, base);
>         gen8_gtt_pte_t scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
>                                                      I915_CACHE_LLC, use_scratch);
>
> -       gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte, !HAS_LLC(vm->dev));
> +       if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
> +               struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp;
> +
> +               gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte,
> +                                          !HAS_LLC(ppgtt->base.dev));
> +       } else {
> +               gen8_ppgtt_clear_range_4lvl(ppgtt, scratch_pte, start, length);
> +       }
>  }
>
>  static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_entry *pdp,
>                                           struct sg_page_iter *sg_iter,
>                                           uint64_t start,
> +                                         size_t pages,
>                                           enum i915_cache_level cache_level,
>                                           const bool flush)
>  {
> @@ -704,7 +732,7 @@ static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_ent
>
>         pt_vaddr = NULL;
>
> -       while (__sg_page_iter_next(sg_iter)) {
> +       while (pages-- && __sg_page_iter_next(sg_iter)) {
>                 if (pt_vaddr == NULL) {
>                         struct i915_page_directory_entry *pd = pdp->page_directory[pdpe];
>                         struct i915_page_table_entry *pt = pd->page_tables[pde];
> @@ -742,11 +770,26 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>                                       u32 unused)
>  {
>         struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base);
> -       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
> +       struct i915_page_directory_pointer_entry *pdp;
>         struct sg_page_iter sg_iter;
>
>         __sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0);
> -       gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start, cache_level, !HAS_LLC(vm->dev));
> +
> +       if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
> +               pdp = &ppgtt->pdp;
> +               gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start,
> +                               sg_nents(pages->sgl),
> +                               cache_level, !HAS_LLC(vm->dev));
> +       } else {
> +               struct i915_pml4 *pml4;
> +               unsigned pml4e = gen8_pml4e_index(start);
> +
> +               pml4 = &ppgtt->pml4;
> +               pdp = pml4->pdps[pml4e];

Since an object could get mapped, such that it straddles the pdp
boundary & 'gen8_ppgtt_insert_pte_entries'
can't switch on its own to a new page directory pointer table (pdp), a
modification is required here to have a
loop of 'gen8_for_each_pml4e', with the use of  'gen8_clamp_pdp',
similar to how gen8_ppgtt_clear_range_4lvl
has been implemented.

> +               gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start,
> +                               sg_nents(pages->sgl),
> +                               cache_level, !HAS_LLC(vm->dev));
> +       }
>  }
>
>  static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde,
> @@ -1185,7 +1228,8 @@ static int __gen8_alloc_vma_range_3lvl(struct i915_address_space *vm,
>                         if (sg_iter) {
>                                 BUG_ON(!sg_iter->__nents);
>                                 gen8_ppgtt_insert_pte_entries(pdp, sg_iter, pd_start,
> -                                                             flags, !HAS_LLC(vm->dev));
> +                                               gen8_pte_count(pd_start, pd_len),
> +                                               flags, !HAS_LLC(vm->dev));
>                         }
>                         set_bit(pde, pd->used_pdes);
>                 }
> @@ -1330,7 +1374,7 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>         if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
>                 int ret = pml4_init(ppgtt);
>                 if (ret) {
> -                       unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> +                       unmap_and_free_pt(ppgtt->scratch_pml4, ppgtt->base.dev);
>                         return ret;
>                 }
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 1f4cdb1..602d446c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -332,6 +332,7 @@ struct i915_hw_ppgtt {
>         union {
>                 struct i915_page_table_entry *scratch_pt;
>                 struct i915_page_table_entry *scratch_pd; /* Just need the daddr */
> +               struct i915_page_table_entry *scratch_pml4;
>         };
>
>         struct drm_i915_file_private *file_priv;
> @@ -452,6 +453,17 @@ static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length)
>         return next_pd - start;
>  }
>
> +/* Clamp length to the next page_directory pointer boundary */
> +static inline uint64_t gen8_clamp_pdp(uint64_t start, uint64_t length)
> +{
> +       uint64_t next_pdp = ALIGN(start + 1, 1ULL << GEN8_PML4E_SHIFT);
> +
> +       if (next_pdp > (start + length))
> +               return length;
> +
> +       return next_pdp - start;
> +}
> +
>  static inline uint32_t gen8_pte_index(uint64_t address)
>  {
>         return i915_pte_index(address, GEN8_PDE_SHIFT);
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 11/12] drm/i915: Expand error state's address width to 64b
  2015-02-20 17:46 ` [PATCH 11/12] drm/i915: Expand error state's address width to 64b Michel Thierry
@ 2015-03-03 16:42   ` akash goel
  0 siblings, 0 replies; 32+ messages in thread
From: akash goel @ 2015-03-03 16:42 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:16 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> v2: 0 pad the new 8B fields or else intel_error_decode has a hard time.
> Note, regardless we need an igt update.
>
> v3: Make reloc_offset 64b also.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h       |  4 ++--
>  drivers/gpu/drm/i915/i915_gpu_error.c | 17 +++++++++--------
>  2 files changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index af0d149..056ced5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -459,7 +459,7 @@ struct drm_i915_error_state {
>
>                 struct drm_i915_error_object {
>                         int page_count;
> -                       u32 gtt_offset;
> +                       u64 gtt_offset;
>                         u32 *pages[0];
>                 } *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
>
> @@ -485,7 +485,7 @@ struct drm_i915_error_state {
>                 u32 size;
>                 u32 name;
>                 u32 rseqno, wseqno;
> -               u32 gtt_offset;
> +               u64 gtt_offset;
>                 u32 read_domains;
>                 u32 write_domain;
>                 s32 fence_reg:I915_MAX_NUM_FENCE_BITS;
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index a982849..bbf25d0 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -195,7 +195,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
>         err_printf(m, "  %s [%d]:\n", name, count);
>
>         while (count--) {
> -               err_printf(m, "    %08x %8u %02x %02x %x %x",
> +               err_printf(m, "    %016llx %8u %02x %02x %x %x",
>                            err->gtt_offset,
>                            err->size,
>                            err->read_domains,
> @@ -415,7 +415,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>                                 err_printf(m, " (submitted by %s [%d])",
>                                            error->ring[i].comm,
>                                            error->ring[i].pid);
> -                       err_printf(m, " --- gtt_offset = 0x%08x\n",
> +                       err_printf(m, " --- gtt_offset = 0x%016llx\n",
>                                    obj->gtt_offset);
>                         print_error_obj(m, obj);
>                 }
> @@ -423,7 +423,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>                 obj = error->ring[i].wa_batchbuffer;
>                 if (obj) {
>                         err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n",
> -                                  dev_priv->ring[i].name, obj->gtt_offset);
> +                                  dev_priv->ring[i].name,
> +                                  lower_32_bits(obj->gtt_offset));
>                         print_error_obj(m, obj);
>                 }
>
> @@ -442,14 +443,14 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>                 if ((obj = error->ring[i].ringbuffer)) {
>                         err_printf(m, "%s --- ringbuffer = 0x%08x\n",
>                                    dev_priv->ring[i].name,
> -                                  obj->gtt_offset);
> +                                  lower_32_bits(obj->gtt_offset));
>                         print_error_obj(m, obj);
>                 }
>
>                 if ((obj = error->ring[i].hws_page)) {
>                         err_printf(m, "%s --- HW Status = 0x%08x\n",
>                                    dev_priv->ring[i].name,
> -                                  obj->gtt_offset);
> +                                  lower_32_bits(obj->gtt_offset));
>                         offset = 0;
>                         for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
>                                 err_printf(m, "[%04x] %08x %08x %08x %08x\n",
> @@ -465,13 +466,13 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>                 if ((obj = error->ring[i].ctx)) {
>                         err_printf(m, "%s --- HW Context = 0x%08x\n",
>                                    dev_priv->ring[i].name,
> -                                  obj->gtt_offset);
> +                                  lower_32_bits(obj->gtt_offset));
>                         print_error_obj(m, obj);
>                 }
>         }
>
>         if ((obj = error->semaphore_obj)) {
> -               err_printf(m, "Semaphore page = 0x%08x\n", obj->gtt_offset);
> +               err_printf(m, "Semaphore page = 0x%016llx\n", obj->gtt_offset);

Can the 'lower_32_bits' be used for the semaphore object also.
Its mapped into GGTT during render ring init time, so may never have
an offset of > 4 GB.

>                 for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
>                         err_printf(m, "[%04x] %08x %08x %08x %08x\n",
>                                    elt * 4,
> @@ -571,7 +572,7 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
>         int num_pages;
>         bool use_ggtt;
>         int i = 0;
> -       u32 reloc_offset;
> +       u64 reloc_offset;
>
>         if (src == NULL || src->pages == NULL)
>                 return NULL;
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl
  2015-02-20 17:45 ` [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl Michel Thierry
  2015-03-03 12:55   ` akash goel
@ 2015-03-04  2:48   ` akash goel
  1 sibling, 0 replies; 32+ messages in thread
From: akash goel @ 2015-03-04  2:48 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> The code for 4lvl works just as one would expect, and nicely it is able
> to call into the existing 3lvl page table code to handle all of the
> lower levels.
>
> PML4 has no special attributes, and there will always be a PML4.
> So simply initialize it at creation, and destroy it at the end.
>
> v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
> compiler happy. And define ret only in one place.
> Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 240 +++++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  11 +-
>  2 files changed, 217 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 1edcc17..edada33 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -483,9 +483,12 @@ static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp)
>  static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp,
>                             struct drm_device *dev)
>  {
> -       __pdp_fini(pdp);
> -       if (USES_FULL_48BIT_PPGTT(dev))
> +       if (USES_FULL_48BIT_PPGTT(dev)) {
> +               __pdp_fini(pdp);
> +               i915_dma_unmap_single(pdp, dev);
> +               __free_page(pdp->page);
>                 kfree(pdp);
> +       }
>  }
>
>  static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
> @@ -511,6 +514,60 @@ static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
>         return 0;
>  }
>
> +static struct i915_page_directory_pointer_entry *alloc_pdp_single(struct i915_hw_ppgtt *ppgtt,
> +                                              struct i915_pml4 *pml4)
> +{
> +       struct drm_device *dev = ppgtt->base.dev;
> +       struct i915_page_directory_pointer_entry *pdp;
> +       int ret;
> +
> +       BUG_ON(!USES_FULL_48BIT_PPGTT(dev));
> +
> +       pdp = kmalloc(sizeof(*pdp), GFP_KERNEL);
> +       if (!pdp)
> +               return ERR_PTR(-ENOMEM);
> +
> +       pdp->page = alloc_page(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO);
> +       if (!pdp->page) {
> +               kfree(pdp);
> +               return ERR_PTR(-ENOMEM);
> +       }
> +
> +       ret = __pdp_init(pdp, dev);
> +       if (ret) {
> +               __free_page(pdp->page);
> +               kfree(pdp);
> +               return ERR_PTR(ret);
> +       }
> +
> +       i915_dma_map_px_single(pdp, dev);
> +
> +       return pdp;
> +}
> +
> +static void pml4_fini(struct i915_pml4 *pml4)
> +{
> +       struct i915_hw_ppgtt *ppgtt =
> +               container_of(pml4, struct i915_hw_ppgtt, pml4);
> +       i915_dma_unmap_single(pml4, ppgtt->base.dev);
> +       __free_page(pml4->page);
> +       /* HACK */
> +       pml4->page = NULL;
> +}
> +
> +static int pml4_init(struct i915_hw_ppgtt *ppgtt)
> +{
> +       struct i915_pml4 *pml4 = &ppgtt->pml4;
> +
> +       pml4->page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +       if (!pml4->page)
> +               return -ENOMEM;
> +
> +       i915_dma_map_px_single(pml4, ppgtt->base.dev);
> +
> +       return 0;
> +}
> +
>  /* Broadwell Page Directory Pointer Descriptors */
>  static int gen8_write_pdp(struct intel_engine_cs *ring,
>                           unsigned entry,
> @@ -712,14 +769,13 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d
>         }
>  }
>
> -static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
> +static void gen8_ppgtt_unmap_pages_3lvl(struct i915_page_directory_pointer_entry *pdp,
> +                                       struct drm_device *dev)
>  {
> -       struct pci_dev *hwdev = ppgtt->base.dev->pdev;
> -       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
> +       struct pci_dev *hwdev = dev->pdev;
>         int i, j;
>
> -       for_each_set_bit(i, pdp->used_pdpes,
> -                       I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> +       for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
>                 struct i915_page_directory_entry *pd;
>
>                 if (WARN_ON(!pdp->page_directory[i]))
> @@ -747,27 +803,73 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
>         }
>  }
>
> -static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
> +static void gen8_ppgtt_unmap_pages_4lvl(struct i915_hw_ppgtt *ppgtt)
>  {
> +       struct pci_dev *hwdev = ppgtt->base.dev->pdev;
>         int i;
>
> -       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> -               for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> -                                I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> -                       if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> -                               continue;
> +       for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
> +               struct i915_page_directory_pointer_entry *pdp;
>
> -                       gen8_free_page_tables(ppgtt->pdp.page_directory[i],
> -                                             ppgtt->base.dev);
> -                       unmap_and_free_pd(ppgtt->pdp.page_directory[i],
> -                                         ppgtt->base.dev);
> -               }
> -               unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> -       } else {
> -               BUG(); /* to be implemented later */
> +               if (WARN_ON(!ppgtt->pml4.pdps[i]))
> +                       continue;
> +
> +               pdp = ppgtt->pml4.pdps[i];
> +               if (!pdp->daddr)

Should this 'if' condition be other way round ? i.e. 'if (pdp->daddr)'.
Same applies to gen8_ppgtt_unmap_pages_3lvl also for un-mapping of pd page.


> +                       pci_unmap_page(hwdev, pdp->daddr, PAGE_SIZE,
> +                                      PCI_DMA_BIDIRECTIONAL);
> +
> +               gen8_ppgtt_unmap_pages_3lvl(ppgtt->pml4.pdps[i],
> +                                           ppgtt->base.dev);
>         }
>  }
>
> +static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
> +{
> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> +               gen8_ppgtt_unmap_pages_3lvl(&ppgtt->pdp, ppgtt->base.dev);
> +       else
> +               gen8_ppgtt_unmap_pages_4lvl(ppgtt);
> +}
> +
> +static void gen8_ppgtt_free_3lvl(struct i915_page_directory_pointer_entry *pdp,
> +                                struct drm_device *dev)
> +{
> +       int i;
> +
> +       for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
> +               if (WARN_ON(!pdp->page_directory[i]))
> +                       continue;
> +
> +               gen8_free_page_tables(pdp->page_directory[i], dev);
> +               unmap_and_free_pd(pdp->page_directory[i], dev);
> +       }
> +
> +       unmap_and_free_pdp(pdp, dev);
> +}
> +
> +static void gen8_ppgtt_free_4lvl(struct i915_hw_ppgtt *ppgtt)
> +{
> +       int i;
> +
> +       for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
> +               if (WARN_ON(!ppgtt->pml4.pdps[i]))
> +                       continue;
> +
> +               gen8_ppgtt_free_3lvl(ppgtt->pml4.pdps[i], ppgtt->base.dev);
> +       }
> +
> +       pml4_fini(&ppgtt->pml4);
> +}
> +
> +static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
> +{
> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> +               gen8_ppgtt_free_3lvl(&ppgtt->pdp, ppgtt->base.dev);
> +       else
> +               gen8_ppgtt_free_4lvl(ppgtt);
> +}
> +
>  static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>  {
>         struct i915_hw_ppgtt *ppgtt =
> @@ -1040,12 +1142,74 @@ err_out:
>         return ret;
>  }
>
> -static int __noreturn gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> -                                              struct i915_pml4 *pml4,
> -                                              uint64_t start,
> -                                              uint64_t length)
> +static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> +                                   struct i915_pml4 *pml4,
> +                                   uint64_t start,
> +                                   uint64_t length)
>  {
> -       BUG(); /* to be implemented later */
> +       DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
> +       struct i915_hw_ppgtt *ppgtt =
> +               container_of(vm, struct i915_hw_ppgtt, base);
> +       struct i915_page_directory_pointer_entry *pdp;
> +       const uint64_t orig_start = start;
> +       const uint64_t orig_length = length;
> +       uint64_t temp, pml4e;
> +       int ret = 0;
> +
> +       /* Do the pml4 allocations first, so we don't need to track the newly
> +        * allocated tables below the pdp */
> +       bitmap_zero(new_pdps, GEN8_PML4ES_PER_PML4);
> +
> +       /* The page_directoryectory and pagetable allocations are done in the shared 3
> +        * and 4 level code. Just allocate the pdps.
> +        */
> +       gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> +               if (!pdp) {
> +                       WARN_ON(test_bit(pml4e, pml4->used_pml4es));
> +                       pdp = alloc_pdp_single(ppgtt, pml4);
> +                       if (IS_ERR(pdp))
> +                               goto err_alloc;
> +
> +                       pml4->pdps[pml4e] = pdp;
> +                       set_bit(pml4e, new_pdps);
> +                       trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, pml4e,
> +                                                  pml4e << GEN8_PML4E_SHIFT,
> +                                                  GEN8_PML4E_SHIFT);
> +
> +               }
> +       }
> +
> +       WARN(bitmap_weight(new_pdps, GEN8_PML4ES_PER_PML4) > 2,
> +            "The allocation has spanned more than 512GB. "
> +            "It is highly likely this is incorrect.");
> +
> +       start = orig_start;
> +       length = orig_length;
> +
> +       gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> +               BUG_ON(!pdp);
> +
> +               ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
> +               if (ret)
> +                       goto err_out;
> +       }
> +
> +       bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
> +                 GEN8_PML4ES_PER_PML4);
> +
> +       return 0;
> +
> +err_out:
> +       start = orig_start;
> +       length = orig_length;
> +       gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e)
> +               gen8_ppgtt_free_3lvl(pdp, vm->dev);
> +
> +err_alloc:
> +       for_each_set_bit(pml4e, new_pdps, GEN8_PML4ES_PER_PML4)
> +               unmap_and_free_pdp(pdp, vm->dev);
> +
> +       return ret;
>  }
>
>  static int gen8_alloc_va_range(struct i915_address_space *vm,
> @@ -1054,16 +1218,19 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>         struct i915_hw_ppgtt *ppgtt =
>                 container_of(vm, struct i915_hw_ppgtt, base);
>
> -       if (!USES_FULL_48BIT_PPGTT(vm->dev))
> -               return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
> -       else
> +       if (USES_FULL_48BIT_PPGTT(vm->dev))
>                 return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
> +       else
> +               return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
>  }
>
>  static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
>  {
>         unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> -       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> +       if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> +               pml4_fini(&ppgtt->pml4);
> +       else
> +               unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
>  }
>
>  /**
> @@ -1086,14 +1253,21 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>
>         ppgtt->switch_mm = gen8_mm_switch;
>
> -       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +       if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +               int ret = pml4_init(ppgtt);
> +               if (ret) {
> +                       unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> +                       return ret;
> +               }
> +       } else {
>                 int ret = __pdp_init(&ppgtt->pdp, false);
>                 if (ret) {
>                         unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
>                         return ret;
>                 }
> -       } else
> -               return -EPERM; /* Not yet implemented */
> +
> +               trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, 0, 0, GEN8_PML4E_SHIFT);
> +       }
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 144858e..1477f54 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -87,6 +87,7 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>   */
>  #define GEN8_PML4ES_PER_PML4           512
>  #define GEN8_PML4E_SHIFT               39
> +#define GEN8_PML4E_MASK                        (GEN8_PML4ES_PER_PML4 - 1)
>  #define GEN8_PDPE_SHIFT                        30
>  /* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
>   * tables */
> @@ -427,6 +428,14 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>              temp = min(temp, length),                                  \
>              start += temp, length -= temp)
>
> +#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
> +       for (iter = gen8_pml4e_index(start), pdp = (pml4)->pdps[iter];  \
> +            length > 0 && iter < GEN8_PML4ES_PER_PML4;                 \
> +            pdp = (pml4)->pdps[++iter],                                \
> +            temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,   \
> +            temp = min(temp, length),                                  \
> +            start += temp, length -= temp)
> +
>  #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
>         gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
>
> @@ -458,7 +467,7 @@ static inline uint32_t gen8_pdpe_index(uint64_t address)
>
>  static inline uint32_t gen8_pml4e_index(uint64_t address)
>  {
> -       BUG(); /* For 64B */
> +       return (address >> GEN8_PML4E_SHIFT) & GEN8_PML4E_MASK;
>  }
>
>  static inline size_t gen8_pte_count(uint64_t addr, uint64_t length)
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 02/12] drm/i915/bdw: Abstract PDP usage
  2015-02-20 17:45 ` [PATCH 02/12] drm/i915/bdw: Abstract PDP usage Michel Thierry
  2015-03-03 12:16   ` akash goel
@ 2015-03-04  3:07   ` akash goel
  1 sibling, 0 replies; 32+ messages in thread
From: akash goel @ 2015-03-04  3:07 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
<michel.thierry@intel.com> wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
>
> Up until now, ppgtt->pdp has always been the root of our page tables.
> Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.
>
> In preparation for 4 level page tables, we need to stop use ppgtt->pdp
> directly unless we know it's what we want. The future structure will use
> ppgtt->pml4 for the top level, and the pdp is just one of the entries
> being pointed to by a pml4e.
>
> v2: Updated after dynamic page allocation changes.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 123 ++++++++++++++++++++----------------
>  1 file changed, 70 insertions(+), 53 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 489f8db..d3ad517 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -560,6 +560,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>  {
>         struct i915_hw_ppgtt *ppgtt =
>                 container_of(vm, struct i915_hw_ppgtt, base);
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         gen8_gtt_pte_t *pt_vaddr, scratch_pte;
>         unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
>         unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
> @@ -575,10 +576,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>                 struct i915_page_table_entry *pt;
>                 struct page *page_table;
>
> -               if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
> +               if (WARN_ON(!pdp->page_directory[pdpe]))
>                         continue;
>
> -               pd = ppgtt->pdp.page_directory[pdpe];
> +               pd = pdp->page_directory[pdpe];
>
>                 if (WARN_ON(!pd->page_tables[pde]))
>                         continue;
> @@ -620,6 +621,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>  {
>         struct i915_hw_ppgtt *ppgtt =
>                 container_of(vm, struct i915_hw_ppgtt, base);
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         gen8_gtt_pte_t *pt_vaddr;
>         unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
>         unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
> @@ -630,7 +632,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>
>         for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
>                 if (pt_vaddr == NULL) {
> -                       struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe];
> +                       struct i915_page_directory_entry *pd = pdp->page_directory[pdpe];
>                         struct i915_page_table_entry *pt = pd->page_tables[pde];
>                         struct page *page_table = pt->page;
>
> @@ -708,16 +710,17 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d
>  static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
>  {
>         struct pci_dev *hwdev = ppgtt->base.dev->pdev;
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         int i, j;
>
> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> +       for_each_set_bit(i, pdp->used_pdpes,
>                         I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>                 struct i915_page_directory_entry *pd;
>
> -               if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> +               if (WARN_ON(!pdp->page_directory[i]))
>                         continue;
>
> -               pd = ppgtt->pdp.page_directory[i];
> +               pd = pdp->page_directory[i];
>                 if (!pd->daddr)
>                         pci_unmap_page(hwdev, pd->daddr, PAGE_SIZE,
>                                         PCI_DMA_BIDIRECTIONAL);
> @@ -743,15 +746,21 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
>  {
>         int i;
>
> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> -                               I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> -               if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> -                       continue;
> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> +               for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> +                                I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> +                       if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> +                               continue;
>
> -               gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
> -               unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
> +                       gen8_free_page_tables(ppgtt->pdp.page_directory[i],
> +                                             ppgtt->base.dev);
> +                       unmap_and_free_pd(ppgtt->pdp.page_directory[i],
> +                                         ppgtt->base.dev);
> +               }
> +               unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> +       } else {
> +               BUG(); /* to be implemented later */
>         }
> -       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
>  }
>
>  static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
> @@ -765,7 +774,7 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>
>  /**
>   * gen8_ppgtt_alloc_pagetabs() - Allocate page tables for VA range.
> - * @ppgtt:     Master ppgtt structure.
> + * @vm:                Master vm structure.
>   * @pd:                Page directory for this address range.
>   * @start:     Starting virtual address to begin allocations.
>   * @length     Size of the allocations.
> @@ -781,12 +790,13 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>   *
>   * Return: 0 if success; negative error code otherwise.
>   */
> -static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
> +static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
>                                      struct i915_page_directory_entry *pd,
>                                      uint64_t start,
>                                      uint64_t length,
>                                      unsigned long *new_pts)
>  {
> +       struct drm_device *dev = vm->dev;
>         struct i915_page_table_entry *pt;
>         uint64_t temp;
>         uint32_t pde;
> @@ -799,7 +809,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
>                         continue;
>                 }
>
> -               pt = alloc_pt_single(ppgtt->base.dev);
> +               pt = alloc_pt_single(dev);
>                 if (IS_ERR(pt))
>                         goto unwind_out;
>
> @@ -811,14 +821,14 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
>
>  unwind_out:
>         for_each_set_bit(pde, new_pts, GEN8_PDES_PER_PAGE)
> -               unmap_and_free_pt(pd->page_tables[pde], ppgtt->base.dev);
> +               unmap_and_free_pt(pd->page_tables[pde], dev);
>
>         return -ENOMEM;
>  }
>
>  /**
>   * gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range.
> - * @ppgtt:     Master ppgtt structure.
> + * @vm:                Master vm structure.
>   * @pdp:       Page directory pointer for this address range.
>   * @start:     Starting virtual address to begin allocations.
>   * @length     Size of the allocations.
> @@ -839,16 +849,17 @@ unwind_out:
>   *
>   * Return: 0 if success; negative error code otherwise.
>   */
> -static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
> +static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
>                                      struct i915_page_directory_pointer_entry *pdp,
>                                      uint64_t start,
>                                      uint64_t length,
>                                      unsigned long *new_pds)
>  {
> +       struct drm_device *dev = vm->dev;
>         struct i915_page_directory_entry *pd;
>         uint64_t temp;
>         uint32_t pdpe;
> -       size_t pdpes =  I915_PDPES_PER_PDP(ppgtt->base.dev);
> +       size_t pdpes =  I915_PDPES_PER_PDP(vm->dev);
>
>         BUG_ON(!bitmap_empty(new_pds, pdpes));
>
> @@ -859,7 +870,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>                 if (pd)
>                         continue;
>
> -               pd = alloc_pd_single(ppgtt->base.dev);
> +               pd = alloc_pd_single(dev);
>                 if (IS_ERR(pd))
>                         goto unwind_out;
>
> @@ -871,7 +882,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>
>  unwind_out:
>         for_each_set_bit(pdpe, new_pds, pdpes)
> -               unmap_and_free_pd(pdp->page_directory[pdpe], ppgtt->base.dev);
> +               unmap_and_free_pd(pdp->page_directory[pdpe], dev);
>
>         return -ENOMEM;
>  }
> @@ -926,13 +937,13 @@ err_out:
>         return -ENOMEM;
>  }
>
> -static int gen8_alloc_va_range(struct i915_address_space *vm,
> -                              uint64_t start,
> -                              uint64_t length)
> +static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
> +                                   struct i915_page_directory_pointer_entry *pdp,
> +                                   uint64_t start,
> +                                   uint64_t length)
>  {
> -       struct i915_hw_ppgtt *ppgtt =
> -               container_of(vm, struct i915_hw_ppgtt, base);
>         unsigned long *new_page_dirs, **new_page_tables;
> +       struct drm_device *dev = vm->dev;
>         struct i915_page_directory_entry *pd;
>         const uint64_t orig_start = start;
>         const uint64_t orig_length = length;
> @@ -961,17 +972,15 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>                 return ret;
>
>         /* Do the allocations first so we can easily bail out */
> -       ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
> -                                       new_page_dirs);
> +       ret = gen8_ppgtt_alloc_page_directories(vm, pdp, start, length, new_page_dirs);
>         if (ret) {
>                 free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>                 return ret;
>         }
>
> -       /* For every page directory referenced, allocate page tables */
> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
> +       gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
>                 bitmap_zero(new_page_tables[pdpe], GEN8_PDES_PER_PAGE);
> -               ret = gen8_ppgtt_alloc_pagetabs(ppgtt, pd, start, length,
> +               ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
>                                                 new_page_tables[pdpe]);
>                 if (ret)
>                         goto err_out;
> @@ -980,10 +989,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>         start = orig_start;
>         length = orig_length;
>
> -       /* Allocations have completed successfully, so set the bitmaps, and do
> -        * the mappings. */
> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
> -               gen8_ppgtt_pde_t *const page_directory = kmap_atomic(pd->page);
> +       gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
>                 struct i915_page_table_entry *pt;
>                 uint64_t pd_len = gen8_clamp_pd(start, length);

Sorry this comment is not relevant  for this patch.
As the following loop 'gen8_for_each_pde' has a check to stop at PD
boundary (iter < GEN8_PDES_PER_PAGE),
no real necessity to use 'gen8_clamp_pd' here.

>                 uint64_t pd_start = start;
> @@ -1005,20 +1011,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>
>                         /* Our pde is now pointing to the pagetable, pt */
>                         set_bit(pde, pd->used_pdes);
> -
> -                       /* Map the PDE to the page table */
> -                       __gen8_do_map_pt(page_directory + pde, pt, vm->dev);
> -
> -                       /* NB: We haven't yet mapped ptes to pages. At this
> -                        * point we're still relying on insert_entries() */
>                 }
>
> -               if (!HAS_LLC(vm->dev))
> -                       drm_clflush_virt_range(page_directory, PAGE_SIZE);
> -
> -               kunmap_atomic(page_directory);
> -
> -               set_bit(pdpe, ppgtt->pdp.used_pdpes);
> +               set_bit(pdpe, pdp->used_pdpes);
> +               gen8_map_pagetable_range(pd, start, length, dev);
>         }
>
>         free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
> @@ -1027,16 +1023,36 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>  err_out:
>         while (pdpe--) {
>                 for_each_set_bit(temp, new_page_tables[pdpe], GEN8_PDES_PER_PAGE)
> -                       unmap_and_free_pt(pd->page_tables[temp], vm->dev);
> +                       unmap_and_free_pt(pd->page_tables[temp], dev);
>         }
>
>         for_each_set_bit(pdpe, new_page_dirs, pdpes)
> -               unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
> +               unmap_and_free_pd(pdp->page_directory[pdpe], dev);
>
>         free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>         return ret;
>  }
>
> +static int __noreturn gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> +                                              struct i915_pml4 *pml4,
> +                                              uint64_t start,
> +                                              uint64_t length)
> +{
> +       BUG(); /* to be implemented later */
> +}
> +
> +static int gen8_alloc_va_range(struct i915_address_space *vm,
> +                              uint64_t start, uint64_t length)
> +{
> +       struct i915_hw_ppgtt *ppgtt =
> +               container_of(vm, struct i915_hw_ppgtt, base);
> +
> +       if (!USES_FULL_48BIT_PPGTT(vm->dev))
> +               return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
> +       else
> +               return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
> +}
> +
>  static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
>  {
>         unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
> @@ -1079,12 +1095,13 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  {
>         struct drm_device *dev = ppgtt->base.dev;
>         struct drm_i915_private *dev_priv = dev->dev_private;
> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>         struct i915_page_directory_entry *pd;
>         uint64_t temp, start = 0, size = dev_priv->gtt.base.total;
>         uint32_t pdpe;
>         int ret;
>
> -       ret = gen8_ppgtt_init_common(ppgtt, dev_priv->gtt.base.total);
> +       ret = gen8_ppgtt_init_common(ppgtt, size);
>         if (ret)
>                 return ret;
>
> @@ -1097,8 +1114,8 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>                 return ret;
>         }
>
> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, size, temp, pdpe)
> -               gen8_map_pagetable_range(pd, start, size, ppgtt->base.dev);
> +       gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
> +               gen8_map_pagetable_range(pd, start, size, dev);
>
>         ppgtt->base.allocate_va_range = NULL;
>         ppgtt->base.clear_range = gen8_ppgtt_clear_range;
> --
> 2.1.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl
  2015-03-03 12:55   ` akash goel
@ 2015-03-04 13:00     ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2015-03-04 13:00 UTC (permalink / raw)
  To: akash goel; +Cc: intel-gfx

On Tue, Mar 03, 2015 at 06:25:27PM +0530, akash goel wrote:
> On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
> > +               pdp = ppgtt->pml4.pdps[i];
> > +               if (!pdp->daddr)
> > +                       pci_unmap_page(hwdev, pdp->daddr, PAGE_SIZE,
> > +                                      PCI_DMA_BIDIRECTIONAL);
> > +
> 
> For consistency & cleanup,  the call to pci_unmap_page can be replaced
> with i915_dma_unmap_single.
> Same can be done inside the gen8_ppgtt_unmap_pages_3lvl function also.

Everything but the dma api interfaces (dma_unmap_page) is deprecated. A
follow-up patch to go through all the i915 code and do these replacements
would be nice. After all this landed ofc.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 06/12] drm/i915/bdw: Add 4 level switching infrastructure
  2015-03-03 13:01   ` akash goel
@ 2015-03-04 13:08     ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2015-03-04 13:08 UTC (permalink / raw)
  To: akash goel; +Cc: intel-gfx, Goel, Akash

On Tue, Mar 03, 2015 at 06:31:03PM +0530, akash goel wrote:
> On Fri, Feb 20, 2015 at 11:16 PM, Michel Thierry
> <michel.thierry@intel.com> wrote:
> > +static void gen8_map_page_directory(struct i915_page_directory_pointer_entry *pdp,
> > +                                   struct i915_page_directory_entry *pd,
> > +                                   int index,
> > +                                   struct drm_device *dev)
> > +{
> > +       gen8_ppgtt_pdpe_t *page_directorypo;
> > +       gen8_ppgtt_pdpe_t pdpe;
> > +
> > +       /* We do not need to clflush because no platform requiring flush
> > +        * supports 64b pagetables. */
> 
> Would be more appropriate to place this comment, either after the ‘if’
> condition or
> at the end of the function (where clflush would have been placed, had
> LLC not been there for platforms supporting 64 bit).
> And same comment can be probably added, at the end of
> gen8_map_page_directory_pointer function also.
> 
> > +       if (!USES_FULL_48BIT_PPGTT(dev))
> > +               return;

Ok this on a lot of levels:
- a function calle map_something, which doesn't actually return a map.
  Must be renamed asap to something that makes sense, in the kernel
  everything call map_foo actually maps foo somewhere and returns where.
- The comment is fairly useless since it doesn't mention which platforms
  flushing is required on. Either we need to split functions up more into
  4G and 48bit variants if this difference is due to the pagetable layout.
  Or we need to replace it with appropriate platform checks.

Or maybe this if is just dead code and should be remove entirely?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic
  2015-03-03 11:48   ` akash goel
@ 2015-03-18 10:15     ` Michel Thierry
  0 siblings, 0 replies; 32+ messages in thread
From: Michel Thierry @ 2015-03-18 10:15 UTC (permalink / raw)
  To: akash goel; +Cc: intel-gfx, Goel, Akash

On 3/3/2015 11:48 AM, akash goel wrote:
> On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
> <michel.thierry@intel.com>  wrote:
>> From: Ben Widawsky<benjamin.widawsky@intel.com>
>>
>> This transitional patch doesn't do much for the existing code. However,
>> it should make upcoming patches to use the full 48b address space a bit
>> easier to swallow. The patch also introduces the PML4, ie. the new top
>> level structure of the page tables.
>>
>> v2: Renamed  pdp_free to be similar to  pd/pt (unmap_and_free_pdp),
>> To facilitate testing, 48b mode will be available on Broadwell, when
>> i915.enable_ppgtt = 3.
>>
>> Signed-off-by: Ben Widawsky<ben@bwidawsk.net>
>> Signed-off-by: Michel Thierry<michel.thierry@intel.com>  (v2)
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h     |   7 ++-
>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 108 +++++++++++++++++++++++++++++-------
>>   drivers/gpu/drm/i915/i915_gem_gtt.h |  41 +++++++++++---
>>   3 files changed, 126 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 2dedd43..af0d149 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2432,7 +2432,12 @@ struct drm_i915_cmd_table {
>>   #define HAS_HW_CONTEXTS(dev)   (INTEL_INFO(dev)->gen >= 6)
>>   #define HAS_LOGICAL_RING_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 8)
>>   #define USES_PPGTT(dev)                (i915.enable_ppgtt)
>> -#define USES_FULL_PPGTT(dev)   (i915.enable_ppgtt == 2)
>> +#define USES_FULL_PPGTT(dev)   (i915.enable_ppgtt >= 2)
>> +#ifdef CONFIG_64BIT
>> +# define USES_FULL_48BIT_PPGTT(dev)    (i915.enable_ppgtt == 3)
>> +#else
>> +# define USES_FULL_48BIT_PPGTT(dev)    false
>> +#endif
>>
>>   #define HAS_OVERLAY(dev)               (INTEL_INFO(dev)->has_overlay)
>>   #define OVERLAY_NEEDS_PHYSICAL(dev)    (INTEL_INFO(dev)->overlay_needs_physical)
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index ff86501..489f8db 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -100,10 +100,17 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>>   {
>>          bool has_aliasing_ppgtt;
>>          bool has_full_ppgtt;
>> +       bool has_full_64bit_ppgtt;
>>
>>          has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
>>          has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
>>
>> +#ifdef CONFIG_64BIT
>> +       has_full_64bit_ppgtt = IS_BROADWELL(dev) && false; /* FIXME: 64b */
>> +#else
>> +       has_full_64bit_ppgtt = false;
>> +#endif
>> +
>>          if (intel_vgpu_active(dev))
>>                  has_full_ppgtt = false; /* emulation is too hard */
>>
>> @@ -121,6 +128,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
>>          if (enable_ppgtt == 2 && has_full_ppgtt)
>>                  return 2;
>>
>> +       if (enable_ppgtt == 3 && has_full_64bit_ppgtt)
>> +               return 3;
>> +
>>   #ifdef CONFIG_INTEL_IOMMU
>>          /* Disable ppgtt on SNB if VT-d is on. */
>>          if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
>> @@ -462,6 +472,45 @@ free_pd:
>>          return ERR_PTR(ret);
>>   }
>>
>> +static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp)
>> +{
>> +       kfree(pdp->used_pdpes);
>> +       kfree(pdp->page_directory);
>> +       /* HACK */
>> +       pdp->page_directory = NULL;
>> +}
>> +
>> +static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp,
>> +                           struct drm_device *dev)
>> +{
>> +       __pdp_fini(pdp);
>> +       if (USES_FULL_48BIT_PPGTT(dev))
>> +               kfree(pdp);
>> +}
>> +
>> +static int __pdp_init(struct i915_page_directory_pointer_entry *pdp,
>> +                     struct drm_device *dev)
>> +{
>> +       size_t pdpes = I915_PDPES_PER_PDP(dev);
>> +
>> +       pdp->used_pdpes = kcalloc(BITS_TO_LONGS(pdpes),
>> +                                 sizeof(unsigned long),
>> +                                 GFP_KERNEL);
>> +       if (!pdp->used_pdpes)
>> +               return -ENOMEM;
>> +
>> +       pdp->page_directory = kcalloc(pdpes, sizeof(*pdp->page_directory), GFP_KERNEL);
>> +       if (!pdp->page_directory) {
>> +               kfree(pdp->used_pdpes);
>> +               /* the PDP might be the statically allocated top level. Keep it
>> +                * as clean as possible */
>> +               pdp->used_pdpes = NULL;
>> +               return -ENOMEM;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>>   /* Broadwell Page Directory Pointer Descriptors */
>>   static int gen8_write_pdp(struct intel_engine_cs *ring,
>>                            unsigned entry,
>> @@ -491,7 +540,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
>>   {
>>          int i, ret;
>>
>> -       for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) {
>> +       for (i = 3; i >= 0; i--) {
>>                  struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i];
>>                  dma_addr_t pd_daddr = pd ? pd->daddr : ppgtt->scratch_pd->daddr;
>>                  /* The page directory might be NULL, but we need to clear out
>> @@ -580,9 +629,6 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>>          pt_vaddr = NULL;
>>
>>          for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
>> -               if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES))
>> -                       break;
>> -
>>                  if (pt_vaddr == NULL) {
>>                          struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe];
>>                          struct i915_page_table_entry *pt = pd->page_tables[pde];
>> @@ -664,7 +710,8 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
>>          struct pci_dev *hwdev = ppgtt->base.dev->pdev;
>>          int i, j;
>>
>> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
>> +       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
>> +                       I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>>                  struct i915_page_directory_entry *pd;
>>
>>                  if (WARN_ON(!ppgtt->pdp.page_directory[i]))
>> @@ -696,13 +743,15 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
>>   {
>>          int i;
>>
>> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
>> +       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
>> +                               I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>>                  if (WARN_ON(!ppgtt->pdp.page_directory[i]))
>>                          continue;
>>
>>                  gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
>>                  unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
>>          }
>> +       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
> 'ppgtt->scratch_pd' is not being de-allocated.
>
> Probably it can be de-allocated explicitly from here, after the call
> to  unmap_and_free_pdp or
> from the gen8_ppgtt_cleanup function .
Right,  unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev) was 
missing. It was added in "drm/i915/bdw: Update pdp switch and point 
unused PDPs to scratch page" v3. It'll still be there in the rebased 
version of this patch.

>>   }
>>
>>   static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>> @@ -799,8 +848,9 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>>          struct i915_page_directory_entry *pd;
>>          uint64_t temp;
>>          uint32_t pdpe;
>> +       size_t pdpes =  I915_PDPES_PER_PDP(ppgtt->base.dev);
>>
>> -       BUG_ON(!bitmap_empty(new_pds, GEN8_LEGACY_PDPES));
>> +       BUG_ON(!bitmap_empty(new_pds, pdpes));
>>
>>          /* FIXME: PPGTT container_of won't work for 64b */
>>          BUG_ON((start + length) > 0x800000000ULL);
>> @@ -820,18 +870,19 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>>          return 0;
>>
>>   unwind_out:
>> -       for_each_set_bit(pdpe, new_pds, GEN8_LEGACY_PDPES)
>> +       for_each_set_bit(pdpe, new_pds, pdpes)
>>                  unmap_and_free_pd(pdp->page_directory[pdpe], ppgtt->base.dev);
>>
>>          return -ENOMEM;
>>   }
>>
>>   static inline void
>> -free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
>> +free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts,
>> +                      size_t pdpes)
>>   {
>>          int i;
>>
>> -       for (i = 0; i < GEN8_LEGACY_PDPES; i++)
>> +       for (i = 0; i < pdpes; i++)
>>                  kfree(new_pts[i]);
>>          kfree(new_pts);
>>          kfree(new_pds);
>> @@ -841,13 +892,14 @@ free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
>>    * of these are based on the number of PDPEs in the system.
>>    */
>>   int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
>> -                                        unsigned long ***new_pts)
>> +                                        unsigned long ***new_pts,
>> +                                        size_t pdpes)
>>   {
>>          int i;
>>          unsigned long *pds;
>>          unsigned long **pts;
>>
>> -       pds = kcalloc(BITS_TO_LONGS(GEN8_LEGACY_PDPES), sizeof(unsigned long), GFP_KERNEL);
>> +       pds = kcalloc(BITS_TO_LONGS(pdpes), sizeof(unsigned long), GFP_KERNEL);
>>          if (!pds)
>>                  return -ENOMEM;
>>
>> @@ -857,7 +909,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
>>                  return -ENOMEM;
>>          }
>>
>> -       for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
>> +       for (i = 0; i < pdpes; i++) {
>>                  pts[i] = kcalloc(BITS_TO_LONGS(GEN8_PDES_PER_PAGE),
>>                                   sizeof(unsigned long), GFP_KERNEL);
>>                  if (!pts[i])
>> @@ -870,7 +922,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
>>          return 0;
>>
>>   err_out:
>> -       free_gen8_temp_bitmaps(pds, pts);
>> +       free_gen8_temp_bitmaps(pds, pts, pdpes);
>>          return -ENOMEM;
>>   }
>>
>> @@ -886,6 +938,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>          const uint64_t orig_length = length;
>>          uint64_t temp;
>>          uint32_t pdpe;
>> +       size_t pdpes = I915_PDPES_PER_PDP(dev);
>>          int ret;
>>
>>   #ifndef CONFIG_64BIT
>> @@ -903,7 +956,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>          if (WARN_ON(start + length < start))
>>                  return -ERANGE;
>>
>> -       ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables);
>> +       ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables, pdpes);
>>          if (ret)
>>                  return ret;
>>
>> @@ -911,7 +964,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>          ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
>>                                          new_page_dirs);
>>          if (ret) {
>> -               free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
>> +               free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>>                  return ret;
>>          }
>>
>> @@ -968,7 +1021,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>                  set_bit(pdpe, ppgtt->pdp.used_pdpes);
>>          }
>>
>> -       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
>> +       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>>          return 0;
>>
>>   err_out:
>> @@ -977,13 +1030,19 @@ err_out:
>>                          unmap_and_free_pt(pd->page_tables[temp], vm->dev);
>>          }
>>
>> -       for_each_set_bit(pdpe, new_page_dirs, GEN8_LEGACY_PDPES)
>> +       for_each_set_bit(pdpe, new_page_dirs, pdpes)
>>                  unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
>>
>> -       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
>> +       free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>>          return ret;
>>   }
>>
>> +static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
>> +{
>> +       unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
>> +       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
>> +}
>> +
>>   /**
>>    * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
>>    * with a net effect resembling a 2-level page table in normal x86 terms. Each
>> @@ -1004,6 +1063,15 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>>
>>          ppgtt->switch_mm = gen8_mm_switch;
>>
>> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
>> +               int ret = __pdp_init(&ppgtt->pdp, false);
>> +               if (ret) {
>> +                       unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
>> +                       return ret;
>> +               }
>> +       } else
>> +               return -EPERM; /* Not yet implemented */
>> +
>>          return 0;
>>   }
>>
>> @@ -1025,7 +1093,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>>           * eventually. */
>>          ret = gen8_alloc_va_range(&ppgtt->base, start, size);
>>          if (ret) {
>> -               unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
>> +               gen8_ppgtt_fini_common(ppgtt);
>>                  return ret;
>>          }
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> index c68ec3a..a33c6e9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> @@ -85,8 +85,12 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>>    * The difference as compared to normal x86 3 level page table is the PDPEs are
>>    * programmed via register.
>>    */
>> +#define GEN8_PML4ES_PER_PML4           512
>> +#define GEN8_PML4E_SHIFT               39
>>   #define GEN8_PDPE_SHIFT                        30
>> -#define GEN8_PDPE_MASK                 0x3
>> +/* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
>> + * tables */
>> +#define GEN8_PDPE_MASK                 0x1ff
>>   #define GEN8_PDE_SHIFT                 21
>>   #define GEN8_PDE_MASK                  0x1ff
>>   #define GEN8_PTE_SHIFT                 12
>> @@ -95,6 +99,13 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
>>   #define GEN8_PTES_PER_PAGE             (PAGE_SIZE / sizeof(gen8_gtt_pte_t))
>>   #define GEN8_PDES_PER_PAGE             (PAGE_SIZE / sizeof(gen8_ppgtt_pde_t))
>>
>> +#ifdef CONFIG_64BIT
>> +# define I915_PDPES_PER_PDP(dev) (USES_FULL_48BIT_PPGTT(dev) ?\
>> +               GEN8_PML4ES_PER_PML4 : GEN8_LEGACY_PDPES)
>> +#else
>> +# define I915_PDPES_PER_PDP            GEN8_LEGACY_PDPES
>> +#endif
>> +
>>   #define PPAT_UNCACHED_INDEX            (_PAGE_PWT | _PAGE_PCD)
>>   #define PPAT_CACHED_PDE_INDEX          0 /* WB LLC */
>>   #define PPAT_CACHED_INDEX              _PAGE_PAT /* WB LLCeLLC */
>> @@ -210,9 +221,17 @@ struct i915_page_directory_entry {
>>   };
>>
>>   struct i915_page_directory_pointer_entry {
>> -       /* struct page *page; */
>> -       DECLARE_BITMAP(used_pdpes, GEN8_LEGACY_PDPES);
>> -       struct i915_page_directory_entry *page_directory[GEN8_LEGACY_PDPES];
>> +       struct page *page;
>> +       dma_addr_t daddr;
>> +       unsigned long *used_pdpes;
>> +       struct i915_page_directory_entry **page_directory;
>> +};
>> +
>> +struct i915_pml4 {
>> +       struct page *page;
>> +       dma_addr_t daddr;
>> +       DECLARE_BITMAP(used_pml4es, GEN8_PML4ES_PER_PML4);
>> +       struct i915_page_directory_pointer_entry *pdps[GEN8_PML4ES_PER_PML4];
>>   };
>>
>>   struct i915_address_space {
>> @@ -302,8 +321,9 @@ struct i915_hw_ppgtt {
>>          struct drm_mm_node node;
>>          unsigned long pd_dirty_rings;
>>          union {
>> -               struct i915_page_directory_pointer_entry pdp;
>> -               struct i915_page_directory_entry pd;
>> +               struct i915_pml4 pml4;          /* GEN8+ & 64b PPGTT */
>> +               struct i915_page_directory_pointer_entry pdp;   /* GEN8+ */
>> +               struct i915_page_directory_entry pd;            /* GEN6-7 */
>>          };
>>
>>          union {
>> @@ -399,14 +419,17 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>>               temp = min(temp, length),                                  \
>>               start += temp, length -= temp)
>>
>> -#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
>> -       for (iter = gen8_pdpe_index(start), pd = (pdp)->page_directory[iter];   \
>> -            length > 0 && iter < GEN8_LEGACY_PDPES;                    \
>> +#define gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, b)    \
>> +       for (iter = gen8_pdpe_index(start), pd = (pdp)->page_directory[iter]; \
>> +            length > 0 && (iter < b);                                  \
>>               pd = (pdp)->page_directory[++iter],                                \
>>               temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,       \
>>               temp = min(temp, length),                                  \
>>               start += temp, length -= temp)
>>
>> +#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
>> +       gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
>> +
>>   /* Clamp length to the next page_directory boundary */
>>   static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length)
>>   {
>> --
>> 2.1.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 02/12] drm/i915/bdw: Abstract PDP usage
  2015-03-03 12:16   ` akash goel
@ 2015-03-18 10:16     ` Michel Thierry
  0 siblings, 0 replies; 32+ messages in thread
From: Michel Thierry @ 2015-03-18 10:16 UTC (permalink / raw)
  To: akash goel; +Cc: intel-gfx, Goel, Akash

On 3/3/2015 12:16 PM, akash goel wrote:
> On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
> <michel.thierry@intel.com>  wrote:
>> From: Ben Widawsky<benjamin.widawsky@intel.com>
>>
>> Up until now, ppgtt->pdp has always been the root of our page tables.
>> Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.
>>
>> In preparation for 4 level page tables, we need to stop use ppgtt->pdp
>> directly unless we know it's what we want. The future structure will use
>> ppgtt->pml4 for the top level, and the pdp is just one of the entries
>> being pointed to by a pml4e.
>>
>> v2: Updated after dynamic page allocation changes.
>>
>> Signed-off-by: Ben Widawsky<ben@bwidawsk.net>
>> Signed-off-by: Michel Thierry<michel.thierry@intel.com>  (v2)
>> ---
>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 123 ++++++++++++++++++++----------------
>>   1 file changed, 70 insertions(+), 53 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 489f8db..d3ad517 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -560,6 +560,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>>   {
>>          struct i915_hw_ppgtt *ppgtt =
>>                  container_of(vm, struct i915_hw_ppgtt, base);
>> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>>          gen8_gtt_pte_t *pt_vaddr, scratch_pte;
>>          unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
>>          unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
>> @@ -575,10 +576,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>>                  struct i915_page_table_entry *pt;
>>                  struct page *page_table;
>>
>> -               if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
>> +               if (WARN_ON(!pdp->page_directory[pdpe]))
>>                          continue;
>>
>> -               pd = ppgtt->pdp.page_directory[pdpe];
>> +               pd = pdp->page_directory[pdpe];
>>
>>                  if (WARN_ON(!pd->page_tables[pde]))
>>                          continue;
>> @@ -620,6 +621,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>>   {
>>          struct i915_hw_ppgtt *ppgtt =
>>                  container_of(vm, struct i915_hw_ppgtt, base);
>> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>>          gen8_gtt_pte_t *pt_vaddr;
>>          unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
>>          unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
>> @@ -630,7 +632,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>>
>>          for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
>>                  if (pt_vaddr == NULL) {
>> -                       struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe];
>> +                       struct i915_page_directory_entry *pd = pdp->page_directory[pdpe];
>>                          struct i915_page_table_entry *pt = pd->page_tables[pde];
>>                          struct page *page_table = pt->page;
>>
>> @@ -708,16 +710,17 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d
>>   static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
>>   {
>>          struct pci_dev *hwdev = ppgtt->base.dev->pdev;
>> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>>          int i, j;
>>
>> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
>> +       for_each_set_bit(i, pdp->used_pdpes,
>>                          I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>>                  struct i915_page_directory_entry *pd;
>>
>> -               if (WARN_ON(!ppgtt->pdp.page_directory[i]))
>> +               if (WARN_ON(!pdp->page_directory[i]))
>>                          continue;
>>
>> -               pd = ppgtt->pdp.page_directory[i];
>> +               pd = pdp->page_directory[i];
>>                  if (!pd->daddr)
>>                          pci_unmap_page(hwdev, pd->daddr, PAGE_SIZE,
>>                                          PCI_DMA_BIDIRECTIONAL);
>> @@ -743,15 +746,21 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
>>   {
>>          int i;
>>
>> -       for_each_set_bit(i, ppgtt->pdp.used_pdpes,
>> -                               I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>> -               if (WARN_ON(!ppgtt->pdp.page_directory[i]))
>> -                       continue;
>> +       if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
>> +               for_each_set_bit(i, ppgtt->pdp.used_pdpes,
>> +                                I915_PDPES_PER_PDP(ppgtt->base.dev)) {
>> +                       if (WARN_ON(!ppgtt->pdp.page_directory[i]))
>> +                               continue;
>>
>> -               gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
>> -               unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
>> +                       gen8_free_page_tables(ppgtt->pdp.page_directory[i],
>> +                                             ppgtt->base.dev);
>> +                       unmap_and_free_pd(ppgtt->pdp.page_directory[i],
>> +                                         ppgtt->base.dev);
>> +               }
>> +               unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
>> +       } else {
>> +               BUG(); /* to be implemented later */
>>          }
>> -       unmap_and_free_pdp(&ppgtt->pdp, ppgtt->base.dev);
>>   }
>>
>>   static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>> @@ -765,7 +774,7 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>>
>>   /**
>>    * gen8_ppgtt_alloc_pagetabs() - Allocate page tables for VA range.
>> - * @ppgtt:     Master ppgtt structure.
>> + * @vm:                Master vm structure.
>>    * @pd:                Page directory for this address range.
>>    * @start:     Starting virtual address to begin allocations.
>>    * @length     Size of the allocations.
>> @@ -781,12 +790,13 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
>>    *
>>    * Return: 0 if success; negative error code otherwise.
>>    */
>> -static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
>> +static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
>>                                       struct i915_page_directory_entry *pd,
>>                                       uint64_t start,
>>                                       uint64_t length,
>>                                       unsigned long *new_pts)
>>   {
>> +       struct drm_device *dev = vm->dev;
>>          struct i915_page_table_entry *pt;
>>          uint64_t temp;
>>          uint32_t pde;
>> @@ -799,7 +809,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
>>                          continue;
>>                  }
>>
>> -               pt = alloc_pt_single(ppgtt->base.dev);
>> +               pt = alloc_pt_single(dev);
>>                  if (IS_ERR(pt))
>>                          goto unwind_out;
>>
>> @@ -811,14 +821,14 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
>>
>>   unwind_out:
>>          for_each_set_bit(pde, new_pts, GEN8_PDES_PER_PAGE)
>> -               unmap_and_free_pt(pd->page_tables[pde], ppgtt->base.dev);
>> +               unmap_and_free_pt(pd->page_tables[pde], dev);
>>
>>          return -ENOMEM;
>>   }
>>
>>   /**
>>    * gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range.
>> - * @ppgtt:     Master ppgtt structure.
>> + * @vm:                Master vm structure.
>>    * @pdp:       Page directory pointer for this address range.
>>    * @start:     Starting virtual address to begin allocations.
>>    * @length     Size of the allocations.
>> @@ -839,16 +849,17 @@ unwind_out:
>>    *
>>    * Return: 0 if success; negative error code otherwise.
>>    */
>> -static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>> +static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
>>                                       struct i915_page_directory_pointer_entry *pdp,
>>                                       uint64_t start,
>>                                       uint64_t length,
>>                                       unsigned long *new_pds)
>>   {
>> +       struct drm_device *dev = vm->dev;
>>          struct i915_page_directory_entry *pd;
>>          uint64_t temp;
>>          uint32_t pdpe;
>> -       size_t pdpes =  I915_PDPES_PER_PDP(ppgtt->base.dev);
>> +       size_t pdpes =  I915_PDPES_PER_PDP(vm->dev);
>>
>>          BUG_ON(!bitmap_empty(new_pds, pdpes));
>>
>> @@ -859,7 +870,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>>                  if (pd)
>>                          continue;
>>
>> -               pd = alloc_pd_single(ppgtt->base.dev);
>> +               pd = alloc_pd_single(dev);
>>                  if (IS_ERR(pd))
>>                          goto unwind_out;
>>
>> @@ -871,7 +882,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
>>
>>   unwind_out:
>>          for_each_set_bit(pdpe, new_pds, pdpes)
>> -               unmap_and_free_pd(pdp->page_directory[pdpe], ppgtt->base.dev);
>> +               unmap_and_free_pd(pdp->page_directory[pdpe], dev);
>>
>>          return -ENOMEM;
>>   }
>> @@ -926,13 +937,13 @@ err_out:
>>          return -ENOMEM;
>>   }
>>
>> -static int gen8_alloc_va_range(struct i915_address_space *vm,
>> -                              uint64_t start,
>> -                              uint64_t length)
>> +static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>> +                                   struct i915_page_directory_pointer_entry *pdp,
>> +                                   uint64_t start,
>> +                                   uint64_t length)
>>   {
>> -       struct i915_hw_ppgtt *ppgtt =
>> -               container_of(vm, struct i915_hw_ppgtt, base);
>>          unsigned long *new_page_dirs, **new_page_tables;
>> +       struct drm_device *dev = vm->dev;
>>          struct i915_page_directory_entry *pd;
>>          const uint64_t orig_start = start;
>>          const uint64_t orig_length = length;
>> @@ -961,17 +972,15 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>                  return ret;
>>
>>          /* Do the allocations first so we can easily bail out */
>> -       ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
>> -                                       new_page_dirs);
>> +       ret = gen8_ppgtt_alloc_page_directories(vm, pdp, start, length, new_page_dirs);
>>          if (ret) {
>>                  free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>>                  return ret;
>>          }
>>
>> -       /* For every page directory referenced, allocate page tables */
>> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
>> +       gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
>>                  bitmap_zero(new_page_tables[pdpe], GEN8_PDES_PER_PAGE);
>> -               ret = gen8_ppgtt_alloc_pagetabs(ppgtt, pd, start, length,
>> +               ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
>>                                                  new_page_tables[pdpe]);
>>                  if (ret)
>>                          goto err_out;
>> @@ -980,10 +989,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>          start = orig_start;
>>          length = orig_length;
>>
>> -       /* Allocations have completed successfully, so set the bitmaps, and do
>> -        * the mappings. */
>> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
>> -               gen8_ppgtt_pde_t *const page_directory = kmap_atomic(pd->page);
>> +       gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
>>                  struct i915_page_table_entry *pt;
>>                  uint64_t pd_len = gen8_clamp_pd(start, length);
>>                  uint64_t pd_start = start;
>> @@ -1005,20 +1011,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>
>>                          /* Our pde is now pointing to the pagetable, pt */
>>                          set_bit(pde, pd->used_pdes);
>> -
>> -                       /* Map the PDE to the page table */
>> -                       __gen8_do_map_pt(page_directory + pde, pt, vm->dev);
>> -
>> -                       /* NB: We haven't yet mapped ptes to pages. At this
>> -                        * point we're still relying on insert_entries() */
>>                  }
>>
>> -               if (!HAS_LLC(vm->dev))
>> -                       drm_clflush_virt_range(page_directory, PAGE_SIZE);
>> -
>> -               kunmap_atomic(page_directory);
>> -
>> -               set_bit(pdpe, ppgtt->pdp.used_pdpes);
>> +               set_bit(pdpe, pdp->used_pdpes);
>> +               gen8_map_pagetable_range(pd, start, length, dev);
>>          }
>>
>>          free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>> @@ -1027,16 +1023,36 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>>   err_out:
>>          while (pdpe--) {
>>                  for_each_set_bit(temp, new_page_tables[pdpe], GEN8_PDES_PER_PAGE)
>> -                       unmap_and_free_pt(pd->page_tables[temp], vm->dev);
>> +                       unmap_and_free_pt(pd->page_tables[temp], dev);
> Sorry the review comment may not be completely pertinent to this very patch.
> In the while loop, on the change of 'pdpe' value, the 'pd' is not
> being updated accordingly.
> The above call to 'unmap_and_free_pt(pd->page_table[temp], dev);'
> should be replaced with
>      'unmap_and_free_pt(pdp->page_directory[pdpe]->page_table[temp], dev);'
> This will give the right page directory.
Fixed in the patch that brings this code (drm/i915/bdw: Dynamic page 
table allocations).

>>          }
>>
>>          for_each_set_bit(pdpe, new_page_dirs, pdpes)
>> -               unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
>> +               unmap_and_free_pd(pdp->page_directory[pdpe], dev);
>>
>>          free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
>>          return ret;
>>   }
>>
>> +static int __noreturn gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
>> +                                              struct i915_pml4 *pml4,
>> +                                              uint64_t start,
>> +                                              uint64_t length)
>> +{
>> +       BUG(); /* to be implemented later */
>> +}
>> +
>> +static int gen8_alloc_va_range(struct i915_address_space *vm,
>> +                              uint64_t start, uint64_t length)
>> +{
>> +       struct i915_hw_ppgtt *ppgtt =
>> +               container_of(vm, struct i915_hw_ppgtt, base);
>> +
>> +       if (!USES_FULL_48BIT_PPGTT(vm->dev))
>> +               return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
>> +       else
>> +               return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
>> +}
>> +
>>   static void gen8_ppgtt_fini_common(struct i915_hw_ppgtt *ppgtt)
>>   {
>>          unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev);
>> @@ -1079,12 +1095,13 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>>   {
>>          struct drm_device *dev = ppgtt->base.dev;
>>          struct drm_i915_private *dev_priv = dev->dev_private;
>> +       struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */
>>          struct i915_page_directory_entry *pd;
>>          uint64_t temp, start = 0, size = dev_priv->gtt.base.total;
>>          uint32_t pdpe;
>>          int ret;
>>
>> -       ret = gen8_ppgtt_init_common(ppgtt, dev_priv->gtt.base.total);
>> +       ret = gen8_ppgtt_init_common(ppgtt, size);
>>          if (ret)
>>                  return ret;
>>
>> @@ -1097,8 +1114,8 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>>                  return ret;
>>          }
>>
>> -       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, size, temp, pdpe)
>> -               gen8_map_pagetable_range(pd, start, size, ppgtt->base.dev);
>> +       gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe)
>> +               gen8_map_pagetable_range(pd, start, size, dev);
> Sorry, again this comment may not be relevant for this patch.
> Is the explicit call to map the page of page tables really needed here ?
> As prior to this, already there is a call to gen8_alloc_va_range,
> which will map the page of page tables into the pdes, for the entire
> virtual range.
Thanks, it was a left over from a previous rebase (before aliasing and 
full ppgtt inits split). As you say it isn't needed.
Fixed in same patch as your prev comment (drm/i915/bdw: Dynamic page 
table allocations).

>>          ppgtt->base.allocate_va_range = NULL;
>>          ppgtt->base.clear_range = gen8_ppgtt_clear_range;
>> --
>> 2.1.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/12] drm/i915/bdw: Add ppgtt info for dynamic pages
  2015-03-03 12:23   ` akash goel
@ 2015-03-18 10:17     ` Michel Thierry
  0 siblings, 0 replies; 32+ messages in thread
From: Michel Thierry @ 2015-03-18 10:17 UTC (permalink / raw)
  To: akash goel; +Cc: intel-gfx, Goel, Akash

On 3/3/2015 12:23 PM, akash goel wrote:
> On Fri, Feb 20, 2015 at 11:15 PM, Michel Thierry
> <michel.thierry@intel.com>  wrote:
>> From: Ben Widawsky<benjamin.widawsky@intel.com>
>>
>> Note that there is no gen8 ppgtt debug_dump function yet.
>>
>> Signed-off-by: Ben Widawsky<ben@bwidawsk.net>
>> Signed-off-by: Michel Thierry<michel.thierry@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c | 19 ++++++++++---------
>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 32 ++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_gtt.h |  9 +++++++++
>>   3 files changed, 51 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 40630bd..93c34ab 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2165,7 +2165,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>>   {
>>          struct drm_i915_private *dev_priv = dev->dev_private;
>>          struct intel_engine_cs *ring;
>> -       struct drm_file *file;
>>          int i;
>>
>>          if (INTEL_INFO(dev)->gen == 6)
>> @@ -2189,14 +2188,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>>
>>                  ppgtt->debug_dump(ppgtt, m);
>>          }
>> -
>> -       list_for_each_entry_reverse(file, &dev->filelist, lhead) {
>> -               struct drm_i915_file_private *file_priv = file->driver_priv;
>> -
>> -               seq_printf(m, "proc: %s\n",
>> -                          get_pid_task(file->pid, PIDTYPE_PID)->comm);
>> -               idr_for_each(&file_priv->context_idr, per_file_ctx, m);
>> -       }
>>   }
>>
>>   static int i915_ppgtt_info(struct seq_file *m, void *data)
>> @@ -2204,6 +2195,7 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
>>          struct drm_info_node *node = m->private;
>>          struct drm_device *dev = node->minor->dev;
>>          struct drm_i915_private *dev_priv = dev->dev_private;
>> +       struct drm_file *file;
>>
>>          int ret = mutex_lock_interruptible(&dev->struct_mutex);
>>          if (ret)
>> @@ -2215,6 +2207,15 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
>>          else if (INTEL_INFO(dev)->gen >= 6)
>>                  gen6_ppgtt_info(m, dev);
>>
>> +       list_for_each_entry_reverse(file, &dev->filelist, lhead) {
>> +               struct drm_i915_file_private *file_priv = file->driver_priv;
>> +
>> +               seq_printf(m, "\nproc: %s\n",
>> +                          get_pid_task(file->pid, PIDTYPE_PID)->comm);
>> +               idr_for_each(&file_priv->context_idr, per_file_ctx,
>> +                            (void *)(unsigned long)m);
>> +       }
>> +
>>          intel_runtime_pm_put(dev_priv);
>>          mutex_unlock(&dev->struct_mutex);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index ecfb62a..1edcc17 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -2125,6 +2125,38 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
>>          readl(gtt_base);
>>   }
>>
>> +void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt,
>> +                            void (*callback)(struct i915_page_directory_pointer_entry *pdp,
>> +                                             struct i915_page_directory_entry *pd,
>> +                                             struct i915_page_table_entry *pt,
>> +                                             unsigned pdpe,
>> +                                             unsigned pde,
>> +                                             void *data),
>> +                            void *data)
>> +{
>> +       uint64_t start = ppgtt->base.start;
>> +       uint64_t length = ppgtt->base.total;
>> +       uint64_t pdpe, pde, temp;
>> +
>> +       struct i915_page_directory_entry *pd;
>> +       struct i915_page_table_entry *pt;
>> +
>> +       gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
>> +               uint64_t pd_start = start, pd_length = length;
>> +               int i;
>> +
>> +               if (pd == NULL) {
>> +                       for (i = 0; i < GEN8_PDES_PER_PAGE; i++)
>> +                               callback(&ppgtt->pdp, NULL, NULL, pdpe, i, data);
>> +                       continue;
>> +               }
>> +
>> +               gen8_for_each_pde(pt, pd, pd_start, pd_length, temp, pde) {
>> +                       callback(&ppgtt->pdp, pd, pt, pdpe, pde, data);
>> +               }
>> +       }
>> +}
>> +
>>   static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>>                                    uint64_t start,
>>                                    uint64_t length,
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> index a33c6e9..144858e 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> @@ -483,6 +483,15 @@ static inline size_t gen8_pde_count(uint64_t addr, uint64_t length)
>>          return i915_pde_index(end, GEN8_PDE_SHIFT) - i915_pde_index(addr, GEN8_PDE_SHIFT);
>>   }
>>
>> +void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt,
>> +                            void (*callback)(struct i915_page_directory_pointer_entry *pdp,
>> +                                             struct i915_page_directory_entry *pd,
>> +                                             struct i915_page_table_entry *pt,
>> +                                             unsigned pdpe,
>> +                                             unsigned pde,
>> +                                             void *data),
>> +                            void *data);
>> +
> Caller of gen8_for_every_pdpe_pde is not there.
> What is the (envisaged) usage of this function ?
Stale code from a previous rebase. I'll remove it.
>>   int i915_gem_gtt_init(struct drm_device *dev);
>>   void i915_gem_init_global_gtt(struct drm_device *dev);
>>   void i915_global_gtt_cleanup(struct drm_device *dev);
>> --
>> 2.1.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2015-03-18 10:17 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-20 17:45 [PATCH 00/12] PPGTT with 48b addressing Michel Thierry
2015-02-20 17:45 ` [PATCH 01/12] drm/i915/bdw: Make pdp allocation more dynamic Michel Thierry
2015-03-03 11:48   ` akash goel
2015-03-18 10:15     ` Michel Thierry
2015-02-20 17:45 ` [PATCH 02/12] drm/i915/bdw: Abstract PDP usage Michel Thierry
2015-03-03 12:16   ` akash goel
2015-03-18 10:16     ` Michel Thierry
2015-03-04  3:07   ` akash goel
2015-02-20 17:45 ` [PATCH 03/12] drm/i915/bdw: Add dynamic page trace events Michel Thierry
2015-02-24 10:56   ` Daniel Vetter
2015-02-24 10:59   ` Daniel Vetter
2015-02-20 17:45 ` [PATCH 04/12] drm/i915/bdw: Add ppgtt info for dynamic pages Michel Thierry
2015-03-03 12:23   ` akash goel
2015-03-18 10:17     ` Michel Thierry
2015-02-20 17:45 ` [PATCH 05/12] drm/i915/bdw: implement alloc/free for 4lvl Michel Thierry
2015-03-03 12:55   ` akash goel
2015-03-04 13:00     ` Daniel Vetter
2015-03-04  2:48   ` akash goel
2015-02-20 17:46 ` [PATCH 06/12] drm/i915/bdw: Add 4 level switching infrastructure Michel Thierry
2015-03-03 13:01   ` akash goel
2015-03-04 13:08     ` Daniel Vetter
2015-02-20 17:46 ` [PATCH 07/12] drm/i915/bdw: Support 64 bit PPGTT in lrc mode Michel Thierry
2015-03-03 13:08   ` akash goel
2015-02-20 17:46 ` [PATCH 08/12] drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT Michel Thierry
2015-02-20 17:46 ` [PATCH 09/12] drm/i915: Plumb sg_iter through va allocation ->maps Michel Thierry
2015-02-20 17:46 ` [PATCH 10/12] drm/i915/bdw: Add 4 level support in insert_entries and clear_range Michel Thierry
2015-03-03 16:39   ` akash goel
2015-02-20 17:46 ` [PATCH 11/12] drm/i915: Expand error state's address width to 64b Michel Thierry
2015-03-03 16:42   ` akash goel
2015-02-20 17:46 ` [PATCH 12/12] drm/i915/bdw: Flip the 48b switch Michel Thierry
2015-02-24 10:54 ` [PATCH 00/12] PPGTT with 48b addressing Daniel Vetter
2015-03-03 13:52 ` Damien Lespiau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.