All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2
@ 2017-04-04 22:11 Matthew Auld
  2017-04-04 22:11 ` [PATCH 01/18] drm/i915: add page_size_mask to dev_info Matthew Auld
                   ` (18 more replies)
  0 siblings, 19 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Same as before, folding in review comments. Notably we now hook in transparent
huge pages through by shmem, and *attempt* to deal with all the fun which that
brings. Again should be considered very much RFC.

So far I have only gone as far as testing 2M pages on my BDW machine.

Thanks,
Matt

Matthew Auld (18):
  drm/i915: add page_size_mask to dev_info
  drm/i915: introduce drm_i915_gem_object page_size members
  drm/i915: pass page_size to insert_entries
  drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust
  drm/i915: clean up cache coloring
  drm/i915: export color_differs
  drm/i915: introduce ppgtt page coloring
  drm/i915: handle evict-for-node with page coloring
  drm/i915: support inserting 64K pages in the ppgtt
  drm/i915: support inserting 2M pages in the ppgtt
  drm/i915: support inserting 1G pages in the ppgtt
  drm/i915: disable GTT cache for huge-pages
  drm/i915/selftests: exercise 4K and 64K mm insertion
  drm/i915/selftests: modify the gtt tests to also exercise huge pages
  drm/i915/selftests: exercise evict-for-node page coloring
  drm/i915/debugfs: include some huge-page metrics
  mm/shmem: tweak the huge-page interface
  drm/i915: support transparent-huge-pages through shmemfs

 drivers/gpu/drm/i915/i915_debugfs.c             |  38 +++-
 drivers/gpu/drm/i915/i915_drv.h                 |   8 +-
 drivers/gpu/drm/i915/i915_gem.c                 | 195 ++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_evict.c           |  36 +++-
 drivers/gpu/drm/i915/i915_gem_gtt.c             | 236 ++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_gtt.h             |  35 +++-
 drivers/gpu/drm/i915/i915_gem_object.h          |   3 +
 drivers/gpu/drm/i915/i915_pci.c                 |  23 ++-
 drivers/gpu/drm/i915/i915_vma.c                 |  32 +++-
 drivers/gpu/drm/i915/i915_vma.h                 |   6 +
 drivers/gpu/drm/i915/intel_pm.c                 |  12 +-
 drivers/gpu/drm/i915/selftests/i915_gem_evict.c | 125 ++++++++++++-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c   | 194 +++++++++++++++----
 drivers/gpu/drm/i915/selftests/mock_gtt.c       |   4 +
 include/linux/shmem_fs.h                        |   1 +
 mm/shmem.c                                      |  10 +-
 16 files changed, 836 insertions(+), 122 deletions(-)

-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/18] drm/i915: add page_size_mask to dev_info
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05  6:19   ` Joonas Lahtinen
  2017-04-05  8:43   ` Chris Wilson
  2017-04-04 22:11 ` [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members Matthew Auld
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

v2:
  - move out pde/pdpe bit definitions until later
  - tidyup the page size definitions, use BIT
  - introduce helper for detecting invalid page sizes

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h     |  3 +++
 drivers/gpu/drm/i915/i915_gem_gtt.h | 17 ++++++++++++++++-
 drivers/gpu/drm/i915/i915_pci.c     | 23 ++++++++++++++++++++++-
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c9b0949f6c1a..ab7a1072e7b5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -901,6 +901,7 @@ struct intel_device_info {
 	enum intel_platform platform;
 	u8 ring_mask; /* Rings supported by the HW */
 	u8 num_rings;
+	unsigned int page_size_mask; /* page sizes supported by the HW */
 #define DEFINE_FLAG(name) u8 name:1
 	DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG);
 #undef DEFINE_FLAG
@@ -2876,6 +2877,8 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define USES_PPGTT(dev_priv)		(i915.enable_ppgtt)
 #define USES_FULL_PPGTT(dev_priv)	(i915.enable_ppgtt >= 2)
 #define USES_FULL_48BIT_PPGTT(dev_priv)	(i915.enable_ppgtt == 3)
+#define SUPPORTS_PAGE_SIZE(dev_priv, page_size) \
+	((dev_priv)->info.page_size_mask & (page_size))
 
 #define HAS_OVERLAY(dev_priv)		 ((dev_priv)->info.has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index fb15684c1d83..27b2b9e681db 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -42,7 +42,22 @@
 #include "i915_gem_request.h"
 #include "i915_selftest.h"
 
-#define I915_GTT_PAGE_SIZE 4096UL
+#define I915_GTT_PAGE_SIZE_4K BIT(12)
+#define I915_GTT_PAGE_SIZE_64K BIT(16)
+#define I915_GTT_PAGE_SIZE_2M BIT(21)
+#define I915_GTT_PAGE_SIZE_1G BIT(30)
+
+#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K
+
+#define I915_GTT_PAGE_SIZE_MASK (I915_GTT_PAGE_SIZE_4K | \
+				 I915_GTT_PAGE_SIZE_64K | \
+				 I915_GTT_PAGE_SIZE_2M | \
+				 I915_GTT_PAGE_SIZE_1G)
+
+#define is_valid_gtt_page_size(page_size) \
+	(is_power_of_2(page_size) && \
+	 (page_size) & I915_GTT_PAGE_SIZE_MASK)
+
 #define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
 
 #define I915_FENCE_REG_NONE -1
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index f87b0c4e564d..25de64dfe732 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -56,6 +56,10 @@
 	.color = { .degamma_lut_size = 65, .gamma_lut_size = 257 }
 
 /* Keep in gen based order, and chronological order within a gen */
+
+#define GEN_DEFAULT_PAGE_SZ \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K
+
 #define GEN2_FEATURES \
 	.gen = 2, .num_pipes = 1, \
 	.has_overlay = 1, .overlay_needs_physical = 1, \
@@ -64,6 +68,7 @@
 	.unfenced_needs_alignment = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SZ, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i830_info = {
@@ -96,6 +101,7 @@ static const struct intel_device_info intel_i865g_info = {
 	.has_gmch_display = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SZ, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i915g_info = {
@@ -158,6 +164,7 @@ static const struct intel_device_info intel_pineview_info = {
 	.has_gmch_display = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SZ, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i965g_info = {
@@ -198,6 +205,7 @@ static const struct intel_device_info intel_gm45_info = {
 	.has_gmbus_irq = 1, \
 	.ring_mask = RENDER_RING | BSD_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SZ, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ironlake_d_info = {
@@ -223,6 +231,7 @@ static const struct intel_device_info intel_ironlake_m_info = {
 	.has_hw_contexts = 1, \
 	.has_aliasing_ppgtt = 1, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SZ, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_sandybridge_d_info = {
@@ -249,6 +258,7 @@ static const struct intel_device_info intel_sandybridge_m_info = {
 	.has_aliasing_ppgtt = 1, \
 	.has_full_ppgtt = 1, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SZ, \
 	IVB_CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ivybridge_d_info = {
@@ -287,6 +297,7 @@ static const struct intel_device_info intel_valleyview_info = {
 	.has_full_ppgtt = 1,
 	.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
+	GEN_DEFAULT_PAGE_SZ,
 	GEN_DEFAULT_PIPEOFFSETS,
 	CURSOR_OFFSETS
 };
@@ -313,7 +324,8 @@ static const struct intel_device_info intel_haswell_info = {
 	BDW_COLORS, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
-	.has_64bit_reloc = 1
+	.has_64bit_reloc = 1, \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_2M | I915_GTT_PAGE_SIZE_1G
 
 static const struct intel_device_info intel_broadwell_info = {
 	BDW_FEATURES,
@@ -346,13 +358,18 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_aliasing_ppgtt = 1,
 	.has_full_ppgtt = 1,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_2M | I915_GTT_PAGE_SIZE_1G,
 	GEN_CHV_PIPEOFFSETS,
 	CURSOR_OFFSETS,
 	CHV_COLORS,
 };
 
+#define GEN9_DEFAULT_PAGE_SZ \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_2M | I915_GTT_PAGE_SIZE_1G
+
 static const struct intel_device_info intel_skylake_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SZ,
 	.platform = INTEL_SKYLAKE,
 	.gen = 9,
 	.has_csr = 1,
@@ -362,6 +379,7 @@ static const struct intel_device_info intel_skylake_info = {
 
 static const struct intel_device_info intel_skylake_gt3_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SZ,
 	.platform = INTEL_SKYLAKE,
 	.gen = 9,
 	.has_csr = 1,
@@ -394,6 +412,7 @@ static const struct intel_device_info intel_skylake_gt3_info = {
 	.has_aliasing_ppgtt = 1, \
 	.has_full_ppgtt = 1, \
 	.has_full_48bit_ppgtt = 1, \
+	GEN9_DEFAULT_PAGE_SZ, \
 	GEN_DEFAULT_PIPEOFFSETS, \
 	IVB_CURSOR_OFFSETS, \
 	BDW_COLORS
@@ -414,6 +433,7 @@ static const struct intel_device_info intel_geminilake_info = {
 
 static const struct intel_device_info intel_kabylake_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SZ,
 	.platform = INTEL_KABYLAKE,
 	.gen = 9,
 	.has_csr = 1,
@@ -423,6 +443,7 @@ static const struct intel_device_info intel_kabylake_info = {
 
 static const struct intel_device_info intel_kabylake_gt3_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SZ,
 	.platform = INTEL_KABYLAKE,
 	.gen = 9,
 	.has_csr = 1,
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
  2017-04-04 22:11 ` [PATCH 01/18] drm/i915: add page_size_mask to dev_info Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05  6:26   ` Joonas Lahtinen
  2017-04-05  6:49   ` Daniel Vetter
  2017-04-04 22:11 ` [PATCH 03/18] drm/i915: pass page_size to insert_entries Matthew Auld
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c        | 5 +++++
 drivers/gpu/drm/i915/i915_gem_object.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4ca88f2539c0..cbf97f4bbb72 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2441,6 +2441,8 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	struct sg_table *pages;
 
 	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
+	GEM_BUG_ON(!is_valid_gtt_page_size(obj->page_size));
+	GEM_BUG_ON(!is_valid_gtt_page_size(obj->gtt_page_size));
 
 	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
 		DRM_DEBUG("Attempting to obtain a purgeable object\n");
@@ -4159,6 +4161,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 
 	obj->ops = ops;
 
+	obj->page_size = PAGE_SIZE;
+	obj->gtt_page_size = I915_GTT_PAGE_SIZE;
+
 	reservation_object_init(&obj->__builtin_resv);
 	obj->resv = &obj->__builtin_resv;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 174cf923c236..b1dacbfe5173 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -107,6 +107,9 @@ struct drm_i915_gem_object {
 	unsigned int cache_level:3;
 	unsigned int cache_dirty:1;
 
+	unsigned int page_size; /* CPU pov - 4K(default), 2M, 1G */
+	unsigned int gtt_page_size; /* GPU pov - 4K(default), 64K, 2M, 1G */
+
 	atomic_t frontbuffer_bits;
 	unsigned int frontbuffer_ggtt_origin; /* write once */
 	struct i915_gem_active frontbuffer_write;
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/18] drm/i915: pass page_size to insert_entries
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
  2017-04-04 22:11 ` [PATCH 01/18] drm/i915: add page_size_mask to dev_info Matthew Auld
  2017-04-04 22:11 ` [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 04/18] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust Matthew Auld
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 33 ++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  1 +
 drivers/gpu/drm/i915/i915_vma.c               |  1 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  3 ++-
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  1 +
 5 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 8bab4aea63e6..0c8350f709da 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -205,7 +205,7 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 		pte_flags |= PTE_READ_ONLY;
 
 	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+				vma->obj->gtt_page_size, cache_level, pte_flags);
 
 	return 0;
 }
@@ -906,6 +906,7 @@ gen8_ppgtt_insert_pte_entries(struct i915_hw_ppgtt *ppgtt,
 static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
 				   struct sg_table *pages,
 				   u64 start,
+				   unsigned int page_size,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
@@ -924,6 +925,7 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
 static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 				   struct sg_table *pages,
 				   u64 start,
+				   unsigned int page_size,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
@@ -935,9 +937,24 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	};
 	struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
 	struct gen8_insert_pte idx = gen8_insert_pte(start);
+	bool (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
+			       struct i915_page_directory_pointer *pdp,
+			       struct sgt_dma *iter,
+			       struct gen8_insert_pte *idx,
+			       enum i915_cache_level cache_level);
+
+	/* TODO: turn this into vfunc */
+	switch (page_size) {
+	case I915_GTT_PAGE_SIZE_4K:
+		insert_entries = gen8_ppgtt_insert_pte_entries;
+		break;
+	default:
+		MISSING_CASE(page_size);
+		return;
+	}
 
-	while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter,
-					     &idx, cache_level))
+	while (insert_entries(ppgtt, pdps[idx.pml4e++], &iter, &idx,
+			      cache_level))
 		GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
 }
 
@@ -1620,6 +1637,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 				      struct sg_table *pages,
 				      u64 start,
+				      unsigned int page_size,
 				      enum i915_cache_level cache_level,
 				      u32 flags)
 {
@@ -2093,6 +2111,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     u64 start,
+				     unsigned int page_size,
 				     enum i915_cache_level level,
 				     u32 unused)
 {
@@ -2140,6 +2159,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     u64 start,
+				     unsigned int page_size,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
@@ -2224,6 +2244,7 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
 static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *pages,
 				     u64 start,
+				     unsigned int page_size,
 				     enum i915_cache_level cache_level,
 				     u32 unused)
 {
@@ -2260,7 +2281,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 
 	intel_runtime_pm_get(i915);
 	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+				I915_GTT_PAGE_SIZE, cache_level, pte_flags);
 	intel_runtime_pm_put(i915);
 
 	/*
@@ -2314,14 +2335,14 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 
 		appgtt->base.insert_entries(&appgtt->base,
 					    vma->pages, vma->node.start,
-					    cache_level, pte_flags);
+					    I915_GTT_PAGE_SIZE, cache_level, pte_flags);
 	}
 
 	if (flags & I915_VMA_GLOBAL_BIND) {
 		intel_runtime_pm_get(i915);
 		vma->vm->insert_entries(vma->vm,
 					vma->pages, vma->node.start,
-					cache_level, pte_flags);
+					I915_GTT_PAGE_SIZE, cache_level, pte_flags);
 		intel_runtime_pm_put(i915);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 27b2b9e681db..232f7ef4c21b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -329,6 +329,7 @@ struct i915_address_space {
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       u64 start,
+			       unsigned int page_size,
 			       enum i915_cache_level cache_level,
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 1aba47024656..0cf9c0a98c19 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -544,6 +544,7 @@ int __i915_vma_do_pin(struct i915_vma *vma,
 	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
 	GEM_BUG_ON((flags & (PIN_GLOBAL | PIN_USER)) == 0);
 	GEM_BUG_ON((flags & PIN_GLOBAL) && !i915_vma_is_ggtt(vma));
+	GEM_BUG_ON(!is_valid_gtt_page_size(vma->obj->gtt_page_size));
 
 	if (WARN_ON(bound & I915_VMA_PIN_OVERFLOW)) {
 		ret = -EBUSY;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 50710e3f1caa..259b5e139df1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -256,7 +256,8 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 				break;
 
 			vm->insert_entries(vm, obj->mm.pages, addr,
-					   I915_CACHE_NONE, 0);
+					   I915_GTT_PAGE_SIZE, I915_CACHE_NONE,
+					   0);
 		}
 		count = n;
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index a61309c7cb3e..38532a008387 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -35,6 +35,7 @@ static void mock_insert_page(struct i915_address_space *vm,
 static void mock_insert_entries(struct i915_address_space *vm,
 				struct sg_table *st,
 				u64 start,
+				unsigned int page_size,
 				enum i915_cache_level level, u32 flags)
 {
 }
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/18] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (2 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 03/18] drm/i915: pass page_size to insert_entries Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05  6:30   ` Joonas Lahtinen
  2017-04-04 22:11 ` [PATCH 05/18] drm/i915: clean up cache coloring Matthew Auld
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0c8350f709da..0989af4a17e4 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2395,10 +2395,10 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj,
 	dma_unmap_sg(kdev, pages->sgl, pages->nents, PCI_DMA_BIDIRECTIONAL);
 }
 
-static void i915_gtt_color_adjust(const struct drm_mm_node *node,
-				  unsigned long color,
-				  u64 *start,
-				  u64 *end)
+static void i915_ggtt_color_adjust(const struct drm_mm_node *node,
+				   unsigned long color,
+				   u64 *start,
+				   u64 *end)
 {
 	if (node->allocated && node->color != color)
 		*start += I915_GTT_PAGE_SIZE;
@@ -2970,7 +2970,7 @@ int i915_ggtt_init_hw(struct drm_i915_private *dev_priv)
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	i915_address_space_init(&ggtt->base, dev_priv, "[global]");
 	if (!HAS_LLC(dev_priv) && !USES_PPGTT(dev_priv))
-		ggtt->base.mm.color_adjust = i915_gtt_color_adjust;
+		ggtt->base.mm.color_adjust = i915_ggtt_color_adjust;
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	if (!io_mapping_init_wc(&dev_priv->ggtt.mappable,
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/18] drm/i915: clean up cache coloring
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (3 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 04/18] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05  6:35   ` Joonas Lahtinen
  2017-04-04 22:11 ` [PATCH 06/18] drm/i915: export color_differs Matthew Auld
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Rid the code of any mm.color_adjust assumptions to allow adding another
flavour of coloring.

v2: better naming

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h                 |  2 +-
 drivers/gpu/drm/i915/i915_gem.c                 |  3 ++-
 drivers/gpu/drm/i915/i915_gem_evict.c           | 12 +++++-------
 drivers/gpu/drm/i915/i915_gem_gtt.h             |  6 ++++++
 drivers/gpu/drm/i915/i915_vma.c                 | 10 +++++++---
 drivers/gpu/drm/i915/selftests/i915_gem_evict.c |  8 +++++---
 6 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ab7a1072e7b5..838ce22a0a40 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3543,7 +3543,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
 					  u64 min_size, u64 alignment,
-					  unsigned cache_level,
+					  unsigned long color,
 					  u64 start, u64 end,
 					  unsigned flags);
 int __must_check i915_gem_evict_for_node(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cbf97f4bbb72..5362f4d18689 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3578,7 +3578,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		obj->cache_dirty = true;
 
 	list_for_each_entry(vma, &obj->vma_list, obj_link)
-		vma->node.color = cache_level;
+		if (i915_vm_has_cache_coloring(vma->vm))
+			vma->node.color = cache_level;
 	obj->cache_level = cache_level;
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 51e365f70464..0c9c51be0f6a 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -74,7 +74,7 @@ mark_free(struct drm_mm_scan *scan,
  * @vm: address space to evict from
  * @min_size: size of the desired free space
  * @alignment: alignment constraint of the desired free space
- * @cache_level: cache_level for the desired space
+ * @color: color for the desired space
  * @start: start (inclusive) of the range from which to evict objects
  * @end: end (exclusive) of the range from which to evict objects
  * @flags: additional flags to control the eviction algorithm
@@ -95,7 +95,7 @@ mark_free(struct drm_mm_scan *scan,
 int
 i915_gem_evict_something(struct i915_address_space *vm,
 			 u64 min_size, u64 alignment,
-			 unsigned cache_level,
+			 unsigned long color,
 			 u64 start, u64 end,
 			 unsigned flags)
 {
@@ -134,7 +134,7 @@ i915_gem_evict_something(struct i915_address_space *vm,
 	if (flags & PIN_MAPPABLE)
 		mode = DRM_MM_INSERT_LOW;
 	drm_mm_scan_init_with_range(&scan, &vm->mm,
-				    min_size, alignment, cache_level,
+				    min_size, alignment, color,
 				    start, end, mode);
 
 	/* Retire before we search the active list. Although we have
@@ -253,7 +253,6 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 	u64 start = target->start;
 	u64 end = start + target->size;
 	struct i915_vma *vma, *next;
-	bool check_color;
 	int ret = 0;
 
 	lockdep_assert_held(&vm->i915->drm.struct_mutex);
@@ -270,8 +269,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 	if (!(flags & PIN_NONBLOCK))
 		i915_gem_retire_requests(vm->i915);
 
-	check_color = vm->mm.color_adjust;
-	if (check_color) {
+	if (i915_vm_has_cache_coloring(vm)) {
 		/* Expand search to cover neighbouring guard pages (or lack!) */
 		if (start)
 			start -= I915_GTT_PAGE_SIZE;
@@ -297,7 +295,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 		 * abutt and conflict. If they are in conflict, then we evict
 		 * those as well to make room for our guard pages.
 		 */
-		if (check_color) {
+		if (i915_vm_has_cache_coloring(vm)) {
 			if (node->start + node->size == target->start) {
 				if (node->color == target->color)
 					continue;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 232f7ef4c21b..9c592e2de516 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -347,6 +347,12 @@ struct i915_address_space {
 #define i915_is_ggtt(V) (!(V)->file)
 
 static inline bool
+i915_vm_has_cache_coloring(const struct i915_address_space *vm)
+{
+	return vm->mm.color_adjust && i915_is_ggtt(vm);
+}
+
+static inline bool
 i915_vm_is_48bit(const struct i915_address_space *vm)
 {
 	return (vm->total - 1) >> 32;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 0cf9c0a98c19..4ead7d075fd3 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -390,7 +390,7 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level)
 	 * these constraints apply and set the drm_mm.color_adjust
 	 * appropriately.
 	 */
-	if (vma->vm->mm.color_adjust == NULL)
+	if (!i915_vm_has_cache_coloring(vma->vm))
 		return true;
 
 	/* Only valid to be called on an already inserted vma */
@@ -429,6 +429,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	struct drm_i915_gem_object *obj = vma->obj;
 	u64 start, end;
 	int ret;
+	unsigned long color = 0;
 
 	GEM_BUG_ON(vma->flags & (I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND));
 	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -471,6 +472,9 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (ret)
 		return ret;
 
+	if (i915_vm_has_cache_coloring(vma->vm))
+		color = obj->cache_level;
+
 	if (flags & PIN_OFFSET_FIXED) {
 		u64 offset = flags & PIN_OFFSET_MASK;
 		if (!IS_ALIGNED(offset, alignment) ||
@@ -480,13 +484,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 		}
 
 		ret = i915_gem_gtt_reserve(vma->vm, &vma->node,
-					   size, offset, obj->cache_level,
+					   size, offset, color,
 					   flags);
 		if (ret)
 			goto err_unpin;
 	} else {
 		ret = i915_gem_gtt_insert(vma->vm, &vma->node,
-					  size, alignment, obj->cache_level,
+					  size, alignment, color,
 					  start, end, flags);
 		if (ret)
 			goto err_unpin;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 14e9c2fbc4e6..8c0b4dc9de3d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -224,11 +224,13 @@ static int igt_evict_for_cache_color(void *arg)
 	int err;
 
 	/* Currently the use of color_adjust is limited to cache domains within
-	 * the ggtt, and so the presence of mm.color_adjust is assumed to be
-	 * i915_gtt_color_adjust throughout our driver, so using a mock color
-	 * adjust will work just fine for our purposes.
+	 * the ggtt, and page sizes within the ppgtt, so the presence of
+	 * mm.color_adjust is assumed to be i915_gtt_color_adjust when if vm is
+	 * ggtt, so using a mock color adjust will work just fine for our
+	 * purposes.
 	 */
 	ggtt->base.mm.color_adjust = mock_color_adjust;
+	GEM_BUG_ON(!i915_vm_has_cache_coloring(&ggtt->base));
 
 	obj = i915_gem_object_create_internal(i915, I915_GTT_PAGE_SIZE);
 	if (IS_ERR(obj)) {
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/18] drm/i915: export color_differs
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (4 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 05/18] drm/i915: clean up cache coloring Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05  6:39   ` Joonas Lahtinen
  2017-04-04 22:11 ` [PATCH 07/18] drm/i915: introduce ppgtt page coloring Matthew Auld
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Export color_differs so that we can use it elsewhere.

v2: better naming

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 11 ++++-------
 drivers/gpu/drm/i915/i915_vma.h |  6 ++++++
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4ead7d075fd3..8f0041ba328f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -373,11 +373,6 @@ void __i915_vma_set_map_and_fenceable(struct i915_vma *vma)
 		vma->flags &= ~I915_VMA_CAN_FENCE;
 }
 
-static bool color_differs(struct drm_mm_node *node, unsigned long color)
-{
-	return node->allocated && node->color != color;
-}
-
 bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level)
 {
 	struct drm_mm_node *node = &vma->node;
@@ -398,11 +393,13 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level)
 	GEM_BUG_ON(list_empty(&node->node_list));
 
 	other = list_prev_entry(node, node_list);
-	if (color_differs(other, cache_level) && !drm_mm_hole_follows(other))
+	if (i915_node_color_differs(other, cache_level) &&
+	    !drm_mm_hole_follows(other))
 		return false;
 
 	other = list_next_entry(node, node_list);
-	if (color_differs(other, cache_level) && !drm_mm_hole_follows(node))
+	if (i915_node_color_differs(other, cache_level) &&
+	    !drm_mm_hole_follows(node))
 		return false;
 
 	return true;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 2e03f81dddbe..6c95926f896f 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -284,6 +284,12 @@ static inline void i915_vma_unpin(struct i915_vma *vma)
 	__i915_vma_unpin(vma);
 }
 
+static inline bool i915_node_color_differs(const struct drm_mm_node *node,
+					   unsigned long color)
+{
+	return node->allocated && node->color != color;
+}
+
 /**
  * i915_vma_pin_iomap - calls ioremap_wc to map the GGTT VMA via the aperture
  * @vma: VMA to iomap
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/18] drm/i915: introduce ppgtt page coloring
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (5 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 06/18] drm/i915: export color_differs Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05 13:41   ` Chris Wilson
  2017-04-04 22:11 ` [PATCH 08/18] drm/i915: handle evict-for-node with " Matthew Auld
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

To enable 64K pages we need to set the intermediate-page-size(IPS) bit
of the pde, therefore a page table is said to be either operating in 64K
or 4K mode. To accommodate this vm placement restriction we introduce a
color for pages and corresponding color_adjust callback.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 25 +++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  6 ++++++
 drivers/gpu/drm/i915/i915_vma.c     |  2 ++
 3 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0989af4a17e4..ddc3db345b76 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1332,6 +1332,28 @@ static int gen8_preallocate_top_level_pdp(struct i915_hw_ppgtt *ppgtt)
 	return -ENOMEM;
 }
 
+static void i915_page_color_adjust(const struct drm_mm_node *node,
+				   unsigned long color,
+				   u64 *start,
+				   u64 *end)
+{
+	GEM_BUG_ON(!is_valid_gtt_page_size(color));
+
+	if (!(color & (I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K)))
+		return;
+
+	GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
+
+	if (i915_node_color_differs(node, color))
+		*start = roundup(*start, 1 << GEN8_PDE_SHIFT);
+
+	node = list_next_entry(node, node_list);
+	if (i915_node_color_differs(node, color))
+		*end = rounddown(*end, 1 << GEN8_PDE_SHIFT);
+
+	GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
+}
+
 /*
  * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
  * with a net effect resembling a 2-level page table in normal x86 terms. Each
@@ -1372,6 +1394,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 		ppgtt->base.allocate_va_range = gen8_ppgtt_alloc_4lvl;
 		ppgtt->base.insert_entries = gen8_ppgtt_insert_4lvl;
 		ppgtt->base.clear_range = gen8_ppgtt_clear_4lvl;
+
+		if (SUPPORTS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K))
+			ppgtt->base.mm.color_adjust = i915_page_color_adjust;
 	} else {
 		ret = __pdp_init(&ppgtt->base, &ppgtt->pdp);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 9c592e2de516..8d893ddd98f2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -353,6 +353,12 @@ i915_vm_has_cache_coloring(const struct i915_address_space *vm)
 }
 
 static inline bool
+i915_vm_has_page_coloring(const struct i915_address_space *vm)
+{
+	return vm->mm.color_adjust && !i915_is_ggtt(vm);
+}
+
+static inline bool
 i915_vm_is_48bit(const struct i915_address_space *vm)
 {
 	return (vm->total - 1) >> 32;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 8f0041ba328f..4043145b4310 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -471,6 +471,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 
 	if (i915_vm_has_cache_coloring(vma->vm))
 		color = obj->cache_level;
+	else if (i915_vm_has_page_coloring(vma->vm))
+		color = obj->gtt_page_size;
 
 	if (flags & PIN_OFFSET_FIXED) {
 		u64 offset = flags & PIN_OFFSET_MASK;
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/18] drm/i915: handle evict-for-node with page coloring
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (6 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 07/18] drm/i915: introduce ppgtt page coloring Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt Matthew Auld
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 0c9c51be0f6a..817acff2fb6c 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -276,6 +276,30 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 
 		/* Always look at the page afterwards to avoid the end-of-GTT */
 		end += I915_GTT_PAGE_SIZE;
+	} else if (i915_vm_has_page_coloring(vm)) {
+		u64 pt_start, pt_end;
+
+		GEM_BUG_ON(!is_valid_gtt_page_size(target->color));
+		GEM_BUG_ON(!IS_ALIGNED(start, target->color));
+		GEM_BUG_ON(!IS_ALIGNED(end, target->color));
+
+		/* We need to consider the page table coloring on both sides of
+		 * the range, where a mismatch would require extending our
+		 * range to evict nodes up to the page table boundry for each
+		 * side to ensure the underlying page table(s) for the range
+		 * match the target color.
+		 */
+		pt_start = rounddown(start, 1 << GEN8_PDE_SHIFT);
+		node = __drm_mm_interval_first(&vm->mm, pt_start, start);
+		if (i915_node_color_differs(node, target->color)) {
+			start = pt_start;
+		}
+
+		pt_end = roundup(start, 1 << GEN8_PDE_SHIFT);
+		node = __drm_mm_interval_first(&vm->mm, end, pt_end);
+		if (i915_node_color_differs(node, target->color)) {
+			end = pt_end;
+		}
 	}
 	GEM_BUG_ON(start >= end);
 
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (7 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 08/18] drm/i915: handle evict-for-node with " Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-06  3:25   ` kbuild test robot
  2017-04-09  0:27   ` kbuild test robot
  2017-04-04 22:11 ` [PATCH 10/18] drm/i915: support inserting 2M " Matthew Auld
                   ` (9 subsequent siblings)
  18 siblings, 2 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 70 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
 2 files changed, 72 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ddc3db345b76..fb822c0bd973 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -848,6 +848,73 @@ static __always_inline struct gen8_insert_pte gen8_insert_pte(u64 start)
 }
 
 static __always_inline bool
+gen8_ppgtt_insert_64K_pte_entries(struct i915_hw_ppgtt *ppgtt,
+				  struct i915_page_directory_pointer *pdp,
+				  struct sgt_dma *iter,
+				  struct gen8_insert_pte *idx,
+				  enum i915_cache_level cache_level)
+{
+	struct i915_page_directory *pd;
+	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
+	gen8_pte_t *vaddr;
+	bool ret;
+
+	GEM_BUG_ON(idx->pte % 16);
+	GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+	/* TODO: probably move this to the allocation phase.. */
+	pd = pdp->page_directory[idx->pdpe];
+	vaddr = kmap_atomic_px(pd);
+	vaddr[idx->pde] |= GEN8_PDE_IPS_64K;
+	kunmap_atomic(vaddr);
+
+	vaddr = kmap_atomic_px(pd->page_table[idx->pde]);
+	do {
+		vaddr[idx->pte] = pte_encode | iter->dma;
+		iter->dma += I915_GTT_PAGE_SIZE_64K;
+		if (iter->dma >= iter->max) {
+			iter->sg = __sg_next(iter->sg);
+			if (!iter->sg) {
+				ret = false;
+				break;
+			}
+
+			iter->dma = sg_dma_address(iter->sg);
+			iter->max = iter->dma + iter->sg->length;
+		}
+
+		idx->pte += 16;
+
+		if (idx->pte == GEN8_PTES) {
+			idx->pte = 0;
+
+			if (++idx->pde == I915_PDES) {
+				idx->pde = 0;
+
+				/* Limited by sg length for 3lvl */
+				if (++idx->pdpe == GEN8_PML4ES_PER_PML4) {
+					idx->pdpe = 0;
+					ret = true;
+					break;
+				}
+
+				GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+				pd = pdp->page_directory[idx->pdpe];
+			}
+
+			kunmap_atomic(vaddr);
+			vaddr = kmap_atomic_px(pd);
+			vaddr[idx->pde] |= GEN8_PDE_IPS_64K;
+			kunmap_atomic(vaddr);
+
+			vaddr = kmap_atomic_px(pd->page_table[idx->pde]);
+		}
+	} while (1);
+	kunmap_atomic(vaddr);
+
+	return ret;
+}
+
+static __always_inline bool
 gen8_ppgtt_insert_pte_entries(struct i915_hw_ppgtt *ppgtt,
 			      struct i915_page_directory_pointer *pdp,
 			      struct sgt_dma *iter,
@@ -948,6 +1015,9 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	case I915_GTT_PAGE_SIZE_4K:
 		insert_entries = gen8_ppgtt_insert_pte_entries;
 		break;
+	case I915_GTT_PAGE_SIZE_64K:
+		insert_entries = gen8_ppgtt_insert_64K_pte_entries;
+		break;
 	default:
 		MISSING_CASE(page_size);
 		return;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8d893ddd98f2..d948808fcf6a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -158,6 +158,8 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((u64)(x) << ((i) * 8))
 
+#define GEN8_PDE_IPS_64K BIT(11)
+
 struct sg_table;
 
 struct intel_rotation_info {
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/18] drm/i915: support inserting 2M pages in the ppgtt
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (8 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 11/18] drm/i915: support inserting 1G " Matthew Auld
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 53 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
 2 files changed, 54 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fb822c0bd973..9dc12955f557 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -848,6 +848,56 @@ static __always_inline struct gen8_insert_pte gen8_insert_pte(u64 start)
 }
 
 static __always_inline bool
+gen8_ppgtt_insert_2M_pde_entries(struct i915_hw_ppgtt *ppgtt,
+				 struct i915_page_directory_pointer *pdp,
+				 struct sgt_dma *iter,
+				 struct gen8_insert_pte *idx,
+				 enum i915_cache_level cache_level)
+{
+	const gen8_pte_t pde_encode = gen8_pte_encode(GEN8_PDE_PS_2M,
+						      cache_level);
+	gen8_pte_t *vaddr;
+	bool ret;
+
+	GEM_BUG_ON(idx->pte);
+	GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+	vaddr = kmap_atomic_px(pdp->page_directory[idx->pdpe]);
+	do {
+		vaddr[idx->pde] = pde_encode | iter->dma;
+		iter->dma += I915_GTT_PAGE_SIZE_2M;
+		if (iter->dma >= iter->max) {
+			iter->sg = __sg_next(iter->sg);
+			if (!iter->sg) {
+				ret = false;
+				break;
+			}
+
+			iter->dma = sg_dma_address(iter->sg);
+			iter->max = iter->dma + iter->sg->length;
+		}
+
+		if (++idx->pde == I915_PDES) {
+			idx->pde = 0;
+
+			if (++idx->pdpe == GEN8_PML4ES_PER_PML4) {
+				idx->pdpe = 0;
+				ret = true;
+				break;
+			}
+
+			kunmap_atomic(vaddr);
+			vaddr = kmap_atomic_px(pdp->page_directory[idx->pdpe]);
+		}
+
+	} while (1);
+	kunmap_atomic(vaddr);
+
+	mark_tlbs_dirty(ppgtt);
+
+	return ret;
+}
+
+static __always_inline bool
 gen8_ppgtt_insert_64K_pte_entries(struct i915_hw_ppgtt *ppgtt,
 				  struct i915_page_directory_pointer *pdp,
 				  struct sgt_dma *iter,
@@ -1018,6 +1068,9 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	case I915_GTT_PAGE_SIZE_64K:
 		insert_entries = gen8_ppgtt_insert_64K_pte_entries;
 		break;
+	case I915_GTT_PAGE_SIZE_2M:
+		insert_entries = gen8_ppgtt_insert_2M_pde_entries;
+		break;
 	default:
 		MISSING_CASE(page_size);
 		return;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d948808fcf6a..cfe31db6b400 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -159,6 +159,7 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT(i, x)			((u64)(x) << ((i) * 8))
 
 #define GEN8_PDE_IPS_64K BIT(11)
+#define GEN8_PDE_PS_2M   BIT(7)
 
 struct sg_table;
 
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 11/18] drm/i915: support inserting 1G pages in the ppgtt
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (9 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 10/18] drm/i915: support inserting 2M " Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 12/18] drm/i915: disable GTT cache for huge-pages Matthew Auld
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 45 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9dc12955f557..5269092ba048 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -848,6 +848,48 @@ static __always_inline struct gen8_insert_pte gen8_insert_pte(u64 start)
 }
 
 static __always_inline bool
+gen8_ppgtt_insert_1G_pdpe_entries(struct i915_hw_ppgtt *ppgtt,
+				  struct i915_page_directory_pointer *pdp,
+				  struct sgt_dma *iter,
+				  struct gen8_insert_pte *idx,
+				  enum i915_cache_level cache_level)
+{
+	const gen8_pte_t pdpe_encode = gen8_pte_encode(GEN8_PDPE_PS_1G,
+						       cache_level);
+	gen8_pte_t *vaddr;
+	bool ret;
+
+	GEM_BUG_ON(idx->pte);
+	GEM_BUG_ON(idx->pde);
+	GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+	vaddr = kmap_atomic_px(pdp);
+	do {
+		vaddr[idx->pdpe] = pdpe_encode | iter->dma;
+		iter->dma += I915_GTT_PAGE_SIZE_1G;
+		if (iter->dma >= iter->max) {
+			iter->sg = __sg_next(iter->sg);
+			if (!iter->sg) {
+				ret = false;
+				break;
+			}
+
+			iter->dma = sg_dma_address(iter->sg);
+			iter->max = iter->dma + iter->sg->length;
+		}
+
+		if (++idx->pdpe == GEN8_PML4ES_PER_PML4) {
+			idx->pdpe = 0;
+			ret = true;
+			break;
+		}
+
+	} while (1);
+	kunmap_atomic(vaddr);
+
+	return ret;
+}
+
+static __always_inline bool
 gen8_ppgtt_insert_2M_pde_entries(struct i915_hw_ppgtt *ppgtt,
 				 struct i915_page_directory_pointer *pdp,
 				 struct sgt_dma *iter,
@@ -1071,6 +1113,9 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	case I915_GTT_PAGE_SIZE_2M:
 		insert_entries = gen8_ppgtt_insert_2M_pde_entries;
 		break;
+	case I915_GTT_PAGE_SIZE_1G:
+		insert_entries = gen8_ppgtt_insert_1G_pdpe_entries;
+		break;
 	default:
 		MISSING_CASE(page_size);
 		return;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index cfe31db6b400..8b970e9e764c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -161,6 +161,8 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
+#define GEN8_PDPE_PS_1G  BIT(7)
+
 struct sg_table;
 
 struct intel_rotation_info {
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 12/18] drm/i915: disable GTT cache for huge-pages
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (10 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 11/18] drm/i915: support inserting 1G " Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 13/18] drm/i915/selftests: exercise 4K and 64K mm insertion Matthew Auld
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

As hinted by the comment and from actually testing 2M pages on a BDW
machine with the GTT cache enabled, we are definitely going to need keep
it disabled.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 570bd603f401..2bb49bada4ea 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7544,10 +7544,10 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv)
 
 	/*
 	 * WaGttCachingOffByDefault:bdw
-	 * GTT cache may not work with big pages, so if those
-	 * are ever enabled GTT cache may need to be disabled.
+	 * The GTT cache must be disabled if the system is planning to use
+	 * 2M/1G pages.
 	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+	I915_WRITE(HSW_GTT_CACHE_EN, 0);
 
 	/* WaKVMNotificationOnConfigChange:bdw */
 	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
@@ -7823,10 +7823,10 @@ static void cherryview_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen8_set_l3sqc_credits(dev_priv, 38, 2);
 
 	/*
-	 * GTT cache may not work with big pages, so if those
-	 * are ever enabled GTT cache may need to be disabled.
+	 * The GTT cache must be disabled if the system is planning to use
+	 * 2M/1G pages.
 	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+	I915_WRITE(HSW_GTT_CACHE_EN, 0);
 }
 
 static void g4x_init_clock_gating(struct drm_i915_private *dev_priv)
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 13/18] drm/i915/selftests: exercise 4K and 64K mm insertion
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (11 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 12/18] drm/i915: disable GTT cache for huge-pages Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 14/18] drm/i915/selftests: modify the gtt tests to also exercise huge pages Matthew Auld
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Mock test filling an address space with 4K and 64K objects, in the hope
of exercising the page color adjust fun.

v2: s/roundup/round_up

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 68 +++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 259b5e139df1..0963dcb67996 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -31,6 +31,7 @@
 #include "mock_context.h"
 #include "mock_drm.h"
 #include "mock_gem_device.h"
+#include "mock_gtt.h"
 
 static void fake_free_pages(struct drm_i915_gem_object *obj,
 			    struct sg_table *pages)
@@ -1307,6 +1308,72 @@ static int igt_gtt_reserve(void *arg)
 	return err;
 }
 
+static int igt_ppgtt_page_color(void *arg)
+{
+	struct drm_mm mm;
+	struct drm_mm_node *node, *prev, *next;
+	unsigned long page_colors[] = {
+		I915_GTT_PAGE_SIZE,
+		I915_GTT_PAGE_SIZE_64K,
+	};
+	int idx = 0;
+	u64 count = 0;
+	u64 size;
+
+	drm_mm_init(&mm, 0, U64_MAX);
+	mm.color_adjust = i915_page_color_adjust;
+
+	/* Running out of memory is okay. */
+
+	for_each_prime_number_from(size, 0, U64_MAX) {
+		node = kzalloc(sizeof(*node), GFP_KERNEL);
+		if (!node) {
+			pr_info("finished test early, unable to allocate node, count=%llu\n", count);
+			break;
+		}
+
+		size = round_up(size, page_colors[idx]);
+
+		if (drm_mm_insert_node_in_range(&mm, node, size,
+						page_colors[idx],
+						page_colors[idx],
+						0, U64_MAX,
+						DRM_MM_INSERT_BEST)) {
+			pr_info("test finished, unable to insert node: color=%lu, size=%llx, count=%llu\n",
+				page_colors[idx], size, count);
+			kfree(node);
+			break;
+		}
+
+		GEM_BUG_ON(!IS_ALIGNED(node->start, node->color));
+		GEM_BUG_ON(!IS_ALIGNED(node->size, node->color));
+
+		/* We can't mix 4K and 64K pte's in the same pt. */
+
+		prev = list_prev_entry(node, node_list);
+		if (i915_node_color_differs(prev, node->color))
+			GEM_BUG_ON(prev->start >> GEN8_PDE_SHIFT ==
+				   node->start >> GEN8_PDE_SHIFT);
+
+		next = list_next_entry(node, node_list);
+		if (i915_node_color_differs(next, node->color))
+			GEM_BUG_ON(((next->start + next->size) >> GEN8_PDE_SHIFT) ==
+				   ((node->start + node->size) >> GEN8_PDE_SHIFT));
+
+		idx ^= 1;
+		++count;
+	}
+
+	drm_mm_for_each_node_safe(node, next, &mm) {
+		drm_mm_remove_node(node);
+		kfree(node);
+	}
+
+	drm_mm_takedown(&mm);
+
+	return 0;
+}
+
 static int igt_gtt_insert(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -1523,6 +1590,7 @@ int i915_gem_gtt_mock_selftests(void)
 		SUBTEST(igt_mock_fill),
 		SUBTEST(igt_gtt_reserve),
 		SUBTEST(igt_gtt_insert),
+		SUBTEST(igt_ppgtt_page_color),
 	};
 	struct drm_i915_private *i915;
 	int err;
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 14/18] drm/i915/selftests: modify the gtt tests to also exercise huge pages
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (12 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 13/18] drm/i915/selftests: exercise 4K and 64K mm insertion Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 15/18] drm/i915/selftests: exercise evict-for-node page coloring Matthew Auld
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

v2: s/roundup/round_up
    s/rounddown/round_down

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 123 ++++++++++++++++++--------
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |   3 +
 2 files changed, 89 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 0963dcb67996..e735de3d9975 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -92,12 +92,14 @@ static const struct drm_i915_gem_object_ops fake_ops = {
 };
 
 static struct drm_i915_gem_object *
-fake_dma_object(struct drm_i915_private *i915, u64 size)
+fake_dma_object(struct drm_i915_private *i915, u64 size, unsigned int page_size)
 {
 	struct drm_i915_gem_object *obj;
 
 	GEM_BUG_ON(!size);
-	GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE));
+	GEM_BUG_ON(!is_valid_gtt_page_size(page_size));
+
+	size = round_up(size, page_size);
 
 	if (overflows_type(size, obj->base.size))
 		return ERR_PTR(-E2BIG);
@@ -107,8 +109,13 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)
 		goto err;
 
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
+
 	i915_gem_object_init(obj, &fake_ops);
 
+	obj->gtt_page_size = obj->page_size = page_size;
+
+	GEM_BUG_ON(!IS_ALIGNED(obj->base.size, obj->page_size));
+
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 	obj->cache_level = I915_CACHE_NONE;
@@ -194,13 +201,14 @@ static int igt_ppgtt_alloc(void *arg)
 static int lowlevel_hole(struct drm_i915_private *i915,
 			 struct i915_address_space *vm,
 			 u64 hole_start, u64 hole_end,
+			 unsigned int page_size,
 			 unsigned long end_time)
 {
 	I915_RND_STATE(seed_prng);
 	unsigned int size;
 
 	/* Keep creating larger objects until one cannot fit into the hole */
-	for (size = 12; (hole_end - hole_start) >> size; size++) {
+	for (size = ilog2(page_size); (hole_end - hole_start) >> size; size++) {
 		I915_RND_SUBSTATE(prng, seed_prng);
 		struct drm_i915_gem_object *obj;
 		unsigned int *order, count, n;
@@ -226,7 +234,7 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 		 * memory. We expect to hit -ENOMEM.
 		 */
 
-		obj = fake_dma_object(i915, BIT_ULL(size));
+		obj = fake_dma_object(i915, BIT_ULL(size), page_size);
 		if (IS_ERR(obj)) {
 			kfree(order);
 			break;
@@ -303,18 +311,25 @@ static void close_object_list(struct list_head *objects,
 static int fill_hole(struct drm_i915_private *i915,
 		     struct i915_address_space *vm,
 		     u64 hole_start, u64 hole_end,
+		     unsigned int page_size,
 		     unsigned long end_time)
 {
 	const u64 hole_size = hole_end - hole_start;
 	struct drm_i915_gem_object *obj;
-	const unsigned long max_pages =
-		min_t(u64, ULONG_MAX - 1, hole_size/2 >> PAGE_SHIFT);
-	const unsigned long max_step = max(int_sqrt(max_pages), 2UL);
-	unsigned long npages, prime, flags;
+	const unsigned page_shift = ilog2(page_size);
+	unsigned long max_pages, max_step, npages, prime, flags;
 	struct i915_vma *vma;
 	LIST_HEAD(objects);
 	int err;
 
+	hole_start = round_up(hole_start, page_size);
+	hole_end = round_down(hole_end, page_size);
+
+	GEM_BUG_ON(hole_start >= hole_end);
+
+	max_pages = min_t(u64, ULONG_MAX - 1, hole_size/2 >> page_shift);
+	max_step = max(int_sqrt(max_pages), 2UL);
+
 	/* Try binding many VMA working inwards from either edge */
 
 	flags = PIN_OFFSET_FIXED | PIN_USER;
@@ -323,7 +338,7 @@ static int fill_hole(struct drm_i915_private *i915,
 
 	for_each_prime_number_from(prime, 2, max_step) {
 		for (npages = 1; npages <= max_pages; npages *= prime) {
-			const u64 full_size = npages << PAGE_SHIFT;
+			const u64 full_size = npages << page_shift;
 			const struct {
 				const char *name;
 				u64 offset;
@@ -334,7 +349,7 @@ static int fill_hole(struct drm_i915_private *i915,
 				{ }
 			}, *p;
 
-			obj = fake_dma_object(i915, full_size);
+			obj = fake_dma_object(i915, full_size, page_size);
 			if (IS_ERR(obj))
 				break;
 
@@ -359,7 +374,7 @@ static int fill_hole(struct drm_i915_private *i915,
 						offset -= obj->base.size;
 					}
 
-					err = i915_vma_pin(vma, 0, 0, offset | flags);
+					err = i915_vma_pin(vma, 0, page_size, offset | flags);
 					if (err) {
 						pr_err("%s(%s) pin (forward) failed with err=%d on size=%lu pages (prime=%lu), offset=%llx\n",
 						       __func__, p->name, err, npages, prime, offset);
@@ -367,7 +382,7 @@ static int fill_hole(struct drm_i915_private *i915,
 					}
 
 					if (!drm_mm_node_allocated(&vma->node) ||
-					    i915_vma_misplaced(vma, 0, 0, offset | flags)) {
+					    i915_vma_misplaced(vma, 0, page_size, offset | flags)) {
 						pr_err("%s(%s) (forward) insert failed: vma.node=%llx + %llx [allocated? %d], expected offset %llx\n",
 						       __func__, p->name, vma->node.start, vma->node.size, drm_mm_node_allocated(&vma->node),
 						       offset);
@@ -397,7 +412,7 @@ static int fill_hole(struct drm_i915_private *i915,
 					}
 
 					if (!drm_mm_node_allocated(&vma->node) ||
-					    i915_vma_misplaced(vma, 0, 0, offset | flags)) {
+					    i915_vma_misplaced(vma, 0, page_size, offset | flags)) {
 						pr_err("%s(%s) (forward) moved vma.node=%llx + %llx, expected offset %llx\n",
 						       __func__, p->name, vma->node.start, vma->node.size,
 						       offset);
@@ -432,7 +447,7 @@ static int fill_hole(struct drm_i915_private *i915,
 						offset -= obj->base.size;
 					}
 
-					err = i915_vma_pin(vma, 0, 0, offset | flags);
+					err = i915_vma_pin(vma, 0, page_size, offset | flags);
 					if (err) {
 						pr_err("%s(%s) pin (backward) failed with err=%d on size=%lu pages (prime=%lu), offset=%llx\n",
 						       __func__, p->name, err, npages, prime, offset);
@@ -440,7 +455,7 @@ static int fill_hole(struct drm_i915_private *i915,
 					}
 
 					if (!drm_mm_node_allocated(&vma->node) ||
-					    i915_vma_misplaced(vma, 0, 0, offset | flags)) {
+					    i915_vma_misplaced(vma, 0, page_size, offset | flags)) {
 						pr_err("%s(%s) (backward) insert failed: vma.node=%llx + %llx [allocated? %d], expected offset %llx\n",
 						       __func__, p->name, vma->node.start, vma->node.size, drm_mm_node_allocated(&vma->node),
 						       offset);
@@ -470,7 +485,7 @@ static int fill_hole(struct drm_i915_private *i915,
 					}
 
 					if (!drm_mm_node_allocated(&vma->node) ||
-					    i915_vma_misplaced(vma, 0, 0, offset | flags)) {
+					    i915_vma_misplaced(vma, 0, page_size, offset | flags)) {
 						pr_err("%s(%s) (backward) moved vma.node=%llx + %llx [allocated? %d], expected offset %llx\n",
 						       __func__, p->name, vma->node.start, vma->node.size, drm_mm_node_allocated(&vma->node),
 						       offset);
@@ -514,11 +529,13 @@ static int fill_hole(struct drm_i915_private *i915,
 static int walk_hole(struct drm_i915_private *i915,
 		     struct i915_address_space *vm,
 		     u64 hole_start, u64 hole_end,
+		     unsigned int page_size,
 		     unsigned long end_time)
 {
 	const u64 hole_size = hole_end - hole_start;
+	const unsigned page_shift = ilog2(page_size);
 	const unsigned long max_pages =
-		min_t(u64, ULONG_MAX - 1, hole_size >> PAGE_SHIFT);
+		min_t(u64, ULONG_MAX - 1, hole_size >> page_shift);
 	unsigned long flags;
 	u64 size;
 
@@ -534,7 +551,7 @@ static int walk_hole(struct drm_i915_private *i915,
 		u64 addr;
 		int err = 0;
 
-		obj = fake_dma_object(i915, size << PAGE_SHIFT);
+		obj = fake_dma_object(i915, size << page_shift, page_size);
 		if (IS_ERR(obj))
 			break;
 
@@ -547,7 +564,7 @@ static int walk_hole(struct drm_i915_private *i915,
 		for (addr = hole_start;
 		     addr + obj->base.size < hole_end;
 		     addr += obj->base.size) {
-			err = i915_vma_pin(vma, 0, 0, addr | flags);
+			err = i915_vma_pin(vma, 0, page_size, addr | flags);
 			if (err) {
 				pr_err("%s bind failed at %llx + %llx [hole %llx- %llx] with err=%d\n",
 				       __func__, addr, vma->size,
@@ -557,7 +574,7 @@ static int walk_hole(struct drm_i915_private *i915,
 			i915_vma_unpin(vma);
 
 			if (!drm_mm_node_allocated(&vma->node) ||
-			    i915_vma_misplaced(vma, 0, 0, addr | flags)) {
+			    i915_vma_misplaced(vma, 0, page_size, addr | flags)) {
 				pr_err("%s incorrect at %llx + %llx\n",
 				       __func__, addr, vma->size);
 				err = -EINVAL;
@@ -596,6 +613,7 @@ static int walk_hole(struct drm_i915_private *i915,
 static int pot_hole(struct drm_i915_private *i915,
 		    struct i915_address_space *vm,
 		    u64 hole_start, u64 hole_end,
+		    unsigned int page_size,
 		    unsigned long end_time)
 {
 	struct drm_i915_gem_object *obj;
@@ -608,7 +626,7 @@ static int pot_hole(struct drm_i915_private *i915,
 	if (i915_is_ggtt(vm))
 		flags |= PIN_GLOBAL;
 
-	obj = i915_gem_object_create_internal(i915, 2 * I915_GTT_PAGE_SIZE);
+	obj = fake_dma_object(i915, 2 * page_size, page_size);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
@@ -620,15 +638,15 @@ static int pot_hole(struct drm_i915_private *i915,
 
 	/* Insert a pair of pages across every pot boundary within the hole */
 	for (pot = fls64(hole_end - 1) - 1;
-	     pot > ilog2(2 * I915_GTT_PAGE_SIZE);
+	     pot > ilog2(2 * page_size);
 	     pot--) {
 		u64 step = BIT_ULL(pot);
 		u64 addr;
 
-		for (addr = round_up(hole_start + I915_GTT_PAGE_SIZE, step) - I915_GTT_PAGE_SIZE;
-		     addr <= round_down(hole_end - 2*I915_GTT_PAGE_SIZE, step) - I915_GTT_PAGE_SIZE;
+		for (addr = round_up(hole_start + page_size, step) - page_size;
+		     addr <= round_down(hole_end - 2*page_size, step) - page_size;
 		     addr += step) {
-			err = i915_vma_pin(vma, 0, 0, addr | flags);
+			err = i915_vma_pin(vma, 0, page_size, addr | flags);
 			if (err) {
 				pr_err("%s failed to pin object at %llx in hole [%llx - %llx], with err=%d\n",
 				       __func__,
@@ -672,6 +690,7 @@ static int pot_hole(struct drm_i915_private *i915,
 static int drunk_hole(struct drm_i915_private *i915,
 		      struct i915_address_space *vm,
 		      u64 hole_start, u64 hole_end,
+		      unsigned int page_size,
 		      unsigned long end_time)
 {
 	I915_RND_STATE(prng);
@@ -683,7 +702,7 @@ static int drunk_hole(struct drm_i915_private *i915,
 		flags |= PIN_GLOBAL;
 
 	/* Keep creating larger objects until one cannot fit into the hole */
-	for (size = 12; (hole_end - hole_start) >> size; size++) {
+	for (size = ilog2(page_size); (hole_end - hole_start) >> size; size++) {
 		struct drm_i915_gem_object *obj;
 		unsigned int *order, count, n;
 		struct i915_vma *vma;
@@ -707,7 +726,7 @@ static int drunk_hole(struct drm_i915_private *i915,
 		 * memory. We expect to hit -ENOMEM.
 		 */
 
-		obj = fake_dma_object(i915, BIT_ULL(size));
+		obj = fake_dma_object(i915, BIT_ULL(size), page_size);
 		if (IS_ERR(obj)) {
 			kfree(order);
 			break;
@@ -724,6 +743,8 @@ static int drunk_hole(struct drm_i915_private *i915,
 		for (n = 0; n < count; n++) {
 			u64 addr = hole_start + order[n] * BIT_ULL(size);
 
+			GEM_BUG_ON(!IS_ALIGNED(addr, page_size));
+
 			err = i915_vma_pin(vma, 0, 0, addr | flags);
 			if (err) {
 				pr_err("%s failed to pin object at %llx + %llx in hole [%llx - %llx], with err=%d\n",
@@ -735,7 +756,7 @@ static int drunk_hole(struct drm_i915_private *i915,
 			}
 
 			if (!drm_mm_node_allocated(&vma->node) ||
-			    i915_vma_misplaced(vma, 0, 0, addr | flags)) {
+			    i915_vma_misplaced(vma, 0, page_size, addr | flags)) {
 				pr_err("%s incorrect at %llx + %llx\n",
 				       __func__, addr, BIT_ULL(size));
 				i915_vma_unpin(vma);
@@ -772,11 +793,12 @@ static int drunk_hole(struct drm_i915_private *i915,
 static int __shrink_hole(struct drm_i915_private *i915,
 			 struct i915_address_space *vm,
 			 u64 hole_start, u64 hole_end,
+			 unsigned int page_size,
 			 unsigned long end_time)
 {
 	struct drm_i915_gem_object *obj;
 	unsigned long flags = PIN_OFFSET_FIXED | PIN_USER;
-	unsigned int order = 12;
+	unsigned int order = ilog2(page_size);
 	LIST_HEAD(objects);
 	int err = 0;
 	u64 addr;
@@ -787,7 +809,7 @@ static int __shrink_hole(struct drm_i915_private *i915,
 		u64 size = BIT_ULL(order++);
 
 		size = min(size, hole_end - addr);
-		obj = fake_dma_object(i915, size);
+		obj = fake_dma_object(i915, size, page_size);
 		if (IS_ERR(obj)) {
 			err = PTR_ERR(obj);
 			break;
@@ -803,7 +825,7 @@ static int __shrink_hole(struct drm_i915_private *i915,
 
 		GEM_BUG_ON(vma->size != size);
 
-		err = i915_vma_pin(vma, 0, 0, addr | flags);
+		err = i915_vma_pin(vma, 0, page_size, addr | flags);
 		if (err) {
 			pr_err("%s failed to pin object at %llx + %llx in hole [%llx - %llx], with err=%d\n",
 			       __func__, addr, size, hole_start, hole_end, err);
@@ -811,7 +833,7 @@ static int __shrink_hole(struct drm_i915_private *i915,
 		}
 
 		if (!drm_mm_node_allocated(&vma->node) ||
-		    i915_vma_misplaced(vma, 0, 0, addr | flags)) {
+		    i915_vma_misplaced(vma, 0, page_size, addr | flags)) {
 			pr_err("%s incorrect at %llx + %llx\n",
 			       __func__, addr, size);
 			i915_vma_unpin(vma);
@@ -838,6 +860,7 @@ static int __shrink_hole(struct drm_i915_private *i915,
 static int shrink_hole(struct drm_i915_private *i915,
 		       struct i915_address_space *vm,
 		       u64 hole_start, u64 hole_end,
+		       unsigned int page_size,
 		       unsigned long end_time)
 {
 	unsigned long prime;
@@ -848,7 +871,8 @@ static int shrink_hole(struct drm_i915_private *i915,
 
 	for_each_prime_number_from(prime, 0, ULONG_MAX - 1) {
 		vm->fault_attr.interval = prime;
-		err = __shrink_hole(i915, vm, hole_start, hole_end, end_time);
+		err = __shrink_hole(i915, vm, hole_start, hole_end, page_size,
+				    end_time);
 		if (err)
 			break;
 	}
@@ -862,12 +886,20 @@ static int exercise_ppgtt(struct drm_i915_private *dev_priv,
 			  int (*func)(struct drm_i915_private *i915,
 				      struct i915_address_space *vm,
 				      u64 hole_start, u64 hole_end,
+				      unsigned int page_size,
 				      unsigned long end_time))
 {
 	struct drm_file *file;
 	struct i915_hw_ppgtt *ppgtt;
 	IGT_TIMEOUT(end_time);
-	int err;
+	unsigned int page_sizes[] = {
+		I915_GTT_PAGE_SIZE_4K,
+		I915_GTT_PAGE_SIZE_64K,
+		I915_GTT_PAGE_SIZE_2M,
+		I915_GTT_PAGE_SIZE_1G,
+	};
+	int err = 0;
+	int i;
 
 	if (!USES_FULL_PPGTT(dev_priv))
 		return 0;
@@ -885,7 +917,11 @@ static int exercise_ppgtt(struct drm_i915_private *dev_priv,
 	GEM_BUG_ON(offset_in_page(ppgtt->base.total));
 	GEM_BUG_ON(ppgtt->base.closed);
 
-	err = func(dev_priv, &ppgtt->base, 0, ppgtt->base.total, end_time);
+	for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) {
+		if (SUPPORTS_PAGE_SIZE(dev_priv, page_sizes[i]))
+			err = func(dev_priv, &ppgtt->base, 0, ppgtt->base.total,
+				   page_sizes[i], end_time);
+	}
 
 	i915_ppgtt_close(&ppgtt->base);
 	i915_ppgtt_put(ppgtt);
@@ -941,6 +977,7 @@ static int exercise_ggtt(struct drm_i915_private *i915,
 			 int (*func)(struct drm_i915_private *i915,
 				     struct i915_address_space *vm,
 				     u64 hole_start, u64 hole_end,
+				     unsigned int page_size,
 				     unsigned long end_time))
 {
 	struct i915_ggtt *ggtt = &i915->ggtt;
@@ -962,7 +999,8 @@ static int exercise_ggtt(struct drm_i915_private *i915,
 		if (hole_start >= hole_end)
 			continue;
 
-		err = func(i915, &ggtt->base, hole_start, hole_end, end_time);
+		err = func(i915, &ggtt->base, hole_start, hole_end,
+			   I915_GTT_PAGE_SIZE, end_time);
 		if (err)
 			break;
 
@@ -1105,12 +1143,20 @@ static int exercise_mock(struct drm_i915_private *i915,
 			 int (*func)(struct drm_i915_private *i915,
 				     struct i915_address_space *vm,
 				     u64 hole_start, u64 hole_end,
+				     unsigned int page_size,
 				     unsigned long end_time))
 {
 	struct i915_gem_context *ctx;
 	struct i915_hw_ppgtt *ppgtt;
 	IGT_TIMEOUT(end_time);
+	unsigned int page_sizes[] = {
+		I915_GTT_PAGE_SIZE_4K,
+		I915_GTT_PAGE_SIZE_64K,
+		I915_GTT_PAGE_SIZE_2M,
+		I915_GTT_PAGE_SIZE_1G,
+	};
 	int err;
+	int i;
 
 	ctx = mock_context(i915, "mock");
 	if (!ctx)
@@ -1119,7 +1165,10 @@ static int exercise_mock(struct drm_i915_private *i915,
 	ppgtt = ctx->ppgtt;
 	GEM_BUG_ON(!ppgtt);
 
-	err = func(i915, &ppgtt->base, 0, ppgtt->base.total, end_time);
+	for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) {
+		err = func(i915, &ppgtt->base, 0, ppgtt->base.total,
+			   page_sizes[i], end_time);
+	}
 
 	mock_context_close(ctx);
 	return err;
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 38532a008387..688d4f554a48 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -88,6 +88,9 @@ mock_ppgtt(struct drm_i915_private *i915,
 	ppgtt->base.unbind_vma = mock_unbind_ppgtt;
 	ppgtt->base.cleanup = mock_cleanup;
 
+	/* For mock testing huge-page support */
+	ppgtt->base.mm.color_adjust = i915_page_color_adjust;
+
 	return ppgtt;
 }
 
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 15/18] drm/i915/selftests: exercise evict-for-node page coloring
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (13 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 14/18] drm/i915/selftests: modify the gtt tests to also exercise huge pages Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 16/18] drm/i915/debugfs: include some huge-page metrics Matthew Auld
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_gem_evict.c | 117 ++++++++++++++++++++++--
 1 file changed, 111 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 8c0b4dc9de3d..780585a202ba 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -25,6 +25,7 @@
 #include "../i915_selftest.h"
 
 #include "mock_gem_device.h"
+#include "mock_gtt.h"
 
 static int populate_ggtt(struct drm_i915_private *i915)
 {
@@ -62,11 +63,11 @@ static int populate_ggtt(struct drm_i915_private *i915)
 	return 0;
 }
 
-static void unpin_ggtt(struct drm_i915_private *i915)
+static void unpin_vm(struct i915_address_space *vm)
 {
 	struct i915_vma *vma;
 
-	list_for_each_entry(vma, &i915->ggtt.base.inactive_list, vm_link)
+	list_for_each_entry(vma, &vm->inactive_list, vm_link)
 		i915_vma_unpin(vma);
 }
 
@@ -110,7 +111,7 @@ static int igt_evict_something(void *arg)
 		goto cleanup;
 	}
 
-	unpin_ggtt(i915);
+	unpin_vm(&ggtt->base);
 
 	/* Everything is unpinned, we should be able to evict something */
 	err = i915_gem_evict_something(&ggtt->base,
@@ -187,7 +188,7 @@ static int igt_evict_for_vma(void *arg)
 		goto cleanup;
 	}
 
-	unpin_ggtt(i915);
+	unpin_vm(&ggtt->base);
 
 	/* Everything is unpinned, we should be able to evict the node */
 	err = i915_gem_evict_for_node(&ggtt->base, &target, 0);
@@ -287,12 +288,115 @@ static int igt_evict_for_cache_color(void *arg)
 	err = 0;
 
 cleanup:
-	unpin_ggtt(i915);
+	unpin_vm(&ggtt->base);
 	cleanup_objects(i915);
 	ggtt->base.mm.color_adjust = NULL;
 	return err;
 }
 
+static int igt_evict_for_page_color(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct i915_hw_ppgtt *ppgtt = mock_ppgtt(i915, "mock-page-color");
+	const unsigned long flags = PIN_USER | PIN_OFFSET_FIXED;
+	struct drm_mm_node target = {
+		/* Straddle the end of the first page-table boundary */
+		.start = (1 << GEN8_PDE_SHIFT) - I915_GTT_PAGE_SIZE_64K,
+		.size = I915_GTT_PAGE_SIZE_64K * 2,
+		.color = I915_GTT_PAGE_SIZE_4K,
+	};
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int err;
+
+	/* The mock ppgtt mm.color_adjust should be set to
+	 * i915_page_color_adjust.
+	 */
+	GEM_BUG_ON(!i915_vm_has_page_coloring(&ppgtt->base));
+
+	obj = i915_gem_object_create_internal(i915, I915_GTT_PAGE_SIZE);
+	if (IS_ERR(obj)) {
+		err = PTR_ERR(obj);
+		goto cleanup;
+	}
+
+	vma = i915_vma_instance(obj, &ppgtt->base, NULL);
+	if (IS_ERR(vma)) {
+		pr_err("[0]i915_vma_instance failed\n");
+		err = PTR_ERR(vma);
+		goto cleanup;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, flags);
+	if (err) {
+		pr_err("[0]i915_vma_pin failed with err=%d\n", err);
+		goto cleanup;
+	}
+
+	obj = i915_gem_object_create_internal(i915, I915_GTT_PAGE_SIZE);
+	if (IS_ERR(obj)) {
+		unpin_vm(&ppgtt->base);
+		err = PTR_ERR(obj);
+		goto cleanup;
+	}
+
+	vma = i915_vma_instance(obj, &ppgtt->base, NULL);
+	if (IS_ERR(vma)) {
+		pr_err("[1]i915_vma_instance failed\n");
+		err = PTR_ERR(vma);
+		goto cleanup;
+	}
+
+	err = i915_vma_pin(vma, 0, 0,
+			   ((2 << GEN8_PDE_SHIFT) - I915_GTT_PAGE_SIZE_4K) |
+			   flags);
+	if (err) {
+		unpin_vm(&ppgtt->base);
+		pr_err("[1]i915_vma_pin failed with err=%d\n", err);
+		goto cleanup;
+	}
+
+	/* Target the page-table boundary between the two already *pinned* gem
+	 * objects with the same page color - should succeed.
+	 */
+	err = i915_gem_evict_for_node(&ppgtt->base, &target, 0);
+	if (err) {
+		unpin_vm(&ppgtt->base);
+		pr_err("[0]i915_gem_evict_for_node returned err=%d\n", err);
+		goto cleanup;
+	}
+
+	target.color = I915_GTT_PAGE_SIZE_64K;
+
+	/* Again target the page-table boundary between the two already *pinned*
+	 * gem objects, but this time with conflicting page colors - should
+	 * fail.
+	 */
+	err = i915_gem_evict_for_node(&ppgtt->base, &target, 0);
+	if (!err) {
+		unpin_vm(&ppgtt->base);
+		pr_err("[1]i915_gem_evict_for_node returned err=%d\n", err);
+		err = -EINVAL;
+		goto cleanup;
+	}
+
+	unpin_vm(&ppgtt->base);
+
+	/* And finaly target the page-table boundary between the two now
+	 * *unpinned* gem objects, again with conflicting page colors - should
+	 * now succeed.
+	 */
+	err = i915_gem_evict_for_node(&ppgtt->base, &target, 0);
+	if (err)
+		pr_err("[2]i915_gem_evict_for_node returned err=%d\n", err);
+
+cleanup:
+	i915_ppgtt_close(&ppgtt->base);
+	i915_ppgtt_put(ppgtt);
+	cleanup_objects(i915);
+	return err;
+}
+
 static int igt_evict_vm(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -313,7 +417,7 @@ static int igt_evict_vm(void *arg)
 		goto cleanup;
 	}
 
-	unpin_ggtt(i915);
+	unpin_vm(&ggtt->base);
 
 	err = i915_gem_evict_vm(&ggtt->base, false);
 	if (err) {
@@ -333,6 +437,7 @@ int i915_gem_evict_mock_selftests(void)
 		SUBTEST(igt_evict_something),
 		SUBTEST(igt_evict_for_vma),
 		SUBTEST(igt_evict_for_cache_color),
+		SUBTEST(igt_evict_for_page_color),
 		SUBTEST(igt_evict_vm),
 		SUBTEST(igt_overcommit),
 	};
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 16/18] drm/i915/debugfs: include some huge-page metrics
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (14 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 15/18] drm/i915/selftests: exercise evict-for-node page coloring Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-04 22:11 ` [PATCH 17/18] mm/shmem: tweak the huge-page interface Matthew Auld
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 38 ++++++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d689e511744e..e8a50481c703 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -117,6 +117,23 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
 	return size;
 }
 
+static const char *stringify_page_size(unsigned int page_size)
+{
+	switch(page_size) {
+	case I915_GTT_PAGE_SIZE_4K:
+		return "4K";
+	case I915_GTT_PAGE_SIZE_64K:
+		return "64K";
+	case I915_GTT_PAGE_SIZE_2M:
+		return "2M";
+	case I915_GTT_PAGE_SIZE_1G:
+		return "1G";
+	default:
+		MISSING_CASE(page_size);
+		return "";
+	}
+}
+
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
@@ -128,7 +145,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 
 	lockdep_assert_held(&obj->base.dev->struct_mutex);
 
-	seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x %s%s%s",
+	seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %s %s %02x %02x %s%s%s",
 		   &obj->base,
 		   get_active_flag(obj),
 		   get_pin_flag(obj),
@@ -136,6 +153,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		   get_global_flag(obj),
 		   get_pin_mapped_flag(obj),
 		   obj->base.size / 1024,
+		   stringify_page_size(obj->page_size),
+		   stringify_page_size(obj->gtt_page_size),
 		   obj->base.read_domains,
 		   obj->base.write_domain,
 		   i915_cache_level_str(dev_priv, obj->cache_level),
@@ -399,8 +418,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct drm_device *dev = &dev_priv->drm;
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
-	u32 count, mapped_count, purgeable_count, dpy_count;
-	u64 size, mapped_size, purgeable_size, dpy_size;
+	u32 count, mapped_count, purgeable_count, dpy_count, huge_count;
+	u64 size, mapped_size, purgeable_size, dpy_size, huge_size;
 	struct drm_i915_gem_object *obj;
 	struct drm_file *file;
 	int ret;
@@ -416,6 +435,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	size = count = 0;
 	mapped_size = mapped_count = 0;
 	purgeable_size = purgeable_count = 0;
+	huge_size = huge_count = 0;
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_link) {
 		size += obj->base.size;
 		++count;
@@ -429,6 +449,11 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 			mapped_count++;
 			mapped_size += obj->base.size;
 		}
+
+		if (obj->page_size > PAGE_SIZE) {
+			huge_count++;
+			huge_size += obj->base.size;
+		}
 	}
 	seq_printf(m, "%u unbound objects, %llu bytes\n", count, size);
 
@@ -451,6 +476,11 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 			mapped_count++;
 			mapped_size += obj->base.size;
 		}
+
+		if (obj->page_size > PAGE_SIZE) {
+			huge_count++;
+			huge_size += obj->base.size;
+		}
 	}
 	seq_printf(m, "%u bound objects, %llu bytes\n",
 		   count, size);
@@ -458,6 +488,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   purgeable_count, purgeable_size);
 	seq_printf(m, "%u mapped objects, %llu bytes\n",
 		   mapped_count, mapped_size);
+	seq_printf(m, "%u huge-paged objects, %llu bytes\n",
+		   huge_count, huge_size);
 	seq_printf(m, "%u display objects (pinned), %llu bytes\n",
 		   dpy_count, dpy_size);
 
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 17/18] mm/shmem: tweak the huge-page interface
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (15 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 16/18] drm/i915/debugfs: include some huge-page metrics Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05  6:42   ` Daniel Vetter
  2017-04-04 22:11 ` [PATCH 18/18] drm/i915: support transparent-huge-pages through shmemfs Matthew Auld
  2017-04-05  8:53 ` [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Chris Wilson
  18 siblings, 1 reply; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

In its current form huge-pages through shmemfs are controlled at the
super-block level, and are currently disabled by default, so to enable
huge-pages for a shmem backed gem object we would need to re-mount the
fs with the huge= argument, but for drm the mount is not user visible,
so good luck with that. The other option is the global sysfs knob
shmem_enabled which exposes the same huge= options, with the addition of
DENY and FORCE.

Neither option seems really workable, what we probably want is to able
to control the use of huge-pages at the time of pinning the backing
storage for a particular gem object, and only where it makes sense given
the size of the object. One caveat is when we write into the page cache
prior to pinning the backing storage. I played around with a bunch of
ideas but in the end just settled with driver overridable huge option
embedded in shmem_inode_info. Thoughts?

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 include/linux/shmem_fs.h |  1 +
 mm/shmem.c               | 10 ++++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index a7d6bd2a918f..001be751420d 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -21,6 +21,7 @@ struct shmem_inode_info {
 	struct shared_policy	policy;		/* NUMA memory alloc policy */
 	struct simple_xattrs	xattrs;		/* list of xattrs */
 	struct inode		vfs_inode;
+	bool                    huge;           /* driver override shmem_huge */
 };
 
 struct shmem_sb_info {
diff --git a/mm/shmem.c b/mm/shmem.c
index e67d6ba4e98e..879a9e514afe 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1723,6 +1723,9 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
 		/* shmem_symlink() */
 		if (mapping->a_ops != &shmem_aops)
 			goto alloc_nohuge;
+		/* driver override shmem_huge */
+		if (info->huge)
+			goto alloc_huge;
 		if (shmem_huge == SHMEM_HUGE_DENY || sgp_huge == SGP_NOHUGE)
 			goto alloc_nohuge;
 		if (shmem_huge == SHMEM_HUGE_FORCE)
@@ -2000,6 +2003,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 	unsigned long inflated_len;
 	unsigned long inflated_addr;
 	unsigned long inflated_offset;
+	struct shmem_inode_info *info = SHMEM_I(file_inode(file));
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
@@ -2016,7 +2020,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 	if (addr > TASK_SIZE - len)
 		return addr;
 
-	if (shmem_huge == SHMEM_HUGE_DENY)
+	if (!info->huge && shmem_huge == SHMEM_HUGE_DENY)
 		return addr;
 	if (len < HPAGE_PMD_SIZE)
 		return addr;
@@ -2030,7 +2034,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 	if (uaddr)
 		return addr;
 
-	if (shmem_huge != SHMEM_HUGE_FORCE) {
+	if (!info->huge && shmem_huge != SHMEM_HUGE_FORCE) {
 		struct super_block *sb;
 
 		if (file) {
@@ -4034,6 +4038,8 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
 	loff_t i_size;
 	pgoff_t off;
 
+	if (SHMEM_I(inode)->huge)
+		return true;
 	if (shmem_huge == SHMEM_HUGE_FORCE)
 		return true;
 	if (shmem_huge == SHMEM_HUGE_DENY)
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 18/18] drm/i915: support transparent-huge-pages through shmemfs
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (16 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 17/18] mm/shmem: tweak the huge-page interface Matthew Auld
@ 2017-04-04 22:11 ` Matthew Auld
  2017-04-05  8:53 ` [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Chris Wilson
  18 siblings, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-04 22:11 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |   3 +
 drivers/gpu/drm/i915/i915_gem.c | 187 +++++++++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_vma.c |   8 ++
 3 files changed, 166 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 838ce22a0a40..07dd4d24b93e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2672,6 +2672,9 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
  * @__pp:	page pointer (output)
  * @__iter:	'struct sgt_iter' (iterator state, internal)
  * @__sgt:	sg_table to iterate over (input)
+ *
+ * Be warned, if we using huge-pages @_pp could be a part of a compound page,
+ * so care must be taken. Too thorny?
  */
 #define for_each_sgt_page(__pp, __iter, __sgt)				\
 	for ((__iter) = __sgt_iter((__sgt)->sgl, false);		\
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5362f4d18689..1dde01676d37 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -171,7 +171,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 	struct sg_table *st;
 	struct scatterlist *sg;
 	char *vaddr;
-	int i;
+	int i, j;
 
 	if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
 		return ERR_PTR(-EINVAL);
@@ -187,7 +187,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 		return ERR_PTR(-ENOMEM);
 
 	vaddr = phys->vaddr;
-	for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
+	for (i = 0; i < obj->base.size / PAGE_SIZE; ) {
 		struct page *page;
 		char *src;
 
@@ -197,13 +197,15 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 			goto err_phys;
 		}
 
-		src = kmap_atomic(page);
-		memcpy(vaddr, src, PAGE_SIZE);
-		drm_clflush_virt_range(vaddr, PAGE_SIZE);
-		kunmap_atomic(src);
+		for (j = 0; j < hpage_nr_pages(page); ++j, ++i) {
+			src = kmap_atomic(page + j);
+			memcpy(vaddr, src, PAGE_SIZE);
+			drm_clflush_virt_range(vaddr, PAGE_SIZE);
+			kunmap_atomic(src);
+			vaddr += PAGE_SIZE;
+		}
 
 		put_page(page);
-		vaddr += PAGE_SIZE;
 	}
 
 	i915_gem_chipset_flush(to_i915(obj->base.dev));
@@ -263,9 +265,9 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
 	if (obj->mm.dirty) {
 		struct address_space *mapping = obj->base.filp->f_mapping;
 		char *vaddr = obj->phys_handle->vaddr;
-		int i;
+		int i, j;
 
-		for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
+		for (i = 0; i < obj->base.size / PAGE_SIZE; ) {
 			struct page *page;
 			char *dst;
 
@@ -273,16 +275,18 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
 			if (IS_ERR(page))
 				continue;
 
-			dst = kmap_atomic(page);
-			drm_clflush_virt_range(vaddr, PAGE_SIZE);
-			memcpy(dst, vaddr, PAGE_SIZE);
-			kunmap_atomic(dst);
+			for (j = 0; j < hpage_nr_pages(page); ++j, ++i) {
+				dst = kmap_atomic(page + j);
+				drm_clflush_virt_range(vaddr, PAGE_SIZE);
+				memcpy(dst, vaddr, PAGE_SIZE);
+				kunmap_atomic(dst);
+				vaddr += PAGE_SIZE;
+			}
 
 			set_page_dirty(page);
 			if (obj->mm.madv == I915_MADV_WILLNEED)
 				mark_page_accessed(page);
 			put_page(page);
-			vaddr += PAGE_SIZE;
 		}
 		obj->mm.dirty = false;
 	}
@@ -2179,6 +2183,8 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
 		i915_gem_object_save_bit_17_swizzle(obj, pages);
 
 	for_each_sgt_page(page, sgt_iter, pages) {
+		if (PageTail(page))
+			continue;
 		if (obj->mm.dirty)
 			set_page_dirty(page);
 
@@ -2272,6 +2278,15 @@ static bool i915_sg_trim(struct sg_table *orig_st)
 	return true;
 }
 
+static inline unsigned int i915_shmem_page_size(struct page *page)
+{
+#ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
+	return PageTransHuge(page) ? HPAGE_PMD_SIZE : PAGE_SIZE;
+#else
+	return PAGE_SIZE;
+#endif
+}
+
 static struct sg_table *
 i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 {
@@ -2287,6 +2302,14 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	unsigned int max_segment;
 	int ret;
 	gfp_t gfp;
+	const unsigned int gtt_page_sizes[] = {
+		I915_GTT_PAGE_SIZE_1G,
+		I915_GTT_PAGE_SIZE_2M,
+		I915_GTT_PAGE_SIZE_64K,
+		I915_GTT_PAGE_SIZE_4K,
+	};
+	unsigned int page_size;
+	int j;
 
 	/* Assert that the object is not currently in any GPU domain. As it
 	 * wasn't in the GTT, there shouldn't be any way it could have been in
@@ -2299,6 +2322,25 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	if (!max_segment)
 		max_segment = rounddown(UINT_MAX, PAGE_SIZE);
 
+	/* max_segment is the maximum number of continuous PAGE_SIZE pages we
+	 * can have in the bounce buffer, assuming swiotlb. So optimistically
+	 * select the largest supported gtt page size which can fit into the
+	 * max_segment. Also take care to properly align the max_segment to
+	 * said page size to avoid any huge pages spilling across sg entries.
+	 */
+	for (j = 0; j < ARRAY_SIZE(gtt_page_sizes); ++j) {
+		unsigned int page_size = gtt_page_sizes[j];
+		unsigned int nr_pages = page_size >> PAGE_SHIFT;
+
+		if (SUPPORTS_PAGE_SIZE(dev_priv, page_size) &&
+		    page_size <= obj->page_size &&
+		    nr_pages <= max_segment) {
+			max_segment = rounddown(max_segment, nr_pages);
+			obj->gtt_page_size = page_size;
+			break;
+		}
+	}
+
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (st == NULL)
 		return ERR_PTR(-ENOMEM);
@@ -2309,6 +2351,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 		return ERR_PTR(-ENOMEM);
 	}
 
+	GEM_BUG_ON(!SUPPORTS_PAGE_SIZE(dev_priv, obj->gtt_page_size));
+	GEM_BUG_ON(!IS_ALIGNED(max_segment << PAGE_SHIFT, obj->gtt_page_size));
+
 	/* Get the list of pages out of our struct file.  They'll be pinned
 	 * at this point until we release them.
 	 *
@@ -2319,7 +2364,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	gfp |= __GFP_NORETRY | __GFP_NOWARN;
 	sg = st->sgl;
 	st->nents = 0;
-	for (i = 0; i < page_count; i++) {
+	for (i = 0; i < page_count; i += hpage_nr_pages(page)) {
 		page = shmem_read_mapping_page_gfp(mapping, i, gfp);
 		if (unlikely(IS_ERR(page))) {
 			i915_gem_shrink(dev_priv,
@@ -2349,17 +2394,36 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 				goto err_sg;
 			}
 		}
+
+		/* If we don't enough huge pages in the pool, fall back to the
+		 * minimum page size. We can still allocate huge-pages but now
+		 * obj->page_size and obj->gtt_page_size will reflect the
+		 * minimum page size in the mapping.
+		 */
+		page_size = i915_shmem_page_size(page);
+		if (page_size < obj->page_size) {
+			obj->page_size = PAGE_SIZE;
+			obj->gtt_page_size = I915_GTT_PAGE_SIZE;
+		}
+
+		/* TODO: if we don't use huge-pages or the object is small
+		 * we can probably do something clever with continious pages
+		 * here, if we have enough of them and they fit nicely into a
+		 * gtt page size and max_segment. Imagine a 64K object, and we
+		 * get 16 continuous 4K pages, we could get away with a single
+		 * 64K pte.
+		 */
 		if (!i ||
 		    sg->length >= max_segment ||
 		    page_to_pfn(page) != last_pfn + 1) {
 			if (i)
 				sg = sg_next(sg);
 			st->nents++;
-			sg_set_page(sg, page, PAGE_SIZE, 0);
+			sg_set_page(sg, page, page_size, 0);
 		} else {
-			sg->length += PAGE_SIZE;
+			sg->length += page_size;
 		}
-		last_pfn = page_to_pfn(page);
+		last_pfn = page_to_pfn(page) + hpage_nr_pages(page) - 1;
 
 		/* Check that the i965g/gm workaround works. */
 		WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
@@ -2372,25 +2436,43 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 
 	ret = i915_gem_gtt_prepare_pages(obj, st);
 	if (ret) {
-		/* DMA remapping failed? One possible cause is that
-		 * it could not reserve enough large entries, asking
-		 * for PAGE_SIZE chunks instead may be helpful.
-		 */
-		if (max_segment > PAGE_SIZE) {
-			for_each_sgt_page(page, sgt_iter, st)
-				put_page(page);
-			sg_free_table(st);
-
-			max_segment = PAGE_SIZE;
-			goto rebuild_st;
-		} else {
+		if (max_segment == PAGE_SIZE) {
 			dev_warn(&dev_priv->drm.pdev->dev,
 				 "Failed to DMA remap %lu pages\n",
 				 page_count);
 			goto err_pages;
 		}
+
+		for_each_sgt_page(page, sgt_iter, st) {
+			if (!PageTail(page))
+				put_page(page);
+		}
+		sg_free_table(st);
+
+		/* DMA remapping failed? One possible cause is that
+		 * it could not reserve enough large entries, trying
+		 * smaller page size chunks instead may be helpful.
+		 *
+		 * We really don't know what the max_segment should be,
+		 * just go with the simple premise that the next
+		 * smallest segment will be at least half the size of
+		 * the previous.
+		 */
+		for (; j < ARRAY_SIZE(gtt_page_sizes); ++j) {
+			unsigned int page_size = gtt_page_sizes[j];
+
+			if (SUPPORTS_PAGE_SIZE(dev_priv, page_size) &&
+			    page_size < max_segment) {
+				obj->gtt_page_size = max_segment = page_size;
+				break;
+			}
+		}
+
+		goto rebuild_st;
 	}
 
+	GEM_BUG_ON(obj->gtt_page_size > obj->page_size);
+
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_do_bit_17_swizzle(obj, st);
 
@@ -2399,8 +2481,10 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 err_sg:
 	sg_mark_end(sg);
 err_pages:
-	for_each_sgt_page(page, sgt_iter, st)
-		put_page(page);
+	for_each_sgt_page(page, sgt_iter, st) {
+		if (!PageTail(page))
+			put_page(page);
+	}
 	sg_free_table(st);
 	kfree(st);
 
@@ -4192,10 +4276,36 @@ struct drm_i915_gem_object *
 i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 {
 	struct drm_i915_gem_object *obj;
+	unsigned int page_size = PAGE_SIZE;
 	struct address_space *mapping;
 	gfp_t mask;
 	int ret;
 
+	/* If configured *attempt* to use THP through shmemfs. HPAGE_PMD_SIZE
+	 * will either be 2M or 1G depending on the default hugepage_sz. This
+	 * is best effort and will of course depend on how many huge-pages we
+	 * have available in the pool. We determine the gtt page size when we
+	 * actually try pinning the backing storage, where gtt_page_size <=
+	 * page_size.
+	 *
+	 * XXX Some musings:
+	 *
+	 * - We don't know if the object will be inserted into the ppgtt where
+	 *   it will be most benificial to have huge-pages, or the ggtt where
+	 *   the object will always be treated like a 4K object.
+	 *
+	 * - Similarly should we care if the gtt doesn't support pages sizes >
+	 *   4K? If it does then great, if it doesn't then we do at least see
+	 *   the benefit of reduced fragmentation, so it's not a complete
+	 *   waste...thoughts?
+	 */
+#ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
+	if (has_transparent_hugepage() && size >= HPAGE_PMD_SIZE) {
+		page_size = HPAGE_PMD_SIZE;
+		size = round_up(size, page_size);
+	}
+#endif
+
 	/* There is a prevalence of the assumption that we fit the object's
 	 * page count inside a 32bit _signed_ variable. Let's document this and
 	 * catch if we ever need to fix it. In the meantime, if you do spot
@@ -4227,6 +4337,19 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 
 	i915_gem_object_init(obj, &i915_gem_object_ops);
 
+	/* In a few places we interact with shmemfs implicitly by writing
+	 * through the page_cache prior to pinning the backing storage, this
+	 * is for optimisation reasons and prevents shmemfs from needlessly
+	 * clearing pages. So in order to control the use of huge-pages, from
+	 * both the pinning of the backing store and any implicit interaction
+	 * which may end up allocating pages we require more than the provided
+	 * read_mapping or getpage interfaces provided by shmem. This should
+	 * effectively default to huge-page allocations in shmem for this
+	 * mapping.
+	 */
+	SHMEM_I(mapping->host)->huge = page_size > PAGE_SIZE;
+	obj->page_size = page_size;
+
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4043145b4310..af295aa3b49c 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -469,6 +469,14 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (ret)
 		return ret;
 
+	/* We don't know the final gtt page size until *after* we pin the
+	 * backing store, or that's at least the case for the shmem backend.
+	 * Therefore re-adjust the alignment if needed. This is only relevant
+	 * for huge-pages being inserted into the ppgtt.
+	 */
+	if (!i915_is_ggtt(vma->vm) && alignment < obj->gtt_page_size)
+		alignment = obj->gtt_page_size;
+
 	if (i915_vm_has_cache_coloring(vma->vm))
 		color = obj->cache_level;
 	else if (i915_vm_has_page_coloring(vma->vm))
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] drm/i915: add page_size_mask to dev_info
  2017-04-04 22:11 ` [PATCH 01/18] drm/i915: add page_size_mask to dev_info Matthew Auld
@ 2017-04-05  6:19   ` Joonas Lahtinen
  2017-04-05  8:45     ` Chris Wilson
  2017-04-05  8:43   ` Chris Wilson
  1 sibling, 1 reply; 42+ messages in thread
From: Joonas Lahtinen @ 2017-04-05  6:19 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Add commit message.

On ti, 2017-04-04 at 23:11 +0100, Matthew Auld wrote:
> v2:
>   - move out pde/pdpe bit definitions until later
>   - tidyup the page size definitions, use BIT
>   - introduce helper for detecting invalid page sizes
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

<SNIP>

> @@ -2876,6 +2877,8 @@ intel_info(const struct drm_i915_private *dev_priv)
>  #define USES_PPGTT(dev_priv)		(i915.enable_ppgtt)
>  #define USES_FULL_PPGTT(dev_priv)	(i915.enable_ppgtt >= 2)
>  #define USES_FULL_48BIT_PPGTT(dev_priv)	(i915.enable_ppgtt == 3)
> +#define SUPPORTS_PAGE_SIZE(dev_priv, page_size) \
> +	((dev_priv)->info.page_size_mask & (page_size))

Why not HAS_PAGE_SIZE()?

> +#define is_valid_gtt_page_size(page_size) \
> +	(is_power_of_2(page_size) && \
> +	 (page_size) & I915_GTT_PAGE_SIZE_MASK)

When would this matter? I'd assume we always gotta rely on device info.

> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -56,6 +56,10 @@
>  	.color = { .degamma_lut_size = 65, .gamma_lut_size = 257 }
>  
>  /* Keep in gen based order, and chronological order within a gen */
> +
> +#define GEN_DEFAULT_PAGE_SZ \
> +	.page_size_mask = I915_GTT_PAGE_SIZE_4K

GEN_DEFAULT_PAGE_SIZES

> @@ -346,13 +358,18 @@ static const struct intel_device_info intel_cherryview_info = {
>  	.has_aliasing_ppgtt = 1,
>  	.has_full_ppgtt = 1,
>  	.display_mmio_offset = VLV_DISPLAY_BASE,
> +	.page_size_mask = I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_2M | I915_GTT_PAGE_SIZE_1G,

Split long line.

>  	GEN_CHV_PIPEOFFSETS,
>  	CURSOR_OFFSETS,
> >  	CHV_COLORS,
>  };
>  
> +#define GEN9_DEFAULT_PAGE_SZ \
> +	.page_size_mask = I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_2M | I915_GTT_PAGE_SIZE_1G

GEN9_DEFAULT_PAGE_SIZES, also split long line.

With above,

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-04 22:11 ` [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members Matthew Auld
@ 2017-04-05  6:26   ` Joonas Lahtinen
  2017-04-05  6:49   ` Daniel Vetter
  1 sibling, 0 replies; 42+ messages in thread
From: Joonas Lahtinen @ 2017-04-05  6:26 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Commit message to explain why.

On ti, 2017-04-04 at 23:11 +0100, Matthew Auld wrote:
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>

<SNIP>

> 
> @@ -2441,6 +2441,8 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  	struct sg_table *pages;
>  
>  	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
> +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->page_size));
> +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->gtt_page_size));

GEM_BUG_ON(!HAS_PAGE_SIZE(obj->i915, obj->page_size));
GEM_BUG_ON(!HAS_PAGE_SIZE(obj->i915, obj->gtt_page_size));

Patches should be split functionally, not for the sake of splitting.
Tt's rather hard to complete review when the appearing code is pretty
much no-op, so for next you could squash them a bit. For now, as a leap
of faith;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/18] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust
  2017-04-04 22:11 ` [PATCH 04/18] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust Matthew Auld
@ 2017-04-05  6:30   ` Joonas Lahtinen
  0 siblings, 0 replies; 42+ messages in thread
From: Joonas Lahtinen @ 2017-04-05  6:30 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Commit message for why we do this.

On ti, 2017-04-04 at 23:11 +0100, Matthew Auld wrote:
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] drm/i915: clean up cache coloring
  2017-04-04 22:11 ` [PATCH 05/18] drm/i915: clean up cache coloring Matthew Auld
@ 2017-04-05  6:35   ` Joonas Lahtinen
  0 siblings, 0 replies; 42+ messages in thread
From: Joonas Lahtinen @ 2017-04-05  6:35 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

On ti, 2017-04-04 at 23:11 +0100, Matthew Auld wrote:
> Rid the code of any mm.color_adjust assumptions to allow adding another
> flavour of coloring.
> 
> v2: better naming
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

<SNIP>

> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -347,6 +347,12 @@ struct i915_address_space {
>  #define i915_is_ggtt(V) (!(V)->file)
>  
>  static inline bool
> +i915_vm_has_cache_coloring(const struct i915_address_space *vm)
> +{
> +	return vm->mm.color_adjust && i915_is_ggtt(vm);
> +}

I'd first check the is_ggtt() because it's more important one, and drop
a comment here as to why we can make the decision.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/18] drm/i915: export color_differs
  2017-04-04 22:11 ` [PATCH 06/18] drm/i915: export color_differs Matthew Auld
@ 2017-04-05  6:39   ` Joonas Lahtinen
  0 siblings, 0 replies; 42+ messages in thread
From: Joonas Lahtinen @ 2017-04-05  6:39 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

On ti, 2017-04-04 at 23:11 +0100, Matthew Auld wrote:
> Export color_differs so that we can use it elsewhere.
> 
> v2: better naming
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 17/18] mm/shmem: tweak the huge-page interface
  2017-04-04 22:11 ` [PATCH 17/18] mm/shmem: tweak the huge-page interface Matthew Auld
@ 2017-04-05  6:42   ` Daniel Vetter
  0 siblings, 0 replies; 42+ messages in thread
From: Daniel Vetter @ 2017-04-05  6:42 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Apr 04, 2017 at 11:11:27PM +0100, Matthew Auld wrote:
> In its current form huge-pages through shmemfs are controlled at the
> super-block level, and are currently disabled by default, so to enable
> huge-pages for a shmem backed gem object we would need to re-mount the
> fs with the huge= argument, but for drm the mount is not user visible,
> so good luck with that. The other option is the global sysfs knob
> shmem_enabled which exposes the same huge= options, with the addition of
> DENY and FORCE.
> 
> Neither option seems really workable, what we probably want is to able
> to control the use of huge-pages at the time of pinning the backing
> storage for a particular gem object, and only where it makes sense given
> the size of the object. One caveat is when we write into the page cache
> prior to pinning the backing storage. I played around with a bunch of
> ideas but in the end just settled with driver overridable huge option
> embedded in shmem_inode_info. Thoughts?
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>

You need to Cc: mm folks and mailing lists for this. Ask
scripts/get_maintainers.pl for the full list please. Otherwise this can't
ever land (and we're looking at along time of bikeshedding anyway).
-Daniel

> ---
>  include/linux/shmem_fs.h |  1 +
>  mm/shmem.c               | 10 ++++++++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> index a7d6bd2a918f..001be751420d 100644
> --- a/include/linux/shmem_fs.h
> +++ b/include/linux/shmem_fs.h
> @@ -21,6 +21,7 @@ struct shmem_inode_info {
>  	struct shared_policy	policy;		/* NUMA memory alloc policy */
>  	struct simple_xattrs	xattrs;		/* list of xattrs */
>  	struct inode		vfs_inode;
> +	bool                    huge;           /* driver override shmem_huge */
>  };
>  
>  struct shmem_sb_info {
> diff --git a/mm/shmem.c b/mm/shmem.c
> index e67d6ba4e98e..879a9e514afe 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1723,6 +1723,9 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
>  		/* shmem_symlink() */
>  		if (mapping->a_ops != &shmem_aops)
>  			goto alloc_nohuge;
> +		/* driver override shmem_huge */
> +		if (info->huge)
> +			goto alloc_huge;
>  		if (shmem_huge == SHMEM_HUGE_DENY || sgp_huge == SGP_NOHUGE)
>  			goto alloc_nohuge;
>  		if (shmem_huge == SHMEM_HUGE_FORCE)
> @@ -2000,6 +2003,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
>  	unsigned long inflated_len;
>  	unsigned long inflated_addr;
>  	unsigned long inflated_offset;
> +	struct shmem_inode_info *info = SHMEM_I(file_inode(file));
>  
>  	if (len > TASK_SIZE)
>  		return -ENOMEM;
> @@ -2016,7 +2020,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
>  	if (addr > TASK_SIZE - len)
>  		return addr;
>  
> -	if (shmem_huge == SHMEM_HUGE_DENY)
> +	if (!info->huge && shmem_huge == SHMEM_HUGE_DENY)
>  		return addr;
>  	if (len < HPAGE_PMD_SIZE)
>  		return addr;
> @@ -2030,7 +2034,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
>  	if (uaddr)
>  		return addr;
>  
> -	if (shmem_huge != SHMEM_HUGE_FORCE) {
> +	if (!info->huge && shmem_huge != SHMEM_HUGE_FORCE) {
>  		struct super_block *sb;
>  
>  		if (file) {
> @@ -4034,6 +4038,8 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
>  	loff_t i_size;
>  	pgoff_t off;
>  
> +	if (SHMEM_I(inode)->huge)
> +		return true;
>  	if (shmem_huge == SHMEM_HUGE_FORCE)
>  		return true;
>  	if (shmem_huge == SHMEM_HUGE_DENY)
> -- 
> 2.9.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-04 22:11 ` [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members Matthew Auld
  2017-04-05  6:26   ` Joonas Lahtinen
@ 2017-04-05  6:49   ` Daniel Vetter
  2017-04-05  8:48     ` Chris Wilson
  1 sibling, 1 reply; 42+ messages in thread
From: Daniel Vetter @ 2017-04-05  6:49 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Apr 04, 2017 at 11:11:12PM +0100, Matthew Auld wrote:
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c        | 5 +++++
>  drivers/gpu/drm/i915/i915_gem_object.h | 3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 4ca88f2539c0..cbf97f4bbb72 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2441,6 +2441,8 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  	struct sg_table *pages;
>  
>  	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
> +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->page_size));
> +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->gtt_page_size));
>  
>  	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
>  		DRM_DEBUG("Attempting to obtain a purgeable object\n");
> @@ -4159,6 +4161,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
>  
>  	obj->ops = ops;
>  
> +	obj->page_size = PAGE_SIZE;
> +	obj->gtt_page_size = I915_GTT_PAGE_SIZE;
> +
>  	reservation_object_init(&obj->__builtin_resv);
>  	obj->resv = &obj->__builtin_resv;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> index 174cf923c236..b1dacbfe5173 100644
> --- a/drivers/gpu/drm/i915/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/i915_gem_object.h
> @@ -107,6 +107,9 @@ struct drm_i915_gem_object {
>  	unsigned int cache_level:3;
>  	unsigned int cache_dirty:1;
>  
> +	unsigned int page_size; /* CPU pov - 4K(default), 2M, 1G */
> +	unsigned int gtt_page_size; /* GPU pov - 4K(default), 64K, 2M, 1G */

Just kinda archecture review, with a long-term view: Is the plan to
eventually become more flexible here, i.e. allow mixed mode? We can of
course ask shmem to try really hard to give us huge pages, but at the end
it might not be able to give us a huge page (if the obj size isn't rounded
to 2M), and there's no hw reason to not map everything else as hugepage.
Through sg table coalescing we can cope with that, and we can check fairly
cheaply whether an entry is big enough to be eligible for huge page
mapping.

That also means in the pte functions we'd not make a top-level decision
whether to use huge entries or not, but do that at each level by looking
at the sg table. This should also make it easier for stolen, which is
always contiguous but rather often not size-rounded.

It's a bit more tricky for 64kb pages, but I think those only can be used
for an object which already has huge pages/is contiguous, but where the
size is only rounded to 64kb and not 2m (because 2m would wast too much
space). Then we can map the partial 2m using 64kb entries.

Just some long-term thoughts on this here, wher I expect things will head
towards eventually.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] drm/i915: add page_size_mask to dev_info
  2017-04-04 22:11 ` [PATCH 01/18] drm/i915: add page_size_mask to dev_info Matthew Auld
  2017-04-05  6:19   ` Joonas Lahtinen
@ 2017-04-05  8:43   ` Chris Wilson
  1 sibling, 0 replies; 42+ messages in thread
From: Chris Wilson @ 2017-04-05  8:43 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Apr 04, 2017 at 11:11:11PM +0100, Matthew Auld wrote:
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index fb15684c1d83..27b2b9e681db 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -42,7 +42,22 @@
>  #include "i915_gem_request.h"
>  #include "i915_selftest.h"
>  
> -#define I915_GTT_PAGE_SIZE 4096UL
> +#define I915_GTT_PAGE_SIZE_4K BIT(12)
> +#define I915_GTT_PAGE_SIZE_64K BIT(16)
> +#define I915_GTT_PAGE_SIZE_2M BIT(21)
> +#define I915_GTT_PAGE_SIZE_1G BIT(30)
> +
> +#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K
> +
> +#define I915_GTT_PAGE_SIZE_MASK (I915_GTT_PAGE_SIZE_4K | \
> +				 I915_GTT_PAGE_SIZE_64K | \
> +				 I915_GTT_PAGE_SIZE_2M | \
> +				 I915_GTT_PAGE_SIZE_1G)
> +
> +#define is_valid_gtt_page_size(page_size) \
> +	(is_power_of_2(page_size) && \
> +	 (page_size) & I915_GTT_PAGE_SIZE_MASK)

({ unsigned int __size = (page_size); (__size & SIZE_MASK) == __size; })

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] drm/i915: add page_size_mask to dev_info
  2017-04-05  6:19   ` Joonas Lahtinen
@ 2017-04-05  8:45     ` Chris Wilson
  2017-04-05 12:57       ` Joonas Lahtinen
  0 siblings, 1 reply; 42+ messages in thread
From: Chris Wilson @ 2017-04-05  8:45 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx, Matthew Auld

On Wed, Apr 05, 2017 at 09:19:49AM +0300, Joonas Lahtinen wrote:
> On ti, 2017-04-04 at 23:11 +0100, Matthew Auld wrote:
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -56,6 +56,10 @@
> >  	.color = { .degamma_lut_size = 65, .gamma_lut_size = 257 }
> >  
> >  /* Keep in gen based order, and chronological order within a gen */
> > +
> > +#define GEN_DEFAULT_PAGE_SZ \
> > +	.page_size_mask = I915_GTT_PAGE_SIZE_4K
> 
> GEN_DEFAULT_PAGE_SIZES
> 
> > @@ -346,13 +358,18 @@ static const struct intel_device_info intel_cherryview_info = {
> >  	.has_aliasing_ppgtt = 1,
> >  	.has_full_ppgtt = 1,
> >  	.display_mmio_offset = VLV_DISPLAY_BASE,
> > +	.page_size_mask = I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_2M | I915_GTT_PAGE_SIZE_1G,
> 
> Split long line.
> 
> >  	GEN_CHV_PIPEOFFSETS,
> >  	CURSOR_OFFSETS,
> > >  	CHV_COLORS,
> >  };
> >  
> > +#define GEN9_DEFAULT_PAGE_SZ \
> > +	.page_size_mask = I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_2M | I915_GTT_PAGE_SIZE_1G
> 
> GEN9_DEFAULT_PAGE_SIZES, also split long line.

Also not in this patch. First patch is to set everything to the status
quo. Last patch will be to enable the (completed) feature on the
platforms using. In testing, that enabling patch comes early on to check
bisection of the series.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-05  6:49   ` Daniel Vetter
@ 2017-04-05  8:48     ` Chris Wilson
  2017-04-05 10:07       ` Matthew Auld
  0 siblings, 1 reply; 42+ messages in thread
From: Chris Wilson @ 2017-04-05  8:48 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Matthew Auld

On Wed, Apr 05, 2017 at 08:49:17AM +0200, Daniel Vetter wrote:
> On Tue, Apr 04, 2017 at 11:11:12PM +0100, Matthew Auld wrote:
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c        | 5 +++++
> >  drivers/gpu/drm/i915/i915_gem_object.h | 3 +++
> >  2 files changed, 8 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 4ca88f2539c0..cbf97f4bbb72 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2441,6 +2441,8 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> >  	struct sg_table *pages;
> >  
> >  	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
> > +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->page_size));
> > +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->gtt_page_size));
> >  
> >  	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
> >  		DRM_DEBUG("Attempting to obtain a purgeable object\n");
> > @@ -4159,6 +4161,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
> >  
> >  	obj->ops = ops;
> >  
> > +	obj->page_size = PAGE_SIZE;
> > +	obj->gtt_page_size = I915_GTT_PAGE_SIZE;
> > +
> >  	reservation_object_init(&obj->__builtin_resv);
> >  	obj->resv = &obj->__builtin_resv;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> > index 174cf923c236..b1dacbfe5173 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_object.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_object.h
> > @@ -107,6 +107,9 @@ struct drm_i915_gem_object {
> >  	unsigned int cache_level:3;
> >  	unsigned int cache_dirty:1;
> >  
> > +	unsigned int page_size; /* CPU pov - 4K(default), 2M, 1G */
> > +	unsigned int gtt_page_size; /* GPU pov - 4K(default), 64K, 2M, 1G */
> 
> Just kinda archecture review, with a long-term view: Is the plan to
> eventually become more flexible here, i.e. allow mixed mode?

Simply put we can not support obj->page_size. Every object will be
composed of a mixture of page sizes, often outside of our control and
those page sizes may vary over the lifetime of the object.

Trying to design around an a priori static page_size is a bad idea, imo.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2
  2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
                   ` (17 preceding siblings ...)
  2017-04-04 22:11 ` [PATCH 18/18] drm/i915: support transparent-huge-pages through shmemfs Matthew Auld
@ 2017-04-05  8:53 ` Chris Wilson
  18 siblings, 0 replies; 42+ messages in thread
From: Chris Wilson @ 2017-04-05  8:53 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Apr 04, 2017 at 11:11:10PM +0100, Matthew Auld wrote:
> Same as before, folding in review comments. Notably we now hook in transparent
> huge pages through by shmem, and *attempt* to deal with all the fun which that
> brings. Again should be considered very much RFC.

But where's the explanation for persisting with an inflexible design? I
still do not like the static page size assignments, and trying to force
that down into the GTT, rather than it percolating up.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-05  8:48     ` Chris Wilson
@ 2017-04-05 10:07       ` Matthew Auld
  2017-04-05 12:15         ` Daniel Vetter
  2017-04-05 12:32         ` Chris Wilson
  0 siblings, 2 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-05 10:07 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Matthew Auld, intel-gfx

On 04/05, Chris Wilson wrote:
> On Wed, Apr 05, 2017 at 08:49:17AM +0200, Daniel Vetter wrote:
> > On Tue, Apr 04, 2017 at 11:11:12PM +0100, Matthew Auld wrote:
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c        | 5 +++++
> > >  drivers/gpu/drm/i915/i915_gem_object.h | 3 +++
> > >  2 files changed, 8 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 4ca88f2539c0..cbf97f4bbb72 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2441,6 +2441,8 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > >  	struct sg_table *pages;
> > >  
> > >  	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
> > > +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->page_size));
> > > +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->gtt_page_size));
> > >  
> > >  	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
> > >  		DRM_DEBUG("Attempting to obtain a purgeable object\n");
> > > @@ -4159,6 +4161,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
> > >  
> > >  	obj->ops = ops;
> > >  
> > > +	obj->page_size = PAGE_SIZE;
> > > +	obj->gtt_page_size = I915_GTT_PAGE_SIZE;
> > > +
> > >  	reservation_object_init(&obj->__builtin_resv);
> > >  	obj->resv = &obj->__builtin_resv;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> > > index 174cf923c236..b1dacbfe5173 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_object.h
> > > +++ b/drivers/gpu/drm/i915/i915_gem_object.h
> > > @@ -107,6 +107,9 @@ struct drm_i915_gem_object {
> > >  	unsigned int cache_level:3;
> > >  	unsigned int cache_dirty:1;
> > >  
> > > +	unsigned int page_size; /* CPU pov - 4K(default), 2M, 1G */
> > > +	unsigned int gtt_page_size; /* GPU pov - 4K(default), 64K, 2M, 1G */
> > 
> > Just kinda archecture review, with a long-term view: Is the plan to
> > eventually become more flexible here, i.e. allow mixed mode?
> 
> Simply put we can not support obj->page_size. Every object will be
> composed of a mixture of page sizes, often outside of our control and
> those page sizes may vary over the lifetime of the object.
> 
> Trying to design around an a priori static page_size is a bad idea, imo.

I think I've misrepresented the intention of obj->page_size, it merely
serves as a hint to get pages, thereafter it represents the minimum page
size in the mapping and is just bookkeeping, so mixed pages are totally
fine and expected. I mostly wanted it to make it clear to the reader
that we have a gtt and cpu page size. I also wanted to know if an object
is entirely composed of huge-pages for debugfs purposes. I'll try to
rework this to make it less terrible.

> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-05 10:07       ` Matthew Auld
@ 2017-04-05 12:15         ` Daniel Vetter
  2017-04-05 12:32         ` Chris Wilson
  1 sibling, 0 replies; 42+ messages in thread
From: Daniel Vetter @ 2017-04-05 12:15 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, Matthew Auld

On Wed, Apr 05, 2017 at 11:07:55AM +0100, Matthew Auld wrote:
> On 04/05, Chris Wilson wrote:
> > On Wed, Apr 05, 2017 at 08:49:17AM +0200, Daniel Vetter wrote:
> > > On Tue, Apr 04, 2017 at 11:11:12PM +0100, Matthew Auld wrote:
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_gem.c        | 5 +++++
> > > >  drivers/gpu/drm/i915/i915_gem_object.h | 3 +++
> > > >  2 files changed, 8 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 4ca88f2539c0..cbf97f4bbb72 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -2441,6 +2441,8 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > > >  	struct sg_table *pages;
> > > >  
> > > >  	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
> > > > +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->page_size));
> > > > +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->gtt_page_size));
> > > >  
> > > >  	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
> > > >  		DRM_DEBUG("Attempting to obtain a purgeable object\n");
> > > > @@ -4159,6 +4161,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
> > > >  
> > > >  	obj->ops = ops;
> > > >  
> > > > +	obj->page_size = PAGE_SIZE;
> > > > +	obj->gtt_page_size = I915_GTT_PAGE_SIZE;
> > > > +
> > > >  	reservation_object_init(&obj->__builtin_resv);
> > > >  	obj->resv = &obj->__builtin_resv;
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> > > > index 174cf923c236..b1dacbfe5173 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_object.h
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_object.h
> > > > @@ -107,6 +107,9 @@ struct drm_i915_gem_object {
> > > >  	unsigned int cache_level:3;
> > > >  	unsigned int cache_dirty:1;
> > > >  
> > > > +	unsigned int page_size; /* CPU pov - 4K(default), 2M, 1G */
> > > > +	unsigned int gtt_page_size; /* GPU pov - 4K(default), 64K, 2M, 1G */
> > > 
> > > Just kinda archecture review, with a long-term view: Is the plan to
> > > eventually become more flexible here, i.e. allow mixed mode?
> > 
> > Simply put we can not support obj->page_size. Every object will be
> > composed of a mixture of page sizes, often outside of our control and
> > those page sizes may vary over the lifetime of the object.
> > 
> > Trying to design around an a priori static page_size is a bad idea, imo.
> 
> I think I've misrepresented the intention of obj->page_size, it merely
> serves as a hint to get pages, thereafter it represents the minimum page
> size in the mapping and is just bookkeeping, so mixed pages are totally
> fine and expected. I mostly wanted it to make it clear to the reader
> that we have a gtt and cpu page size. I also wanted to know if an object
> is entirely composed of huge-pages for debugfs purposes. I'll try to
> rework this to make it less terrible.

But if you already handle mixed pages, why do we need this still? Viz.
this thread, it does seem to cause quite a bit of confusion.

And I have no idea why we need the cpu_page_size, we kmap at the page
level anyway, and shmem handles the cpu mmap stuff for us.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-05 10:07       ` Matthew Auld
  2017-04-05 12:15         ` Daniel Vetter
@ 2017-04-05 12:32         ` Chris Wilson
  2017-04-05 12:39           ` Chris Wilson
  1 sibling, 1 reply; 42+ messages in thread
From: Chris Wilson @ 2017-04-05 12:32 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, Matthew Auld

On Wed, Apr 05, 2017 at 11:07:55AM +0100, Matthew Auld wrote:
> On 04/05, Chris Wilson wrote:
> > On Wed, Apr 05, 2017 at 08:49:17AM +0200, Daniel Vetter wrote:
> > > On Tue, Apr 04, 2017 at 11:11:12PM +0100, Matthew Auld wrote:
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> > > > index 174cf923c236..b1dacbfe5173 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_object.h
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_object.h
> > > > @@ -107,6 +107,9 @@ struct drm_i915_gem_object {
> > > >  	unsigned int cache_level:3;
> > > >  	unsigned int cache_dirty:1;
> > > >  
> > > > +	unsigned int page_size; /* CPU pov - 4K(default), 2M, 1G */
> > > > +	unsigned int gtt_page_size; /* GPU pov - 4K(default), 64K, 2M, 1G */
> > > 
> > > Just kinda archecture review, with a long-term view: Is the plan to
> > > eventually become more flexible here, i.e. allow mixed mode?
> > 
> > Simply put we can not support obj->page_size. Every object will be
> > composed of a mixture of page sizes, often outside of our control and
> > those page sizes may vary over the lifetime of the object.
> > 
> > Trying to design around an a priori static page_size is a bad idea, imo.
> 
> I think I've misrepresented the intention of obj->page_size, it merely
> serves as a hint to get pages, thereafter it represents the minimum page
> size in the mapping and is just bookkeeping, so mixed pages are totally
> fine and expected. I mostly wanted it to make it clear to the reader
> that we have a gtt and cpu page size. I also wanted to know if an object
> is entirely composed of huge-pages for debugfs purposes. I'll try to
> rework this to make it less terrible.

An approach that might be interesting. On pinning the pages
(i.e. ops->get_pages) if we fill in the bitmask of page sizes, something
like

for_each_sg() {
	obj->mm.page_sizes |= fls(sg->length);

obj->mm.pages_sizes &= ~i915->info.page_sizes;

Then in insert_pages or probably better as vma_insert:

	gtt_page_alignment = fls(obj->mm.page_sizes);

I think that will be useful info even prior to trying to put it to good
use, i.e. that will be enough to start dumping debugfs stats over how
frequently we get large allocations.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members
  2017-04-05 12:32         ` Chris Wilson
@ 2017-04-05 12:39           ` Chris Wilson
  0 siblings, 0 replies; 42+ messages in thread
From: Chris Wilson @ 2017-04-05 12:39 UTC (permalink / raw)
  To: Matthew Auld, Daniel Vetter, Matthew Auld, intel-gfx

On Wed, Apr 05, 2017 at 01:32:30PM +0100, Chris Wilson wrote:
> An approach that might be interesting. On pinning the pages
> (i.e. ops->get_pages) if we fill in the bitmask of page sizes, something
> like
> 
> for_each_sg() {
> 	obj->mm.page_sizes |= fls(sg->length);
> 
> obj->mm.pages_sizes &= ~i915->info.page_sizes;

Well, that was written in haste. You'll need a precomputed translation
table to map the order to the nearest supported page size.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/18] drm/i915: add page_size_mask to dev_info
  2017-04-05  8:45     ` Chris Wilson
@ 2017-04-05 12:57       ` Joonas Lahtinen
  0 siblings, 0 replies; 42+ messages in thread
From: Joonas Lahtinen @ 2017-04-05 12:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Matthew Auld

On ke, 2017-04-05 at 09:45 +0100, Chris Wilson wrote:
> 
> Also not in this patch. First patch is to set everything to the status
> quo. Last patch will be to enable the (completed) feature on the
> platforms using. In testing, that enabling patch comes early on to check
> bisection of the series.

Except if the feature is implemented in increments, but I already
mentioned in another patch that the ordering could be shuffled to allow
for easier review (which will lead to easier bisecting, too).

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/18] drm/i915: introduce ppgtt page coloring
  2017-04-04 22:11 ` [PATCH 07/18] drm/i915: introduce ppgtt page coloring Matthew Auld
@ 2017-04-05 13:41   ` Chris Wilson
  2017-04-05 13:50     ` Matthew Auld
  0 siblings, 1 reply; 42+ messages in thread
From: Chris Wilson @ 2017-04-05 13:41 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Apr 04, 2017 at 11:11:17PM +0100, Matthew Auld wrote:
> To enable 64K pages we need to set the intermediate-page-size(IPS) bit
> of the pde, therefore a page table is said to be either operating in 64K
> or 4K mode. To accommodate this vm placement restriction we introduce a
> color for pages and corresponding color_adjust callback.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 25 +++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  6 ++++++
>  drivers/gpu/drm/i915/i915_vma.c     |  2 ++
>  3 files changed, 33 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 0989af4a17e4..ddc3db345b76 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1332,6 +1332,28 @@ static int gen8_preallocate_top_level_pdp(struct i915_hw_ppgtt *ppgtt)
>  	return -ENOMEM;
>  }
>  
> +static void i915_page_color_adjust(const struct drm_mm_node *node,
> +				   unsigned long color,
> +				   u64 *start,
> +				   u64 *end)
> +{
> +	GEM_BUG_ON(!is_valid_gtt_page_size(color));
> +
> +	if (!(color & (I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K)))
> +		return;
> +
> +	GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
> +
> +	if (i915_node_color_differs(node, color))
> +		*start = roundup(*start, 1 << GEN8_PDE_SHIFT);
> +
> +	node = list_next_entry(node, node_list);
> +	if (i915_node_color_differs(node, color))
> +		*end = rounddown(*end, 1 << GEN8_PDE_SHIFT);
> +
> +	GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
> +}
> +
>  /*
>   * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
>   * with a net effect resembling a 2-level page table in normal x86 terms. Each
> @@ -1372,6 +1394,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  		ppgtt->base.allocate_va_range = gen8_ppgtt_alloc_4lvl;
>  		ppgtt->base.insert_entries = gen8_ppgtt_insert_4lvl;
>  		ppgtt->base.clear_range = gen8_ppgtt_clear_4lvl;
> +
> +		if (SUPPORTS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K))
> +			ppgtt->base.mm.color_adjust = i915_page_color_adjust;
>  	} else {
>  		ret = __pdp_init(&ppgtt->base, &ppgtt->pdp);
>  		if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 9c592e2de516..8d893ddd98f2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -353,6 +353,12 @@ i915_vm_has_cache_coloring(const struct i915_address_space *vm)
>  }
>  
>  static inline bool
> +i915_vm_has_page_coloring(const struct i915_address_space *vm)
> +{
> +	return vm->mm.color_adjust && !i915_is_ggtt(vm);
> +}
> +
> +static inline bool
>  i915_vm_is_48bit(const struct i915_address_space *vm)
>  {
>  	return (vm->total - 1) >> 32;
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 8f0041ba328f..4043145b4310 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -471,6 +471,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>  
>  	if (i915_vm_has_cache_coloring(vma->vm))
>  		color = obj->cache_level;
> +	else if (i915_vm_has_page_coloring(vma->vm))
> +		color = obj->gtt_page_size;

This does not need color_adjust since you are just specifying an
alignment and size. Why the extra complications? I remember asking the
same last time.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/18] drm/i915: introduce ppgtt page coloring
  2017-04-05 13:41   ` Chris Wilson
@ 2017-04-05 13:50     ` Matthew Auld
  2017-04-05 14:02       ` Chris Wilson
  0 siblings, 1 reply; 42+ messages in thread
From: Matthew Auld @ 2017-04-05 13:50 UTC (permalink / raw)
  To: Chris Wilson, Matthew Auld, Intel Graphics Development

On 5 April 2017 at 14:41, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Tue, Apr 04, 2017 at 11:11:17PM +0100, Matthew Auld wrote:
>> To enable 64K pages we need to set the intermediate-page-size(IPS) bit
>> of the pde, therefore a page table is said to be either operating in 64K
>> or 4K mode. To accommodate this vm placement restriction we introduce a
>> color for pages and corresponding color_adjust callback.
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_gtt.c | 25 +++++++++++++++++++++++++
>>  drivers/gpu/drm/i915/i915_gem_gtt.h |  6 ++++++
>>  drivers/gpu/drm/i915/i915_vma.c     |  2 ++
>>  3 files changed, 33 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 0989af4a17e4..ddc3db345b76 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -1332,6 +1332,28 @@ static int gen8_preallocate_top_level_pdp(struct i915_hw_ppgtt *ppgtt)
>>       return -ENOMEM;
>>  }
>>
>> +static void i915_page_color_adjust(const struct drm_mm_node *node,
>> +                                unsigned long color,
>> +                                u64 *start,
>> +                                u64 *end)
>> +{
>> +     GEM_BUG_ON(!is_valid_gtt_page_size(color));
>> +
>> +     if (!(color & (I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K)))
>> +             return;
>> +
>> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
>> +
>> +     if (i915_node_color_differs(node, color))
>> +             *start = roundup(*start, 1 << GEN8_PDE_SHIFT);
>> +
>> +     node = list_next_entry(node, node_list);
>> +     if (i915_node_color_differs(node, color))
>> +             *end = rounddown(*end, 1 << GEN8_PDE_SHIFT);
>> +
>> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
>> +}
>> +
>>  /*
>>   * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
>>   * with a net effect resembling a 2-level page table in normal x86 terms. Each
>> @@ -1372,6 +1394,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>>               ppgtt->base.allocate_va_range = gen8_ppgtt_alloc_4lvl;
>>               ppgtt->base.insert_entries = gen8_ppgtt_insert_4lvl;
>>               ppgtt->base.clear_range = gen8_ppgtt_clear_4lvl;
>> +
>> +             if (SUPPORTS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K))
>> +                     ppgtt->base.mm.color_adjust = i915_page_color_adjust;
>>       } else {
>>               ret = __pdp_init(&ppgtt->base, &ppgtt->pdp);
>>               if (ret)
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> index 9c592e2de516..8d893ddd98f2 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> @@ -353,6 +353,12 @@ i915_vm_has_cache_coloring(const struct i915_address_space *vm)
>>  }
>>
>>  static inline bool
>> +i915_vm_has_page_coloring(const struct i915_address_space *vm)
>> +{
>> +     return vm->mm.color_adjust && !i915_is_ggtt(vm);
>> +}
>> +
>> +static inline bool
>>  i915_vm_is_48bit(const struct i915_address_space *vm)
>>  {
>>       return (vm->total - 1) >> 32;
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index 8f0041ba328f..4043145b4310 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -471,6 +471,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>>
>>       if (i915_vm_has_cache_coloring(vma->vm))
>>               color = obj->cache_level;
>> +     else if (i915_vm_has_page_coloring(vma->vm))
>> +             color = obj->gtt_page_size;
>
> This does not need color_adjust since you are just specifying an
> alignment and size. Why the extra complications? I remember asking the
> same last time.
Hmm, are you saying the whole idea of needing a color_adjust for
4K/64K vm placement is completely unnecessary?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/18] drm/i915: introduce ppgtt page coloring
  2017-04-05 13:50     ` Matthew Auld
@ 2017-04-05 14:02       ` Chris Wilson
  2017-04-05 15:05         ` Matthew Auld
  2017-04-10 12:08         ` Matthew Auld
  0 siblings, 2 replies; 42+ messages in thread
From: Chris Wilson @ 2017-04-05 14:02 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld

On Wed, Apr 05, 2017 at 02:50:41PM +0100, Matthew Auld wrote:
> On 5 April 2017 at 14:41, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Tue, Apr 04, 2017 at 11:11:17PM +0100, Matthew Auld wrote:
> >> To enable 64K pages we need to set the intermediate-page-size(IPS) bit
> >> of the pde, therefore a page table is said to be either operating in 64K
> >> or 4K mode. To accommodate this vm placement restriction we introduce a
> >> color for pages and corresponding color_adjust callback.
> >>
> >> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_gem_gtt.c | 25 +++++++++++++++++++++++++
> >>  drivers/gpu/drm/i915/i915_gem_gtt.h |  6 ++++++
> >>  drivers/gpu/drm/i915/i915_vma.c     |  2 ++
> >>  3 files changed, 33 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> index 0989af4a17e4..ddc3db345b76 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> @@ -1332,6 +1332,28 @@ static int gen8_preallocate_top_level_pdp(struct i915_hw_ppgtt *ppgtt)
> >>       return -ENOMEM;
> >>  }
> >>
> >> +static void i915_page_color_adjust(const struct drm_mm_node *node,
> >> +                                unsigned long color,
> >> +                                u64 *start,
> >> +                                u64 *end)
> >> +{
> >> +     GEM_BUG_ON(!is_valid_gtt_page_size(color));
> >> +
> >> +     if (!(color & (I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K)))
> >> +             return;
> >> +
> >> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
> >> +
> >> +     if (i915_node_color_differs(node, color))
> >> +             *start = roundup(*start, 1 << GEN8_PDE_SHIFT);
> >> +
> >> +     node = list_next_entry(node, node_list);
> >> +     if (i915_node_color_differs(node, color))
> >> +             *end = rounddown(*end, 1 << GEN8_PDE_SHIFT);
> >> +
> >> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
> >> +}
> >> +
> >>  /*
> >>   * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
> >>   * with a net effect resembling a 2-level page table in normal x86 terms. Each
> >> @@ -1372,6 +1394,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >>               ppgtt->base.allocate_va_range = gen8_ppgtt_alloc_4lvl;
> >>               ppgtt->base.insert_entries = gen8_ppgtt_insert_4lvl;
> >>               ppgtt->base.clear_range = gen8_ppgtt_clear_4lvl;
> >> +
> >> +             if (SUPPORTS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K))
> >> +                     ppgtt->base.mm.color_adjust = i915_page_color_adjust;
> >>       } else {
> >>               ret = __pdp_init(&ppgtt->base, &ppgtt->pdp);
> >>               if (ret)
> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >> index 9c592e2de516..8d893ddd98f2 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >> @@ -353,6 +353,12 @@ i915_vm_has_cache_coloring(const struct i915_address_space *vm)
> >>  }
> >>
> >>  static inline bool
> >> +i915_vm_has_page_coloring(const struct i915_address_space *vm)
> >> +{
> >> +     return vm->mm.color_adjust && !i915_is_ggtt(vm);
> >> +}
> >> +
> >> +static inline bool
> >>  i915_vm_is_48bit(const struct i915_address_space *vm)
> >>  {
> >>       return (vm->total - 1) >> 32;
> >> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> >> index 8f0041ba328f..4043145b4310 100644
> >> --- a/drivers/gpu/drm/i915/i915_vma.c
> >> +++ b/drivers/gpu/drm/i915/i915_vma.c
> >> @@ -471,6 +471,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
> >>
> >>       if (i915_vm_has_cache_coloring(vma->vm))
> >>               color = obj->cache_level;
> >> +     else if (i915_vm_has_page_coloring(vma->vm))
> >> +             color = obj->gtt_page_size;
> >
> > This does not need color_adjust since you are just specifying an
> > alignment and size. Why the extra complications? I remember asking the
> > same last time.
> Hmm, are you saying the whole idea of needing a color_adjust for
> 4K/64K vm placement is completely unnecessary?

As constructed here, yes. Since you just want to request a
obj->gtt_page_size aligned block:

	.size = round_up(size, obj->gtt_page_size),
	.align = max(align, obj->gtt_page_size).

(Hmm, now I think about it you shouldn't round size up unless the
insert_pages() is careful not to assume that the last page is a full
superpage. More I think about this, you only want to align the base and
let insert_pages group up the superpages.)

Unless I have completely misunderstood, you do not need to insert
gaps between blocks. Both the color_adjust approach and this approach
still need lower level support to amalgamate pages.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/18] drm/i915: introduce ppgtt page coloring
  2017-04-05 14:02       ` Chris Wilson
@ 2017-04-05 15:05         ` Matthew Auld
  2017-04-10 12:08         ` Matthew Auld
  1 sibling, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-05 15:05 UTC (permalink / raw)
  To: Chris Wilson, Matthew Auld, Matthew Auld, Intel Graphics Development

On 5 April 2017 at 15:02, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, Apr 05, 2017 at 02:50:41PM +0100, Matthew Auld wrote:
>> On 5 April 2017 at 14:41, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> > On Tue, Apr 04, 2017 at 11:11:17PM +0100, Matthew Auld wrote:
>> >> To enable 64K pages we need to set the intermediate-page-size(IPS) bit
>> >> of the pde, therefore a page table is said to be either operating in 64K
>> >> or 4K mode. To accommodate this vm placement restriction we introduce a
>> >> color for pages and corresponding color_adjust callback.
>> >>
>> >> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> >> ---
>> >>  drivers/gpu/drm/i915/i915_gem_gtt.c | 25 +++++++++++++++++++++++++
>> >>  drivers/gpu/drm/i915/i915_gem_gtt.h |  6 ++++++
>> >>  drivers/gpu/drm/i915/i915_vma.c     |  2 ++
>> >>  3 files changed, 33 insertions(+)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> >> index 0989af4a17e4..ddc3db345b76 100644
>> >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> >> @@ -1332,6 +1332,28 @@ static int gen8_preallocate_top_level_pdp(struct i915_hw_ppgtt *ppgtt)
>> >>       return -ENOMEM;
>> >>  }
>> >>
>> >> +static void i915_page_color_adjust(const struct drm_mm_node *node,
>> >> +                                unsigned long color,
>> >> +                                u64 *start,
>> >> +                                u64 *end)
>> >> +{
>> >> +     GEM_BUG_ON(!is_valid_gtt_page_size(color));
>> >> +
>> >> +     if (!(color & (I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K)))
>> >> +             return;
>> >> +
>> >> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
>> >> +
>> >> +     if (i915_node_color_differs(node, color))
>> >> +             *start = roundup(*start, 1 << GEN8_PDE_SHIFT);
>> >> +
>> >> +     node = list_next_entry(node, node_list);
>> >> +     if (i915_node_color_differs(node, color))
>> >> +             *end = rounddown(*end, 1 << GEN8_PDE_SHIFT);
>> >> +
>> >> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
>> >> +}
>> >> +
>> >>  /*
>> >>   * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
>> >>   * with a net effect resembling a 2-level page table in normal x86 terms. Each
>> >> @@ -1372,6 +1394,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>> >>               ppgtt->base.allocate_va_range = gen8_ppgtt_alloc_4lvl;
>> >>               ppgtt->base.insert_entries = gen8_ppgtt_insert_4lvl;
>> >>               ppgtt->base.clear_range = gen8_ppgtt_clear_4lvl;
>> >> +
>> >> +             if (SUPPORTS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K))
>> >> +                     ppgtt->base.mm.color_adjust = i915_page_color_adjust;
>> >>       } else {
>> >>               ret = __pdp_init(&ppgtt->base, &ppgtt->pdp);
>> >>               if (ret)
>> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> >> index 9c592e2de516..8d893ddd98f2 100644
>> >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> >> @@ -353,6 +353,12 @@ i915_vm_has_cache_coloring(const struct i915_address_space *vm)
>> >>  }
>> >>
>> >>  static inline bool
>> >> +i915_vm_has_page_coloring(const struct i915_address_space *vm)
>> >> +{
>> >> +     return vm->mm.color_adjust && !i915_is_ggtt(vm);
>> >> +}
>> >> +
>> >> +static inline bool
>> >>  i915_vm_is_48bit(const struct i915_address_space *vm)
>> >>  {
>> >>       return (vm->total - 1) >> 32;
>> >> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> >> index 8f0041ba328f..4043145b4310 100644
>> >> --- a/drivers/gpu/drm/i915/i915_vma.c
>> >> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> >> @@ -471,6 +471,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>> >>
>> >>       if (i915_vm_has_cache_coloring(vma->vm))
>> >>               color = obj->cache_level;
>> >> +     else if (i915_vm_has_page_coloring(vma->vm))
>> >> +             color = obj->gtt_page_size;
>> >
>> > This does not need color_adjust since you are just specifying an
>> > alignment and size. Why the extra complications? I remember asking the
>> > same last time.
>> Hmm, are you saying the whole idea of needing a color_adjust for
>> 4K/64K vm placement is completely unnecessary?
>
> As constructed here, yes. Since you just want to request a
> obj->gtt_page_size aligned block:
>
>         .size = round_up(size, obj->gtt_page_size),
>         .align = max(align, obj->gtt_page_size).
Unless I've gone completely mad, I really don't think it's that
simple, we never ever want a 4K object or 64K object to overlap the
same page-table. We derive the pde and hence the page-table from
node.start, so if we just request an obj->gtt_page_size aligned block,
we could have a page-table with a mixture of 64K and 4K pte's, at
which point we're screwed.

>
> (Hmm, now I think about it you shouldn't round size up unless the
> insert_pages() is careful not to assume that the last page is a full
> superpage. More I think about this, you only want to align the base and
> let insert_pages group up the superpages.)
>
> Unless I have completely misunderstood, you do not need to insert
> gaps between blocks. Both the color_adjust approach and this approach
> still need lower level support to amalgamate pages.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt
  2017-04-04 22:11 ` [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt Matthew Auld
@ 2017-04-06  3:25   ` kbuild test robot
  2017-04-09  0:27   ` kbuild test robot
  1 sibling, 0 replies; 42+ messages in thread
From: kbuild test robot @ 2017-04-06  3:25 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3499 bytes --]

Hi Matthew,

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20170405]
[cannot apply to v4.11-rc5]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/drm-i915-initial-support-for-huge-gtt-pages-V2/20170406-060958
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-s2-04061013 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem_gtt.c: In function 'gen8_ppgtt_insert_4lvl':
>> drivers/gpu/drm/i915/i915_gem_gtt.c:1002: warning: 'iter' is used uninitialized in this function
   drivers/gpu/drm/i915/i915_gem_gtt.c: In function 'gen8_ppgtt_insert_3lvl':
   drivers/gpu/drm/i915/i915_gem_gtt.c:983: warning: 'iter.sg' is used uninitialized in this function
   drivers/gpu/drm/i915/i915_gem_gtt.c:984: warning: 'iter.dma' is used uninitialized in this function

vim +/iter +1002 drivers/gpu/drm/i915/i915_gem_gtt.c

9e89f9ee3 Chris Wilson   2017-02-25   986  	struct gen8_insert_pte idx = gen8_insert_pte(start);
de5ba8eb9 Michel Thierry 2015-08-03   987  
9e89f9ee3 Chris Wilson   2017-02-25   988  	gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx,
9e89f9ee3 Chris Wilson   2017-02-25   989  				      cache_level);
de5ba8eb9 Michel Thierry 2015-08-03   990  }
894ccebee Chris Wilson   2017-02-15   991  
894ccebee Chris Wilson   2017-02-15   992  static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
894ccebee Chris Wilson   2017-02-15   993  				   struct sg_table *pages,
75c7b0b86 Chris Wilson   2017-02-15   994  				   u64 start,
c7a43c911 Matthew Auld   2017-04-04   995  				   unsigned int page_size,
894ccebee Chris Wilson   2017-02-15   996  				   enum i915_cache_level cache_level,
894ccebee Chris Wilson   2017-02-15   997  				   u32 unused)
894ccebee Chris Wilson   2017-02-15   998  {
894ccebee Chris Wilson   2017-02-15   999  	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
894ccebee Chris Wilson   2017-02-15  1000  	struct sgt_dma iter = {
894ccebee Chris Wilson   2017-02-15  1001  		.sg = pages->sgl,
894ccebee Chris Wilson   2017-02-15 @1002  		.dma = sg_dma_address(iter.sg),
894ccebee Chris Wilson   2017-02-15  1003  		.max = iter.dma + iter.sg->length,
894ccebee Chris Wilson   2017-02-15  1004  	};
894ccebee Chris Wilson   2017-02-15  1005  	struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
9e89f9ee3 Chris Wilson   2017-02-25  1006  	struct gen8_insert_pte idx = gen8_insert_pte(start);
c7a43c911 Matthew Auld   2017-04-04  1007  	bool (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
c7a43c911 Matthew Auld   2017-04-04  1008  			       struct i915_page_directory_pointer *pdp,
c7a43c911 Matthew Auld   2017-04-04  1009  			       struct sgt_dma *iter,
c7a43c911 Matthew Auld   2017-04-04  1010  			       struct gen8_insert_pte *idx,

:::::: The code at line 1002 was first introduced by commit
:::::: 894ccebee2b0e606ba9638d20dd87b33568482d7 drm/i915: Micro-optimise gen8_ppgtt_insert_entries()

:::::: TO: Chris Wilson <chris@chris-wilson.co.uk>
:::::: CC: Chris Wilson <chris@chris-wilson.co.uk>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24101 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt
  2017-04-04 22:11 ` [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt Matthew Auld
  2017-04-06  3:25   ` kbuild test robot
@ 2017-04-09  0:27   ` kbuild test robot
  1 sibling, 0 replies; 42+ messages in thread
From: kbuild test robot @ 2017-04-09  0:27 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2342 bytes --]

Hi Matthew,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20170407]
[cannot apply to v4.11-rc5]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Auld/drm-i915-initial-support-for-huge-gtt-pages-V2/20170406-060958
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-s0-04090705 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   cc1: warnings being treated as errors
   drivers/gpu/drm/i915/i915_gem_gtt.c: In function 'gen8_ppgtt_insert_4lvl':
>> drivers/gpu/drm/i915/i915_gem_gtt.c:1002: error: 'iter' is used uninitialized in this function
   drivers/gpu/drm/i915/i915_gem_gtt.c: In function 'gen8_ppgtt_insert_3lvl':
   drivers/gpu/drm/i915/i915_gem_gtt.c:983: error: 'iter.sg' is used uninitialized in this function
   drivers/gpu/drm/i915/i915_gem_gtt.c:984: error: 'iter.dma' is used uninitialized in this function

vim +/iter +1002 drivers/gpu/drm/i915/i915_gem_gtt.c

894ccebee Chris Wilson 2017-02-15   996  				   enum i915_cache_level cache_level,
894ccebee Chris Wilson 2017-02-15   997  				   u32 unused)
894ccebee Chris Wilson 2017-02-15   998  {
894ccebee Chris Wilson 2017-02-15   999  	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
894ccebee Chris Wilson 2017-02-15  1000  	struct sgt_dma iter = {
894ccebee Chris Wilson 2017-02-15  1001  		.sg = pages->sgl,
894ccebee Chris Wilson 2017-02-15 @1002  		.dma = sg_dma_address(iter.sg),
894ccebee Chris Wilson 2017-02-15  1003  		.max = iter.dma + iter.sg->length,
894ccebee Chris Wilson 2017-02-15  1004  	};
894ccebee Chris Wilson 2017-02-15  1005  	struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;

:::::: The code at line 1002 was first introduced by commit
:::::: 894ccebee2b0e606ba9638d20dd87b33568482d7 drm/i915: Micro-optimise gen8_ppgtt_insert_entries()

:::::: TO: Chris Wilson <chris@chris-wilson.co.uk>
:::::: CC: Chris Wilson <chris@chris-wilson.co.uk>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28910 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/18] drm/i915: introduce ppgtt page coloring
  2017-04-05 14:02       ` Chris Wilson
  2017-04-05 15:05         ` Matthew Auld
@ 2017-04-10 12:08         ` Matthew Auld
  1 sibling, 0 replies; 42+ messages in thread
From: Matthew Auld @ 2017-04-10 12:08 UTC (permalink / raw)
  To: Chris Wilson, Matthew Auld, Matthew Auld, Intel Graphics Development

On 5 April 2017 at 15:02, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, Apr 05, 2017 at 02:50:41PM +0100, Matthew Auld wrote:
>> On 5 April 2017 at 14:41, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> > On Tue, Apr 04, 2017 at 11:11:17PM +0100, Matthew Auld wrote:
>> >> To enable 64K pages we need to set the intermediate-page-size(IPS) bit
>> >> of the pde, therefore a page table is said to be either operating in 64K
>> >> or 4K mode. To accommodate this vm placement restriction we introduce a
>> >> color for pages and corresponding color_adjust callback.
>> >>
>> >> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> >> ---
>> >>  drivers/gpu/drm/i915/i915_gem_gtt.c | 25 +++++++++++++++++++++++++
>> >>  drivers/gpu/drm/i915/i915_gem_gtt.h |  6 ++++++
>> >>  drivers/gpu/drm/i915/i915_vma.c     |  2 ++
>> >>  3 files changed, 33 insertions(+)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> >> index 0989af4a17e4..ddc3db345b76 100644
>> >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> >> @@ -1332,6 +1332,28 @@ static int gen8_preallocate_top_level_pdp(struct i915_hw_ppgtt *ppgtt)
>> >>       return -ENOMEM;
>> >>  }
>> >>
>> >> +static void i915_page_color_adjust(const struct drm_mm_node *node,
>> >> +                                unsigned long color,
>> >> +                                u64 *start,
>> >> +                                u64 *end)
>> >> +{
>> >> +     GEM_BUG_ON(!is_valid_gtt_page_size(color));
>> >> +
>> >> +     if (!(color & (I915_GTT_PAGE_SIZE_4K | I915_GTT_PAGE_SIZE_64K)))
>> >> +             return;
>> >> +
>> >> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
>> >> +
>> >> +     if (i915_node_color_differs(node, color))
>> >> +             *start = roundup(*start, 1 << GEN8_PDE_SHIFT);
>> >> +
>> >> +     node = list_next_entry(node, node_list);
>> >> +     if (i915_node_color_differs(node, color))
>> >> +             *end = rounddown(*end, 1 << GEN8_PDE_SHIFT);
>> >> +
>> >> +     GEM_BUG_ON(node->allocated && !is_valid_gtt_page_size(node->color));
>> >> +}
>> >> +
>> >>  /*
>> >>   * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
>> >>   * with a net effect resembling a 2-level page table in normal x86 terms. Each
>> >> @@ -1372,6 +1394,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>> >>               ppgtt->base.allocate_va_range = gen8_ppgtt_alloc_4lvl;
>> >>               ppgtt->base.insert_entries = gen8_ppgtt_insert_4lvl;
>> >>               ppgtt->base.clear_range = gen8_ppgtt_clear_4lvl;
>> >> +
>> >> +             if (SUPPORTS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K))
>> >> +                     ppgtt->base.mm.color_adjust = i915_page_color_adjust;
>> >>       } else {
>> >>               ret = __pdp_init(&ppgtt->base, &ppgtt->pdp);
>> >>               if (ret)
>> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> >> index 9c592e2de516..8d893ddd98f2 100644
>> >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> >> @@ -353,6 +353,12 @@ i915_vm_has_cache_coloring(const struct i915_address_space *vm)
>> >>  }
>> >>
>> >>  static inline bool
>> >> +i915_vm_has_page_coloring(const struct i915_address_space *vm)
>> >> +{
>> >> +     return vm->mm.color_adjust && !i915_is_ggtt(vm);
>> >> +}
>> >> +
>> >> +static inline bool
>> >>  i915_vm_is_48bit(const struct i915_address_space *vm)
>> >>  {
>> >>       return (vm->total - 1) >> 32;
>> >> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> >> index 8f0041ba328f..4043145b4310 100644
>> >> --- a/drivers/gpu/drm/i915/i915_vma.c
>> >> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> >> @@ -471,6 +471,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>> >>
>> >>       if (i915_vm_has_cache_coloring(vma->vm))
>> >>               color = obj->cache_level;
>> >> +     else if (i915_vm_has_page_coloring(vma->vm))
>> >> +             color = obj->gtt_page_size;
>> >
>> > This does not need color_adjust since you are just specifying an
>> > alignment and size. Why the extra complications? I remember asking the
>> > same last time.
>> Hmm, are you saying the whole idea of needing a color_adjust for
>> 4K/64K vm placement is completely unnecessary?
>
> As constructed here, yes. Since you just want to request a
> obj->gtt_page_size aligned block:
>
>         .size = round_up(size, obj->gtt_page_size),
>         .align = max(align, obj->gtt_page_size).
>
> (Hmm, now I think about it you shouldn't round size up unless the
> insert_pages() is careful not to assume that the last page is a full
> superpage. More I think about this, you only want to align the base and
> let insert_pages group up the superpages.)
I feel like I must be missing your point, could you expand on what you
mean my grouping superpages?

>
> Unless I have completely misunderstood, you do not need to insert
> gaps between blocks. Both the color_adjust approach and this approach
> still need lower level support to amalgamate pages.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2017-04-10 12:09 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 22:11 [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Matthew Auld
2017-04-04 22:11 ` [PATCH 01/18] drm/i915: add page_size_mask to dev_info Matthew Auld
2017-04-05  6:19   ` Joonas Lahtinen
2017-04-05  8:45     ` Chris Wilson
2017-04-05 12:57       ` Joonas Lahtinen
2017-04-05  8:43   ` Chris Wilson
2017-04-04 22:11 ` [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members Matthew Auld
2017-04-05  6:26   ` Joonas Lahtinen
2017-04-05  6:49   ` Daniel Vetter
2017-04-05  8:48     ` Chris Wilson
2017-04-05 10:07       ` Matthew Auld
2017-04-05 12:15         ` Daniel Vetter
2017-04-05 12:32         ` Chris Wilson
2017-04-05 12:39           ` Chris Wilson
2017-04-04 22:11 ` [PATCH 03/18] drm/i915: pass page_size to insert_entries Matthew Auld
2017-04-04 22:11 ` [PATCH 04/18] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust Matthew Auld
2017-04-05  6:30   ` Joonas Lahtinen
2017-04-04 22:11 ` [PATCH 05/18] drm/i915: clean up cache coloring Matthew Auld
2017-04-05  6:35   ` Joonas Lahtinen
2017-04-04 22:11 ` [PATCH 06/18] drm/i915: export color_differs Matthew Auld
2017-04-05  6:39   ` Joonas Lahtinen
2017-04-04 22:11 ` [PATCH 07/18] drm/i915: introduce ppgtt page coloring Matthew Auld
2017-04-05 13:41   ` Chris Wilson
2017-04-05 13:50     ` Matthew Auld
2017-04-05 14:02       ` Chris Wilson
2017-04-05 15:05         ` Matthew Auld
2017-04-10 12:08         ` Matthew Auld
2017-04-04 22:11 ` [PATCH 08/18] drm/i915: handle evict-for-node with " Matthew Auld
2017-04-04 22:11 ` [PATCH 09/18] drm/i915: support inserting 64K pages in the ppgtt Matthew Auld
2017-04-06  3:25   ` kbuild test robot
2017-04-09  0:27   ` kbuild test robot
2017-04-04 22:11 ` [PATCH 10/18] drm/i915: support inserting 2M " Matthew Auld
2017-04-04 22:11 ` [PATCH 11/18] drm/i915: support inserting 1G " Matthew Auld
2017-04-04 22:11 ` [PATCH 12/18] drm/i915: disable GTT cache for huge-pages Matthew Auld
2017-04-04 22:11 ` [PATCH 13/18] drm/i915/selftests: exercise 4K and 64K mm insertion Matthew Auld
2017-04-04 22:11 ` [PATCH 14/18] drm/i915/selftests: modify the gtt tests to also exercise huge pages Matthew Auld
2017-04-04 22:11 ` [PATCH 15/18] drm/i915/selftests: exercise evict-for-node page coloring Matthew Auld
2017-04-04 22:11 ` [PATCH 16/18] drm/i915/debugfs: include some huge-page metrics Matthew Auld
2017-04-04 22:11 ` [PATCH 17/18] mm/shmem: tweak the huge-page interface Matthew Auld
2017-04-05  6:42   ` Daniel Vetter
2017-04-04 22:11 ` [PATCH 18/18] drm/i915: support transparent-huge-pages through shmemfs Matthew Auld
2017-04-05  8:53 ` [RFC PATCH 00/18] drm/i915: initial support for huge gtt pages V2 Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.