All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/17] add support for huge-gtt-pages
@ 2017-05-16  8:29 Matthew Auld
  2017-05-16  8:29 ` [PATCH 01/17] drm/i915: introduce page_size_mask to dev_info Matthew Auld
                   ` (17 more replies)
  0 siblings, 18 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

Adds support for 64K, 2M and 1G pages for the 48b PPGTT. We select the largest
gtt page size which fits the layout of the sg table. To complement this we also
request THP for our shmem backed objects, which should be able to give us 2M or
1G pages depending on configuration.

Hopefully this addresses the concerns from the last version. 

Matthew Auld (17):
  drm/i915: introduce page_size_mask to dev_info
  drm/i915: introduce gtt page size
  drm/i915: align the vma start to the gtt page size
  drm/i915: align 64K objects to 2M
  drm/i915: fallback to normal pages on vma insert failure
  mm/shmem: expose driver overridable huge option
  drm/i915: request THP for shmem backed objects
  drm/i915: pass gtt page size to insert_entries
  drm/i915: enable IPS bit for 64K pages
  drm/i915: support inserting 64K pages into the 48b PPGTT
  drm/i915: disable GTT cache for 2M/1G pages
  drm/i915: support inserting 2M pages into the 48b PPGTT
  drm/i915: support inserting 1G pages into the 48b PPGTT
  drm/i915/debugfs: include some gtt_page_size metrics
  drm/i915: enable platform support for 64K pages
  drm/i915: enable platform support for 2M pages
  drm/i915: enable platform support for 1G pages

 drivers/gpu/drm/i915/i915_debugfs.c              |  37 +++-
 drivers/gpu/drm/i915/i915_drv.h                  |   3 +
 drivers/gpu/drm/i915/i915_gem.c                  |  44 +++++
 drivers/gpu/drm/i915/i915_gem_gtt.c              | 206 ++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.h              |  14 +-
 drivers/gpu/drm/i915/i915_gem_object.h           |   2 +
 drivers/gpu/drm/i915/i915_pci.c                  |  29 ++++
 drivers/gpu/drm/i915/i915_reg.h                  |   3 +
 drivers/gpu/drm/i915/i915_vma.c                  |  36 ++++
 drivers/gpu/drm/i915/intel_pm.c                  |  12 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c    |   3 +-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |   6 +
 drivers/gpu/drm/i915/selftests/mock_gtt.c        |   1 +
 include/linux/shmem_fs.h                         |  20 +++
 mm/shmem.c                                       |  37 ++--
 15 files changed, 416 insertions(+), 37 deletions(-)

-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 01/17] drm/i915: introduce page_size_mask to dev_info
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 02/17] drm/i915: introduce gtt page size Matthew Auld
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

In preparation for huge gtt pages expose a page_size_mask as part of the
device info, to indicate the page sizes supported by the HW.  Currently
only 4K is supported.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h                  |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.h              |  8 +++++++-
 drivers/gpu/drm/i915/i915_pci.c                  | 21 +++++++++++++++++++++
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  3 +++
 4 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a6f20471b4cd..e18f11f77f35 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -791,6 +791,7 @@ struct intel_device_info {
 	enum intel_platform platform;
 	u8 ring_mask; /* Rings supported by the HW */
 	u8 num_rings;
+	unsigned int page_size_mask; /* page sizes supported by the HW */
 #define DEFINE_FLAG(name) u8 name:1
 	DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG);
 #undef DEFINE_FLAG
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index fb15684c1d83..f8db231c28aa 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -42,7 +42,13 @@
 #include "i915_gem_request.h"
 #include "i915_selftest.h"
 
-#define I915_GTT_PAGE_SIZE 4096UL
+#define I915_GTT_PAGE_SIZE_4K BIT(12)
+#define I915_GTT_PAGE_SIZE_64K BIT(16)
+#define I915_GTT_PAGE_SIZE_2M BIT(21)
+#define I915_GTT_PAGE_SIZE_1G BIT(30)
+
+#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K
+
 #define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
 
 #define I915_FENCE_REG_NONE -1
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index f80db2ccd92f..7caccb5bf963 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -56,6 +56,10 @@
 	.color = { .degamma_lut_size = 65, .gamma_lut_size = 257 }
 
 /* Keep in gen based order, and chronological order within a gen */
+
+#define GEN_DEFAULT_PAGE_SIZES \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K
+
 #define GEN2_FEATURES \
 	.gen = 2, .num_pipes = 1, \
 	.has_overlay = 1, .overlay_needs_physical = 1, \
@@ -64,6 +68,7 @@
 	.unfenced_needs_alignment = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i830_info = {
@@ -96,6 +101,7 @@ static const struct intel_device_info intel_i865g_info = {
 	.has_gmch_display = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i915g_info = {
@@ -158,6 +164,7 @@ static const struct intel_device_info intel_pineview_info = {
 	.has_gmch_display = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i965g_info = {
@@ -198,6 +205,7 @@ static const struct intel_device_info intel_gm45_info = {
 	.has_gmbus_irq = 1, \
 	.ring_mask = RENDER_RING | BSD_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ironlake_d_info = {
@@ -222,6 +230,7 @@ static const struct intel_device_info intel_ironlake_m_info = {
 	.has_gmbus_irq = 1, \
 	.has_aliasing_ppgtt = 1, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_sandybridge_d_info = {
@@ -247,6 +256,7 @@ static const struct intel_device_info intel_sandybridge_m_info = {
 	.has_aliasing_ppgtt = 1, \
 	.has_full_ppgtt = 1, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	IVB_CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ivybridge_d_info = {
@@ -284,6 +294,7 @@ static const struct intel_device_info intel_valleyview_info = {
 	.has_full_ppgtt = 1,
 	.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
+	GEN_DEFAULT_PAGE_SIZES,
 	GEN_DEFAULT_PIPEOFFSETS,
 	CURSOR_OFFSETS
 };
@@ -308,6 +319,7 @@ static const struct intel_device_info intel_haswell_info = {
 #define BDW_FEATURES \
 	HSW_FEATURES, \
 	BDW_COLORS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
 	.has_64bit_reloc = 1
@@ -342,13 +354,18 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_aliasing_ppgtt = 1,
 	.has_full_ppgtt = 1,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
+	GEN_DEFAULT_PAGE_SIZES,
 	GEN_CHV_PIPEOFFSETS,
 	CURSOR_OFFSETS,
 	CHV_COLORS,
 };
 
+#define GEN9_DEFAULT_PAGE_SIZES \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K
+
 static const struct intel_device_info intel_skylake_info = {
 	BDW_FEATURES,
+	GEN_DEFAULT_PAGE_SIZES,
 	.platform = INTEL_SKYLAKE,
 	.gen = 9,
 	.has_csr = 1,
@@ -358,6 +375,7 @@ static const struct intel_device_info intel_skylake_info = {
 
 static const struct intel_device_info intel_skylake_gt3_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SIZES,
 	.platform = INTEL_SKYLAKE,
 	.gen = 9,
 	.has_csr = 1,
@@ -389,6 +407,7 @@ static const struct intel_device_info intel_skylake_gt3_info = {
 	.has_aliasing_ppgtt = 1, \
 	.has_full_ppgtt = 1, \
 	.has_full_48bit_ppgtt = 1, \
+	GEN9_DEFAULT_PAGE_SIZES, \
 	GEN_DEFAULT_PIPEOFFSETS, \
 	IVB_CURSOR_OFFSETS, \
 	BDW_COLORS
@@ -409,6 +428,7 @@ static const struct intel_device_info intel_geminilake_info = {
 
 static const struct intel_device_info intel_kabylake_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SIZES,
 	.platform = INTEL_KABYLAKE,
 	.gen = 9,
 	.has_csr = 1,
@@ -418,6 +438,7 @@ static const struct intel_device_info intel_kabylake_info = {
 
 static const struct intel_device_info intel_kabylake_gt3_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SIZES,
 	.platform = INTEL_KABYLAKE,
 	.gen = 9,
 	.has_csr = 1,
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 2c1500d0d55a..b7e4ba03e3bc 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -143,6 +143,9 @@ struct drm_i915_private *mock_gem_device(void)
 
 	mkwrite_device_info(i915)->gen = -1;
 
+	mkwrite_device_info(i915)->page_size_mask =
+		I915_GTT_PAGE_SIZE_4K;
+
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 02/17] drm/i915: introduce gtt page size
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
  2017-05-16  8:29 ` [PATCH 01/17] drm/i915: introduce page_size_mask to dev_info Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:41   ` Chris Wilson
  2017-05-16  9:59   ` Chris Wilson
  2017-05-16  8:29 ` [PATCH 03/17] drm/i915: align the vma start to the " Matthew Auld
                   ` (15 subsequent siblings)
  17 siblings, 2 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

In preparation for supporting huge gtt pages for the ppgtt, we introduce
a gtt_page_size member for gem objects.  We fill in the gtt page size by
scanning the sg table to determine the max page size which satisfies the
alignment for each sg entry.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h        |  2 ++
 drivers/gpu/drm/i915/i915_gem.c        | 23 +++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_object.h |  2 ++
 3 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e18f11f77f35..a7a108d18a2d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2843,6 +2843,8 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define USES_PPGTT(dev_priv)		(i915.enable_ppgtt)
 #define USES_FULL_PPGTT(dev_priv)	(i915.enable_ppgtt >= 2)
 #define USES_FULL_48BIT_PPGTT(dev_priv)	(i915.enable_ppgtt == 3)
+#define HAS_PAGE_SIZE(dev_priv, page_size) \
+	((dev_priv)->info.page_size_mask & (page_size))
 
 #define HAS_OVERLAY(dev_priv)		 ((dev_priv)->info.has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0c1cbe98c994..6a5e864d7710 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2294,6 +2294,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 	if (!IS_ERR(pages))
 		obj->ops->put_pages(obj, pages);
 
+	obj->gtt_page_size = 0;
+
 unlock:
 	mutex_unlock(&obj->mm.lock);
 }
@@ -2473,6 +2475,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 				 struct sg_table *pages)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	unsigned long supported_page_sizes = INTEL_INFO(i915)->page_size_mask;
+	struct scatterlist *sg;
+	unsigned int sg_mask = 0;
+	unsigned int i;
+	unsigned int bit;
+
 	lockdep_assert_held(&obj->mm.lock);
 
 	obj->mm.get_page.sg_pos = pages->sgl;
@@ -2486,6 +2495,20 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 		__i915_gem_object_pin_pages(obj);
 		obj->mm.quirked = true;
 	}
+
+	for_each_sg(pages->sgl, sg, pages->nents, i)
+		sg_mask |= sg->length;
+
+	GEM_BUG_ON(!sg_mask);
+
+	for_each_set_bit(bit, &supported_page_sizes, BITS_PER_LONG) {
+		if (!IS_ALIGNED(sg_mask, 1 << bit))
+			break;
+
+		obj->gtt_page_size = 1 << bit;
+	}
+
+	GEM_BUG_ON(!HAS_PAGE_SIZE(i915, obj->gtt_page_size));
 }
 
 static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 174cf923c236..75beb6a79635 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -107,6 +107,8 @@ struct drm_i915_gem_object {
 	unsigned int cache_level:3;
 	unsigned int cache_dirty:1;
 
+	unsigned int gtt_page_size;
+
 	atomic_t frontbuffer_bits;
 	unsigned int frontbuffer_ggtt_origin; /* write once */
 	struct i915_gem_active frontbuffer_write;
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 03/17] drm/i915: align the vma start to the gtt page size
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
  2017-05-16  8:29 ` [PATCH 01/17] drm/i915: introduce page_size_mask to dev_info Matthew Auld
  2017-05-16  8:29 ` [PATCH 02/17] drm/i915: introduce gtt page size Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:40   ` Chris Wilson
  2017-05-16  8:29 ` [PATCH 04/17] drm/i915: align 64K objects to 2M Matthew Auld
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

When inserting into a 48bit PPGTT we need to align the vma start address
to the required page size boundary. The size will already be aligned so
no padding is needed.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_vma.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 1aba47024656..53f6c94b2ee6 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -471,6 +471,14 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (ret)
 		return ret;
 
+	if (i915_vm_is_48bit(vma->vm) &&
+	    obj->gtt_page_size > I915_GTT_PAGE_SIZE) {
+		unsigned int page_alignment = obj->gtt_page_size;
+
+		alignment = max_t(typeof(alignment), alignment, page_alignment);
+		GEM_BUG_ON(!IS_ALIGNED(vma->size, obj->gtt_page_size));
+	}
+
 	if (flags & PIN_OFFSET_FIXED) {
 		u64 offset = flags & PIN_OFFSET_MASK;
 		if (!IS_ALIGNED(offset, alignment) ||
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 04/17] drm/i915: align 64K objects to 2M
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (2 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 03/17] drm/i915: align the vma start to the " Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 05/17] drm/i915: fallback to normal pages on vma insert failure Matthew Auld
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

We can't mix 64K and 4K pte's in the same page-table, so for now we
align 64K objects to 2M to avoid any potential mixing. This is
potentially wasteful but in reality shouldn't be too bad since this only
applies to the virtual address space of a 48b PPGTT.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 53f6c94b2ee6..d2e8edd351cf 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -475,6 +475,15 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	    obj->gtt_page_size > I915_GTT_PAGE_SIZE) {
 		unsigned int page_alignment = obj->gtt_page_size;
 
+		/* We can't mix 64K and 4K pte's in the same page-table (2M
+		 * block), and so to avoid the ugliness and complexity of
+		 * coloring we opt for just aligning 64K objects to 2M.
+		 */
+		if (page_alignment == I915_GTT_PAGE_SIZE_64K) {
+			page_alignment = I915_GTT_PAGE_SIZE_2M;
+			size = roundup(size, page_alignment);
+		}
+
 		alignment = max_t(typeof(alignment), alignment, page_alignment);
 		GEM_BUG_ON(!IS_ALIGNED(vma->size, obj->gtt_page_size));
 	}
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 05/17] drm/i915: fallback to normal pages on vma insert failure
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (3 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 04/17] drm/i915: align 64K objects to 2M Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:39   ` Chris Wilson
  2017-05-16  8:29 ` [PATCH 06/17] mm/shmem: expose driver overridable huge option Matthew Auld
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

Part of the cost in choosing huge-gtt-pages is potentially using a
larger alignment and/or size. Therefore if our vma insert fails either
because of the insert/reserve or the pin-offset-fixed we should fallback
to normal pages and retry before giving up.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_vma.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index d2e8edd351cf..9d4ffd76184e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -427,6 +427,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 {
 	struct drm_i915_private *dev_priv = vma->vm->i915;
 	struct drm_i915_gem_object *obj = vma->obj;
+	u64 requested_alignment;
+	u64 requested_size;
 	u64 start, end;
 	int ret;
 
@@ -471,6 +473,9 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (ret)
 		return ret;
 
+	requested_alignment = alignment;
+	requested_size = size;
+
 	if (i915_vm_is_48bit(vma->vm) &&
 	    obj->gtt_page_size > I915_GTT_PAGE_SIZE) {
 		unsigned int page_alignment = obj->gtt_page_size;
@@ -488,6 +493,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 		GEM_BUG_ON(!IS_ALIGNED(vma->size, obj->gtt_page_size));
 	}
 
+retry_insert:
 	if (flags & PIN_OFFSET_FIXED) {
 		u64 offset = flags & PIN_OFFSET_MASK;
 		if (!IS_ALIGNED(offset, alignment) ||
@@ -522,6 +528,19 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	return 0;
 
 err_unpin:
+	/* We try to use huge-gtt-pages whenever we can, but part of the cost
+	 * is that we may need adjust the alignment and possibly the size
+	 * before we insert into a vm, and so we should always fallback and
+	 * retry without huge-gtt-pages if we ever encounter a failure, before
+	 * giving up.
+	 */
+	if (alignment > requested_alignment || size > requested_size) {
+		obj->gtt_page_size = I915_GTT_PAGE_SIZE;
+		alignment = requested_alignment;
+		size = requested_size;
+		goto retry_insert;
+	}
+
 	i915_gem_object_unpin_pages(obj);
 	return ret;
 }
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 06/17] mm/shmem: expose driver overridable huge option
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (4 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 05/17] drm/i915: fallback to normal pages on vma insert failure Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16 10:02   ` Kirill A. Shutemov
  2017-05-16  8:29   ` Matthew Auld
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx
  Cc: Joonas Lahtinen, Dave Hansen, Daniel Vetter, Hugh Dickins, linux-mm

In i915 we are aiming to support huge GTT pages for the GPU, and to
complement this we also want to enable THP for our shmem backed objects.
Even though THP is supported in shmemfs it can only be enabled through
the huge= mount option, but for users of the kernel mounted shm_mnt like
i915, we are a little stuck. There is the sysfs knob shmem_enabled to
either forcefully enable/disable the feature, but that seems to only be
useful for testing purposes. What we propose is to expose a driver
overridable huge option as part of shmem_inode_info to control the use
of THP for a given mapping.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
---
 include/linux/shmem_fs.h | 20 ++++++++++++++++++++
 mm/shmem.c               | 37 +++++++++++++++----------------------
 2 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index a7d6bd2a918f..4cfdb2e8e1d8 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -21,8 +21,28 @@ struct shmem_inode_info {
 	struct shared_policy	policy;		/* NUMA memory alloc policy */
 	struct simple_xattrs	xattrs;		/* list of xattrs */
 	struct inode		vfs_inode;
+	unsigned char		huge;           /* driver override sbinfo->huge */
 };
 
+/*
+ * Definitions for "huge tmpfs": tmpfs mounted with the huge= option
+ *
+ * SHMEM_HUGE_NEVER:
+ *	disables huge pages for the mount;
+ * SHMEM_HUGE_ALWAYS:
+ *	enables huge pages for the mount;
+ * SHMEM_HUGE_WITHIN_SIZE:
+ *	only allocate huge pages if the page will be fully within i_size,
+ *	also respect fadvise()/madvise() hints;
+ * SHMEM_HUGE_ADVISE:
+ *	only allocate huge pages if requested with fadvise()/madvise();
+ */
+
+#define SHMEM_HUGE_NEVER	0
+#define SHMEM_HUGE_ALWAYS	1
+#define SHMEM_HUGE_WITHIN_SIZE	2
+#define SHMEM_HUGE_ADVISE	3
+
 struct shmem_sb_info {
 	unsigned long max_blocks;   /* How many blocks are allowed */
 	struct percpu_counter used_blocks;  /* How many are allocated */
diff --git a/mm/shmem.c b/mm/shmem.c
index e67d6ba4e98e..4fa042694957 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -346,25 +346,6 @@ static bool shmem_confirm_swap(struct address_space *mapping,
 }
 
 /*
- * Definitions for "huge tmpfs": tmpfs mounted with the huge= option
- *
- * SHMEM_HUGE_NEVER:
- *	disables huge pages for the mount;
- * SHMEM_HUGE_ALWAYS:
- *	enables huge pages for the mount;
- * SHMEM_HUGE_WITHIN_SIZE:
- *	only allocate huge pages if the page will be fully within i_size,
- *	also respect fadvise()/madvise() hints;
- * SHMEM_HUGE_ADVISE:
- *	only allocate huge pages if requested with fadvise()/madvise();
- */
-
-#define SHMEM_HUGE_NEVER	0
-#define SHMEM_HUGE_ALWAYS	1
-#define SHMEM_HUGE_WITHIN_SIZE	2
-#define SHMEM_HUGE_ADVISE	3
-
-/*
  * Special values.
  * Only can be set via /sys/kernel/mm/transparent_hugepage/shmem_enabled:
  *
@@ -1715,6 +1696,8 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
 		swap_free(swap);
 
 	} else {
+		unsigned char sbinfo_huge = sbinfo->huge;
+
 		if (vma && userfaultfd_missing(vma)) {
 			*fault_type = handle_userfault(vmf, VM_UFFD_MISSING);
 			return 0;
@@ -1727,7 +1710,10 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
 			goto alloc_nohuge;
 		if (shmem_huge == SHMEM_HUGE_FORCE)
 			goto alloc_huge;
-		switch (sbinfo->huge) {
+		/* driver override sbinfo->huge */
+		if (info->huge)
+			sbinfo_huge = info->huge;
+		switch (sbinfo_huge) {
 			loff_t i_size;
 			pgoff_t off;
 		case SHMEM_HUGE_NEVER:
@@ -2032,10 +2018,13 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 
 	if (shmem_huge != SHMEM_HUGE_FORCE) {
 		struct super_block *sb;
+		unsigned char sbinfo_huge = 0;
 
 		if (file) {
 			VM_BUG_ON(file->f_op != &shmem_file_operations);
 			sb = file_inode(file)->i_sb;
+			/* driver override sbinfo->huge */
+			sbinfo_huge = SHMEM_I(file_inode(file))->huge;
 		} else {
 			/*
 			 * Called directly from mm/mmap.c, or drivers/char/mem.c
@@ -2045,7 +2034,8 @@ unsigned long shmem_get_unmapped_area(struct file *file,
 				return addr;
 			sb = shm_mnt->mnt_sb;
 		}
-		if (SHMEM_SB(sb)->huge == SHMEM_HUGE_NEVER)
+		if (SHMEM_SB(sb)->huge == SHMEM_HUGE_NEVER &&
+		    sbinfo_huge == SHMEM_HUGE_NEVER)
 			return addr;
 	}
 
@@ -4031,6 +4021,7 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
 {
 	struct inode *inode = file_inode(vma->vm_file);
 	struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
+	unsigned char sbinfo_huge = sbinfo->huge;
 	loff_t i_size;
 	pgoff_t off;
 
@@ -4038,7 +4029,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
 		return true;
 	if (shmem_huge == SHMEM_HUGE_DENY)
 		return false;
-	switch (sbinfo->huge) {
+	if (SHMEM_I(inode)->huge)
+		sbinfo_huge = SHMEM_I(inode)->huge;
+	switch (sbinfo_huge) {
 		case SHMEM_HUGE_NEVER:
 			return false;
 		case SHMEM_HUGE_ALWAYS:
-- 
2.9.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 07/17] drm/i915: request THP for shmem backed objects
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
@ 2017-05-16  8:29   ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 02/17] drm/i915: introduce gtt page size Matthew Auld
                     ` (16 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx
  Cc: Joonas Lahtinen, Dave Hansen, Daniel Vetter, Hugh Dickins, linux-mm

Default to transparent-huge-pages for shmem backed objects through the
SHMEM_HUGE_WITHIN_SIZE huge option. Best effort only.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
---
 drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6a5e864d7710..e4ee54f0f55f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4308,6 +4308,16 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 	mapping = obj->base.filp->f_mapping;
 	mapping_set_gfp_mask(mapping, mask);
 
+	/* If configured attempt to use THP through shmemfs. This will
+	 * effectively default to huge-pages for this mapping if it makes sense
+	 * given the object size and HPAGE_PMD_SIZE. This is best effort only.
+	 */
+#ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
+	if (has_transparent_hugepage() &&
+	    HAS_PAGE_SIZE(dev_priv, HPAGE_PMD_SIZE))
+		SHMEM_I(mapping->host)->huge = SHMEM_HUGE_WITHIN_SIZE;
+#endif
+
 	i915_gem_object_init(obj, &i915_gem_object_ops);
 
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
-- 
2.9.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 07/17] drm/i915: request THP for shmem backed objects
@ 2017-05-16  8:29   ` Matthew Auld
  0 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Hansen, Hugh Dickins, linux-mm

Default to transparent-huge-pages for shmem backed objects through the
SHMEM_HUGE_WITHIN_SIZE huge option. Best effort only.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
---
 drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6a5e864d7710..e4ee54f0f55f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4308,6 +4308,16 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 	mapping = obj->base.filp->f_mapping;
 	mapping_set_gfp_mask(mapping, mask);
 
+	/* If configured attempt to use THP through shmemfs. This will
+	 * effectively default to huge-pages for this mapping if it makes sense
+	 * given the object size and HPAGE_PMD_SIZE. This is best effort only.
+	 */
+#ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
+	if (has_transparent_hugepage() &&
+	    HAS_PAGE_SIZE(dev_priv, HPAGE_PMD_SIZE))
+		SHMEM_I(mapping->host)->huge = SHMEM_HUGE_WITHIN_SIZE;
+#endif
+
 	i915_gem_object_init(obj, &i915_gem_object_ops);
 
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 08/17] drm/i915: pass gtt page size to insert_entries
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (6 preceding siblings ...)
  2017-05-16  8:29   ` Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 09/17] drm/i915: enable IPS bit for 64K pages Matthew Auld
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

Expose a page size parameter for insert_entries, this is only relevant
for inserting into the 4lvl ppgtt where we pass the gtt_page_size of the
object.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 32 +++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  1 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  3 ++-
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  1 +
 4 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index bc3c63e92c16..3be3cbfb6d28 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -210,7 +210,8 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 		pte_flags |= PTE_READ_ONLY;
 
 	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+				vma->obj->gtt_page_size, cache_level,
+				pte_flags);
 
 	return 0;
 }
@@ -911,6 +912,7 @@ gen8_ppgtt_insert_pte_entries(struct i915_hw_ppgtt *ppgtt,
 static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
 				   struct sg_table *pages,
 				   u64 start,
+				   unsigned int page_size,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
@@ -929,6 +931,7 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
 static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 				   struct sg_table *pages,
 				   u64 start,
+				   unsigned int page_size,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
@@ -940,9 +943,24 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	};
 	struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
 	struct gen8_insert_pte idx = gen8_insert_pte(start);
+	bool (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
+			       struct i915_page_directory_pointer *pdp,
+			       struct sgt_dma *iter,
+			       struct gen8_insert_pte *idx,
+			       enum i915_cache_level cache_level);
+
+	/* TODO: turn this into vfunc */
+	switch (page_size) {
+	case I915_GTT_PAGE_SIZE_4K:
+		insert_entries = gen8_ppgtt_insert_pte_entries;
+		break;
+	default:
+		MISSING_CASE(page_size);
+		return;
+	}
 
-	while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter,
-					     &idx, cache_level))
+	while (insert_entries(ppgtt, pdps[idx.pml4e++], &iter, &idx,
+			      cache_level))
 		GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
 }
 
@@ -1625,6 +1643,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 				      struct sg_table *pages,
 				      u64 start,
+				      unsigned int page_size,
 				      enum i915_cache_level cache_level,
 				      u32 flags)
 {
@@ -2098,6 +2117,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     u64 start,
+				     unsigned int page_size,
 				     enum i915_cache_level level,
 				     u32 unused)
 {
@@ -2145,6 +2165,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     u64 start,
+				     unsigned int page_size,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
@@ -2229,6 +2250,7 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
 static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *pages,
 				     u64 start,
+				     unsigned int page_size,
 				     enum i915_cache_level cache_level,
 				     u32 unused)
 {
@@ -2265,7 +2287,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 
 	intel_runtime_pm_get(i915);
 	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+				I915_GTT_PAGE_SIZE, cache_level, pte_flags);
 	intel_runtime_pm_put(i915);
 
 	/*
@@ -2320,6 +2342,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 
 		appgtt->base.insert_entries(&appgtt->base,
 					    vma->pages, vma->node.start,
+					    I915_GTT_PAGE_SIZE,
 					    cache_level, pte_flags);
 	}
 
@@ -2327,6 +2350,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 		intel_runtime_pm_get(i915);
 		vma->vm->insert_entries(vma->vm,
 					vma->pages, vma->node.start,
+					I915_GTT_PAGE_SIZE,
 					cache_level, pte_flags);
 		intel_runtime_pm_put(i915);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f8db231c28aa..5a2a3907d266 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -320,6 +320,7 @@ struct i915_address_space {
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
 			       u64 start,
+			       unsigned int page_size,
 			       enum i915_cache_level cache_level,
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 50710e3f1caa..259b5e139df1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -256,7 +256,8 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 				break;
 
 			vm->insert_entries(vm, obj->mm.pages, addr,
-					   I915_CACHE_NONE, 0);
+					   I915_GTT_PAGE_SIZE, I915_CACHE_NONE,
+					   0);
 		}
 		count = n;
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index a61309c7cb3e..38532a008387 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -35,6 +35,7 @@ static void mock_insert_page(struct i915_address_space *vm,
 static void mock_insert_entries(struct i915_address_space *vm,
 				struct sg_table *st,
 				u64 start,
+				unsigned int page_size,
 				enum i915_cache_level level, u32 flags)
 {
 }
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 09/17] drm/i915: enable IPS bit for 64K pages
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (7 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 08/17] drm/i915: pass gtt page size to insert_entries Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 10/17] drm/i915: support inserting 64K pages into the 48b PPGTT Matthew Auld
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

Before we can enable 64K pages through the IPS bit, we must first enable
it through MMIO, otherwise the page-walker will simply ignore it.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 11 +++++++++++
 drivers/gpu/drm/i915/i915_reg.h |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e4ee54f0f55f..fa133aa61261 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4717,6 +4717,17 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 		}
 	}
 
+	/* To support 64K PTE's we need to first enable the use of the
+	 * Intermediate-Page-Size(IPS) bit of the PDE field via some magical
+	 * mmio, otherwise the page-walker will simply ignore the IPS bit. This
+	 * shouldn't be needed after GEN10.
+	 */
+	if (HAS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K) &&
+	    INTEL_GEN(dev_priv) <= 10)
+		I915_WRITE(GEN8_GAMW_ECO_DEV_RW_IA,
+			   I915_READ(GEN8_GAMW_ECO_DEV_RW_IA) |
+			   GAMW_ECO_ENABLE_64K_IPS_FIELD);
+
 	i915_gem_init_swizzling(dev_priv);
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index ee144ec57935..7416e7d7d472 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1993,6 +1993,9 @@ enum skl_disp_power_wells {
 #define GEN9_GAMT_ECO_REG_RW_IA _MMIO(0x4ab0)
 #define   GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS	(1<<18)
 
+#define GEN8_GAMW_ECO_DEV_RW_IA _MMIO(0x4080)
+#define   GAMW_ECO_ENABLE_64K_IPS_FIELD 0xF
+
 #define GAMT_CHKN_BIT_REG	_MMIO(0x4ab8)
 #define   GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING	(1<<28)
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 10/17] drm/i915: support inserting 64K pages into the 48b PPGTT
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (8 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 09/17] drm/i915: enable IPS bit for 64K pages Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 11/17] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

If we set the IPS bit, aka PDE[11] then every 16th entry should be used
to index, the HW makes no assumptions for any other PTEs.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 74 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 +
 2 files changed, 76 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3be3cbfb6d28..874854e77247 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -854,6 +854,77 @@ static __always_inline struct gen8_insert_pte gen8_insert_pte(u64 start)
 }
 
 static __always_inline bool
+gen8_ppgtt_insert_64K_pte_entries(struct i915_hw_ppgtt *ppgtt,
+				  struct i915_page_directory_pointer *pdp,
+				  struct sgt_dma *iter,
+				  struct gen8_insert_pte *idx,
+				  enum i915_cache_level cache_level)
+{
+	struct i915_page_directory *pd;
+	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
+	gen8_pte_t *vaddr;
+	bool ret;
+
+	/* Currently 64K objects should be aligned to 2M to prevent mixing 4K
+	 * and 64K pte's in the same page-table.
+	 */
+	GEM_BUG_ON(idx->pte);
+	GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+	pd = pdp->page_directory[idx->pdpe];
+
+	vaddr = kmap_atomic_px(pd);
+	vaddr[idx->pde] |= GEN8_PDE_IPS_64K;
+	kunmap_atomic(vaddr);
+
+	vaddr = kmap_atomic_px(pd->page_table[idx->pde]);
+	do {
+		vaddr[idx->pte] = pte_encode | iter->dma;
+		iter->dma += I915_GTT_PAGE_SIZE_64K;
+		if (iter->dma >= iter->max) {
+			iter->sg = __sg_next(iter->sg);
+			if (!iter->sg) {
+				ret = false;
+				break;
+			}
+
+			iter->dma = sg_dma_address(iter->sg);
+			iter->max = iter->dma + iter->sg->length;
+		}
+
+		idx->pte += 16;
+
+		if (idx->pte == GEN8_PTES) {
+			idx->pte = 0;
+
+			if (++idx->pde == I915_PDES) {
+				idx->pde = 0;
+
+				if (++idx->pdpe == GEN8_PML4ES_PER_PML4) {
+					idx->pdpe = 0;
+					ret = true;
+					break;
+				}
+
+				GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+				pd = pdp->page_directory[idx->pdpe];
+			}
+
+			kunmap_atomic(vaddr);
+			vaddr = kmap_atomic_px(pd);
+			vaddr[idx->pde] |= GEN8_PDE_IPS_64K;
+			kunmap_atomic(vaddr);
+
+			vaddr = kmap_atomic_px(pd->page_table[idx->pde]);
+		}
+	} while (1);
+	kunmap_atomic(vaddr);
+
+	mark_tlbs_dirty(ppgtt);
+
+	return ret;
+}
+
+static __always_inline bool
 gen8_ppgtt_insert_pte_entries(struct i915_hw_ppgtt *ppgtt,
 			      struct i915_page_directory_pointer *pdp,
 			      struct sgt_dma *iter,
@@ -954,6 +1025,9 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	case I915_GTT_PAGE_SIZE_4K:
 		insert_entries = gen8_ppgtt_insert_pte_entries;
 		break;
+	case I915_GTT_PAGE_SIZE_64K:
+		insert_entries = gen8_ppgtt_insert_64K_pte_entries;
+		break;
 	default:
 		MISSING_CASE(page_size);
 		return;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 5a2a3907d266..04d37c62c3ef 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -149,6 +149,8 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((u64)(x) << ((i) * 8))
 
+#define GEN8_PDE_IPS_64K BIT(11)
+
 struct sg_table;
 
 struct intel_rotation_info {
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 11/17] drm/i915: disable GTT cache for 2M/1G pages
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (9 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 10/17] drm/i915: support inserting 64K pages into the 48b PPGTT Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16 10:04   ` Ville Syrjälä
  2017-05-16  8:29 ` [PATCH 12/17] drm/i915: support inserting 2M pages into the 48b PPGTT Matthew Auld
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

When SW enables the use of 2M/1G pages, it must disable the GTT cache.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ef0e9f8d4dbd..b39b8d394179 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8178,10 +8178,10 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv)
 
 	/*
 	 * WaGttCachingOffByDefault:bdw
-	 * GTT cache may not work with big pages, so if those
-	 * are ever enabled GTT cache may need to be disabled.
+	 * The GTT cache must be disabled if the system is planning to use
+	 * 2M/1G pages.
 	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+	I915_WRITE(HSW_GTT_CACHE_EN, 0);
 
 	/* WaKVMNotificationOnConfigChange:bdw */
 	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
@@ -8457,10 +8457,10 @@ static void cherryview_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen8_set_l3sqc_credits(dev_priv, 38, 2);
 
 	/*
-	 * GTT cache may not work with big pages, so if those
-	 * are ever enabled GTT cache may need to be disabled.
+	 * The GTT cache must be disabled if the system is planning to use
+	 * 2M/1G pages.
 	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+	I915_WRITE(HSW_GTT_CACHE_EN, 0);
 }
 
 static void g4x_init_clock_gating(struct drm_i915_private *dev_priv)
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 12/17] drm/i915: support inserting 2M pages into the 48b PPGTT
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (10 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 11/17] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 13/17] drm/i915: support inserting 1G " Matthew Auld
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

To enable 2M pages we set the PS bit of PDE, aka PDE[7] to indicate a 2M
page and not a page-table.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 53 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
 2 files changed, 54 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 874854e77247..3dadb501daa6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -854,6 +854,56 @@ static __always_inline struct gen8_insert_pte gen8_insert_pte(u64 start)
 }
 
 static __always_inline bool
+gen8_ppgtt_insert_2M_pde_entries(struct i915_hw_ppgtt *ppgtt,
+				 struct i915_page_directory_pointer *pdp,
+				 struct sgt_dma *iter,
+				 struct gen8_insert_pte *idx,
+				 enum i915_cache_level cache_level)
+{
+	const gen8_pte_t pde_encode = gen8_pte_encode(GEN8_PDE_PS_2M,
+						      cache_level);
+	gen8_pte_t *vaddr;
+	bool ret;
+
+	GEM_BUG_ON(idx->pte);
+	GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+	vaddr = kmap_atomic_px(pdp->page_directory[idx->pdpe]);
+	do {
+		vaddr[idx->pde] = pde_encode | iter->dma;
+		iter->dma += I915_GTT_PAGE_SIZE_2M;
+		if (iter->dma >= iter->max) {
+			iter->sg = __sg_next(iter->sg);
+			if (!iter->sg) {
+				ret = false;
+				break;
+			}
+
+			iter->dma = sg_dma_address(iter->sg);
+			iter->max = iter->dma + iter->sg->length;
+		}
+
+		if (++idx->pde == I915_PDES) {
+			idx->pde = 0;
+
+			if (++idx->pdpe == GEN8_PML4ES_PER_PML4) {
+				idx->pdpe = 0;
+				ret = true;
+				break;
+			}
+
+			kunmap_atomic(vaddr);
+			vaddr = kmap_atomic_px(pdp->page_directory[idx->pdpe]);
+		}
+
+	} while (1);
+	kunmap_atomic(vaddr);
+
+	mark_tlbs_dirty(ppgtt);
+
+	return ret;
+}
+
+static __always_inline bool
 gen8_ppgtt_insert_64K_pte_entries(struct i915_hw_ppgtt *ppgtt,
 				  struct i915_page_directory_pointer *pdp,
 				  struct sgt_dma *iter,
@@ -1028,6 +1078,9 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	case I915_GTT_PAGE_SIZE_64K:
 		insert_entries = gen8_ppgtt_insert_64K_pte_entries;
 		break;
+	case I915_GTT_PAGE_SIZE_2M:
+		insert_entries = gen8_ppgtt_insert_2M_pde_entries;
+		break;
 	default:
 		MISSING_CASE(page_size);
 		return;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 04d37c62c3ef..840d08be8fa3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -150,6 +150,7 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT(i, x)			((u64)(x) << ((i) * 8))
 
 #define GEN8_PDE_IPS_64K BIT(11)
+#define GEN8_PDE_PS_2M   BIT(7)
 
 struct sg_table;
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 13/17] drm/i915: support inserting 1G pages into the 48b PPGTT
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (11 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 12/17] drm/i915: support inserting 2M pages into the 48b PPGTT Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics Matthew Auld
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

To enable 1G pages we set the PS bit in the PDPE, aka PDPE[7] to
indicate a 1G page, and not a PD.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 47 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
 2 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3dadb501daa6..e81c78ffbea5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -854,6 +854,50 @@ static __always_inline struct gen8_insert_pte gen8_insert_pte(u64 start)
 }
 
 static __always_inline bool
+gen8_ppgtt_insert_1G_pdpe_entries(struct i915_hw_ppgtt *ppgtt,
+				  struct i915_page_directory_pointer *pdp,
+				  struct sgt_dma *iter,
+				  struct gen8_insert_pte *idx,
+				  enum i915_cache_level cache_level)
+{
+	const gen8_pte_t pdpe_encode = gen8_pte_encode(GEN8_PDPE_PS_1G,
+						       cache_level);
+	gen8_pte_t *vaddr;
+	bool ret;
+
+	GEM_BUG_ON(idx->pte);
+	GEM_BUG_ON(idx->pde);
+	GEM_BUG_ON(idx->pdpe >= i915_pdpes_per_pdp(&ppgtt->base));
+	vaddr = kmap_atomic_px(pdp);
+	do {
+		vaddr[idx->pdpe] = pdpe_encode | iter->dma;
+		iter->dma += I915_GTT_PAGE_SIZE_1G;
+		if (iter->dma >= iter->max) {
+			iter->sg = __sg_next(iter->sg);
+			if (!iter->sg) {
+				ret = false;
+				break;
+			}
+
+			iter->dma = sg_dma_address(iter->sg);
+			iter->max = iter->dma + iter->sg->length;
+		}
+
+		if (++idx->pdpe == GEN8_PML4ES_PER_PML4) {
+			idx->pdpe = 0;
+			ret = true;
+			break;
+		}
+
+	} while (1);
+	kunmap_atomic(vaddr);
+
+	mark_tlbs_dirty(ppgtt);
+
+	return ret;
+}
+
+static __always_inline bool
 gen8_ppgtt_insert_2M_pde_entries(struct i915_hw_ppgtt *ppgtt,
 				 struct i915_page_directory_pointer *pdp,
 				 struct sgt_dma *iter,
@@ -1081,6 +1125,9 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 	case I915_GTT_PAGE_SIZE_2M:
 		insert_entries = gen8_ppgtt_insert_2M_pde_entries;
 		break;
+	case I915_GTT_PAGE_SIZE_1G:
+		insert_entries = gen8_ppgtt_insert_1G_pdpe_entries;
+		break;
 	default:
 		MISSING_CASE(page_size);
 		return;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 840d08be8fa3..1517cfdbd5ce 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -152,6 +152,8 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
+#define GEN8_PDPE_PS_1G  BIT(7)
+
 struct sg_table;
 
 struct intel_rotation_info {
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (12 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 13/17] drm/i915: support inserting 1G " Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-19 10:48   ` Chris Wilson
  2017-05-19 10:51   ` Chris Wilson
  2017-05-16  8:29 ` [PATCH 15/17] drm/i915: enable platform support for 64K pages Matthew Auld
                   ` (3 subsequent siblings)
  17 siblings, 2 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

Good to know, mostly for debugging purposes.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 37 ++++++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index bd9abef40c66..dd36baa47667 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -117,6 +117,23 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
 	return size;
 }
 
+static const char *stringify_page_size(unsigned int page_size)
+{
+	switch (page_size) {
+	case I915_GTT_PAGE_SIZE_4K:
+		return "4K";
+	case I915_GTT_PAGE_SIZE_64K:
+		return "64K";
+	case I915_GTT_PAGE_SIZE_2M:
+		return "2M";
+	case I915_GTT_PAGE_SIZE_1G:
+		return "1G";
+	default:
+		MISSING_CASE(page_size);
+		return "";
+	}
+}
+
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
@@ -128,7 +145,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 
 	lockdep_assert_held(&obj->base.dev->struct_mutex);
 
-	seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x %s%s%s",
+	seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %s %02x %02x %s%s%s",
 		   &obj->base,
 		   get_active_flag(obj),
 		   get_pin_flag(obj),
@@ -136,6 +153,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		   get_global_flag(obj),
 		   get_pin_mapped_flag(obj),
 		   obj->base.size / 1024,
+		   stringify_page_size(obj->gtt_page_size),
 		   obj->base.read_domains,
 		   obj->base.write_domain,
 		   i915_cache_level_str(dev_priv, obj->cache_level),
@@ -399,8 +417,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct drm_device *dev = &dev_priv->drm;
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
-	u32 count, mapped_count, purgeable_count, dpy_count;
-	u64 size, mapped_size, purgeable_size, dpy_size;
+	u32 count, mapped_count, purgeable_count, dpy_count, huge_count;
+	u64 size, mapped_size, purgeable_size, dpy_size, huge_size;
 	struct drm_i915_gem_object *obj;
 	struct drm_file *file;
 	int ret;
@@ -416,6 +434,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	size = count = 0;
 	mapped_size = mapped_count = 0;
 	purgeable_size = purgeable_count = 0;
+	huge_size = huge_count = 0;
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_link) {
 		size += obj->base.size;
 		++count;
@@ -429,6 +448,11 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 			mapped_count++;
 			mapped_size += obj->base.size;
 		}
+
+		if (obj->gtt_page_size > I915_GTT_PAGE_SIZE) {
+			huge_count++;
+			huge_size += obj->base.size;
+		}
 	}
 	seq_printf(m, "%u unbound objects, %llu bytes\n", count, size);
 
@@ -451,6 +475,11 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 			mapped_count++;
 			mapped_size += obj->base.size;
 		}
+
+		if (obj->gtt_page_size > I915_GTT_PAGE_SIZE) {
+			huge_count++;
+			huge_size += obj->base.size;
+		}
 	}
 	seq_printf(m, "%u bound objects, %llu bytes\n",
 		   count, size);
@@ -458,6 +487,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   purgeable_count, purgeable_size);
 	seq_printf(m, "%u mapped objects, %llu bytes\n",
 		   mapped_count, mapped_size);
+	seq_printf(m, "%u huge-paged objects, %llu bytes\n",
+		   huge_count, huge_size);
 	seq_printf(m, "%u display objects (pinned), %llu bytes\n",
 		   dpy_count, dpy_size);
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 15/17] drm/i915: enable platform support for 64K pages
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (13 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 16/17] drm/i915: enable platform support for 2M pages Matthew Auld
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

For gen9+ enable platform level support for 64K pages. Also enable for
mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c                  | 6 ++++--
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 7caccb5bf963..0a6940c3841d 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -354,14 +354,16 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_aliasing_ppgtt = 1,
 	.has_full_ppgtt = 1,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
-	GEN_DEFAULT_PAGE_SIZES,
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K |
+			  I915_GTT_PAGE_SIZE_64K,
 	GEN_CHV_PIPEOFFSETS,
 	CURSOR_OFFSETS,
 	CHV_COLORS,
 };
 
 #define GEN9_DEFAULT_PAGE_SIZES \
-	.page_size_mask = I915_GTT_PAGE_SIZE_4K
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
+			  I915_GTT_PAGE_SIZE_64K
 
 static const struct intel_device_info intel_skylake_info = {
 	BDW_FEATURES,
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index b7e4ba03e3bc..d41ed2178e3e 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -144,7 +144,8 @@ struct drm_i915_private *mock_gem_device(void)
 	mkwrite_device_info(i915)->gen = -1;
 
 	mkwrite_device_info(i915)->page_size_mask =
-		I915_GTT_PAGE_SIZE_4K;
+		I915_GTT_PAGE_SIZE_4K |
+		I915_GTT_PAGE_SIZE_64K;
 
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 16/17] drm/i915: enable platform support for 2M pages
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (14 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 15/17] drm/i915: enable platform support for 64K pages Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:29 ` [PATCH 17/17] drm/i915: enable platform support for 1G pages Matthew Auld
  2017-05-16  8:49 ` ✓ Fi.CI.BAT: success for add support for huge-gtt-pages Patchwork
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

For gen8+ enable platform level support for 2M pages. Also enable for
mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c                  | 9 ++++++---
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 0a6940c3841d..452f061fd7b3 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -319,7 +319,8 @@ static const struct intel_device_info intel_haswell_info = {
 #define BDW_FEATURES \
 	HSW_FEATURES, \
 	BDW_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES, \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
+			  I915_GTT_PAGE_SIZE_2M, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
 	.has_64bit_reloc = 1
@@ -355,7 +356,8 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_full_ppgtt = 1,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K |
-			  I915_GTT_PAGE_SIZE_64K,
+			  I915_GTT_PAGE_SIZE_64K |
+			  I915_GTT_PAGE_SIZE_2M,
 	GEN_CHV_PIPEOFFSETS,
 	CURSOR_OFFSETS,
 	CHV_COLORS,
@@ -363,7 +365,8 @@ static const struct intel_device_info intel_cherryview_info = {
 
 #define GEN9_DEFAULT_PAGE_SIZES \
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
-			  I915_GTT_PAGE_SIZE_64K
+			  I915_GTT_PAGE_SIZE_64K | \
+			  I915_GTT_PAGE_SIZE_2M
 
 static const struct intel_device_info intel_skylake_info = {
 	BDW_FEATURES,
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index d41ed2178e3e..23f0db2dbb5d 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -145,7 +145,8 @@ struct drm_i915_private *mock_gem_device(void)
 
 	mkwrite_device_info(i915)->page_size_mask =
 		I915_GTT_PAGE_SIZE_4K |
-		I915_GTT_PAGE_SIZE_64K;
+		I915_GTT_PAGE_SIZE_64K |
+		I915_GTT_PAGE_SIZE_2M;
 
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 17/17] drm/i915: enable platform support for 1G pages
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (15 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 16/17] drm/i915: enable platform support for 2M pages Matthew Auld
@ 2017-05-16  8:29 ` Matthew Auld
  2017-05-16  8:49 ` ✓ Fi.CI.BAT: success for add support for huge-gtt-pages Patchwork
  17 siblings, 0 replies; 33+ messages in thread
From: Matthew Auld @ 2017-05-16  8:29 UTC (permalink / raw)
  To: intel-gfx

For gen8+ enable platform level support for 1G pages. Also enable for
mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c                  | 9 ++++++---
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 452f061fd7b3..68baefe6566c 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -320,7 +320,8 @@ static const struct intel_device_info intel_haswell_info = {
 	HSW_FEATURES, \
 	BDW_COLORS, \
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
-			  I915_GTT_PAGE_SIZE_2M, \
+			  I915_GTT_PAGE_SIZE_2M | \
+			  I915_GTT_PAGE_SIZE_1G, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
 	.has_64bit_reloc = 1
@@ -357,7 +358,8 @@ static const struct intel_device_info intel_cherryview_info = {
 	.display_mmio_offset = VLV_DISPLAY_BASE,
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K |
 			  I915_GTT_PAGE_SIZE_64K |
-			  I915_GTT_PAGE_SIZE_2M,
+			  I915_GTT_PAGE_SIZE_2M |
+			  I915_GTT_PAGE_SIZE_1G,
 	GEN_CHV_PIPEOFFSETS,
 	CURSOR_OFFSETS,
 	CHV_COLORS,
@@ -366,7 +368,8 @@ static const struct intel_device_info intel_cherryview_info = {
 #define GEN9_DEFAULT_PAGE_SIZES \
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
 			  I915_GTT_PAGE_SIZE_64K | \
-			  I915_GTT_PAGE_SIZE_2M
+			  I915_GTT_PAGE_SIZE_2M | \
+			  I915_GTT_PAGE_SIZE_1G
 
 static const struct intel_device_info intel_skylake_info = {
 	BDW_FEATURES,
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 23f0db2dbb5d..2535e211650c 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -146,7 +146,8 @@ struct drm_i915_private *mock_gem_device(void)
 	mkwrite_device_info(i915)->page_size_mask =
 		I915_GTT_PAGE_SIZE_4K |
 		I915_GTT_PAGE_SIZE_64K |
-		I915_GTT_PAGE_SIZE_2M;
+		I915_GTT_PAGE_SIZE_2M |
+		I915_GTT_PAGE_SIZE_1G;
 
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 05/17] drm/i915: fallback to normal pages on vma insert failure
  2017-05-16  8:29 ` [PATCH 05/17] drm/i915: fallback to normal pages on vma insert failure Matthew Auld
@ 2017-05-16  8:39   ` Chris Wilson
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2017-05-16  8:39 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, May 16, 2017 at 09:29:36AM +0100, Matthew Auld wrote:
> Part of the cost in choosing huge-gtt-pages is potentially using a
> larger alignment and/or size. Therefore if our vma insert fails either
> because of the insert/reserve or the pin-offset-fixed we should fallback
> to normal pages and retry before giving up.

Too late. By the point you do this, we will already have been evicting
from the GTT. We need a larger shakeup to take this trial-and-error
approach into consideration, though perhaps with just including NOEVICT,
it may make more sense.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/17] drm/i915: align the vma start to the gtt page size
  2017-05-16  8:29 ` [PATCH 03/17] drm/i915: align the vma start to the " Matthew Auld
@ 2017-05-16  8:40   ` Chris Wilson
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2017-05-16  8:40 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, May 16, 2017 at 09:29:34AM +0100, Matthew Auld wrote:
> When inserting into a 48bit PPGTT we need to align the vma start address
> to the required page size boundary. The size will already be aligned so
> no padding is needed.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_vma.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 1aba47024656..53f6c94b2ee6 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -471,6 +471,14 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>  	if (ret)
>  		return ret;
>  
> +	if (i915_vm_is_48bit(vma->vm) &&
> +	    obj->gtt_page_size > I915_GTT_PAGE_SIZE) {
> +		unsigned int page_alignment = obj->gtt_page_size;
> +
> +		alignment = max_t(typeof(alignment), alignment, page_alignment);
> +		GEM_BUG_ON(!IS_ALIGNED(vma->size, obj->gtt_page_size));
> +	}
> +
>  	if (flags & PIN_OFFSET_FIXED) {

We should only increase the minimum alignment for !FIXED. Otherwise the
softpin user will not know what games we are playing and be able to
compensate.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/17] drm/i915: introduce gtt page size
  2017-05-16  8:29 ` [PATCH 02/17] drm/i915: introduce gtt page size Matthew Auld
@ 2017-05-16  8:41   ` Chris Wilson
  2017-05-16  9:59   ` Chris Wilson
  1 sibling, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2017-05-16  8:41 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, May 16, 2017 at 09:29:33AM +0100, Matthew Auld wrote:
> In preparation for supporting huge gtt pages for the ppgtt, we introduce
> a gtt_page_size member for gem objects.  We fill in the gtt page size by
> scanning the sg table to determine the max page size which satisfies the
> alignment for each sg entry.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_drv.h        |  2 ++
>  drivers/gpu/drm/i915/i915_gem.c        | 23 +++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_object.h |  2 ++
>  3 files changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e18f11f77f35..a7a108d18a2d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2843,6 +2843,8 @@ intel_info(const struct drm_i915_private *dev_priv)
>  #define USES_PPGTT(dev_priv)		(i915.enable_ppgtt)
>  #define USES_FULL_PPGTT(dev_priv)	(i915.enable_ppgtt >= 2)
>  #define USES_FULL_48BIT_PPGTT(dev_priv)	(i915.enable_ppgtt == 3)
> +#define HAS_PAGE_SIZE(dev_priv, page_size) \
> +	((dev_priv)->info.page_size_mask & (page_size))
>  
>  #define HAS_OVERLAY(dev_priv)		 ((dev_priv)->info.has_overlay)
>  #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0c1cbe98c994..6a5e864d7710 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2294,6 +2294,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
>  	if (!IS_ERR(pages))
>  		obj->ops->put_pages(obj, pages);
>  
> +	obj->gtt_page_size = 0;
> +
>  unlock:
>  	mutex_unlock(&obj->mm.lock);
>  }
> @@ -2473,6 +2475,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>  				 struct sg_table *pages)
>  {
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	unsigned long supported_page_sizes = INTEL_INFO(i915)->page_size_mask;
> +	struct scatterlist *sg;
> +	unsigned int sg_mask = 0;
> +	unsigned int i;
> +	unsigned int bit;
> +
>  	lockdep_assert_held(&obj->mm.lock);
>  
>  	obj->mm.get_page.sg_pos = pages->sgl;
> @@ -2486,6 +2495,20 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>  		__i915_gem_object_pin_pages(obj);
>  		obj->mm.quirked = true;
>  	}
> +
> +	for_each_sg(pages->sgl, sg, pages->nents, i)
> +		sg_mask |= sg->length;
> +
> +	GEM_BUG_ON(!sg_mask);
> +
> +	for_each_set_bit(bit, &supported_page_sizes, BITS_PER_LONG) {
> +		if (!IS_ALIGNED(sg_mask, 1 << bit))
> +			break;
> +
> +		obj->gtt_page_size = 1 << bit;
> +	}

Not here. This is the synchronous part, and we really do not want to
loop again. However, we have just looped over and can compute this mask
inline in the asynchronous portion of get_pages.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* ✓ Fi.CI.BAT: success for add support for huge-gtt-pages
  2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
                   ` (16 preceding siblings ...)
  2017-05-16  8:29 ` [PATCH 17/17] drm/i915: enable platform support for 1G pages Matthew Auld
@ 2017-05-16  8:49 ` Patchwork
  17 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2017-05-16  8:49 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: add support for huge-gtt-pages
URL   : https://patchwork.freedesktop.org/series/24481/
State : success

== Summary ==

Series 24481v1 add support for huge-gtt-pages
https://patchwork.freedesktop.org/api/1.0/series/24481/revisions/1/mbox/

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time:442s
fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:432s
fi-bsw-n3050     total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  time:581s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:504s
fi-byt-j1900     total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  time:509s
fi-byt-n2820     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:507s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:422s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:418s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:420s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:505s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:468s
fi-kbl-7500u     total:278  pass:255  dwarn:5   dfail:0   fail:0   skip:18  time:469s
fi-kbl-7560u     total:278  pass:263  dwarn:5   dfail:0   fail:0   skip:10  time:577s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:456s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time:588s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:470s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:495s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:433s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:555s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:409s

9b25870f9fa4548ec2bb40e42fa28f35db2189e1 drm-tip: 2017y-05m-15d-15h-47m-31s UTC integration manifest
cf035b2 drm/i915: enable platform support for 1G pages
ead918b drm/i915: enable platform support for 2M pages
ee0e57d drm/i915: enable platform support for 64K pages
d0bfa76 drm/i915/debugfs: include some gtt_page_size metrics
c6fb01e drm/i915: support inserting 1G pages into the 48b PPGTT
2dc2ed8 drm/i915: support inserting 2M pages into the 48b PPGTT
f06788c drm/i915: disable GTT cache for 2M/1G pages
86ce13d drm/i915: support inserting 64K pages into the 48b PPGTT
fcc9241 drm/i915: enable IPS bit for 64K pages
34b66cf drm/i915: pass gtt page size to insert_entries
61d5eb7 drm/i915: request THP for shmem backed objects
34a823b mm/shmem: expose driver overridable huge option
9b38f26 drm/i915: fallback to normal pages on vma insert failure
1fa7ce3 drm/i915: align 64K objects to 2M
d357c27 drm/i915: align the vma start to the gtt page size
0ecfdaa drm/i915: introduce gtt page size
176907c drm/i915: introduce page_size_mask to dev_info

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4705/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/17] drm/i915: introduce gtt page size
  2017-05-16  8:29 ` [PATCH 02/17] drm/i915: introduce gtt page size Matthew Auld
  2017-05-16  8:41   ` Chris Wilson
@ 2017-05-16  9:59   ` Chris Wilson
  2017-05-23 12:42     ` Matthew Auld
  1 sibling, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2017-05-16  9:59 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, May 16, 2017 at 09:29:33AM +0100, Matthew Auld wrote:
> In preparation for supporting huge gtt pages for the ppgtt, we introduce
> a gtt_page_size member for gem objects.  We fill in the gtt page size by
> scanning the sg table to determine the max page size which satisfies the
> alignment for each sg entry.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_drv.h        |  2 ++
>  drivers/gpu/drm/i915/i915_gem.c        | 23 +++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_object.h |  2 ++
>  3 files changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e18f11f77f35..a7a108d18a2d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2843,6 +2843,8 @@ intel_info(const struct drm_i915_private *dev_priv)
>  #define USES_PPGTT(dev_priv)		(i915.enable_ppgtt)
>  #define USES_FULL_PPGTT(dev_priv)	(i915.enable_ppgtt >= 2)
>  #define USES_FULL_48BIT_PPGTT(dev_priv)	(i915.enable_ppgtt == 3)
> +#define HAS_PAGE_SIZE(dev_priv, page_size) \
> +	((dev_priv)->info.page_size_mask & (page_size))
>  
>  #define HAS_OVERLAY(dev_priv)		 ((dev_priv)->info.has_overlay)
>  #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0c1cbe98c994..6a5e864d7710 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2294,6 +2294,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
>  	if (!IS_ERR(pages))
>  		obj->ops->put_pages(obj, pages);
>  
> +	obj->gtt_page_size = 0;
> +
>  unlock:
>  	mutex_unlock(&obj->mm.lock);
>  }
> @@ -2473,6 +2475,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>  				 struct sg_table *pages)
>  {
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	unsigned long supported_page_sizes = INTEL_INFO(i915)->page_size_mask;
> +	struct scatterlist *sg;
> +	unsigned int sg_mask = 0;
> +	unsigned int i;
> +	unsigned int bit;
> +
>  	lockdep_assert_held(&obj->mm.lock);
>  
>  	obj->mm.get_page.sg_pos = pages->sgl;
> @@ -2486,6 +2495,20 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>  		__i915_gem_object_pin_pages(obj);
>  		obj->mm.quirked = true;
>  	}
> +
> +	for_each_sg(pages->sgl, sg, pages->nents, i)
> +		sg_mask |= sg->length;
> +
> +	GEM_BUG_ON(!sg_mask);
> +

This should just be obj->gtt_page_sizes = sg_mask & supported_sizes;

And it should be obj->mm.gtt_page_sizes.

Then latter steps can make decisions based on the most strict
requirements, or least strict etc.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/17] mm/shmem: expose driver overridable huge option
  2017-05-16  8:29 ` [PATCH 06/17] mm/shmem: expose driver overridable huge option Matthew Auld
@ 2017-05-16 10:02   ` Kirill A. Shutemov
  0 siblings, 0 replies; 33+ messages in thread
From: Kirill A. Shutemov @ 2017-05-16 10:02 UTC (permalink / raw)
  To: Matthew Auld
  Cc: intel-gfx, Joonas Lahtinen, Dave Hansen, Daniel Vetter,
	Hugh Dickins, linux-mm

On Tue, May 16, 2017 at 09:29:37AM +0100, Matthew Auld wrote:
> In i915 we are aiming to support huge GTT pages for the GPU, and to
> complement this we also want to enable THP for our shmem backed objects.
> Even though THP is supported in shmemfs it can only be enabled through
> the huge= mount option, but for users of the kernel mounted shm_mnt like
> i915, we are a little stuck. There is the sysfs knob shmem_enabled to
> either forcefully enable/disable the feature, but that seems to only be
> useful for testing purposes. What we propose is to expose a driver
> overridable huge option as part of shmem_inode_info to control the use
> of THP for a given mapping.

I don't like this. It's kinda hacky.

Is there a reason why i915 cannot mount a new tmpfs for own use?

Or other option would be to change default to SHMEM_HUGE_ADVISE and wire
up fadvise handle to control per-file allocation policy.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/17] drm/i915: disable GTT cache for 2M/1G pages
  2017-05-16  8:29 ` [PATCH 11/17] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
@ 2017-05-16 10:04   ` Ville Syrjälä
  2017-05-16 10:11     ` Chris Wilson
  0 siblings, 1 reply; 33+ messages in thread
From: Ville Syrjälä @ 2017-05-16 10:04 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, May 16, 2017 at 09:29:42AM +0100, Matthew Auld wrote:
> When SW enables the use of 2M/1G pages, it must disable the GTT cache.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index ef0e9f8d4dbd..b39b8d394179 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -8178,10 +8178,10 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv)
>  
>  	/*
>  	 * WaGttCachingOffByDefault:bdw
> -	 * GTT cache may not work with big pages, so if those
> -	 * are ever enabled GTT cache may need to be disabled.
> +	 * The GTT cache must be disabled if the system is planning to use
> +	 * 2M/1G pages.
>  	 */
> -	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> +	I915_WRITE(HSW_GTT_CACHE_EN, 0);
>  
>  	/* WaKVMNotificationOnConfigChange:bdw */
>  	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
> @@ -8457,10 +8457,10 @@ static void cherryview_init_clock_gating(struct drm_i915_private *dev_priv)
>  	gen8_set_l3sqc_credits(dev_priv, 38, 2);
>  
>  	/*
> -	 * GTT cache may not work with big pages, so if those
> -	 * are ever enabled GTT cache may need to be disabled.
> +	 * The GTT cache must be disabled if the system is planning to use
> +	 * 2M/1G pages.
>  	 */
> -	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> +	I915_WRITE(HSW_GTT_CACHE_EN, 0);
>  }

Should we perhaps have a modparam to make it easier to evaluate
whether big pages are actually beneficial or not? If so, it should also
affect whether we enable the the GTT cache or not.

>  
>  static void g4x_init_clock_gating(struct drm_i915_private *dev_priv)
> -- 
> 2.9.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/17] drm/i915: disable GTT cache for 2M/1G pages
  2017-05-16 10:04   ` Ville Syrjälä
@ 2017-05-16 10:11     ` Chris Wilson
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2017-05-16 10:11 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Matthew Auld

On Tue, May 16, 2017 at 01:04:38PM +0300, Ville Syrjälä wrote:
> On Tue, May 16, 2017 at 09:29:42AM +0100, Matthew Auld wrote:
> > When SW enables the use of 2M/1G pages, it must disable the GTT cache.
> > 
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_pm.c | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index ef0e9f8d4dbd..b39b8d394179 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -8178,10 +8178,10 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv)
> >  
> >  	/*
> >  	 * WaGttCachingOffByDefault:bdw
> > -	 * GTT cache may not work with big pages, so if those
> > -	 * are ever enabled GTT cache may need to be disabled.
> > +	 * The GTT cache must be disabled if the system is planning to use
> > +	 * 2M/1G pages.
> >  	 */
> > -	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> > +	I915_WRITE(HSW_GTT_CACHE_EN, 0);
> >  
> >  	/* WaKVMNotificationOnConfigChange:bdw */
> >  	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
> > @@ -8457,10 +8457,10 @@ static void cherryview_init_clock_gating(struct drm_i915_private *dev_priv)
> >  	gen8_set_l3sqc_credits(dev_priv, 38, 2);
> >  
> >  	/*
> > -	 * GTT cache may not work with big pages, so if those
> > -	 * are ever enabled GTT cache may need to be disabled.
> > +	 * The GTT cache must be disabled if the system is planning to use
> > +	 * 2M/1G pages.
> >  	 */
> > -	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> > +	I915_WRITE(HSW_GTT_CACHE_EN, 0);
> >  }
> 
> Should we perhaps have a modparam to make it easier to evaluate
> whether big pages are actually beneficial or not? If so, it should also
> affect whether we enable the the GTT cache or not.

If we are sticking to only using it on bdw 48b, then ppgtt=4? It doesn't
seem a good idea for a long term modparam, but who wants to keep a
modparam around where users might find it?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics
  2017-05-16  8:29 ` [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics Matthew Auld
@ 2017-05-19 10:48   ` Chris Wilson
  2017-05-19 10:51   ` Chris Wilson
  1 sibling, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2017-05-19 10:48 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, May 16, 2017 at 09:29:45AM +0100, Matthew Auld wrote:
> Good to know, mostly for debugging purposes.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Ok, we can do this along with setting obj->mm.page_sizes today and
that'll start giving us some information and clear a couple of patches.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics
  2017-05-16  8:29 ` [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics Matthew Auld
  2017-05-19 10:48   ` Chris Wilson
@ 2017-05-19 10:51   ` Chris Wilson
  1 sibling, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2017-05-19 10:51 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, May 16, 2017 at 09:29:45AM +0100, Matthew Auld wrote:
> Good to know, mostly for debugging purposes.

Can we also get similar information into the error state. Copying
obj->mm.page_sizes across is easy enough, recording whether gtt was
using huge pages, I leave as an exercise to the reader.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/17] drm/i915: introduce gtt page size
  2017-05-16  9:59   ` Chris Wilson
@ 2017-05-23 12:42     ` Matthew Auld
  2017-05-23 12:54       ` Chris Wilson
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Auld @ 2017-05-23 12:42 UTC (permalink / raw)
  To: Chris Wilson, Intel Graphics Development

On 16 May 2017 at 10:59, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Tue, May 16, 2017 at 09:29:33AM +0100, Matthew Auld wrote:
>> In preparation for supporting huge gtt pages for the ppgtt, we introduce
>> a gtt_page_size member for gem objects.  We fill in the gtt page size by
>> scanning the sg table to determine the max page size which satisfies the
>> alignment for each sg entry.
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h        |  2 ++
>>  drivers/gpu/drm/i915/i915_gem.c        | 23 +++++++++++++++++++++++
>>  drivers/gpu/drm/i915/i915_gem_object.h |  2 ++
>>  3 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index e18f11f77f35..a7a108d18a2d 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2843,6 +2843,8 @@ intel_info(const struct drm_i915_private *dev_priv)
>>  #define USES_PPGTT(dev_priv)         (i915.enable_ppgtt)
>>  #define USES_FULL_PPGTT(dev_priv)    (i915.enable_ppgtt >= 2)
>>  #define USES_FULL_48BIT_PPGTT(dev_priv)      (i915.enable_ppgtt == 3)
>> +#define HAS_PAGE_SIZE(dev_priv, page_size) \
>> +     ((dev_priv)->info.page_size_mask & (page_size))
>>
>>  #define HAS_OVERLAY(dev_priv)                 ((dev_priv)->info.has_overlay)
>>  #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 0c1cbe98c994..6a5e864d7710 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2294,6 +2294,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
>>       if (!IS_ERR(pages))
>>               obj->ops->put_pages(obj, pages);
>>
>> +     obj->gtt_page_size = 0;
>> +
>>  unlock:
>>       mutex_unlock(&obj->mm.lock);
>>  }
>> @@ -2473,6 +2475,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>  void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>>                                struct sg_table *pages)
>>  {
>> +     struct drm_i915_private *i915 = to_i915(obj->base.dev);
>> +     unsigned long supported_page_sizes = INTEL_INFO(i915)->page_size_mask;
>> +     struct scatterlist *sg;
>> +     unsigned int sg_mask = 0;
>> +     unsigned int i;
>> +     unsigned int bit;
>> +
>>       lockdep_assert_held(&obj->mm.lock);
>>
>>       obj->mm.get_page.sg_pos = pages->sgl;
>> @@ -2486,6 +2495,20 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>>               __i915_gem_object_pin_pages(obj);
>>               obj->mm.quirked = true;
>>       }
>> +
>> +     for_each_sg(pages->sgl, sg, pages->nents, i)
>> +             sg_mask |= sg->length;
>> +
>> +     GEM_BUG_ON(!sg_mask);
>> +
>
> This should just be obj->gtt_page_sizes = sg_mask & supported_sizes;
But wouldn't this assume that sg->length is exactly a page size, I
would have imagined it would be possible for shmem to give us two or
more continuous super-pages, or am I missing something?

>
> And it should be obj->mm.gtt_page_sizes.
>
> Then latter steps can make decisions based on the most strict
> requirements, or least strict etc.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/17] drm/i915: introduce gtt page size
  2017-05-23 12:42     ` Matthew Auld
@ 2017-05-23 12:54       ` Chris Wilson
  2017-05-23 13:57         ` Matthew Auld
  0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2017-05-23 12:54 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development

On Tue, May 23, 2017 at 01:42:56PM +0100, Matthew Auld wrote:
> On 16 May 2017 at 10:59, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Tue, May 16, 2017 at 09:29:33AM +0100, Matthew Auld wrote:
> >> In preparation for supporting huge gtt pages for the ppgtt, we introduce
> >> a gtt_page_size member for gem objects.  We fill in the gtt page size by
> >> scanning the sg table to determine the max page size which satisfies the
> >> alignment for each sg entry.
> >>
> >> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Daniel Vetter <daniel@ffwll.ch>
> >> ---
> >>  drivers/gpu/drm/i915/i915_drv.h        |  2 ++
> >>  drivers/gpu/drm/i915/i915_gem.c        | 23 +++++++++++++++++++++++
> >>  drivers/gpu/drm/i915/i915_gem_object.h |  2 ++
> >>  3 files changed, 27 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index e18f11f77f35..a7a108d18a2d 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -2843,6 +2843,8 @@ intel_info(const struct drm_i915_private *dev_priv)
> >>  #define USES_PPGTT(dev_priv)         (i915.enable_ppgtt)
> >>  #define USES_FULL_PPGTT(dev_priv)    (i915.enable_ppgtt >= 2)
> >>  #define USES_FULL_48BIT_PPGTT(dev_priv)      (i915.enable_ppgtt == 3)
> >> +#define HAS_PAGE_SIZE(dev_priv, page_size) \
> >> +     ((dev_priv)->info.page_size_mask & (page_size))
> >>
> >>  #define HAS_OVERLAY(dev_priv)                 ((dev_priv)->info.has_overlay)
> >>  #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >> index 0c1cbe98c994..6a5e864d7710 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem.c
> >> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> @@ -2294,6 +2294,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> >>       if (!IS_ERR(pages))
> >>               obj->ops->put_pages(obj, pages);
> >>
> >> +     obj->gtt_page_size = 0;
> >> +
> >>  unlock:
> >>       mutex_unlock(&obj->mm.lock);
> >>  }
> >> @@ -2473,6 +2475,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> >>  void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
> >>                                struct sg_table *pages)
> >>  {
> >> +     struct drm_i915_private *i915 = to_i915(obj->base.dev);
> >> +     unsigned long supported_page_sizes = INTEL_INFO(i915)->page_size_mask;
> >> +     struct scatterlist *sg;
> >> +     unsigned int sg_mask = 0;
> >> +     unsigned int i;
> >> +     unsigned int bit;
> >> +
> >>       lockdep_assert_held(&obj->mm.lock);
> >>
> >>       obj->mm.get_page.sg_pos = pages->sgl;
> >> @@ -2486,6 +2495,20 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
> >>               __i915_gem_object_pin_pages(obj);
> >>               obj->mm.quirked = true;
> >>       }
> >> +
> >> +     for_each_sg(pages->sgl, sg, pages->nents, i)
> >> +             sg_mask |= sg->length;
> >> +
> >> +     GEM_BUG_ON(!sg_mask);
> >> +
> >
> > This should just be obj->gtt_page_sizes = sg_mask & supported_sizes;
> But wouldn't this assume that sg->length is exactly a page size, I
> would have imagined it would be possible for shmem to give us two or
> more continuous super-pages, or am I missing something?

I'd actually report obj->mm.phys_page_sizes = sg_mask; and cook a value for
obj->mm.gtt_pages_sizes:

	obj->mm.gtt_page_sizes = 0;
	for_each_set_bit(bit, &i915->info.supported_gtt_pages_size) { // add salt
		if (obj->mm.phys_page_sizes & ~0u << bit)
			obj->mm.gtt_page_sizes |= BIT(bit);
	}

Certainly for the internal objects we will have a variety of different
orders.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/17] drm/i915: introduce gtt page size
  2017-05-23 12:54       ` Chris Wilson
@ 2017-05-23 13:57         ` Matthew Auld
  2017-05-23 14:30           ` Chris Wilson
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Auld @ 2017-05-23 13:57 UTC (permalink / raw)
  To: Chris Wilson, Matthew Auld, Intel Graphics Development

On 23 May 2017 at 13:54, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Tue, May 23, 2017 at 01:42:56PM +0100, Matthew Auld wrote:
>> On 16 May 2017 at 10:59, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> > On Tue, May 16, 2017 at 09:29:33AM +0100, Matthew Auld wrote:
>> >> In preparation for supporting huge gtt pages for the ppgtt, we introduce
>> >> a gtt_page_size member for gem objects.  We fill in the gtt page size by
>> >> scanning the sg table to determine the max page size which satisfies the
>> >> alignment for each sg entry.
>> >>
>> >> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> >> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> >> Cc: Daniel Vetter <daniel@ffwll.ch>
>> >> ---
>> >>  drivers/gpu/drm/i915/i915_drv.h        |  2 ++
>> >>  drivers/gpu/drm/i915/i915_gem.c        | 23 +++++++++++++++++++++++
>> >>  drivers/gpu/drm/i915/i915_gem_object.h |  2 ++
>> >>  3 files changed, 27 insertions(+)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> >> index e18f11f77f35..a7a108d18a2d 100644
>> >> --- a/drivers/gpu/drm/i915/i915_drv.h
>> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> >> @@ -2843,6 +2843,8 @@ intel_info(const struct drm_i915_private *dev_priv)
>> >>  #define USES_PPGTT(dev_priv)         (i915.enable_ppgtt)
>> >>  #define USES_FULL_PPGTT(dev_priv)    (i915.enable_ppgtt >= 2)
>> >>  #define USES_FULL_48BIT_PPGTT(dev_priv)      (i915.enable_ppgtt == 3)
>> >> +#define HAS_PAGE_SIZE(dev_priv, page_size) \
>> >> +     ((dev_priv)->info.page_size_mask & (page_size))
>> >>
>> >>  #define HAS_OVERLAY(dev_priv)                 ((dev_priv)->info.has_overlay)
>> >>  #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
>> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> >> index 0c1cbe98c994..6a5e864d7710 100644
>> >> --- a/drivers/gpu/drm/i915/i915_gem.c
>> >> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> >> @@ -2294,6 +2294,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
>> >>       if (!IS_ERR(pages))
>> >>               obj->ops->put_pages(obj, pages);
>> >>
>> >> +     obj->gtt_page_size = 0;
>> >> +
>> >>  unlock:
>> >>       mutex_unlock(&obj->mm.lock);
>> >>  }
>> >> @@ -2473,6 +2475,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>> >>  void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>> >>                                struct sg_table *pages)
>> >>  {
>> >> +     struct drm_i915_private *i915 = to_i915(obj->base.dev);
>> >> +     unsigned long supported_page_sizes = INTEL_INFO(i915)->page_size_mask;
>> >> +     struct scatterlist *sg;
>> >> +     unsigned int sg_mask = 0;
>> >> +     unsigned int i;
>> >> +     unsigned int bit;
>> >> +
>> >>       lockdep_assert_held(&obj->mm.lock);
>> >>
>> >>       obj->mm.get_page.sg_pos = pages->sgl;
>> >> @@ -2486,6 +2495,20 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
>> >>               __i915_gem_object_pin_pages(obj);
>> >>               obj->mm.quirked = true;
>> >>       }
>> >> +
>> >> +     for_each_sg(pages->sgl, sg, pages->nents, i)
>> >> +             sg_mask |= sg->length;
>> >> +
>> >> +     GEM_BUG_ON(!sg_mask);
>> >> +
>> >
>> > This should just be obj->gtt_page_sizes = sg_mask & supported_sizes;
>> But wouldn't this assume that sg->length is exactly a page size, I
>> would have imagined it would be possible for shmem to give us two or
>> more continuous super-pages, or am I missing something?
>
> I'd actually report obj->mm.phys_page_sizes = sg_mask; and cook a value for
> obj->mm.gtt_pages_sizes:
>
>         obj->mm.gtt_page_sizes = 0;
>         for_each_set_bit(bit, &i915->info.supported_gtt_pages_size) { // add salt
>                 if (obj->mm.phys_page_sizes & ~0u << bit)
>                         obj->mm.gtt_page_sizes |= BIT(bit);
>         }
Nifty.

So in mixed-mode what would be the alignment strategy? Align to
largest, smallest, don't align at all etc. For example if we were
unlucky and got something like 4K->2M? The obj->mm.gtt_page_sizes
should always be representative of how we end up inserting it into the
gtt, right? Would it not be more apt to move the gtt_page_sizes
tracking to when we do the insert?

>
> Certainly for the internal objects we will have a variety of different
> orders.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/17] drm/i915: introduce gtt page size
  2017-05-23 13:57         ` Matthew Auld
@ 2017-05-23 14:30           ` Chris Wilson
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2017-05-23 14:30 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development

On Tue, May 23, 2017 at 02:57:16PM +0100, Matthew Auld wrote:
> On 23 May 2017 at 13:54, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Tue, May 23, 2017 at 01:42:56PM +0100, Matthew Auld wrote:
> >> On 16 May 2017 at 10:59, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >> > This should just be obj->gtt_page_sizes = sg_mask & supported_sizes;
> >> But wouldn't this assume that sg->length is exactly a page size, I
> >> would have imagined it would be possible for shmem to give us two or
> >> more continuous super-pages, or am I missing something?
> >
> > I'd actually report obj->mm.phys_page_sizes = sg_mask; and cook a value for
> > obj->mm.gtt_pages_sizes:
> >
> >         obj->mm.gtt_page_sizes = 0;
> >         for_each_set_bit(bit, &i915->info.supported_gtt_pages_size) { // add salt
> >                 if (obj->mm.phys_page_sizes & ~0u << bit)
> >                         obj->mm.gtt_page_sizes |= BIT(bit);
> >         }
> Nifty.
> 
> So in mixed-mode what would be the alignment strategy? Align to
> largest, smallest, don't align at all etc. For example if we were
> unlucky and got something like 4K->2M? The obj->mm.gtt_page_sizes
> should always be representative of how we end up inserting it into the
> gtt, right? Would it not be more apt to move the gtt_page_sizes
> tracking to when we do the insert?

My first thought was align to worst (and then hope for the best, i.e.
that we can make use of that alignment, I'm still thinking even if we
get a huge physical page, wasting that alignment in a 48b aperture still
isn't terrible -- caveats if we need to fit inside the low 4G!).
Reporting the actual GTT sizes used might be useful for debug, but the
focus should be on tracking the values that make the code simpler :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-05-23 14:30 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-16  8:29 [PATCH 00/17] add support for huge-gtt-pages Matthew Auld
2017-05-16  8:29 ` [PATCH 01/17] drm/i915: introduce page_size_mask to dev_info Matthew Auld
2017-05-16  8:29 ` [PATCH 02/17] drm/i915: introduce gtt page size Matthew Auld
2017-05-16  8:41   ` Chris Wilson
2017-05-16  9:59   ` Chris Wilson
2017-05-23 12:42     ` Matthew Auld
2017-05-23 12:54       ` Chris Wilson
2017-05-23 13:57         ` Matthew Auld
2017-05-23 14:30           ` Chris Wilson
2017-05-16  8:29 ` [PATCH 03/17] drm/i915: align the vma start to the " Matthew Auld
2017-05-16  8:40   ` Chris Wilson
2017-05-16  8:29 ` [PATCH 04/17] drm/i915: align 64K objects to 2M Matthew Auld
2017-05-16  8:29 ` [PATCH 05/17] drm/i915: fallback to normal pages on vma insert failure Matthew Auld
2017-05-16  8:39   ` Chris Wilson
2017-05-16  8:29 ` [PATCH 06/17] mm/shmem: expose driver overridable huge option Matthew Auld
2017-05-16 10:02   ` Kirill A. Shutemov
2017-05-16  8:29 ` [PATCH 07/17] drm/i915: request THP for shmem backed objects Matthew Auld
2017-05-16  8:29   ` Matthew Auld
2017-05-16  8:29 ` [PATCH 08/17] drm/i915: pass gtt page size to insert_entries Matthew Auld
2017-05-16  8:29 ` [PATCH 09/17] drm/i915: enable IPS bit for 64K pages Matthew Auld
2017-05-16  8:29 ` [PATCH 10/17] drm/i915: support inserting 64K pages into the 48b PPGTT Matthew Auld
2017-05-16  8:29 ` [PATCH 11/17] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
2017-05-16 10:04   ` Ville Syrjälä
2017-05-16 10:11     ` Chris Wilson
2017-05-16  8:29 ` [PATCH 12/17] drm/i915: support inserting 2M pages into the 48b PPGTT Matthew Auld
2017-05-16  8:29 ` [PATCH 13/17] drm/i915: support inserting 1G " Matthew Auld
2017-05-16  8:29 ` [PATCH 14/17] drm/i915/debugfs: include some gtt_page_size metrics Matthew Auld
2017-05-19 10:48   ` Chris Wilson
2017-05-19 10:51   ` Chris Wilson
2017-05-16  8:29 ` [PATCH 15/17] drm/i915: enable platform support for 64K pages Matthew Auld
2017-05-16  8:29 ` [PATCH 16/17] drm/i915: enable platform support for 2M pages Matthew Auld
2017-05-16  8:29 ` [PATCH 17/17] drm/i915: enable platform support for 1G pages Matthew Auld
2017-05-16  8:49 ` ✓ Fi.CI.BAT: success for add support for huge-gtt-pages Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.