All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] huge gtt pages
@ 2017-06-21 20:33 Matthew Auld
  2017-06-21 20:33 ` [PATCH 01/19] drm/i915: introduce simple gemfs Matthew Auld
                   ` (19 more replies)
  0 siblings, 20 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Hopefully addresses the feedback from the previous round.

Matthew Auld (19):
  drm/i915: introduce simple gemfs
  drm/i915/gemfs: enable THP
  drm/i915: introduce page_size_mask to dev_info
  drm/i915: introduce page_size members
  drm/i915: align the vma start to the largest gtt page size
  drm/i915: align 64K objects to 2M
  drm/i915: pass the vma to insert_entries
  drm/i915: enable IPS bit for 64K pages
  drm/i915: disable GTT cache for 2M/1G pages
  drm/i915: support 1G pages for the 48b PPGTT
  drm/i915: support 2M pages for the 48b PPGTT
  drm/i915: support 64K pages for the 48b PPGTT
  drm/i915: accurate page size tracking for the ppgtt
  drm/i915/debugfs: include some gtt page size metrics
  drm/i915/selftests: basic huge page tests
  drm/i915/selftests: mix huge pages
  drm/i915: enable platform support for 64K pages
  drm/i915: enable platform support for 2M pages
  drm/i915: enable platform support for 1G pages

 drivers/gpu/drm/i915/Makefile                      |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c                |  42 +-
 drivers/gpu/drm/i915/i915_drv.h                    |   9 +-
 drivers/gpu/drm/i915/i915_gem.c                    |  99 +++-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c             |  17 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c                | 188 +++++--
 drivers/gpu/drm/i915/i915_gem_gtt.h                |  16 +-
 drivers/gpu/drm/i915/i915_gem_internal.c           |   5 +-
 drivers/gpu/drm/i915/i915_gem_object.h             |  30 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c             |  13 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c            |  26 +-
 drivers/gpu/drm/i915/i915_gemfs.c                  | 130 +++++
 drivers/gpu/drm/i915/i915_gemfs.h                  |  40 ++
 drivers/gpu/drm/i915/i915_pci.c                    |  25 +
 drivers/gpu/drm/i915/i915_reg.h                    |   3 +
 drivers/gpu/drm/i915/i915_vma.c                    |  23 +
 drivers/gpu/drm/i915/i915_vma.h                    |   1 +
 drivers/gpu/drm/i915/intel_pm.c                    |   6 +-
 drivers/gpu/drm/i915/selftests/huge_gem_object.c   |   4 +-
 drivers/gpu/drm/i915/selftests/huge_pages.c        | 604 +++++++++++++++++++++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c      |  13 +-
 .../gpu/drm/i915/selftests/i915_live_selftests.h   |   1 +
 drivers/gpu/drm/i915/selftests/mock_gem_device.c   |  16 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c          |   3 +-
 drivers/gpu/drm/i915/selftests/scatterlist.c       |  15 +
 25 files changed, 1243 insertions(+), 87 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h
 create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c

-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/19] drm/i915: introduce simple gemfs
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:19   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 02/19] drm/i915/gemfs: enable THP Matthew Auld
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so
moves us away from the shmemfs shm_mnt, and gives us the much needed
flexibility to do things like set our own mount options, namely huge=
which should allow us to enable the use of transparent-huge-pages for
our shmem backed objects.

v2: various improvements suggested by Joonas

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                    |   1 +
 drivers/gpu/drm/i915/i915_drv.h                  |   3 +
 drivers/gpu/drm/i915/i915_gem.c                  |  44 ++++++++-
 drivers/gpu/drm/i915/i915_gemfs.c                | 114 +++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gemfs.h                |  40 ++++++++
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  10 +-
 6 files changed, 208 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c
 create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index f8227318dcaf..29e3cfdf56ce 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -46,6 +46,7 @@ i915-y += i915_cmd_parser.o \
 	  i915_gem_tiling.o \
 	  i915_gem_timeline.o \
 	  i915_gem_userptr.o \
+	  i915_gemfs.o \
 	  i915_trace_points.o \
 	  i915_vma.o \
 	  intel_breadcrumbs.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30e89456fc61..376cd93a973a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2248,6 +2248,9 @@ struct drm_i915_private {
 	DECLARE_HASHTABLE(mm_structs, 7);
 	struct mutex mm_lock;
 
+	/* Our tmpfs instance used for shmem backed objects */
+	struct vfsmount *gemfs;
+
 	/* Kernel Modesetting */
 
 	struct intel_crtc *plane_to_crtc_mapping[I915_MAX_PIPES];
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1353491c1010..30f04d3fc8c9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -35,6 +35,7 @@
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
 #include "intel_mocs.h"
+#include "i915_gemfs.h"
 #include <linux/dma-fence-array.h>
 #include <linux/kthread.h>
 #include <linux/reservation.h>
@@ -4308,6 +4309,32 @@ static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
 	.pwrite = i915_gem_object_pwrite_gtt,
 };
 
+static int i915_drm_gem_object_init(struct drm_device *dev,
+				    struct drm_gem_object *obj,
+				    size_t size)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct file *filp;
+
+	drm_gem_private_object_init(dev, obj, size);
+
+	filp = i915_gemfs_file_setup(i915, "i915 mm object", size);
+	if (IS_ERR(filp))
+		return PTR_ERR(filp);
+
+	obj->filp = filp;
+
+	return 0;
+}
+
+static void i915_drm_gem_object_release(struct drm_gem_object *obj)
+{
+	if (obj->filp)
+		i915_gemfs_unlink(obj->filp);
+
+	drm_gem_object_release(obj);
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 {
@@ -4331,7 +4358,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 	if (obj == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	ret = drm_gem_object_init(&dev_priv->drm, &obj->base, size);
+	ret = i915_drm_gem_object_init(&dev_priv->drm, &obj->base, size);
 	if (ret)
 		goto fail;
 
@@ -4449,7 +4476,8 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 			drm_prime_gem_destroy(&obj->base, NULL);
 
 		reservation_object_fini(&obj->__builtin_resv);
-		drm_gem_object_release(&obj->base);
+
+		i915_drm_gem_object_release(&obj->base);
 		i915_gem_info_remove_obj(i915, obj->base.size);
 
 		kfree(obj->bit_17);
@@ -4909,7 +4937,13 @@ i915_gem_load_init_fences(struct drm_i915_private *dev_priv)
 int
 i915_gem_load_init(struct drm_i915_private *dev_priv)
 {
-	int err = -ENOMEM;
+	int err;
+
+	err = i915_gemfs_init(dev_priv);
+	if (err)
+		return err;
+
+	err = -ENOMEM;
 
 	dev_priv->objects = KMEM_CACHE(drm_i915_gem_object, SLAB_HWCACHE_ALIGN);
 	if (!dev_priv->objects)
@@ -4975,6 +5009,8 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 err_objects:
 	kmem_cache_destroy(dev_priv->objects);
 err_out:
+	i915_gemfs_fini(dev_priv);
+
 	return err;
 }
 
@@ -4997,6 +5033,8 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv)
 
 	/* And ensure that our DESTROY_BY_RCU slabs are truly destroyed */
 	rcu_barrier();
+
+	i915_gemfs_fini(dev_priv);
 }
 
 int i915_gem_freeze(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gemfs.c b/drivers/gpu/drm/i915/i915_gemfs.c
new file mode 100644
index 000000000000..62b266d1d36d
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gemfs.c
@@ -0,0 +1,114 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/mount.h>
+
+#include "i915_drv.h"
+#include "i915_gemfs.h"
+
+static const struct dentry_operations anon_ops = {
+	.d_dname = simple_dname
+};
+
+int i915_gemfs_init(struct drm_i915_private *i915)
+{
+	struct file_system_type *type;
+	struct vfsmount *gemfs;
+
+	type = get_fs_type("tmpfs");
+	if (!type)
+		return -ENODEV;
+
+	gemfs = kern_mount(type);
+	if (IS_ERR(gemfs))
+		return PTR_ERR(gemfs);
+
+	i915->gemfs = gemfs;
+
+	return 0;
+}
+
+void i915_gemfs_fini(struct drm_i915_private *i915)
+{
+	kern_unmount(i915->gemfs);
+	i915->gemfs = NULL;
+}
+
+struct file *i915_gemfs_file_setup(struct drm_i915_private *i915,
+				   const char *name, size_t size)
+{
+	struct super_block *sb = i915->gemfs->mnt_sb;
+	struct inode *dir = d_inode(sb->s_root);
+	struct inode *inode;
+	struct path path;
+	struct qstr this;
+	struct file *res;
+	int ret;
+
+	if (size < 0 || size > MAX_LFS_FILESIZE)
+		return ERR_PTR(-EINVAL);
+
+	this.name = name;
+	this.len = strlen(name);
+	this.hash = 0;
+
+	path.mnt = mntget(i915->gemfs);
+	path.dentry = d_alloc_pseudo(sb, &this);
+	if (!path.dentry) {
+		res = ERR_PTR(-ENOMEM);
+		goto put_path;
+	}
+	d_set_d_op(path.dentry, &anon_ops);
+
+	ret = dir->i_op->create(dir, path.dentry, S_IFREG | S_IRWXUGO, false);
+	if (ret) {
+		res = ERR_PTR(ret);
+		goto put_path;
+	}
+
+	inode = d_inode(path.dentry);
+	inode->i_size = size;
+
+	res = alloc_file(&path, FMODE_WRITE | FMODE_READ, inode->i_fop);
+	if (IS_ERR(res))
+		goto unlink;
+
+	return res;
+
+unlink:
+	dir->i_op->unlink(dir, path.dentry);
+put_path:
+	path_put(&path);
+
+	return res;
+}
+
+int i915_gemfs_unlink(struct file *filp)
+{
+	struct inode *dir = d_inode(filp->f_inode->i_sb->s_root);
+
+	return dir->i_op->unlink(dir, filp->f_path.dentry);
+}
diff --git a/drivers/gpu/drm/i915/i915_gemfs.h b/drivers/gpu/drm/i915/i915_gemfs.h
new file mode 100644
index 000000000000..34e0c79ca2de
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gemfs.h
@@ -0,0 +1,40 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __I915_GEMFS_H__
+#define __I915_GEMFS_H__
+
+struct drm_i915_private;
+struct file;
+
+int i915_gemfs_init(struct drm_i915_private *i915);
+
+void i915_gemfs_fini(struct drm_i915_private *i915);
+
+struct file *i915_gemfs_file_setup(struct drm_i915_private *i915,
+				   const char *name, size_t size);
+
+int i915_gemfs_unlink(struct file *filp);
+
+#endif
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 47613d20bba8..0ac4efd5c7a2 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -81,6 +81,8 @@ static void mock_device_release(struct drm_device *dev)
 	kmem_cache_destroy(i915->vmas);
 	kmem_cache_destroy(i915->objects);
 
+	i915_gemfs_fini(i915);
+
 	drm_dev_fini(&i915->drm);
 	put_device(&i915->drm.pdev->dev);
 }
@@ -168,9 +170,13 @@ struct drm_i915_private *mock_gem_device(void)
 
 	i915->gt.awake = true;
 
+	err = i915_gemfs_init(i915);
+	if (err)
+		goto err_wq;
+
 	i915->objects = KMEM_CACHE(mock_object, SLAB_HWCACHE_ALIGN);
 	if (!i915->objects)
-		goto err_wq;
+		goto err_gemfs;
 
 	i915->vmas = KMEM_CACHE(i915_vma, SLAB_HWCACHE_ALIGN);
 	if (!i915->vmas)
@@ -228,6 +234,8 @@ struct drm_i915_private *mock_gem_device(void)
 	kmem_cache_destroy(i915->vmas);
 err_objects:
 	kmem_cache_destroy(i915->objects);
+err_gemfs:
+	i915_gemfs_fini(i915);
 err_wq:
 	destroy_workqueue(i915->wq);
 put_device:
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/19] drm/i915/gemfs: enable THP
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
  2017-06-21 20:33 ` [PATCH 01/19] drm/i915: introduce simple gemfs Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 03/19] drm/i915: introduce page_size_mask to dev_info Matthew Auld
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Enable transparent-huge-pages through gemfs by mounting with
huge=within_size.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gemfs.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gemfs.c b/drivers/gpu/drm/i915/i915_gemfs.c
index 62b266d1d36d..8dc6e577a5c4 100644
--- a/drivers/gpu/drm/i915/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/i915_gemfs.c
@@ -25,6 +25,7 @@
 #include <linux/fs.h>
 #include <linux/file.h>
 #include <linux/mount.h>
+#include <linux/pagemap.h>
 
 #include "i915_drv.h"
 #include "i915_gemfs.h"
@@ -46,6 +47,21 @@ int i915_gemfs_init(struct drm_i915_private *i915)
 	if (IS_ERR(gemfs))
 		return PTR_ERR(gemfs);
 
+	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE) &&
+	    has_transparent_hugepage()) {
+		struct super_block *sb = gemfs->mnt_sb;
+		char options[] = "huge=within_size";
+		int flags = 0;
+
+		/* We don't consider failure to remount fatal, since this should
+		 * only ever attempt to modify the mount options of the sb, and
+		 * so should always leave us with a working mount upon failure.
+		 * Hence decoupling this from the actual kern_mount is probably
+		 * advisable.
+		 */
+		WARN_ON(sb->s_op->remount_fs(sb, &flags, options));
+	}
+
 	i915->gemfs = gemfs;
 
 	return 0;
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/19] drm/i915: introduce page_size_mask to dev_info
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
  2017-06-21 20:33 ` [PATCH 01/19] drm/i915: introduce simple gemfs Matthew Auld
  2017-06-21 20:33 ` [PATCH 02/19] drm/i915/gemfs: enable THP Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 04/19] drm/i915: introduce page_size members Matthew Auld
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

In preparation for huge gtt pages expose a page_size_mask as part of the
device info, to indicate the page sizes supported by the HW.  Currently
only 4K is supported.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h                  |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.h              |  8 +++++++-
 drivers/gpu/drm/i915/i915_pci.c                  | 20 ++++++++++++++++++++
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  3 +++
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 376cd93a973a..33fc2b1b11f6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -843,6 +843,7 @@ struct intel_device_info {
 	enum intel_platform platform;
 	u8 ring_mask; /* Rings supported by the HW */
 	u8 num_rings;
+	unsigned int page_size_mask; /* page sizes supported by the HW */
 #define DEFINE_FLAG(name) u8 name:1
 	DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG);
 #undef DEFINE_FLAG
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 1b2a56c3e5d3..e9c428d711aa 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -42,7 +42,13 @@
 #include "i915_gem_request.h"
 #include "i915_selftest.h"
 
-#define I915_GTT_PAGE_SIZE 4096UL
+#define I915_GTT_PAGE_SIZE_4K BIT(12)
+#define I915_GTT_PAGE_SIZE_64K BIT(16)
+#define I915_GTT_PAGE_SIZE_2M BIT(21)
+#define I915_GTT_PAGE_SIZE_1G BIT(30)
+
+#define I915_GTT_PAGE_SIZE I915_GTT_PAGE_SIZE_4K
+
 #define I915_GTT_MIN_ALIGNMENT I915_GTT_PAGE_SIZE
 
 #define I915_FENCE_REG_NONE -1
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 04aaf553e3fa..b73c1eb778d1 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -56,6 +56,10 @@
 	.color = { .degamma_lut_size = 65, .gamma_lut_size = 257 }
 
 /* Keep in gen based order, and chronological order within a gen */
+
+#define GEN_DEFAULT_PAGE_SIZES \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K
+
 #define GEN2_FEATURES \
 	.gen = 2, .num_pipes = 1, \
 	.has_overlay = 1, .overlay_needs_physical = 1, \
@@ -64,6 +68,7 @@
 	.unfenced_needs_alignment = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i830_info = {
@@ -96,6 +101,7 @@ static const struct intel_device_info intel_i865g_info = {
 	.has_gmch_display = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i915g_info = {
@@ -158,6 +164,7 @@ static const struct intel_device_info intel_pineview_info = {
 	.has_gmch_display = 1, \
 	.ring_mask = RENDER_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_i965g_info = {
@@ -198,6 +205,7 @@ static const struct intel_device_info intel_gm45_info = {
 	.has_gmbus_irq = 1, \
 	.ring_mask = RENDER_RING | BSD_RING, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ironlake_d_info = {
@@ -222,6 +230,7 @@ static const struct intel_device_info intel_ironlake_m_info = {
 	.has_gmbus_irq = 1, \
 	.has_aliasing_ppgtt = 1, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	CURSOR_OFFSETS
 
 static const struct intel_device_info intel_sandybridge_d_info = {
@@ -247,6 +256,7 @@ static const struct intel_device_info intel_sandybridge_m_info = {
 	.has_aliasing_ppgtt = 1, \
 	.has_full_ppgtt = 1, \
 	GEN_DEFAULT_PIPEOFFSETS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	IVB_CURSOR_OFFSETS
 
 static const struct intel_device_info intel_ivybridge_d_info = {
@@ -284,6 +294,7 @@ static const struct intel_device_info intel_valleyview_info = {
 	.has_full_ppgtt = 1,
 	.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
+	GEN_DEFAULT_PAGE_SIZES,
 	GEN_DEFAULT_PIPEOFFSETS,
 	CURSOR_OFFSETS
 };
@@ -308,6 +319,7 @@ static const struct intel_device_info intel_haswell_info = {
 #define BDW_FEATURES \
 	HSW_FEATURES, \
 	BDW_COLORS, \
+	GEN_DEFAULT_PAGE_SIZES, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
 	.has_64bit_reloc = 1, \
@@ -345,13 +357,18 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_full_ppgtt = 1,
 	.has_reset_engine = 1,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
+	GEN_DEFAULT_PAGE_SIZES,
 	GEN_CHV_PIPEOFFSETS,
 	CURSOR_OFFSETS,
 	CHV_COLORS,
 };
 
+#define GEN9_DEFAULT_PAGE_SIZES \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K
+
 #define SKL_PLATFORM \
 	BDW_FEATURES, \
+	GEN9_DEFAULT_PAGE_SIZES, \
 	.gen = 9, \
 	.platform = INTEL_SKYLAKE, \
 	.has_csr = 1, \
@@ -390,6 +407,7 @@ static const struct intel_device_info intel_skylake_gt3_info = {
 	.has_full_ppgtt = 1, \
 	.has_full_48bit_ppgtt = 1, \
 	.has_reset_engine = 1, \
+	GEN9_DEFAULT_PAGE_SIZES, \
 	GEN_DEFAULT_PIPEOFFSETS, \
 	IVB_CURSOR_OFFSETS, \
 	BDW_COLORS
@@ -409,6 +427,7 @@ static const struct intel_device_info intel_geminilake_info = {
 
 #define KBL_PLATFORM \
 	BDW_FEATURES, \
+	GEN9_DEFAULT_PAGE_SIZES, \
 	.gen = 9, \
 	.platform = INTEL_KABYLAKE, \
 	.has_csr = 1, \
@@ -444,6 +463,7 @@ static const struct intel_device_info intel_coffeelake_gt3_info = {
 
 static const struct intel_device_info intel_cannonlake_info = {
 	BDW_FEATURES,
+	GEN9_DEFAULT_PAGE_SIZES, \
 	.is_alpha_support = 1,
 	.platform = INTEL_CANNONLAKE,
 	.gen = 10,
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 0ac4efd5c7a2..0002ba28780c 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -148,6 +148,9 @@ struct drm_i915_private *mock_gem_device(void)
 
 	mkwrite_device_info(i915)->gen = -1;
 
+	mkwrite_device_info(i915)->page_size_mask =
+		I915_GTT_PAGE_SIZE_4K;
+
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/19] drm/i915: introduce page_size members
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (2 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 03/19] drm/i915: introduce page_size_mask to dev_info Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:26   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 05/19] drm/i915: align the vma start to the largest gtt page size Matthew Auld
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

In preparation for supporting huge gtt pages for the ppgtt, we introduce
page size members for gem objects.  We fill in the page sizes by
scanning the sg table.

v2: pass the sg_mask to set_pages

v3: calculate the sg_mask inline with populating the sg_table where
possible, and pass to set_pages along with the pages.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h                  |  5 ++-
 drivers/gpu/drm/i915/i915_gem.c                  | 43 ++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_dmabuf.c           | 17 ++++++++--
 drivers/gpu/drm/i915/i915_gem_internal.c         |  5 ++-
 drivers/gpu/drm/i915/i915_gem_object.h           | 20 ++++++++++-
 drivers/gpu/drm/i915/i915_gem_stolen.c           | 13 ++++---
 drivers/gpu/drm/i915/i915_gem_userptr.c          | 26 ++++++++++----
 drivers/gpu/drm/i915/selftests/huge_gem_object.c |  4 ++-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c    |  3 +-
 9 files changed, 110 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 33fc2b1b11f6..f3bc1509998f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2951,6 +2951,8 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define USES_PPGTT(dev_priv)		(i915.enable_ppgtt)
 #define USES_FULL_PPGTT(dev_priv)	(i915.enable_ppgtt >= 2)
 #define USES_FULL_48BIT_PPGTT(dev_priv)	(i915.enable_ppgtt == 3)
+#define HAS_PAGE_SIZE(dev_priv, page_size) \
+	((dev_priv)->info.page_size_mask & (page_size))
 
 #define HAS_OVERLAY(dev_priv)		 ((dev_priv)->info.has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
@@ -3332,7 +3334,8 @@ i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj,
 				unsigned long n);
 
 void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
-				 struct sg_table *pages);
+				 struct sg_table *pages,
+				 unsigned int sg_mask);
 int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 
 static inline int __must_check
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 30f04d3fc8c9..d9bc8a07b0ca 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -163,7 +163,8 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 }
 
 static struct sg_table *
-i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
+i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj,
+			       unsigned int *sg_mask)
 {
 	struct address_space *mapping = obj->base.filp->f_mapping;
 	drm_dma_handle_t *phys;
@@ -223,6 +224,8 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 	sg->offset = 0;
 	sg->length = obj->base.size;
 
+	*sg_mask = sg->length;
+
 	sg_dma_address(sg) = phys->busaddr;
 	sg_dma_len(sg) = obj->base.size;
 
@@ -2314,6 +2317,8 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 	if (!IS_ERR(pages))
 		obj->ops->put_pages(obj, pages);
 
+	obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0;
+
 unlock:
 	mutex_unlock(&obj->mm.lock);
 }
@@ -2345,7 +2350,8 @@ static bool i915_sg_trim(struct sg_table *orig_st)
 }
 
 static struct sg_table *
-i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
+i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
+			      unsigned int *sg_mask)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 	const unsigned long page_count = obj->base.size / PAGE_SIZE;
@@ -2392,6 +2398,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 
 	sg = st->sgl;
 	st->nents = 0;
+	*sg_mask = 0;
 	for (i = 0; i < page_count; i++) {
 		const unsigned int shrink[] = {
 			I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | I915_SHRINK_PURGEABLE,
@@ -2443,8 +2450,10 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 		if (!i ||
 		    sg->length >= max_segment ||
 		    page_to_pfn(page) != last_pfn + 1) {
-			if (i)
+			if (i) {
+				*sg_mask |= sg->length;
 				sg = sg_next(sg);
+			}
 			st->nents++;
 			sg_set_page(sg, page, PAGE_SIZE, 0);
 		} else {
@@ -2455,8 +2464,10 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 		/* Check that the i965g/gm workaround works. */
 		WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
 	}
-	if (sg) /* loop terminated early; short sg table */
+	if (sg) { /* loop terminated early; short sg table */
+		*sg_mask |= sg->length;
 		sg_mark_end(sg);
+	}
 
 	/* Trim unused sg entries to avoid wasting memory. */
 	i915_sg_trim(st);
@@ -2510,8 +2521,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 }
 
 void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
-				 struct sg_table *pages)
+				 struct sg_table *pages,
+				 unsigned int sg_mask)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	unsigned long supported_page_sizes = INTEL_INFO(i915)->page_size_mask;
+	unsigned int bit;
+
 	lockdep_assert_held(&obj->mm.lock);
 
 	obj->mm.get_page.sg_pos = pages->sgl;
@@ -2525,11 +2541,24 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 		__i915_gem_object_pin_pages(obj);
 		obj->mm.quirked = true;
 	}
+
+	GEM_BUG_ON(!sg_mask);
+
+	obj->mm.page_sizes.phys = sg_mask;
+
+	obj->mm.page_sizes.sg = 0;
+	for_each_set_bit(bit, &supported_page_sizes, BITS_PER_LONG) {
+		if (obj->mm.page_sizes.phys & ~0u << bit)
+			obj->mm.page_sizes.sg |= BIT(bit);
+	}
+
+	GEM_BUG_ON(!HAS_PAGE_SIZE(i915, obj->mm.page_sizes.sg));
 }
 
 static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 {
 	struct sg_table *pages;
+	unsigned int sg_mask = 0;
 
 	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
 
@@ -2538,11 +2567,11 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 		return -EFAULT;
 	}
 
-	pages = obj->ops->get_pages(obj);
+	pages = obj->ops->get_pages(obj, &sg_mask);
 	if (unlikely(IS_ERR(pages)))
 		return PTR_ERR(pages);
 
-	__i915_gem_object_set_pages(obj, pages);
+	__i915_gem_object_set_pages(obj, pages, sg_mask);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 6176e589cf09..2b3b16d88d4b 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -257,10 +257,21 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 }
 
 static struct sg_table *
-i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj)
+i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj,
+				 unsigned int *sg_mask)
 {
-	return dma_buf_map_attachment(obj->base.import_attach,
-				      DMA_BIDIRECTIONAL);
+	struct sg_table *pages;
+	struct scatterlist *sg;
+	int n;
+
+	pages = dma_buf_map_attachment(obj->base.import_attach,
+				       DMA_BIDIRECTIONAL);
+	if (!IS_ERR(pages)) {
+		for_each_sg(pages->sgl, sg, pages->nents, n)
+			*sg_mask |= sg->length;
+	}
+
+	return pages;
 }
 
 static void i915_gem_object_put_pages_dmabuf(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c
index 568bf83af1f5..a505cb82eb82 100644
--- a/drivers/gpu/drm/i915/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/i915_gem_internal.c
@@ -45,7 +45,8 @@ static void internal_free_pages(struct sg_table *st)
 }
 
 static struct sg_table *
-i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
+i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj,
+				   unsigned int *sg_mask)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct sg_table *st;
@@ -76,6 +77,7 @@ i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 	}
 
 create_st:
+	*sg_mask = 0;
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (!st)
 		return ERR_PTR(-ENOMEM);
@@ -105,6 +107,7 @@ i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 		} while (1);
 
 		sg_set_page(sg, page, PAGE_SIZE << order, 0);
+		*sg_mask |= PAGE_SIZE << order;
 		st->nents++;
 
 		npages -= 1 << order;
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 5b19a4916a4d..7fc8b8402897 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -53,7 +53,8 @@ struct drm_i915_gem_object_ops {
 	 * being released or under memory pressure (where we attempt to
 	 * reap pages for the shrinker).
 	 */
-	struct sg_table *(*get_pages)(struct drm_i915_gem_object *);
+	struct sg_table *(*get_pages)(struct drm_i915_gem_object *,
+				      unsigned int *sg_mask);
 	void (*put_pages)(struct drm_i915_gem_object *, struct sg_table *);
 
 	int (*pwrite)(struct drm_i915_gem_object *,
@@ -143,6 +144,23 @@ struct drm_i915_gem_object {
 		struct sg_table *pages;
 		void *mapping;
 
+		struct i915_page_sizes {
+			/**
+			 * The sg mask of the pages sg_table. i.e the mask of
+			 * of the lengths for each sg entry.
+			 */
+			unsigned int phys;
+
+			/**
+			 * The gtt page sizes we are allowed to use given the
+			 * sg mask and the supported page sizes. This will
+			 * express the smallest unit we can use for the whole
+			 * object, as well as the larger sizes we may be able
+			 * to use opportunistically.
+			 */
+			unsigned int sg;
+		} page_sizes;
+
 		struct i915_gem_object_page_iter {
 			struct scatterlist *sg_pos;
 			unsigned int sg_idx; /* in pages, but 32bit eek! */
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index a817b3e0b17e..2cc09517d46c 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -539,11 +539,16 @@ i915_pages_create_for_stolen(struct drm_device *dev,
 }
 
 static struct sg_table *
-i915_gem_object_get_pages_stolen(struct drm_i915_gem_object *obj)
+i915_gem_object_get_pages_stolen(struct drm_i915_gem_object *obj,
+				 unsigned int *sg_mask)
 {
-	return i915_pages_create_for_stolen(obj->base.dev,
-					    obj->stolen->start,
-					    obj->stolen->size);
+	struct sg_table *pages =
+		i915_pages_create_for_stolen(obj->base.dev,
+					     obj->stolen->start,
+					     obj->stolen->size);
+	*sg_mask = obj->stolen->size;
+
+	return pages;
 }
 
 static void i915_gem_object_put_pages_stolen(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index ccd09e8419f5..b4c15a847a63 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -406,7 +406,8 @@ struct get_pages_work {
 #endif
 
 static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
+st_set_pages(struct sg_table **st, struct page **pvec, int num_pages,
+	     unsigned int *sg_mask)
 {
 	struct scatterlist *sg;
 	int ret, n;
@@ -422,12 +423,17 @@ st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
 
 		for_each_sg((*st)->sgl, sg, num_pages, n)
 			sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
+
+		*sg_mask = PAGE_SIZE;
 	} else {
 		ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
 						0, num_pages << PAGE_SHIFT,
 						GFP_KERNEL);
 		if (ret)
 			goto err;
+
+		for_each_sg((*st)->sgl, sg, num_pages, n)
+			*sg_mask |= sg->length;
 	}
 
 	return 0;
@@ -440,12 +446,13 @@ st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
 
 static struct sg_table *
 __i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
-			     struct page **pvec, int num_pages)
+			     struct page **pvec, int num_pages,
+			     unsigned int *sg_mask)
 {
 	struct sg_table *pages;
 	int ret;
 
-	ret = st_set_pages(&pages, pvec, num_pages);
+	ret = st_set_pages(&pages, pvec, num_pages, sg_mask);
 	if (ret)
 		return ERR_PTR(ret);
 
@@ -540,9 +547,12 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 		struct sg_table *pages = ERR_PTR(ret);
 
 		if (pinned == npages) {
-			pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
+			unsigned int sg_mask = 0;
+
+			pages = __i915_gem_userptr_set_pages(obj, pvec, npages,
+							     &sg_mask);
 			if (!IS_ERR(pages)) {
-				__i915_gem_object_set_pages(obj, pages);
+				__i915_gem_object_set_pages(obj, pages, sg_mask);
 				pinned = 0;
 				pages = NULL;
 			}
@@ -604,7 +614,8 @@ __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 }
 
 static struct sg_table *
-i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
+i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj,
+			   unsigned int *sg_mask)
 {
 	const int num_pages = obj->base.size >> PAGE_SHIFT;
 	struct mm_struct *mm = obj->userptr.mm->mm;
@@ -661,7 +672,8 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 		pages = __i915_gem_userptr_get_pages_schedule(obj);
 		active = pages == ERR_PTR(-EAGAIN);
 	} else {
-		pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
+		pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages,
+						     sg_mask);
 		active = !IS_ERR(pages);
 	}
 	if (active)
diff --git a/drivers/gpu/drm/i915/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
index caf76af36aba..3f1afe4b65f1 100644
--- a/drivers/gpu/drm/i915/selftests/huge_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
@@ -38,7 +38,7 @@ static void huge_free_pages(struct drm_i915_gem_object *obj,
 }
 
 static struct sg_table *
-huge_get_pages(struct drm_i915_gem_object *obj)
+huge_get_pages(struct drm_i915_gem_object *obj, unsigned int *sg_mask)
 {
 #define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY)
 	const unsigned long nreal = obj->scratch / PAGE_SIZE;
@@ -81,6 +81,8 @@ huge_get_pages(struct drm_i915_gem_object *obj)
 	if (i915_gem_gtt_prepare_pages(obj, pages))
 		goto err;
 
+	*sg_mask = PAGE_SIZE;
+
 	return pages;
 
 err:
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 50710e3f1caa..74fdb84f6843 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -40,7 +40,7 @@ static void fake_free_pages(struct drm_i915_gem_object *obj,
 }
 
 static struct sg_table *
-fake_get_pages(struct drm_i915_gem_object *obj)
+fake_get_pages(struct drm_i915_gem_object *obj, unsigned int *sg_mask)
 {
 #define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY)
 #define PFN_BIAS 0x1000
@@ -66,6 +66,7 @@ fake_get_pages(struct drm_i915_gem_object *obj)
 		sg_set_page(sg, pfn_to_page(PFN_BIAS), len, 0);
 		sg_dma_address(sg) = page_to_phys(sg_page(sg));
 		sg_dma_len(sg) = len;
+		*sg_mask |= len;
 
 		rem -= len;
 	}
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/19] drm/i915: align the vma start to the largest gtt page size
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (3 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 04/19] drm/i915: introduce page_size members Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:35   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 06/19] drm/i915: align 64K objects to 2M Matthew Auld
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

For the 48b PPGTT try to align the vma start address to the required
page size boundary to guarantee we use said page size in the gtt. If we
are dealing with multiple page sizes, we can't guarantee anything and
just align to the largest. For soft pinning and objects which need to be
tightly packed into the lower 32bits we don't force any alignment.

v2: various improvements suggested by Chris

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_vma.c | 15 +++++++++++++++
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 958be0a95960..cee1d00dc085 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -471,6 +471,9 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (ret)
 		return ret;
 
+	vma->page_sizes.phys = obj->mm.page_sizes.phys;
+	vma->page_sizes.sg = obj->mm.page_sizes.sg;
+
 	if (flags & PIN_OFFSET_FIXED) {
 		u64 offset = flags & PIN_OFFSET_MASK;
 		if (!IS_ALIGNED(offset, alignment) ||
@@ -485,6 +488,18 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 		if (ret)
 			goto err_unpin;
 	} else {
+		/* We only support huge gtt pages through the 48b PPGTT,
+		 * however we also don't want to force any alignment for
+		 * objects which need to be tightly packed into the low 32bits.
+		 */
+		if (end > (1ULL << 32) &&
+		    vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+			u64 page_alignment =
+				rounddown_pow_of_two(vma->page_sizes.sg);
+
+			alignment = max(alignment, page_alignment);
+		}
+
 		ret = i915_gem_gtt_insert(vma->vm, &vma->node,
 					  size, alignment, obj->cache_level,
 					  start, end, flags);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 4a673fc1a432..834f7ca2ada2 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -52,6 +52,7 @@ struct i915_vma {
 	struct drm_i915_fence_reg *fence;
 	struct reservation_object *resv; /** Alias of obj->resv */
 	struct sg_table *pages;
+	struct i915_page_sizes page_sizes;
 	void __iomem *iomap;
 	u64 size;
 	u64 display_alignment;
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/19] drm/i915: align 64K objects to 2M
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (4 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 05/19] drm/i915: align the vma start to the largest gtt page size Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:37   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 07/19] drm/i915: pass the vma to insert_entries Matthew Auld
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

We can't mix 64K and 4K pte's in the same page-table, so for now we
align 64K objects to 2M to avoid any potential mixing. This is
potentially wasteful but in reality shouldn't be too bad since this only
applies to the virtual address space of a 48b PPGTT.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index cee1d00dc085..596269172cd2 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -495,7 +495,15 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 		if (end > (1ULL << 32) &&
 		    vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
 			u64 page_alignment =
-				rounddown_pow_of_two(vma->page_sizes.sg);
+				rounddown_pow_of_two(vma->page_sizes.sg |
+						     I915_GTT_PAGE_SIZE_2M);
+
+			/* We can't mix 64K and 4K PTEs in the same page-table (2M
+			 * block), and so to avoid the ugliness and complexity of
+			 * coloring we opt for just aligning 64K objects to 2M.
+			 */
+			if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K)
+				size = round_up(size, I915_GTT_PAGE_SIZE_2M);
 
 			alignment = max(alignment, page_alignment);
 		}
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/19] drm/i915: pass the vma to insert_entries
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (5 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 06/19] drm/i915: align 64K objects to 2M Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:39   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 08/19] drm/i915: enable IPS bit for 64K pages Matthew Auld
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

The vma contains most of the information we need for insertion. But also
in preparation for supporting huge-pages for the ppgtt, it would be
useful to know the details of vma->page_sizes and the node size, such
that we can we can easily determine the page sizes we are allowed to use
when inserting into the 48b PPGTT.  This is especially true for 64K
where we can't just arbitrarily use it, since we require
aligning/padding the vm space to 2M, which sometimes we can't enforce in
the upper levels.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 68 +++++++++++----------------
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 +++-
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  3 +-
 4 files changed, 38 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 61fc7e90a7da..de67084d5fcf 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -207,8 +207,7 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 	if (vma->obj->gt_ro)
 		pte_flags |= PTE_READ_ONLY;
 
-	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 
 	return 0;
 }
@@ -907,37 +906,35 @@ gen8_ppgtt_insert_pte_entries(struct i915_hw_ppgtt *ppgtt,
 }
 
 static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
-				   struct sg_table *pages,
-				   u64 start,
+				   struct i915_vma *vma,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
-	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
+	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vma->vm);
 	struct sgt_dma iter = {
-		.sg = pages->sgl,
+		.sg = vma->pages->sgl,
 		.dma = sg_dma_address(iter.sg),
 		.max = iter.dma + iter.sg->length,
 	};
-	struct gen8_insert_pte idx = gen8_insert_pte(start);
+	struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
 
 	gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx,
 				      cache_level);
 }
 
 static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
-				   struct sg_table *pages,
-				   u64 start,
+				   struct i915_vma *vma,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
 	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
 	struct sgt_dma iter = {
-		.sg = pages->sgl,
+		.sg = vma->pages->sgl,
 		.dma = sg_dma_address(iter.sg),
 		.max = iter.dma + iter.sg->length,
 	};
 	struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
-	struct gen8_insert_pte idx = gen8_insert_pte(start);
+	struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
 
 	while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter,
 					     &idx, cache_level))
@@ -1621,13 +1618,12 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 }
 
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
-				      struct sg_table *pages,
-				      u64 start,
+				      struct i915_vma *vma,
 				      enum i915_cache_level cache_level,
 				      u32 flags)
 {
 	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
-	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned first_entry = vma->node.start >> PAGE_SHIFT;
 	unsigned act_pt = first_entry / GEN6_PTES;
 	unsigned act_pte = first_entry % GEN6_PTES;
 	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
@@ -1635,7 +1631,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 	gen6_pte_t *vaddr;
 
 	vaddr = kmap_atomic_px(ppgtt->pd.page_table[act_pt]);
-	iter.sg = pages->sgl;
+	iter.sg = vma->pages->sgl;
 	iter.dma = sg_dma_address(iter.sg);
 	iter.max = iter.dma + iter.sg->length;
 	do {
@@ -2090,8 +2086,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     u64 start,
+				     struct i915_vma *vma,
 				     enum i915_cache_level level,
 				     u32 unused)
 {
@@ -2102,8 +2097,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 	dma_addr_t addr;
 
 	gtt_entries = (gen8_pte_t __iomem *)ggtt->gsm;
-	gtt_entries += start >> PAGE_SHIFT;
-	for_each_sgt_dma(addr, sgt_iter, st)
+	gtt_entries += vma->node.start >> PAGE_SHIFT;
+	for_each_sgt_dma(addr, sgt_iter, vma->pages)
 		gen8_set_pte(gtt_entries++, pte_encode | addr);
 
 	wmb();
@@ -2137,17 +2132,16 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
  * mapped BAR (dev_priv->mm.gtt->gtt).
  */
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     u64 start,
+				     struct i915_vma *vma,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	gen6_pte_t __iomem *entries = (gen6_pte_t __iomem *)ggtt->gsm;
-	unsigned int i = start >> PAGE_SHIFT;
+	unsigned int i = vma->node.start >> PAGE_SHIFT;
 	struct sgt_iter iter;
 	dma_addr_t addr;
-	for_each_sgt_dma(addr, iter, st)
+	for_each_sgt_dma(addr, iter, vma->pages)
 		iowrite32(vm->pte_encode(addr, level, flags), &entries[i++]);
 	wmb();
 
@@ -2229,8 +2223,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
 
 struct insert_entries {
 	struct i915_address_space *vm;
-	struct sg_table *st;
-	u64 start;
+	struct i915_vma *vma;
 	enum i915_cache_level level;
 };
 
@@ -2238,19 +2231,18 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
 {
 	struct insert_entries *arg = _arg;
 
-	gen8_ggtt_insert_entries(arg->vm, arg->st, arg->start, arg->level, 0);
+	gen8_ggtt_insert_entries(arg->vm, arg->vma, arg->level, 0);
 	bxt_vtd_ggtt_wa(arg->vm);
 
 	return 0;
 }
 
 static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
-					     struct sg_table *st,
-					     u64 start,
+					     struct i915_vma *vma,
 					     enum i915_cache_level level,
 					     u32 unused)
 {
-	struct insert_entries arg = { vm, st, start, level };
+	struct insert_entries arg = { vma->vm, vma, level };
 
 	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
 }
@@ -2316,15 +2308,15 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *pages,
-				     u64 start,
+				     struct i915_vma *vma,
 				     enum i915_cache_level cache_level,
 				     u32 unused)
 {
 	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
-	intel_gtt_insert_sg_entries(pages, start >> PAGE_SHIFT, flags);
+	intel_gtt_insert_sg_entries(vma->pages, vma->node.start >> PAGE_SHIFT,
+				    flags);
 }
 
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
@@ -2353,8 +2345,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 		pte_flags |= PTE_READ_ONLY;
 
 	intel_runtime_pm_get(i915);
-	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 	intel_runtime_pm_put(i915);
 
 	/*
@@ -2407,16 +2398,13 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 				goto err_pages;
 		}
 
-		appgtt->base.insert_entries(&appgtt->base,
-					    vma->pages, vma->node.start,
-					    cache_level, pte_flags);
+		appgtt->base.insert_entries(&appgtt->base, vma, cache_level,
+					    pte_flags);
 	}
 
 	if (flags & I915_VMA_GLOBAL_BIND) {
 		intel_runtime_pm_get(i915);
-		vma->vm->insert_entries(vma->vm,
-					vma->pages, vma->node.start,
-					cache_level, pte_flags);
+		vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 		intel_runtime_pm_put(i915);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e9c428d711aa..4c2f7d7c1e7d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -319,8 +319,7 @@ struct i915_address_space {
 			    enum i915_cache_level cache_level,
 			    u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
-			       struct sg_table *st,
-			       u64 start,
+			       struct i915_vma *vma,
 			       enum i915_cache_level cache_level,
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 74fdb84f6843..0e1ded4239f9 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -198,6 +198,9 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 {
 	I915_RND_STATE(seed_prng);
 	unsigned int size;
+	struct i915_vma mock_vma;
+
+	memset(&mock_vma, 0, sizeof(struct i915_vma));
 
 	/* Keep creating larger objects until one cannot fit into the hole */
 	for (size = 12; (hole_end - hole_start) >> size; size++) {
@@ -256,8 +259,11 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 			    vm->allocate_va_range(vm, addr, BIT_ULL(size)))
 				break;
 
-			vm->insert_entries(vm, obj->mm.pages, addr,
-					   I915_CACHE_NONE, 0);
+			mock_vma.pages = obj->mm.pages;
+			mock_vma.node.size = BIT_ULL(size);
+			mock_vma.node.start = addr;
+
+			vm->insert_entries(vm, &mock_vma, I915_CACHE_NONE, 0);
 		}
 		count = n;
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index a61309c7cb3e..f2118cf535a0 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -33,8 +33,7 @@ static void mock_insert_page(struct i915_address_space *vm,
 }
 
 static void mock_insert_entries(struct i915_address_space *vm,
-				struct sg_table *st,
-				u64 start,
+				struct i915_vma *vma,
 				enum i915_cache_level level, u32 flags)
 {
 }
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/19] drm/i915: enable IPS bit for 64K pages
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (6 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 07/19] drm/i915: pass the vma to insert_entries Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 09/19] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Before we can enable 64K pages through the IPS bit, we must first enable
it through MMIO, otherwise the page-walker will simply ignore it.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 11 +++++++++++
 drivers/gpu/drm/i915/i915_reg.h |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9bc8a07b0ca..c4a27b4f419c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4804,6 +4804,17 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 		}
 	}
 
+	/* To support 64K PTE's we need to first enable the use of the
+	 * Intermediate-Page-Size(IPS) bit of the PDE field via some magical
+	 * mmio, otherwise the page-walker will simply ignore the IPS bit. This
+	 * shouldn't be needed after GEN10.
+	 */
+	if (HAS_PAGE_SIZE(dev_priv, I915_GTT_PAGE_SIZE_64K) &&
+	    INTEL_GEN(dev_priv) <= 10)
+		I915_WRITE(GEN8_GAMW_ECO_DEV_RW_IA,
+			   I915_READ(GEN8_GAMW_ECO_DEV_RW_IA) |
+			   GAMW_ECO_ENABLE_64K_IPS_FIELD);
+
 	i915_gem_init_swizzling(dev_priv);
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index c8647cfa81ba..f2ecc92bdd48 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2190,6 +2190,9 @@ enum skl_disp_power_wells {
 #define GEN9_GAMT_ECO_REG_RW_IA _MMIO(0x4ab0)
 #define   GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS	(1<<18)
 
+#define GEN8_GAMW_ECO_DEV_RW_IA _MMIO(0x4080)
+#define   GAMW_ECO_ENABLE_64K_IPS_FIELD 0xF
+
 #define GAMT_CHKN_BIT_REG	_MMIO(0x4ab8)
 #define   GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING	(1<<28)
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/19] drm/i915: disable GTT cache for 2M/1G pages
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (7 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 08/19] drm/i915: enable IPS bit for 64K pages Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:41   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT Matthew Auld
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

When SW enables the use of 2M/1G pages, it must disable the GTT cache.

v2: don't disable for Cherryview which doesn't even support 48b PPGTT!

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_pm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b5b7372fcddc..3939977dddb8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8307,10 +8307,10 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv)
 
 	/*
 	 * WaGttCachingOffByDefault:bdw
-	 * GTT cache may not work with big pages, so if those
-	 * are ever enabled GTT cache may need to be disabled.
+	 * The GTT cache must be disabled if the system is planning to use
+	 * 2M/1G pages.
 	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+	I915_WRITE(HSW_GTT_CACHE_EN, 0);
 
 	/* WaKVMNotificationOnConfigChange:bdw */
 	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (8 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 09/19] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:49   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 11/19] drm/i915: support 2M " Matthew Auld
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Support inserting 1G gtt pages into the 48b PPGTT.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 72 ++++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
 2 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index de67084d5fcf..6fe10ee7dca8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -922,6 +922,65 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
 				      cache_level);
 }
 
+static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
+					   struct i915_page_directory_pointer **pdps,
+					   struct sgt_dma *iter,
+					   enum i915_cache_level cache_level)
+{
+	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
+	u64 start = vma->node.start;
+
+	do {
+		struct gen8_insert_pte idx = gen8_insert_pte(start);
+		struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
+		struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
+		struct i915_page_table *pt = pd->page_table[idx.pde];
+		dma_addr_t rem = iter->max - iter->dma;
+		unsigned int page_size;
+		gen8_pte_t encode = pte_encode;
+		gen8_pte_t *vaddr;
+		u16 index, max;
+
+		if (unlikely(vma->page_sizes.sg & I915_GTT_PAGE_SIZE_1G) &&
+		    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_1G) &&
+		    rem >= I915_GTT_PAGE_SIZE_1G && !(idx.pte | idx.pde)) {
+			vaddr = kmap_atomic_px(pdp);
+			index = idx.pdpe;
+			max = GEN8_PML4ES_PER_PML4;
+			page_size = I915_GTT_PAGE_SIZE_1G;
+			encode |= GEN8_PDPE_PS_1G;
+		} else {
+			vaddr = kmap_atomic_px(pt);
+			index = idx.pte;
+			max = GEN8_PTES;
+			page_size = I915_GTT_PAGE_SIZE;
+		}
+
+		do {
+			vaddr[index++] = encode | iter->dma;
+
+			start += page_size;
+			iter->dma += page_size;
+			if (iter->dma >= iter->max) {
+				iter->sg = __sg_next(iter->sg);
+				if (!iter->sg)
+					break;
+
+				iter->dma = sg_dma_address(iter->sg);
+				iter->max = iter->dma + iter->sg->length;
+
+				if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
+					break;
+			}
+			rem = iter->max - iter->dma;
+
+		} while (rem >= page_size && index < max);
+
+		kunmap_atomic(vaddr);
+
+	} while (iter->sg);
+}
+
 static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 				   struct i915_vma *vma,
 				   enum i915_cache_level cache_level,
@@ -934,11 +993,16 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 		.max = iter.dma + iter.sg->length,
 	};
 	struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
-	struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
 
-	while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter,
-					     &idx, cache_level))
-		GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+	if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
+		gen8_ppgtt_insert_huge_entries(vma, pdps, &iter, cache_level);
+	} else {
+		struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
+
+		while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++],
+						     &iter, &idx, cache_level))
+			GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+	}
 }
 
 static void gen8_free_page_tables(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 4c2f7d7c1e7d..0d31b46cde03 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -149,6 +149,8 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((u64)(x) << ((i) * 8))
 
+#define GEN8_PDPE_PS_1G  BIT(7)
+
 struct sg_table;
 
 struct intel_rotation_info {
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/19] drm/i915: support 2M pages for the 48b PPGTT
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (9 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 12/19] drm/i915: support 64K " Matthew Auld
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6fe10ee7dca8..03c35097ef39 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -949,6 +949,14 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
 			max = GEN8_PML4ES_PER_PML4;
 			page_size = I915_GTT_PAGE_SIZE_1G;
 			encode |= GEN8_PDPE_PS_1G;
+		} else if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
+			   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) &&
+			   rem >= I915_GTT_PAGE_SIZE_2M && !idx.pte) {
+			vaddr = kmap_atomic_px(pd);
+			index = idx.pde;
+			max = I915_PDES;
+			page_size = I915_GTT_PAGE_SIZE_2M;
+			encode |= GEN8_PDE_PS_2M;
 		} else {
 			vaddr = kmap_atomic_px(pt);
 			index = idx.pte;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0d31b46cde03..e9ec75b92f85 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -149,6 +149,8 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((u64)(x) << ((i) * 8))
 
+#define GEN8_PDE_PS_2M   BIT(7)
+
 #define GEN8_PDPE_PS_1G  BIT(7)
 
 struct sg_table;
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/19] drm/i915: support 64K pages for the 48b PPGTT
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (10 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 11/19] drm/i915: support 2M " Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:55   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 13/19] drm/i915: accurate page size tracking for the ppgtt Matthew Auld
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 26 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 03c35097ef39..9b89ec10f333 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -937,6 +937,7 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
 		struct i915_page_table *pt = pd->page_table[idx.pde];
 		dma_addr_t rem = iter->max - iter->dma;
 		unsigned int page_size;
+		bool maybe_64K = false;
 		gen8_pte_t encode = pte_encode;
 		gen8_pte_t *vaddr;
 		u16 index, max;
@@ -962,9 +963,17 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
 			index = idx.pte;
 			max = GEN8_PTES;
 			page_size = I915_GTT_PAGE_SIZE;
+
+			if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K && !idx.pte)
+				maybe_64K = true;
 		}
 
 		do {
+			if (maybe_64K && (index % 16 == 0) &&
+			    (!IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) ||
+			     rem < I915_GTT_PAGE_SIZE_64K))
+				maybe_64K = false;
+
 			vaddr[index++] = encode | iter->dma;
 
 			start += page_size;
@@ -986,6 +995,23 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
 
 		kunmap_atomic(vaddr);
 
+
+		/* Is it safe to mark the 2M block as 64K? -- Either we have
+		 * filled whole page-table with 64K entries, or filled part of
+		 * it and have reached the end of the sg table and we have
+		 * enough padding.
+		 */
+		if (maybe_64K) {
+			if (index == max ||
+			    (!iter->sg && IS_ALIGNED(vma->node.start +
+						     vma->node.size,
+						     I915_GTT_PAGE_SIZE_2M))) {
+				vaddr = kmap_atomic_px(pd);
+				vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
+				kunmap_atomic(vaddr);
+			}
+		}
+
 	} while (iter->sg);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e9ec75b92f85..41df07e5e37a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -149,6 +149,7 @@ typedef u64 gen8_ppgtt_pml4e_t;
 #define GEN8_PPAT_ELLC_OVERRIDE		(0<<2)
 #define GEN8_PPAT(i, x)			((u64)(x) << ((i) * 8))
 
+#define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
 #define GEN8_PDPE_PS_1G  BIT(7)
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/19] drm/i915: accurate page size tracking for the ppgtt
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (11 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 12/19] drm/i915: support 64K " Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:57   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 14/19] drm/i915/debugfs: include some gtt page size metrics Matthew Auld
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Now that we support multiple page sizes for the ppgtt, it would be
useful to track the real usage for debugging purposes.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c    | 16 ++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_object.h | 10 ++++++++++
 2 files changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9b89ec10f333..b1bfbc062b98 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -215,6 +215,8 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 static void ppgtt_unbind_vma(struct i915_vma *vma)
 {
 	vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
+
+	vma->page_sizes.gtt = 0;
 }
 
 static gen8_pte_t gen8_pte_encode(dma_addr_t addr,
@@ -920,6 +922,8 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
 
 	gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx,
 				      cache_level);
+
+	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
 
 static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
@@ -1009,8 +1013,10 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
 				vaddr = kmap_atomic_px(pd);
 				vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
 				kunmap_atomic(vaddr);
+				page_size = I915_GTT_PAGE_SIZE_64K;
 			}
 		}
+		vma->page_sizes.gtt |= page_size;
 
 	} while (iter->sg);
 }
@@ -1036,6 +1042,8 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 		while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++],
 						     &iter, &idx, cache_level))
 			GEM_BUG_ON(idx.pml4e >= GEN8_PML4ES_PER_PML4);
+
+		vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 	}
 }
 
@@ -1752,6 +1760,8 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 	} while (1);
 	kunmap_atomic(vaddr);
+
+	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
@@ -2446,6 +2456,8 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 	intel_runtime_pm_put(i915);
 
+	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+
 	/*
 	 * Without aliasing PPGTT there's no difference between
 	 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
@@ -2463,6 +2475,8 @@ static void ggtt_unbind_vma(struct i915_vma *vma)
 	intel_runtime_pm_get(i915);
 	vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
 	intel_runtime_pm_put(i915);
+
+	vma->page_sizes.gtt = 0;
 }
 
 static int aliasing_gtt_bind_vma(struct i915_vma *vma,
@@ -2535,6 +2549,8 @@ static void aliasing_gtt_unbind_vma(struct i915_vma *vma)
 
 		vm->clear_range(vm, vma->node.start, vma->size);
 	}
+
+	vma->page_sizes.gtt = 0;
 }
 
 void i915_gem_gtt_finish_pages(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 7fc8b8402897..2e0f3b48a81a 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -144,6 +144,7 @@ struct drm_i915_gem_object {
 		struct sg_table *pages;
 		void *mapping;
 
+		/* TODO: whack some of this into the error state */
 		struct i915_page_sizes {
 			/**
 			 * The sg mask of the pages sg_table. i.e the mask of
@@ -159,6 +160,15 @@ struct drm_i915_gem_object {
 			 * to use opportunistically.
 			 */
 			unsigned int sg;
+
+			/**
+			 * The actual gtt page size usage. Since we can have
+			 * multiple vma associated with this object we need to
+			 * prevent any trampling of state, hence a copy of this
+			 * struct also lives in each vma, therefore the gtt
+			 * value here should only be read/write through the vma.
+			 */
+			unsigned int gtt;
 		} page_sizes;
 
 		struct i915_gem_object_page_iter {
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/19] drm/i915/debugfs: include some gtt page size metrics
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (12 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 13/19] drm/i915: accurate page size tracking for the ppgtt Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 15/19] drm/i915/selftests: basic huge page tests Matthew Auld
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Good to know, mostly for debugging purposes.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 42 +++++++++++++++++++++++++++++++++----
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f7aa6cbe3a2e..63d38c1328cf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -117,6 +117,26 @@ static u64 i915_gem_obj_total_ggtt_size(struct drm_i915_gem_object *obj)
 	return size;
 }
 
+static const char *stringify_page_sizes(unsigned int page_sizes)
+{
+	switch (page_sizes) {
+	case I915_GTT_PAGE_SIZE_4K:
+		return "4K";
+	case I915_GTT_PAGE_SIZE_64K:
+		return "64K";
+	case I915_GTT_PAGE_SIZE_2M:
+		return "2M";
+	case I915_GTT_PAGE_SIZE_1G:
+		return "1G";
+	default:
+		/* mixed-mode? */
+		if (page_sizes)
+			return "M";
+
+		return "";
+	}
+}
+
 static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
@@ -154,9 +174,10 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		if (!drm_mm_node_allocated(&vma->node))
 			continue;
 
-		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
+		seq_printf(m, " (%sgtt offset: %08llx, size: %08llx, pages: %s",
 			   i915_vma_is_ggtt(vma) ? "g" : "pp",
-			   vma->node.start, vma->node.size);
+			   vma->node.start, vma->node.size,
+			   stringify_page_sizes(vma->page_sizes.gtt));
 		if (i915_vma_is_ggtt(vma)) {
 			switch (vma->ggtt_view.type) {
 			case I915_GGTT_VIEW_NORMAL:
@@ -401,8 +422,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct drm_device *dev = &dev_priv->drm;
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
-	u32 count, mapped_count, purgeable_count, dpy_count;
-	u64 size, mapped_size, purgeable_size, dpy_size;
+	u32 count, mapped_count, purgeable_count, dpy_count, huge_count;
+	u64 size, mapped_size, purgeable_size, dpy_size, huge_size;
 	struct drm_i915_gem_object *obj;
 	struct drm_file *file;
 	int ret;
@@ -418,6 +439,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	size = count = 0;
 	mapped_size = mapped_count = 0;
 	purgeable_size = purgeable_count = 0;
+	huge_size = huge_count = 0;
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_link) {
 		size += obj->base.size;
 		++count;
@@ -431,6 +453,11 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 			mapped_count++;
 			mapped_size += obj->base.size;
 		}
+
+		if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+			huge_count++;
+			huge_size += obj->base.size;
+		}
 	}
 	seq_printf(m, "%u unbound objects, %llu bytes\n", count, size);
 
@@ -453,6 +480,11 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 			mapped_count++;
 			mapped_size += obj->base.size;
 		}
+
+		if (obj->mm.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+			huge_count++;
+			huge_size += obj->base.size;
+		}
 	}
 	seq_printf(m, "%u bound objects, %llu bytes\n",
 		   count, size);
@@ -460,6 +492,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		   purgeable_count, purgeable_size);
 	seq_printf(m, "%u mapped objects, %llu bytes\n",
 		   mapped_count, mapped_size);
+	seq_printf(m, "%u huge-paged objects, %llu bytes\n",
+		   huge_count, huge_size);
 	seq_printf(m, "%u display objects (pinned), %llu bytes\n",
 		   dpy_count, dpy_size);
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 15/19] drm/i915/selftests: basic huge page tests
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (13 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 14/19] drm/i915/debugfs: include some gtt page size metrics Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-22 14:17   ` Chris Wilson
  2017-06-22 14:21   ` Chris Wilson
  2017-06-21 20:33 ` [PATCH 16/19] drm/i915/selftests: mix huge pages Matthew Auld
                   ` (4 subsequent siblings)
  19 siblings, 2 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c                    |   1 +
 drivers/gpu/drm/i915/selftests/huge_pages.c        | 604 +++++++++++++++++++++
 .../gpu/drm/i915/selftests/i915_live_selftests.h   |   1 +
 3 files changed, 606 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/selftests/huge_pages.c

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c4a27b4f419c..6cced283e9de 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5413,6 +5413,7 @@ i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj,
 #include "selftests/scatterlist.c"
 #include "selftests/mock_gem_device.c"
 #include "selftests/huge_gem_object.c"
+#include "selftests/huge_pages.c"
 #include "selftests/i915_gem_object.c"
 #include "selftests/i915_gem_coherency.c"
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
new file mode 100644
index 000000000000..5e0437045804
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -0,0 +1,604 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "../i915_selftest.h"
+
+#include <linux/prime_numbers.h>
+
+#include "mock_drm.h"
+
+static unsigned int page_sizes[] = {
+	I915_GTT_PAGE_SIZE_1G,
+	I915_GTT_PAGE_SIZE_2M,
+	I915_GTT_PAGE_SIZE_64K,
+	I915_GTT_PAGE_SIZE_4K,
+};
+
+static unsigned int get_largest_page_size(struct drm_i915_private *i915,
+					  unsigned int rem)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) {
+		unsigned int page_size = page_sizes[i];
+
+		if (HAS_PAGE_SIZE(i915, page_size) && rem >= page_size)
+			return page_size;
+	}
+
+	GEM_BUG_ON(1);
+}
+
+static struct sg_table *
+fake_get_huge_pages(struct drm_i915_gem_object *obj,
+		    unsigned int *sg_mask)
+{
+#define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY)
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	const unsigned long max = obj->base.size >> PAGE_SHIFT;
+	struct sg_table *st;
+	struct scatterlist *sg;
+	typeof(obj->base.size) rem;
+
+	st = kmalloc(sizeof(*st), GFP);
+	if (!st)
+		return ERR_PTR(-ENOMEM);
+
+	if (sg_alloc_table(st, max, GFP)) {
+		kfree(st);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	/*
+	 * Use the optimal page sized chunks to fill in the sg table from large
+	 * to small.
+	 */
+	rem = obj->base.size;
+	sg = st->sgl;
+	st->nents = 0;
+	do {
+		unsigned int page_size = get_largest_page_size(i915, rem);
+		unsigned int len = page_size * (rem / page_size);
+
+		sg->offset = 0;
+		sg->length = len;
+		sg_dma_len(sg) = len;
+		sg_dma_address(sg) = page_size;
+
+		*sg_mask |= len;
+
+		st->nents++;
+
+		rem -= len;
+		if (!rem) {
+			sg_mark_end(sg);
+			break;
+		}
+
+		sg = sg_next(sg);
+	} while (1);
+
+	obj->mm.madv = I915_MADV_DONTNEED;
+
+	return st;
+#undef GFP
+}
+
+static void fake_free_huge_pages(struct drm_i915_gem_object *obj,
+				 struct sg_table *pages)
+{
+	sg_free_table(pages);
+	kfree(pages);
+}
+
+static void fake_put_huge_pages(struct drm_i915_gem_object *obj,
+				struct sg_table *pages)
+{
+	fake_free_huge_pages(obj, pages);
+	obj->mm.dirty = false;
+	obj->mm.madv = I915_MADV_WILLNEED;
+}
+
+static const struct drm_i915_gem_object_ops fake_ops = {
+	.flags = I915_GEM_OBJECT_IS_SHRINKABLE,
+	.get_pages = fake_get_huge_pages,
+	.put_pages = fake_put_huge_pages,
+};
+
+static struct drm_i915_gem_object *
+fake_huge_paged_object(struct drm_i915_private *i915, u64 size)
+{
+	struct drm_i915_gem_object *obj;
+
+	GEM_BUG_ON(!size);
+	GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE));
+
+	if (overflows_type(size, obj->base.size))
+		return ERR_PTR(-E2BIG);
+
+	obj = i915_gem_object_alloc(i915);
+	if (!obj)
+		return ERR_PTR(-ENOMEM);
+
+	drm_gem_private_object_init(&i915->drm, &obj->base, size);
+	i915_gem_object_init(obj, &fake_ops);
+
+	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+	obj->cache_level = I915_CACHE_NONE;
+
+	return obj;
+}
+
+static int igt_ppgtt_huge_fill(void *arg)
+{
+	struct i915_hw_ppgtt *ppgtt = arg;
+	struct drm_i915_private *i915 = ppgtt->base.i915;
+	struct drm_i915_gem_object *obj;
+	unsigned long max_pages = ppgtt->base.total >> PAGE_SHIFT;
+	unsigned long page_num;
+	IGT_TIMEOUT(end_time);
+	int err;
+
+	for_each_prime_number_from(page_num, 1, max_pages) {
+		unsigned int expected_gtt = 0;
+		struct i915_vma *vma;
+		typeof(obj->base.size) size;
+		int i;
+
+		obj = fake_huge_paged_object(i915, page_num << PAGE_SHIFT);
+		if (IS_ERR(obj))
+			return PTR_ERR(obj);
+
+		GEM_BUG_ON(obj->base.size != page_num << PAGE_SHIFT);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto out_put;
+
+		GEM_BUG_ON(!obj->mm.page_sizes.sg);
+
+		vma = i915_vma_instance(obj, &ppgtt->base, NULL);
+		if (IS_ERR(vma)) {
+			err = PTR_ERR(vma);
+			goto out_unpin;
+		}
+
+		err = i915_vma_pin(vma, 0, 0, PIN_USER);
+		if (err) {
+			i915_vma_close(vma);
+			goto out_unpin;
+		}
+
+		GEM_BUG_ON(obj->mm.page_sizes.gtt);
+		GEM_BUG_ON(!vma->page_sizes.sg);
+		GEM_BUG_ON(!vma->page_sizes.phys);
+
+		size = obj->base.size;
+
+		/* Figure out the expected gtt page size knowing that we go from
+		 * largest to smallest page size sg chunks, and that we align to
+		 * the largest page size.
+		 */
+		for (i = 0; i < ARRAY_SIZE(page_sizes); ++i) {
+			unsigned int page_size = page_sizes[i];
+
+			if (HAS_PAGE_SIZE(i915, page_size) && size >= page_size) {
+				expected_gtt |= page_size;
+				size &= page_size-1;
+			}
+		}
+
+		GEM_BUG_ON(!expected_gtt);
+		GEM_BUG_ON(size);
+
+		if (expected_gtt & I915_GTT_PAGE_SIZE_4K)
+			expected_gtt &= ~I915_GTT_PAGE_SIZE_64K;
+
+		if (i915_vm_is_48bit(&ppgtt->base)) {
+			GEM_BUG_ON(vma->page_sizes.gtt != expected_gtt);
+
+			if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) {
+				GEM_BUG_ON(!IS_ALIGNED(vma->node.start,
+						       I915_GTT_PAGE_SIZE_2M));
+				GEM_BUG_ON(!IS_ALIGNED(vma->node.size,
+						       I915_GTT_PAGE_SIZE_2M));
+			}
+		} else {
+			GEM_BUG_ON(vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K);
+			GEM_BUG_ON(vma->node.size != page_num << PAGE_SHIFT);
+		}
+
+		i915_vma_unpin(vma);
+		i915_vma_close(vma);
+
+		i915_gem_object_unpin_pages(obj);
+		i915_gem_object_put(obj);
+
+		if (igt_timeout(end_time,
+				"%s timed out at size %lx\n",
+				__func__, obj->base.size))
+			break;
+	}
+
+	return 0;
+
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
+
+static int igt_ppgtt_misaligned_dma(void *arg)
+{
+	struct i915_hw_ppgtt *ppgtt = arg;
+	struct drm_i915_private *i915 = ppgtt->base.i915;
+	unsigned long supported = INTEL_INFO(i915)->page_size_mask;
+	struct drm_i915_gem_object *obj;
+	int err;
+	int bit;
+
+	/* Sanity check dma misalignment for huge pages -- the dma addresses we
+	 * insert into the paging structures need to always respect the page
+	 * size alignment.
+	 */
+
+	bit = ilog2(I915_GTT_PAGE_SIZE_64K);
+
+	for_each_set_bit_from(bit, &supported, BITS_PER_LONG) {
+		IGT_TIMEOUT(end_time);
+		unsigned int page_size = BIT(bit);
+		unsigned int flags = PIN_USER | PIN_OFFSET_FIXED;
+		struct i915_vma *vma;
+		unsigned int offset;
+		unsigned int size =
+			round_up(page_size, I915_GTT_PAGE_SIZE_2M) << 1;
+
+		obj = fake_huge_paged_object(i915, size);
+		if (IS_ERR(obj))
+			return PTR_ERR(obj);
+
+		GEM_BUG_ON(obj->base.size != size);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto out_put;
+
+		GEM_BUG_ON(!(obj->mm.page_sizes.sg & page_size));
+
+		/* Force the page size for this object */
+		obj->mm.page_sizes.sg = page_size;
+
+		vma = i915_vma_instance(obj, &ppgtt->base, NULL);
+		if (IS_ERR(vma)) {
+			err = PTR_ERR(vma);
+			goto out_unpin;
+		}
+
+		err = i915_vma_pin(vma, 0, 0, flags);
+		if (err) {
+			i915_vma_close(vma);
+			goto out_unpin;
+		}
+
+		GEM_BUG_ON(vma->page_sizes.gtt != page_size);
+
+		i915_vma_unpin(vma);
+		err = i915_vma_unbind(vma);
+		if (err)
+			goto out_unpin;
+
+		/* Try all the other valid offsets until the next boundary --
+		 * should always fall back to using 4K pages.
+		 */
+		for (offset = 4096; offset < page_size; offset += 4096) {
+			err = i915_vma_pin(vma, 0, 0, flags | offset);
+			if (err) {
+				i915_vma_close(vma);
+				goto out_unpin;
+			}
+
+			GEM_BUG_ON(vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K);
+
+			i915_vma_unpin(vma);
+			err = i915_vma_unbind(vma);
+			if (err)
+				goto out_unpin;
+
+			if (igt_timeout(end_time,
+					"%s timed out at offset %x with page-size %x\n",
+					__func__, offset, page_size))
+				break;
+		}
+
+		i915_vma_close(vma);
+
+		i915_gem_object_unpin_pages(obj);
+		i915_gem_object_put(obj);
+	}
+
+	return 0;
+
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
+static int igt_ppgtt_64K(void *arg)
+{
+	struct i915_hw_ppgtt *ppgtt = arg;
+	struct drm_i915_private *i915 = ppgtt->base.i915;
+	struct drm_i915_gem_object *obj;
+	struct object_info {
+		unsigned int size;
+		unsigned int gtt;
+		unsigned int offset;
+	} objects[] = {
+		/* Cases with forced padding/alignment */
+		{
+			.size = SZ_64K,
+			.gtt = I915_GTT_PAGE_SIZE_64K,
+			.offset = 0,
+		},
+		{
+			.size = SZ_64K + SZ_4K,
+			.gtt = I915_GTT_PAGE_SIZE_4K,
+			.offset = 0.
+		},
+		{
+			.size = SZ_2M - SZ_4K,
+			.gtt = I915_GTT_PAGE_SIZE_4K,
+			.offset = 0,
+		},
+		{
+			.size = SZ_2M + SZ_64K,
+			.gtt = I915_GTT_PAGE_SIZE_64K,
+			.offset = 0,
+		},
+		{
+			.size = SZ_2M + SZ_4K,
+			.gtt = I915_GTT_PAGE_SIZE_64K | I915_GTT_PAGE_SIZE_4K,
+			.offset = 0,
+		},
+		/* Try without any forced padding/alignment */
+		{
+			.size = SZ_64K,
+			.offset = SZ_2M,
+			.gtt = I915_GTT_PAGE_SIZE_4K,
+		},
+		{
+			.size = SZ_128K,
+			.offset = SZ_2M - SZ_64K,
+			.gtt = I915_GTT_PAGE_SIZE_4K,
+		},
+	};
+	int i;
+	int err;
+
+	if (!HAS_PAGE_SIZE(i915, I915_GTT_PAGE_SIZE_64K))
+		return 0;
+
+	GEM_BUG_ON(!i915_vm_is_48bit(&ppgtt->base));
+
+	/* Sanity check some of the trickiness with 64K pages -- either we can
+	 * safely mark the whole page-table(2M block) as 64K, or we have to
+	 * always fallback to 4K.
+	 */
+
+	for (i = 0; i < ARRAY_SIZE(objects); ++i) {
+		unsigned int size = objects[i].size;
+		unsigned int expected_gtt = objects[i].gtt;
+		unsigned int offset = objects[i].offset;
+		struct i915_vma *vma;
+		int flags = PIN_USER;
+
+		obj = fake_huge_paged_object(i915, size);
+		if (IS_ERR(obj))
+			return PTR_ERR(obj);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto out_put;
+
+		GEM_BUG_ON(!obj->mm.page_sizes.sg);
+
+		/* Disable 2M pages -- We only want to use 64K/4K pages for
+		 * this test.
+		 */
+		obj->mm.page_sizes.sg &= ~I915_GTT_PAGE_SIZE_2M;
+
+		vma = i915_vma_instance(obj, &ppgtt->base, NULL);
+		if (IS_ERR(vma)) {
+			err = PTR_ERR(vma);
+			goto out_unpin;
+		}
+
+		if (offset)
+			flags |= PIN_OFFSET_FIXED | offset;
+
+		err = i915_vma_pin(vma, 0, 0, flags);
+		if (err) {
+			i915_vma_close(vma);
+			goto out_unpin;
+		}
+
+		GEM_BUG_ON(obj->mm.page_sizes.gtt);
+		GEM_BUG_ON(!vma->page_sizes.sg);
+		GEM_BUG_ON(!vma->page_sizes.phys);
+
+		GEM_BUG_ON(vma->page_sizes.gtt != expected_gtt);
+
+		if (!offset && vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) {
+			GEM_BUG_ON(!IS_ALIGNED(vma->node.start,
+					       I915_GTT_PAGE_SIZE_2M));
+			GEM_BUG_ON(!IS_ALIGNED(vma->node.size,
+					       I915_GTT_PAGE_SIZE_2M));
+		}
+
+		i915_vma_unpin(vma);
+		i915_vma_close(vma);
+
+		i915_gem_object_unpin_pages(obj);
+		i915_gem_object_put(obj);
+	}
+
+	return 0;
+
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
+static int igt_ppgtt_gemfs_huge(void *arg)
+{
+	struct i915_hw_ppgtt *ppgtt = arg;
+	struct drm_i915_private *i915 = ppgtt->base.i915;
+	struct drm_i915_gem_object *obj;
+	unsigned int object_sizes[] = {
+		I915_GTT_PAGE_SIZE_2M,
+		I915_GTT_PAGE_SIZE_2M + I915_GTT_PAGE_SIZE_4K,
+	};
+	int err;
+	int i;
+
+	if (!HAS_PAGE_SIZE(i915, I915_GTT_PAGE_SIZE_2M) ||
+	    !IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE) ||
+	    !has_transparent_hugepage())
+		return 0;
+
+	/* Sanity check THP through gemfs */
+
+	for (i = 0; i < ARRAY_SIZE(object_sizes); ++i) {
+		unsigned int size = object_sizes[i];
+		struct i915_vma *vma;
+
+		obj = i915_gem_object_create(i915, size);
+		if (IS_ERR(obj))
+			return PTR_ERR(obj);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto out_put;
+
+		GEM_BUG_ON(!obj->mm.page_sizes.sg);
+
+		if (obj->mm.page_sizes.sg < size) {
+			pr_err("Failed to allocate huge pages\n");
+			err = -EINVAL;
+			goto out_unpin;
+		}
+
+		vma = i915_vma_instance(obj, &ppgtt->base, NULL);
+		if (IS_ERR(vma)) {
+			err = PTR_ERR(vma);
+			goto out_unpin;
+		}
+
+		err = i915_vma_pin(vma, 0, 0, PIN_USER);
+		if (err) {
+			i915_vma_close(vma);
+			goto out_unpin;
+		}
+
+		/* TODO: maybe we should try writing to the obj from the gpu
+		 * pov, since we require various things to be enabled/disabled
+		 * for huge-pages to work correctly. Also if we force huge-pages
+		 * for gemfs we should be able to exercise 64K pages, since this
+		 * should always give us a 64K aligned memory region.
+		 */
+
+		GEM_BUG_ON(obj->mm.page_sizes.gtt);
+		GEM_BUG_ON(!vma->page_sizes.sg);
+		GEM_BUG_ON(!vma->page_sizes.phys);
+
+		if (i915_vm_is_48bit(&ppgtt->base)) {
+			GEM_BUG_ON(vma->page_sizes.gtt != size);
+			GEM_BUG_ON(!IS_ALIGNED(vma->node.start, I915_GTT_PAGE_SIZE_2M));
+			GEM_BUG_ON(!IS_ALIGNED(vma->node.size, I915_GTT_PAGE_SIZE_2M));
+		} else {
+			GEM_BUG_ON(vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K);
+			GEM_BUG_ON(vma->node.size != size);
+		}
+
+		i915_vma_unpin(vma);
+		i915_vma_close(vma);
+
+		i915_gem_object_unpin_pages(obj);
+		i915_gem_object_put(obj);
+	}
+
+	return 0;
+
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
+int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_ppgtt_huge_fill),
+		SUBTEST(igt_ppgtt_misaligned_dma),
+		SUBTEST(igt_ppgtt_64K),
+		SUBTEST(igt_ppgtt_gemfs_huge),
+	};
+	struct i915_hw_ppgtt *ppgtt;
+	struct drm_file *file;
+	int ret;
+
+	file = mock_file(dev_priv);
+	if (IS_ERR(file))
+		return PTR_ERR(file);
+
+	mutex_lock(&dev_priv->drm.struct_mutex);
+	ppgtt = i915_ppgtt_create(dev_priv, file->driver_priv, "mock");
+	if (IS_ERR(ppgtt)) {
+		ret = PTR_ERR(ppgtt);
+		goto out_unlock;
+	}
+
+	ret = i915_subtests(tests, ppgtt);
+
+	i915_ppgtt_close(&ppgtt->base);
+	i915_ppgtt_put(ppgtt);
+
+out_unlock:
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+
+	mock_file_free(dev_priv, file);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index 18b174d855ca..6b49cbb49535 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -15,5 +15,6 @@ selftest(objects, i915_gem_object_live_selftests)
 selftest(dmabuf, i915_gem_dmabuf_live_selftests)
 selftest(coherency, i915_gem_coherency_live_selftests)
 selftest(gtt, i915_gem_gtt_live_selftests)
+selftest(huge, i915_gem_huge_page_live_selftests)
 selftest(contexts, i915_gem_context_live_selftests)
 selftest(hangcheck, intel_hangcheck_live_selftests)
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 16/19] drm/i915/selftests: mix huge pages
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (14 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 15/19] drm/i915/selftests: basic huge page tests Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 17/19] drm/i915: enable platform support for 64K pages Matthew Auld
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

Try to mix sg page sizes for 4K, 64K and 2M pages.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/selftests/scatterlist.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/scatterlist.c b/drivers/gpu/drm/i915/selftests/scatterlist.c
index 1cc5d2931753..340935abcd0a 100644
--- a/drivers/gpu/drm/i915/selftests/scatterlist.c
+++ b/drivers/gpu/drm/i915/selftests/scatterlist.c
@@ -189,6 +189,20 @@ static unsigned int random(unsigned long n,
 	return 1 + (prandom_u32_state(rnd) % 1024);
 }
 
+static unsigned int random_page_size_pages(unsigned long n,
+					   unsigned long count,
+					   struct rnd_state *rnd)
+{
+	/* 4K, 64K, 2M */
+	static unsigned int page_count[] = {
+		BIT(12) >> 12,
+		BIT(16) >> 12,
+		BIT(21) >> 12,
+	};
+
+	return page_count[(prandom_u32_state(rnd) % 3)];
+}
+
 static inline bool page_contiguous(struct page *first,
 				   struct page *last,
 				   unsigned long npages)
@@ -252,6 +266,7 @@ static const npages_fn_t npages_funcs[] = {
 	grow,
 	shrink,
 	random,
+	random_page_size_pages,
 	NULL,
 };
 
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 17/19] drm/i915: enable platform support for 64K pages
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (15 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 16/19] drm/i915/selftests: mix huge pages Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 18/19] drm/i915: enable platform support for 2M pages Matthew Auld
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

For gen9+ enable platform level support for 64K pages. Also enable for
mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c                  | 3 ++-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index b73c1eb778d1..d8fa06988914 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -364,7 +364,8 @@ static const struct intel_device_info intel_cherryview_info = {
 };
 
 #define GEN9_DEFAULT_PAGE_SIZES \
-	.page_size_mask = I915_GTT_PAGE_SIZE_4K
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
+			  I915_GTT_PAGE_SIZE_64K
 
 #define SKL_PLATFORM \
 	BDW_FEATURES, \
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 0002ba28780c..370ded2a5166 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -149,7 +149,8 @@ struct drm_i915_private *mock_gem_device(void)
 	mkwrite_device_info(i915)->gen = -1;
 
 	mkwrite_device_info(i915)->page_size_mask =
-		I915_GTT_PAGE_SIZE_4K;
+		I915_GTT_PAGE_SIZE_4K |
+		I915_GTT_PAGE_SIZE_64K;
 
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 18/19] drm/i915: enable platform support for 2M pages
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (16 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 17/19] drm/i915: enable platform support for 64K pages Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 20:33 ` [PATCH 19/19] drm/i915: enable platform support for 1G pages Matthew Auld
  2017-06-21 21:05 ` ✓ Fi.CI.BAT: success for huge gtt pages (rev2) Patchwork
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

For gen8+ platforms which support the 48b PPGTT, enable platform level
support for 2M pages. Also enable for mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c                  | 6 ++++--
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index d8fa06988914..bfe6a79be969 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -319,7 +319,8 @@ static const struct intel_device_info intel_haswell_info = {
 #define BDW_FEATURES \
 	HSW_FEATURES, \
 	BDW_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES, \
+	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
+			  I915_GTT_PAGE_SIZE_2M, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
 	.has_64bit_reloc = 1, \
@@ -365,7 +366,8 @@ static const struct intel_device_info intel_cherryview_info = {
 
 #define GEN9_DEFAULT_PAGE_SIZES \
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
-			  I915_GTT_PAGE_SIZE_64K
+			  I915_GTT_PAGE_SIZE_64K | \
+			  I915_GTT_PAGE_SIZE_2M
 
 #define SKL_PLATFORM \
 	BDW_FEATURES, \
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 370ded2a5166..499ec437cc15 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -150,7 +150,8 @@ struct drm_i915_private *mock_gem_device(void)
 
 	mkwrite_device_info(i915)->page_size_mask =
 		I915_GTT_PAGE_SIZE_4K |
-		I915_GTT_PAGE_SIZE_64K;
+		I915_GTT_PAGE_SIZE_64K |
+		I915_GTT_PAGE_SIZE_2M;
 
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 19/19] drm/i915: enable platform support for 1G pages
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (17 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 18/19] drm/i915: enable platform support for 2M pages Matthew Auld
@ 2017-06-21 20:33 ` Matthew Auld
  2017-06-21 21:05 ` ✓ Fi.CI.BAT: success for huge gtt pages (rev2) Patchwork
  19 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-21 20:33 UTC (permalink / raw)
  To: intel-gfx

For gen8+ enable platforms which support the 48b PPGTT, enable support
for 1G pages. Also enable for mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c                  | 6 ++++--
 drivers/gpu/drm/i915/selftests/mock_gem_device.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index bfe6a79be969..12363ae5da06 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -320,7 +320,8 @@ static const struct intel_device_info intel_haswell_info = {
 	HSW_FEATURES, \
 	BDW_COLORS, \
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
-			  I915_GTT_PAGE_SIZE_2M, \
+			  I915_GTT_PAGE_SIZE_2M | \
+			  I915_GTT_PAGE_SIZE_1G, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
 	.has_64bit_reloc = 1, \
@@ -367,7 +368,8 @@ static const struct intel_device_info intel_cherryview_info = {
 #define GEN9_DEFAULT_PAGE_SIZES \
 	.page_size_mask = I915_GTT_PAGE_SIZE_4K | \
 			  I915_GTT_PAGE_SIZE_64K | \
-			  I915_GTT_PAGE_SIZE_2M
+			  I915_GTT_PAGE_SIZE_2M | \
+			  I915_GTT_PAGE_SIZE_1G
 
 #define SKL_PLATFORM \
 	BDW_FEATURES, \
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 499ec437cc15..a8327febf02f 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -151,7 +151,8 @@ struct drm_i915_private *mock_gem_device(void)
 	mkwrite_device_info(i915)->page_size_mask =
 		I915_GTT_PAGE_SIZE_4K |
 		I915_GTT_PAGE_SIZE_64K |
-		I915_GTT_PAGE_SIZE_2M;
+		I915_GTT_PAGE_SIZE_2M |
+		I915_GTT_PAGE_SIZE_1G;
 
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* ✓ Fi.CI.BAT: success for huge gtt pages (rev2)
  2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
                   ` (18 preceding siblings ...)
  2017-06-21 20:33 ` [PATCH 19/19] drm/i915: enable platform support for 1G pages Matthew Auld
@ 2017-06-21 21:05 ` Patchwork
  19 siblings, 0 replies; 36+ messages in thread
From: Patchwork @ 2017-06-21 21:05 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: huge gtt pages (rev2)
URL   : https://patchwork.freedesktop.org/series/25118/
State : success

== Summary ==

Series 25118v2 huge gtt pages
https://patchwork.freedesktop.org/api/1.0/series/25118/revisions/2/mbox/

Test prime_busy:
        Subgroup basic-wait-after-default:
                pass       -> DMESG-WARN (fi-skl-6700hq) fdo#101515 +1

fdo#101515 https://bugs.freedesktop.org/show_bug.cgi?id=101515

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time:436s
fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:423s
fi-bsw-n3050     total:278  pass:241  dwarn:1   dfail:0   fail:0   skip:36  time:531s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:499s
fi-byt-j1900     total:278  pass:252  dwarn:2   dfail:0   fail:0   skip:24  time:477s
fi-byt-n2820     total:278  pass:248  dwarn:2   dfail:0   fail:0   skip:28  time:482s
fi-glk-2a        total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:582s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:440s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:414s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:413s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:497s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:471s
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:466s
fi-kbl-7560u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:570s
fi-kbl-r         total:278  pass:259  dwarn:1   dfail:0   fail:0   skip:18  time:591s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:466s
fi-skl-6700hq    total:278  pass:220  dwarn:3   dfail:0   fail:30  skip:24  time:333s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:465s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:474s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:436s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:538s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:403s

0a9fd1712c87aac1b6065a58cb46582a6a117cee drm-tip: 2017y-06m-21d-15h-44m-13s UTC integration manifest
4fff039 drm/i915: enable platform support for 1G pages
d73d54a drm/i915: enable platform support for 2M pages
057d276 drm/i915: enable platform support for 64K pages
0336d86 drm/i915/selftests: mix huge pages
c9c7b20 drm/i915/selftests: basic huge page tests
3ef656f drm/i915/debugfs: include some gtt page size metrics
64f3afc drm/i915: accurate page size tracking for the ppgtt
18150de drm/i915: support 64K pages for the 48b PPGTT
4287a5d drm/i915: support 2M pages for the 48b PPGTT
1be63f4 drm/i915: support 1G pages for the 48b PPGTT
34f2960 drm/i915: disable GTT cache for 2M/1G pages
b013dd0 drm/i915: enable IPS bit for 64K pages
71b0e77 drm/i915: pass the vma to insert_entries
f6009f9 drm/i915: align 64K objects to 2M
340bd33 drm/i915: align the vma start to the largest gtt page size
00cb18d drm/i915: introduce page_size members
ea2960b drm/i915: introduce page_size_mask to dev_info
cb970c1 drm/i915/gemfs: enable THP
32e94fe drm/i915: introduce simple gemfs

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_5020/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/19] drm/i915: introduce simple gemfs
  2017-06-21 20:33 ` [PATCH 01/19] drm/i915: introduce simple gemfs Matthew Auld
@ 2017-06-21 21:19   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:19 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:27)
> Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so
> moves us away from the shmemfs shm_mnt, and gives us the much needed
> flexibility to do things like set our own mount options, namely huge=
> which should allow us to enable the use of transparent-huge-pages for
> our shmem backed objects.
> 
> v2: various improvements suggested by Joonas
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/Makefile                    |   1 +
>  drivers/gpu/drm/i915/i915_drv.h                  |   3 +
>  drivers/gpu/drm/i915/i915_gem.c                  |  44 ++++++++-
>  drivers/gpu/drm/i915/i915_gemfs.c                | 114 +++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gemfs.h                |  40 ++++++++
>  drivers/gpu/drm/i915/selftests/mock_gem_device.c |  10 +-
>  6 files changed, 208 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_gemfs.c
>  create mode 100644 drivers/gpu/drm/i915/i915_gemfs.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index f8227318dcaf..29e3cfdf56ce 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -46,6 +46,7 @@ i915-y += i915_cmd_parser.o \
>           i915_gem_tiling.o \
>           i915_gem_timeline.o \
>           i915_gem_userptr.o \
> +         i915_gemfs.o \
>           i915_trace_points.o \
>           i915_vma.o \
>           intel_breadcrumbs.o \
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 30e89456fc61..376cd93a973a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2248,6 +2248,9 @@ struct drm_i915_private {
>         DECLARE_HASHTABLE(mm_structs, 7);
>         struct mutex mm_lock;
>  
> +       /* Our tmpfs instance used for shmem backed objects */
> +       struct vfsmount *gemfs;

You don't Yhink this might be better off in drm_i915_private.mm for the
time being?

> +static int i915_drm_gem_object_init(struct drm_device *dev,
> +                                   struct drm_gem_object *obj,
> +                                   size_t size)
> +{
> +       struct drm_i915_private *i915 = to_i915(dev);
> +       struct file *filp;
> +
> +       drm_gem_private_object_init(dev, obj, size);
> +
> +       filp = i915_gemfs_file_setup(i915, "i915 mm object", size);

I'm betting spaces aren't expected. i915_gem_object or just i915?

> +       if (IS_ERR(filp))
> +               return PTR_ERR(filp);
> +
> +       obj->filp = filp;
> +
> +       return 0;
> +}
> +
> +static void i915_drm_gem_object_release(struct drm_gem_object *obj)
> +{
> +       if (obj->filp)
> +               i915_gemfs_unlink(obj->filp);
> +
> +       drm_gem_object_release(obj);

drm_gem_object_release() does fput() as one expects, but you add an
unlink. Why? Because of a lack of control over shmem_file_setup.

> +}
> +
>  struct drm_i915_gem_object *
>  i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
>  {
> @@ -4331,7 +4358,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
>         if (obj == NULL)
>                 return ERR_PTR(-ENOMEM);
>  
> -       ret = drm_gem_object_init(&dev_priv->drm, &obj->base, size);
> +       ret = i915_drm_gem_object_init(&dev_priv->drm, &obj->base, size);

This is in dire need of a naming overhaul. (Something like
i915_gem_object_create_shmem and caried through.) Not your problem.
Although i915_drm_gem is very, very odd and should not survive much
longer. (Odd because we have a lot of drm_i915_gem precedence.)

>         if (ret)
>                 goto fail;
>  
> @@ -4449,7 +4476,8 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
>                         drm_prime_gem_destroy(&obj->base, NULL);
>  
>                 reservation_object_fini(&obj->__builtin_resv);
> -               drm_gem_object_release(&obj->base);
> +
> +               i915_drm_gem_object_release(&obj->base);
>                 i915_gem_info_remove_obj(i915, obj->base.size);
>  
>                 kfree(obj->bit_17);

> +static const struct dentry_operations anon_ops = {
> +       .d_dname = simple_dname
> +};
> +
> +int i915_gemfs_init(struct drm_i915_private *i915)
> +{
> +       struct file_system_type *type;
> +       struct vfsmount *gemfs;
> +
> +       type = get_fs_type("tmpfs");
> +       if (!type)
> +               return -ENODEV;
> +
> +       gemfs = kern_mount(type);
> +       if (IS_ERR(gemfs))
> +               return PTR_ERR(gemfs);
> +
> +       i915->gemfs = gemfs;
> +
> +       return 0;
> +}
> +
> +void i915_gemfs_fini(struct drm_i915_private *i915)
> +{
> +       kern_unmount(i915->gemfs);
> +       i915->gemfs = NULL;

Why the nullify?

> +}
> +
> +struct file *i915_gemfs_file_setup(struct drm_i915_private *i915,
> +                                  const char *name, size_t size)
> +{
> +       struct super_block *sb = i915->gemfs->mnt_sb;
> +       struct inode *dir = d_inode(sb->s_root);
> +       struct inode *inode;
> +       struct path path;
> +       struct qstr this;
> +       struct file *res;
> +       int ret;
> +
> +       if (size < 0 || size > MAX_LFS_FILESIZE)
> +               return ERR_PTR(-EINVAL);

You will want to run through smatch. (size < 0 can never be true.)

> +
> +       this.name = name;
> +       this.len = strlen(name);
> +       this.hash = 0;
> +
> +       path.mnt = mntget(i915->gemfs);
> +       path.dentry = d_alloc_pseudo(sb, &this);
> +       if (!path.dentry) {
> +               res = ERR_PTR(-ENOMEM);
> +               goto put_path;
> +       }
> +       d_set_d_op(path.dentry, &anon_ops);
> +
> +       ret = dir->i_op->create(dir, path.dentry, S_IFREG | S_IRWXUGO, false);
> +       if (ret) {
> +               res = ERR_PTR(ret);
> +               goto put_path;
> +       }

Ah, so this leaves it linked. How about just dropping the link right
away? And petition for shmem_file_setup() to take the super_block.
Hmm, I think that will be better from the start.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 04/19] drm/i915: introduce page_size members
  2017-06-21 20:33 ` [PATCH 04/19] drm/i915: introduce page_size members Matthew Auld
@ 2017-06-21 21:26   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:26 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:30)
> @@ -2538,11 +2567,11 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>                 return -EFAULT;
>         }
>  
> -       pages = obj->ops->get_pages(obj);
> +       pages = obj->ops->get_pages(obj, &sg_mask);

Grr. I need to push set_pages down to the callers. The inconsistency now
between the async and sync paths is silly.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/19] drm/i915: align the vma start to the largest gtt page size
  2017-06-21 20:33 ` [PATCH 05/19] drm/i915: align the vma start to the largest gtt page size Matthew Auld
@ 2017-06-21 21:35   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:35 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:31)
> For the 48b PPGTT try to align the vma start address to the required
> page size boundary to guarantee we use said page size in the gtt. If we
> are dealing with multiple page sizes, we can't guarantee anything and
> just align to the largest. For soft pinning and objects which need to be
> tightly packed into the lower 32bits we don't force any alignment.
> 
> v2: various improvements suggested by Chris
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_vma.c | 15 +++++++++++++++
>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>  2 files changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 958be0a95960..cee1d00dc085 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -471,6 +471,9 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>         if (ret)
>                 return ret;
>  
> +       vma->page_sizes.phys = obj->mm.page_sizes.phys;
> +       vma->page_sizes.sg = obj->mm.page_sizes.sg;

I expected this to be in the same place as where we assigned vma->pages.
That'll take a bit of rejigging to make it look neat. First thought is a
if (!vma->pages) vma->vm->set_pages(vma);

> +
>         if (flags & PIN_OFFSET_FIXED) {
>                 u64 offset = flags & PIN_OFFSET_MASK;
>                 if (!IS_ALIGNED(offset, alignment) ||
> @@ -485,6 +488,18 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>                 if (ret)
>                         goto err_unpin;
>         } else {
> +               /* We only support huge gtt pages through the 48b PPGTT,
> +                * however we also don't want to force any alignment for
> +                * objects which need to be tightly packed into the low 32bits.
> +                */
> +               if (end > (1ULL << 32) &&
> +                   vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
> +                       u64 page_alignment =
> +                               rounddown_pow_of_two(vma->page_sizes.sg);
> +
> +                       alignment = max(alignment, page_alignment);
> +               }
> +
>                 ret = i915_gem_gtt_insert(vma->vm, &vma->node,
>                                           size, alignment, obj->cache_level,
>                                           start, end, flags);
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index 4a673fc1a432..834f7ca2ada2 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -52,6 +52,7 @@ struct i915_vma {
>         struct drm_i915_fence_reg *fence;
>         struct reservation_object *resv; /** Alias of obj->resv */
>         struct sg_table *pages;
> +       struct i915_page_sizes page_sizes;

In the middle of a bunch of pointers! Have some decency please!

>         void __iomem *iomap;
>         u64 size;
>         u64 display_alignment;

Especially when there are some related variables right here :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/19] drm/i915: align 64K objects to 2M
  2017-06-21 20:33 ` [PATCH 06/19] drm/i915: align 64K objects to 2M Matthew Auld
@ 2017-06-21 21:37   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:37 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:32)
> We can't mix 64K and 4K pte's in the same page-table, so for now we
> align 64K objects to 2M to avoid any potential mixing. This is
> potentially wasteful but in reality shouldn't be too bad since this only
> applies to the virtual address space of a 48b PPGTT.
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_vma.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index cee1d00dc085..596269172cd2 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -495,7 +495,15 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>                 if (end > (1ULL << 32) &&
>                     vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
>                         u64 page_alignment =
> -                               rounddown_pow_of_two(vma->page_sizes.sg);
> +                               rounddown_pow_of_two(vma->page_sizes.sg |
> +                                                    I915_GTT_PAGE_SIZE_2M);
> +
> +                       /* We can't mix 64K and 4K PTEs in the same page-table (2M
> +                        * block), and so to avoid the ugliness and complexity of
> +                        * coloring we opt for just aligning 64K objects to 2M.
> +                        */
> +                       if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K)
> +                               size = round_up(size, I915_GTT_PAGE_SIZE_2M);

Why separate the logically connected ops? i.e put the round_up after the
alignment = max()

>  
>                         alignment = max(alignment, page_alignment);
>                 }
> -- 
> 2.9.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/19] drm/i915: pass the vma to insert_entries
  2017-06-21 20:33 ` [PATCH 07/19] drm/i915: pass the vma to insert_entries Matthew Auld
@ 2017-06-21 21:39   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:39 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:33)
> The vma contains most of the information we need for insertion. But also
> in preparation for supporting huge-pages for the ppgtt, it would be
> useful to know the details of vma->page_sizes and the node size, such
> that we can we can easily determine the page sizes we are allowed to use
> when inserting into the 48b PPGTT.  This is especially true for 64K
> where we can't just arbitrarily use it, since we require
> aligning/padding the vm space to 2M, which sometimes we can't enforce in
> the upper levels.
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

We can apply this immediately, right? (Just resend by itself before you
do.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 09/19] drm/i915: disable GTT cache for 2M/1G pages
  2017-06-21 20:33 ` [PATCH 09/19] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
@ 2017-06-21 21:41   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:41 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:35)
> When SW enables the use of 2M/1G pages, it must disable the GTT cache.
> 
> v2: don't disable for Cherryview which doesn't even support 48b PPGTT!
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index b5b7372fcddc..3939977dddb8 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -8307,10 +8307,10 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv)
>  
>         /*
>          * WaGttCachingOffByDefault:bdw
> -        * GTT cache may not work with big pages, so if those
> -        * are ever enabled GTT cache may need to be disabled.
> +        * The GTT cache must be disabled if the system is planning to use
> +        * 2M/1G pages.
>          */
> -       I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> +       I915_WRITE(HSW_GTT_CACHE_EN, 0);

Worth doing HAS_PAGE_SIZES() ? 0 : GTT_CACHE_EN_ALL ?

>  
>         /* WaKVMNotificationOnConfigChange:bdw */
>         I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
> -- 
> 2.9.4
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT
  2017-06-21 20:33 ` [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT Matthew Auld
@ 2017-06-21 21:49   ` Chris Wilson
  2017-06-21 22:51     ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:49 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:36)
> Support inserting 1G gtt pages into the 48b PPGTT.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 ++++++++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
>  2 files changed, 70 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index de67084d5fcf..6fe10ee7dca8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -922,6 +922,65 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
>                                       cache_level);
>  }
>  
> +static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
> +                                          struct i915_page_directory_pointer **pdps,
> +                                          struct sgt_dma *iter,
> +                                          enum i915_cache_level cache_level)
> +{
> +       const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
> +       u64 start = vma->node.start;
> +
> +       do {
> +               struct gen8_insert_pte idx = gen8_insert_pte(start);
> +               struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
> +               struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
> +               struct i915_page_table *pt = pd->page_table[idx.pde];
> +               dma_addr_t rem = iter->max - iter->dma;
> +               unsigned int page_size;
> +               gen8_pte_t encode = pte_encode;
> +               gen8_pte_t *vaddr;
> +               u16 index, max;
> +
> +               if (unlikely(vma->page_sizes.sg & I915_GTT_PAGE_SIZE_1G) &&
> +                   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_1G) &&
> +                   rem >= I915_GTT_PAGE_SIZE_1G && !(idx.pte | idx.pde)) {
> +                       vaddr = kmap_atomic_px(pdp);
> +                       index = idx.pdpe;
> +                       max = GEN8_PML4ES_PER_PML4;
> +                       page_size = I915_GTT_PAGE_SIZE_1G;
> +                       encode |= GEN8_PDPE_PS_1G;
> +               } else {
> +                       vaddr = kmap_atomic_px(pt);
> +                       index = idx.pte;
> +                       max = GEN8_PTES;
> +                       page_size = I915_GTT_PAGE_SIZE;
> +               }
> +
> +               do {
> +                       vaddr[index++] = encode | iter->dma;
> +
> +                       start += page_size;
> +                       iter->dma += page_size;
> +                       if (iter->dma >= iter->max) {
> +                               iter->sg = __sg_next(iter->sg);
> +                               if (!iter->sg)
> +                                       break;
> +
> +                               iter->dma = sg_dma_address(iter->sg);
> +                               iter->max = iter->dma + iter->sg->length;
> +
> +                               if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
> +                                       break;
> +                       }
> +                       rem = iter->max - iter->dma;
> +
> +               } while (rem >= page_size && index < max);

Where does idx advance?

> +
> +               kunmap_atomic(vaddr);
> +
> +       } while (iter->sg);
> +}
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/19] drm/i915: support 64K pages for the 48b PPGTT
  2017-06-21 20:33 ` [PATCH 12/19] drm/i915: support 64K " Matthew Auld
@ 2017-06-21 21:55   ` Chris Wilson
  2017-06-22 11:27     ` Matthew Auld
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:55 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:38)
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 26 ++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
>  2 files changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 03c35097ef39..9b89ec10f333 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -937,6 +937,7 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
>                 struct i915_page_table *pt = pd->page_table[idx.pde];
>                 dma_addr_t rem = iter->max - iter->dma;
>                 unsigned int page_size;
> +               bool maybe_64K = false;
>                 gen8_pte_t encode = pte_encode;
>                 gen8_pte_t *vaddr;
>                 u16 index, max;
> @@ -962,9 +963,17 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
>                         index = idx.pte;
>                         max = GEN8_PTES;
>                         page_size = I915_GTT_PAGE_SIZE;
> +
> +                       if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K && !idx.pte)
> +                               maybe_64K = true;
>                 }
>  
>                 do {
> +                       if (maybe_64K && (index % 16 == 0) &&
> +                           (!IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) ||
> +                            rem < I915_GTT_PAGE_SIZE_64K))
> +                               maybe_64K = false;
> +
>                         vaddr[index++] = encode | iter->dma;
>  
>                         start += page_size;
> @@ -986,6 +995,23 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
>  
>                 kunmap_atomic(vaddr);
>  
> +
> +               /* Is it safe to mark the 2M block as 64K? -- Either we have
> +                * filled whole page-table with 64K entries, or filled part of
> +                * it and have reached the end of the sg table and we have
> +                * enough padding.
> +                */
> +               if (maybe_64K) {
> +                       if (index == max ||
> +                           (!iter->sg && IS_ALIGNED(vma->node.start +
> +                                                    vma->node.size,
> +                                                    I915_GTT_PAGE_SIZE_2M))) {
> +                               vaddr = kmap_atomic_px(pd);
> +                               vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
> +                               kunmap_atomic(vaddr);
> +                       }

Hmm. I think you know this at the start. It's a bit hard to see from
this diff why not.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 13/19] drm/i915: accurate page size tracking for the ppgtt
  2017-06-21 20:33 ` [PATCH 13/19] drm/i915: accurate page size tracking for the ppgtt Matthew Auld
@ 2017-06-21 21:57   ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 21:57 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:39)
>  static int aliasing_gtt_bind_vma(struct i915_vma *vma,
> @@ -2535,6 +2549,8 @@ static void aliasing_gtt_unbind_vma(struct i915_vma *vma)
>  
>                 vm->clear_range(vm, vma->node.start, vma->size);
>         }
> +
> +       vma->page_sizes.gtt = 0;

One might suggest where vma->pages get sets to NULL might be a good
place to memclear(vma->page_sizes);
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT
  2017-06-21 21:49   ` Chris Wilson
@ 2017-06-21 22:51     ` Chris Wilson
  2017-06-22 11:07       ` Matthew Auld
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2017-06-21 22:51 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Chris Wilson (2017-06-21 22:49:07)
> Quoting Matthew Auld (2017-06-21 21:33:36)
> > Support inserting 1G gtt pages into the 48b PPGTT.
> > 
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 ++++++++++++++++++++++++++++++++++---
> >  drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
> >  2 files changed, 70 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index de67084d5fcf..6fe10ee7dca8 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -922,6 +922,65 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
> >                                       cache_level);
> >  }
> >  
> > +static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
> > +                                          struct i915_page_directory_pointer **pdps,
> > +                                          struct sgt_dma *iter,
> > +                                          enum i915_cache_level cache_level)
> > +{
> > +       const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
> > +       u64 start = vma->node.start;
> > +
> > +       do {
> > +               struct gen8_insert_pte idx = gen8_insert_pte(start);
> > +               struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
> > +               struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
> > +               struct i915_page_table *pt = pd->page_table[idx.pde];
> > +               dma_addr_t rem = iter->max - iter->dma;
> > +               unsigned int page_size;
> > +               gen8_pte_t encode = pte_encode;
> > +               gen8_pte_t *vaddr;
> > +               u16 index, max;
> > +
> > +               if (unlikely(vma->page_sizes.sg & I915_GTT_PAGE_SIZE_1G) &&
> > +                   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_1G) &&
> > +                   rem >= I915_GTT_PAGE_SIZE_1G && !(idx.pte | idx.pde)) {
> > +                       vaddr = kmap_atomic_px(pdp);
> > +                       index = idx.pdpe;
> > +                       max = GEN8_PML4ES_PER_PML4;
> > +                       page_size = I915_GTT_PAGE_SIZE_1G;
> > +                       encode |= GEN8_PDPE_PS_1G;
> > +               } else {
> > +                       vaddr = kmap_atomic_px(pt);
> > +                       index = idx.pte;
> > +                       max = GEN8_PTES;
> > +                       page_size = I915_GTT_PAGE_SIZE;
> > +               }
> > +
> > +               do {
> > +                       vaddr[index++] = encode | iter->dma;
> > +
> > +                       start += page_size;
> > +                       iter->dma += page_size;
> > +                       if (iter->dma >= iter->max) {
> > +                               iter->sg = __sg_next(iter->sg);
> > +                               if (!iter->sg)
> > +                                       break;
> > +

GEM_BUG_ON(iter->sg->length < page_size);

> > +                               iter->dma = sg_dma_address(iter->sg);
> > +                               iter->max = iter->dma + iter->sg->length;
> > +
> > +                               if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
> > +                                       break;
> > +                       }
> > +                       rem = iter->max - iter->dma;
> > +
> > +               } while (rem >= page_size && index < max);
> 
> Where does idx advance?

via start.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT
  2017-06-21 22:51     ` Chris Wilson
@ 2017-06-22 11:07       ` Matthew Auld
  2017-06-22 11:38         ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Auld @ 2017-06-22 11:07 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On 21 June 2017 at 23:51, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Chris Wilson (2017-06-21 22:49:07)
>> Quoting Matthew Auld (2017-06-21 21:33:36)
>> > Support inserting 1G gtt pages into the 48b PPGTT.
>> >
>> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 ++++++++++++++++++++++++++++++++++---
>> >  drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
>> >  2 files changed, 70 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> > index de67084d5fcf..6fe10ee7dca8 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> > @@ -922,6 +922,65 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
>> >                                       cache_level);
>> >  }
>> >
>> > +static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
>> > +                                          struct i915_page_directory_pointer **pdps,
>> > +                                          struct sgt_dma *iter,
>> > +                                          enum i915_cache_level cache_level)
>> > +{
>> > +       const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
>> > +       u64 start = vma->node.start;
>> > +
>> > +       do {
>> > +               struct gen8_insert_pte idx = gen8_insert_pte(start);
>> > +               struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
>> > +               struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
>> > +               struct i915_page_table *pt = pd->page_table[idx.pde];
>> > +               dma_addr_t rem = iter->max - iter->dma;
>> > +               unsigned int page_size;
>> > +               gen8_pte_t encode = pte_encode;
>> > +               gen8_pte_t *vaddr;
>> > +               u16 index, max;
>> > +
>> > +               if (unlikely(vma->page_sizes.sg & I915_GTT_PAGE_SIZE_1G) &&
>> > +                   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_1G) &&
>> > +                   rem >= I915_GTT_PAGE_SIZE_1G && !(idx.pte | idx.pde)) {
>> > +                       vaddr = kmap_atomic_px(pdp);
>> > +                       index = idx.pdpe;
>> > +                       max = GEN8_PML4ES_PER_PML4;
>> > +                       page_size = I915_GTT_PAGE_SIZE_1G;
>> > +                       encode |= GEN8_PDPE_PS_1G;
>> > +               } else {
>> > +                       vaddr = kmap_atomic_px(pt);
>> > +                       index = idx.pte;
>> > +                       max = GEN8_PTES;
>> > +                       page_size = I915_GTT_PAGE_SIZE;
>> > +               }
>> > +
>> > +               do {
>> > +                       vaddr[index++] = encode | iter->dma;
>> > +
>> > +                       start += page_size;
>> > +                       iter->dma += page_size;
>> > +                       if (iter->dma >= iter->max) {
>> > +                               iter->sg = __sg_next(iter->sg);
>> > +                               if (!iter->sg)
>> > +                                       break;
>> > +
>
> GEM_BUG_ON(iter->sg->length < page_size);

That should be expected behaviour, in that we need to downgrade to a
smaller page size on the next iteration.

>
>> > +                               iter->dma = sg_dma_address(iter->sg);
>> > +                               iter->max = iter->dma + iter->sg->length;
>> > +
>> > +                               if (unlikely(!IS_ALIGNED(iter->dma, page_size)))
>> > +                                       break;
>> > +                       }
>> > +                       rem = iter->max - iter->dma;
>> > +
>> > +               } while (rem >= page_size && index < max);
>>
>> Where does idx advance?
>
> via start.
> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/19] drm/i915: support 64K pages for the 48b PPGTT
  2017-06-21 21:55   ` Chris Wilson
@ 2017-06-22 11:27     ` Matthew Auld
  0 siblings, 0 replies; 36+ messages in thread
From: Matthew Auld @ 2017-06-22 11:27 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On 21 June 2017 at 22:55, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Matthew Auld (2017-06-21 21:33:38)
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_gtt.c | 26 ++++++++++++++++++++++++++
>>  drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
>>  2 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 03c35097ef39..9b89ec10f333 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -937,6 +937,7 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
>>                 struct i915_page_table *pt = pd->page_table[idx.pde];
>>                 dma_addr_t rem = iter->max - iter->dma;
>>                 unsigned int page_size;
>> +               bool maybe_64K = false;
>>                 gen8_pte_t encode = pte_encode;
>>                 gen8_pte_t *vaddr;
>>                 u16 index, max;
>> @@ -962,9 +963,17 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
>>                         index = idx.pte;
>>                         max = GEN8_PTES;
>>                         page_size = I915_GTT_PAGE_SIZE;
>> +
>> +                       if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K && !idx.pte)
>> +                               maybe_64K = true;
>>                 }
>>
>>                 do {
>> +                       if (maybe_64K && (index % 16 == 0) &&
>> +                           (!IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) ||
>> +                            rem < I915_GTT_PAGE_SIZE_64K))
>> +                               maybe_64K = false;
>> +
>>                         vaddr[index++] = encode | iter->dma;
>>
>>                         start += page_size;
>> @@ -986,6 +995,23 @@ static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
>>
>>                 kunmap_atomic(vaddr);
>>
>> +
>> +               /* Is it safe to mark the 2M block as 64K? -- Either we have
>> +                * filled whole page-table with 64K entries, or filled part of
>> +                * it and have reached the end of the sg table and we have
>> +                * enough padding.
>> +                */
>> +               if (maybe_64K) {
>> +                       if (index == max ||
>> +                           (!iter->sg && IS_ALIGNED(vma->node.start +
>> +                                                    vma->node.size,
>> +                                                    I915_GTT_PAGE_SIZE_2M))) {
>> +                               vaddr = kmap_atomic_px(pd);
>> +                               vaddr[idx.pde] |= GEN8_PDE_IPS_64K;
>> +                               kunmap_atomic(vaddr);
>> +                       }
>
> Hmm. I think you know this at the start. It's a bit hard to see from
> this diff why not.

Not sure I follow, what do we know from the start?

> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT
  2017-06-22 11:07       ` Matthew Auld
@ 2017-06-22 11:38         ` Chris Wilson
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-22 11:38 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld

Quoting Matthew Auld (2017-06-22 12:07:55)
> On 21 June 2017 at 23:51, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > Quoting Chris Wilson (2017-06-21 22:49:07)
> >> Quoting Matthew Auld (2017-06-21 21:33:36)
> >> > Support inserting 1G gtt pages into the 48b PPGTT.
> >> >
> >> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 ++++++++++++++++++++++++++++++++++---
> >> >  drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
> >> >  2 files changed, 70 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> > index de67084d5fcf..6fe10ee7dca8 100644
> >> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >> > @@ -922,6 +922,65 @@ static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
> >> >                                       cache_level);
> >> >  }
> >> >
> >> > +static void gen8_ppgtt_insert_huge_entries(struct i915_vma *vma,
> >> > +                                          struct i915_page_directory_pointer **pdps,
> >> > +                                          struct sgt_dma *iter,
> >> > +                                          enum i915_cache_level cache_level)
> >> > +{
> >> > +       const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level);
> >> > +       u64 start = vma->node.start;
> >> > +
> >> > +       do {
> >> > +               struct gen8_insert_pte idx = gen8_insert_pte(start);
> >> > +               struct i915_page_directory_pointer *pdp = pdps[idx.pml4e];
> >> > +               struct i915_page_directory *pd = pdp->page_directory[idx.pdpe];
> >> > +               struct i915_page_table *pt = pd->page_table[idx.pde];
> >> > +               dma_addr_t rem = iter->max - iter->dma;
> >> > +               unsigned int page_size;
> >> > +               gen8_pte_t encode = pte_encode;
> >> > +               gen8_pte_t *vaddr;
> >> > +               u16 index, max;
> >> > +
> >> > +               if (unlikely(vma->page_sizes.sg & I915_GTT_PAGE_SIZE_1G) &&
> >> > +                   IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_1G) &&
> >> > +                   rem >= I915_GTT_PAGE_SIZE_1G && !(idx.pte | idx.pde)) {
> >> > +                       vaddr = kmap_atomic_px(pdp);
> >> > +                       index = idx.pdpe;
> >> > +                       max = GEN8_PML4ES_PER_PML4;
> >> > +                       page_size = I915_GTT_PAGE_SIZE_1G;
> >> > +                       encode |= GEN8_PDPE_PS_1G;
> >> > +               } else {
> >> > +                       vaddr = kmap_atomic_px(pt);
> >> > +                       index = idx.pte;
> >> > +                       max = GEN8_PTES;
> >> > +                       page_size = I915_GTT_PAGE_SIZE;
> >> > +               }
> >> > +
> >> > +               do {
> >> > +                       vaddr[index++] = encode | iter->dma;
> >> > +
> >> > +                       start += page_size;
> >> > +                       iter->dma += page_size;
> >> > +                       if (iter->dma >= iter->max) {
> >> > +                               iter->sg = __sg_next(iter->sg);
> >> > +                               if (!iter->sg)
> >> > +                                       break;
> >> > +
> >
> > GEM_BUG_ON(iter->sg->length < page_size);
> 
> That should be expected behaviour, in that we need to downgrade to a
> smaller page size on the next iteration.

It still applies to just above where we set vaddr[index]. It fails here
because we have yet decided on our course of action. I still think there
is merit in having a confirmation that sg->length does meet our
criteria, considering that we set the page_sizes a long time ago.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 15/19] drm/i915/selftests: basic huge page tests
  2017-06-21 20:33 ` [PATCH 15/19] drm/i915/selftests: basic huge page tests Matthew Auld
@ 2017-06-22 14:17   ` Chris Wilson
  2017-06-22 14:21   ` Chris Wilson
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-22 14:17 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:41)
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

From my read through, we are only doing ppgtt operations. I think will
be useful to try this as a mock test (as well) so that we can exercise
unlikely combinations of device support for page sizes.

> diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> index 18b174d855ca..6b49cbb49535 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> @@ -15,5 +15,6 @@ selftest(objects, i915_gem_object_live_selftests)
>  selftest(dmabuf, i915_gem_dmabuf_live_selftests)
>  selftest(coherency, i915_gem_coherency_live_selftests)
>  selftest(gtt, i915_gem_gtt_live_selftests)
> +selftest(huge, i915_gem_huge_page_live_selftests)

Call it hugepages. huge is a little to vague.
s/fake_huge_paged_object/fake_huge_pages_object/
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 15/19] drm/i915/selftests: basic huge page tests
  2017-06-21 20:33 ` [PATCH 15/19] drm/i915/selftests: basic huge page tests Matthew Auld
  2017-06-22 14:17   ` Chris Wilson
@ 2017-06-22 14:21   ` Chris Wilson
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Wilson @ 2017-06-22 14:21 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2017-06-21 21:33:41)
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

I will also suggest that we combine this with some MI_STORE_DWORD tests
to check the hw uses the pages correctly. Something like igt_ctx_exec,
but not quite so insane.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-06-22 14:21 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-21 20:33 [PATCH 00/19] huge gtt pages Matthew Auld
2017-06-21 20:33 ` [PATCH 01/19] drm/i915: introduce simple gemfs Matthew Auld
2017-06-21 21:19   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 02/19] drm/i915/gemfs: enable THP Matthew Auld
2017-06-21 20:33 ` [PATCH 03/19] drm/i915: introduce page_size_mask to dev_info Matthew Auld
2017-06-21 20:33 ` [PATCH 04/19] drm/i915: introduce page_size members Matthew Auld
2017-06-21 21:26   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 05/19] drm/i915: align the vma start to the largest gtt page size Matthew Auld
2017-06-21 21:35   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 06/19] drm/i915: align 64K objects to 2M Matthew Auld
2017-06-21 21:37   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 07/19] drm/i915: pass the vma to insert_entries Matthew Auld
2017-06-21 21:39   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 08/19] drm/i915: enable IPS bit for 64K pages Matthew Auld
2017-06-21 20:33 ` [PATCH 09/19] drm/i915: disable GTT cache for 2M/1G pages Matthew Auld
2017-06-21 21:41   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 10/19] drm/i915: support 1G pages for the 48b PPGTT Matthew Auld
2017-06-21 21:49   ` Chris Wilson
2017-06-21 22:51     ` Chris Wilson
2017-06-22 11:07       ` Matthew Auld
2017-06-22 11:38         ` Chris Wilson
2017-06-21 20:33 ` [PATCH 11/19] drm/i915: support 2M " Matthew Auld
2017-06-21 20:33 ` [PATCH 12/19] drm/i915: support 64K " Matthew Auld
2017-06-21 21:55   ` Chris Wilson
2017-06-22 11:27     ` Matthew Auld
2017-06-21 20:33 ` [PATCH 13/19] drm/i915: accurate page size tracking for the ppgtt Matthew Auld
2017-06-21 21:57   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 14/19] drm/i915/debugfs: include some gtt page size metrics Matthew Auld
2017-06-21 20:33 ` [PATCH 15/19] drm/i915/selftests: basic huge page tests Matthew Auld
2017-06-22 14:17   ` Chris Wilson
2017-06-22 14:21   ` Chris Wilson
2017-06-21 20:33 ` [PATCH 16/19] drm/i915/selftests: mix huge pages Matthew Auld
2017-06-21 20:33 ` [PATCH 17/19] drm/i915: enable platform support for 64K pages Matthew Auld
2017-06-21 20:33 ` [PATCH 18/19] drm/i915: enable platform support for 2M pages Matthew Auld
2017-06-21 20:33 ` [PATCH 19/19] drm/i915: enable platform support for 1G pages Matthew Auld
2017-06-21 21:05 ` ✓ Fi.CI.BAT: success for huge gtt pages (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.