All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/37] Introduce memory region concept (including device local memory)
@ 2019-06-27 20:55 Matthew Auld
  2019-06-27 20:55 ` [PATCH v2 01/37] drm/i915: buddy allocator Matthew Auld
                   ` (39 more replies)
  0 siblings, 40 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:55 UTC (permalink / raw)
  To: intel-gfx

In preparation for upcoming devices with device local memory, introduce the
concept of different memory regions, and a simple buddy allocator to manage
them.

One of the concerns raised from v1 was around not using enough of TTM, which is
a fair criticism, so trying to get better alignment here is something we are
investigating, though currently that is still WIP so in the meantime v2 still
continues to push more of the low-level details forward, but not yet the TTM
interactions.

Abdiel Janulgue (11):
  drm/i915: Add memory region information to device_info
  drm/i915: setup io-mapping for LMEM
  drm/i915/lmem: support kernel mapping
  drm/i915: enumerate and init each supported region
  drm/i915: Allow i915 to manage the vma offset nodes instead of drm
    core
  drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  drm/i915/lmem: add helper to get CPU accessible offset
  drm/i915: Add cpu and lmem fault handlers
  drm/i915: cpu-map based dumb buffers
  drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
  drm/i915/query: Expose memory regions through the query uAPI

Daniele Ceraolo Spurio (5):
  drm/i915: define HAS_MAPPABLE_APERTURE
  drm/i915: do not map aperture if it is not available.
  drm/i915: expose missing map_gtt support to users
  drm/i915: set num_fence_regs to 0 if there is no aperture
  drm/i915: error capture with no ggtt slot

Matthew Auld (20):
  drm/i915: buddy allocator
  drm/i915: introduce intel_memory_region
  drm/i915/region: support basic eviction
  drm/i915/region: support continuous allocations
  drm/i915/region: support volatile objects
  drm/i915: support creating LMEM objects
  drm/i915/blt: support copying objects
  drm/i915/selftests: move gpu-write-dw into utils
  drm/i915/selftests: add write-dword test for LMEM
  drm/i915/selftests: don't just test CACHE_NONE for huge-pages
  drm/i915/selftest: extend coverage to include LMEM huge-pages
  drm/i915/lmem: support CPU relocations
  drm/i915/lmem: support pread
  drm/i915/lmem: support pwrite
  drm/i915: treat shmem as a region
  drm/i915: treat stolen as a region
  drm/i915/selftests: check for missing aperture
  drm/i915: support basic object migration
  HAX drm/i915: add the fake lmem region
  HAX drm/i915/lmem: default userspace allocations to LMEM

Michal Wajdeczko (1):
  drm/i915: Don't try to place HWS in non-existing mappable region

 drivers/gpu/drm/i915/Makefile                 |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  12 +
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   2 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  69 +-
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   4 +
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 370 +++++++-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 278 ++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  27 +-
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 135 +++
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |   8 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  44 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  20 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  67 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  66 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 329 +++++---
 .../i915/gem/selftests/i915_gem_coherency.c   |   5 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 134 +--
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  15 +-
 .../i915/gem/selftests/i915_gem_object_blt.c  | 105 +++
 .../drm/i915/gem/selftests/igt_gem_utils.c    | 135 +++
 .../drm/i915/gem/selftests/igt_gem_utils.h    |  16 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   2 +-
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   3 +-
 drivers/gpu/drm/i915/gt/intel_reset.c         |  19 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  14 +-
 drivers/gpu/drm/i915/i915_buddy.c             | 413 +++++++++
 drivers/gpu/drm/i915/i915_buddy.h             | 115 +++
 drivers/gpu/drm/i915/i915_drv.c               |  38 +-
 drivers/gpu/drm/i915/i915_drv.h               |  25 +-
 drivers/gpu/drm/i915/i915_gem.c               |  75 +-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c     |   6 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 119 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  63 +-
 drivers/gpu/drm/i915/i915_params.c            |   3 +
 drivers/gpu/drm/i915/i915_params.h            |   3 +-
 drivers/gpu/drm/i915/i915_pci.c               |  29 +-
 drivers/gpu/drm/i915/i915_query.c             |  57 ++
 drivers/gpu/drm/i915/i915_vma.c               |  21 +-
 drivers/gpu/drm/i915/intel_device_info.h      |   1 +
 drivers/gpu/drm/i915/intel_memory_region.c    | 321 +++++++
 drivers/gpu/drm/i915/intel_memory_region.h    | 121 +++
 drivers/gpu/drm/i915/intel_region_lmem.c      | 404 +++++++++
 drivers/gpu/drm/i915/intel_region_lmem.h      |  30 +
 drivers/gpu/drm/i915/selftests/i915_buddy.c   | 491 +++++++++++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   3 +
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 .../drm/i915/selftests/i915_mock_selftests.h  |   2 +
 .../drm/i915/selftests/intel_memory_region.c  | 792 ++++++++++++++++++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |   9 +-
 drivers/gpu/drm/i915/selftests/mock_region.c  |  59 ++
 drivers/gpu/drm/i915/selftests/mock_region.h  |  16 +
 include/uapi/drm/i915_drm.h                   |  97 +++
 52 files changed, 4780 insertions(+), 416 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 create mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 create mode 100644 drivers/gpu/drm/i915/intel_memory_region.c
 create mode 100644 drivers/gpu/drm/i915/intel_memory_region.h
 create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.c
 create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.h
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c
 create mode 100644 drivers/gpu/drm/i915/selftests/intel_memory_region.c
 create mode 100644 drivers/gpu/drm/i915/selftests/mock_region.c
 create mode 100644 drivers/gpu/drm/i915/selftests/mock_region.h

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2 01/37] drm/i915: buddy allocator
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
@ 2019-06-27 20:55 ` Matthew Auld
  2019-06-27 22:28   ` Chris Wilson
  2019-06-28  9:35   ` Chris Wilson
  2019-06-27 20:55 ` [PATCH v2 02/37] drm/i915: introduce intel_memory_region Matthew Auld
                   ` (38 subsequent siblings)
  39 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:55 UTC (permalink / raw)
  To: intel-gfx

Simple buddy allocator. We want to allocate properly aligned
power-of-two blocks to promote usage of huge-pages for the GTT, so 64K,
2M and possibly even 1G. While we do support allocating stuff at a
specific offset, it is more intended for preallocating portions of the
address space, say for an initial framebuffer, for other uses drm_mm is
probably a much better fit. Anyway, hopefully this can all be thrown
away if we eventually move to having the core MM manage device memory.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/i915_buddy.c             | 413 +++++++++++++++
 drivers/gpu/drm/i915/i915_buddy.h             | 115 ++++
 drivers/gpu/drm/i915/selftests/i915_buddy.c   | 491 ++++++++++++++++++
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
 5 files changed, 1021 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 create mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3bd8f0349a8a..cb66cf1a5a10 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -117,6 +117,7 @@ gem-y += \
 i915-y += \
 	  $(gem-y) \
 	  i915_active.o \
+	  i915_buddy.o \
 	  i915_cmd_parser.o \
 	  i915_gem_batch_pool.o \
 	  i915_gem_evict.o \
diff --git a/drivers/gpu/drm/i915/i915_buddy.c b/drivers/gpu/drm/i915/i915_buddy.c
new file mode 100644
index 000000000000..c0ac68d71d94
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_buddy.c
@@ -0,0 +1,413 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/slab.h>
+#include <linux/list.h>
+
+#include "i915_buddy.h"
+
+#include "i915_gem.h"
+#include "i915_utils.h"
+
+static void mark_allocated(struct i915_buddy_block *block)
+{
+	block->header &= ~I915_BUDDY_HEADER_STATE;
+	block->header |= I915_BUDDY_ALLOCATED;
+
+	list_del_init(&block->link);
+}
+
+static void mark_free(struct i915_buddy_mm *mm,
+		      struct i915_buddy_block *block)
+{
+	block->header &= ~I915_BUDDY_HEADER_STATE;
+	block->header |= I915_BUDDY_FREE;
+
+	list_add(&block->link,
+		 &mm->free_list[i915_buddy_block_order(block)]);
+}
+
+static void mark_split(struct i915_buddy_block *block)
+{
+	block->header &= ~I915_BUDDY_HEADER_STATE;
+	block->header |= I915_BUDDY_SPLIT;
+
+	list_del_init(&block->link);
+}
+
+int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 min_size)
+{
+	unsigned int i;
+	u64 offset;
+
+	if (size < min_size)
+		return -EINVAL;
+
+	if (min_size < PAGE_SIZE)
+		return -EINVAL;
+
+	if (!is_power_of_2(min_size))
+		return -EINVAL;
+
+	size = round_down(size, min_size);
+
+	mm->size = size;
+	mm->min_size = min_size;
+	mm->max_order = ilog2(rounddown_pow_of_two(size)) - ilog2(min_size);
+
+	GEM_BUG_ON(mm->max_order > I915_BUDDY_MAX_ORDER);
+
+	mm->free_list = kmalloc_array(mm->max_order + 1,
+				      sizeof(struct list_head),
+				      GFP_KERNEL);
+	if (!mm->free_list)
+		return -ENOMEM;
+
+	for (i = 0; i <= mm->max_order; ++i)
+		INIT_LIST_HEAD(&mm->free_list[i]);
+
+	mm->blocks = KMEM_CACHE(i915_buddy_block, SLAB_HWCACHE_ALIGN);
+	if (!mm->blocks)
+		goto out_free_list;
+
+	mm->n_roots = hweight64(size);
+
+	mm->roots = kmalloc_array(mm->n_roots,
+				  sizeof(struct i915_buddy_block *),
+				  GFP_KERNEL);
+	if (!mm->roots)
+		goto out_free_blocks;
+
+	offset = 0;
+	i = 0;
+
+	/*
+	 * Split into power-of-two blocks, in case we are given a size that is
+	 * not itself a power-of-two.
+	 */
+	do {
+		struct i915_buddy_block *root;
+		unsigned int order;
+		u64 root_size;
+
+		root = kmem_cache_zalloc(mm->blocks, GFP_KERNEL);
+		if (!root)
+			goto out_free_roots;
+
+		root_size = rounddown_pow_of_two(size);
+		order = ilog2(root_size) - ilog2(min_size);
+
+		root->header = offset;
+		root->header |= order;
+
+		mark_free(mm, root);
+
+		GEM_BUG_ON(i > mm->max_order);
+		GEM_BUG_ON(i915_buddy_block_size(mm, root) < min_size);
+
+		mm->roots[i] = root;
+
+		offset += root_size;
+		size -= root_size;
+		i++;
+	} while (size);
+
+	return 0;
+
+out_free_roots:
+	while (i--)
+		kmem_cache_free(mm->blocks, mm->roots[i]);
+	kfree(mm->roots);
+out_free_blocks:
+	kmem_cache_destroy(mm->blocks);
+out_free_list:
+	kfree(mm->free_list);
+	return -ENOMEM;
+}
+
+void i915_buddy_fini(struct i915_buddy_mm *mm)
+{
+	int err = 0;
+	int i;
+
+	for (i = 0; i < mm->n_roots; ++i) {
+		if (!i915_buddy_block_free(mm->roots[i])) {
+			err = -EBUSY;
+			continue;
+		}
+
+		kmem_cache_free(mm->blocks, mm->roots[i]);
+	}
+
+	/*
+	 * XXX: Rather leak memory for now, than hit a potential user-after-free
+	 */
+	if (WARN_ON(err))
+		return;
+
+	kfree(mm->roots);
+	kfree(mm->free_list);
+	kmem_cache_destroy(mm->blocks);
+}
+
+static int split_block(struct i915_buddy_mm *mm,
+		       struct i915_buddy_block *block)
+{
+	unsigned int order = i915_buddy_block_order(block);
+	u64 offset = i915_buddy_block_offset(block);
+
+	GEM_BUG_ON(!i915_buddy_block_free(block));
+	GEM_BUG_ON(!order);
+
+	block->left = kmem_cache_zalloc(mm->blocks, GFP_KERNEL);
+	if (!block->left)
+		return -ENOMEM;
+
+	block->left->header = offset;
+	block->left->header |= order - 1;
+
+	block->left->parent = block;
+
+	INIT_LIST_HEAD(&block->left->link);
+	INIT_LIST_HEAD(&block->left->tmp_link);
+
+	block->right = kmem_cache_zalloc(mm->blocks, GFP_KERNEL);
+	if (!block->right) {
+		kmem_cache_free(mm->blocks, block->left);
+		return -ENOMEM;
+	}
+
+	block->right->header = offset + (BIT_ULL(order - 1) * mm->min_size);
+	block->right->header |= order - 1;
+
+	block->right->parent = block;
+
+	INIT_LIST_HEAD(&block->right->link);
+	INIT_LIST_HEAD(&block->right->tmp_link);
+
+	mark_free(mm, block->left);
+	mark_free(mm, block->right);
+
+	mark_split(block);
+
+	return 0;
+}
+
+static struct i915_buddy_block *
+get_buddy(struct i915_buddy_block *block)
+{
+	struct i915_buddy_block *parent;
+
+	parent = block->parent;
+	if (!parent)
+		return NULL;
+
+	if (parent->left == block)
+		return parent->right;
+
+	return parent->left;
+}
+
+static void __i915_buddy_free(struct i915_buddy_mm *mm,
+			      struct i915_buddy_block *block)
+{
+	list_del_init(&block->link); /* We have ownership now */
+
+	while (block->parent) {
+		struct i915_buddy_block *buddy;
+
+		buddy = get_buddy(block);
+
+		if (!i915_buddy_block_free(buddy))
+			break;
+
+		list_del(&buddy->link);
+
+		kmem_cache_free(mm->blocks, block);
+		kmem_cache_free(mm->blocks, buddy);
+
+		block = block->parent;
+	}
+
+	mark_free(mm, block);
+}
+
+void i915_buddy_free(struct i915_buddy_mm *mm,
+		     struct i915_buddy_block *block)
+{
+	GEM_BUG_ON(!i915_buddy_block_allocated(block));
+	__i915_buddy_free(mm, block);
+}
+
+void i915_buddy_free_list(struct i915_buddy_mm *mm,
+			      struct list_head *objects)
+{
+	struct i915_buddy_block *block, *on;
+
+	list_for_each_entry_safe(block, on, objects, link)
+		i915_buddy_free(mm, block);
+}
+
+/*
+ * Allocate power-of-two block. The order value here translates to:
+ *
+ *   0 = 2^0 * mm->min_size
+ *   1 = 2^1 * mm->min_size
+ *   2 = 2^2 * mm->min_size
+ *   ...
+ */
+struct i915_buddy_block *
+i915_buddy_alloc(struct i915_buddy_mm *mm, unsigned int order)
+{
+	struct i915_buddy_block *block = NULL;
+	unsigned int i;
+	int err;
+
+	for (i = order; i <= mm->max_order; ++i) {
+		block = list_first_entry_or_null(&mm->free_list[i],
+						 struct i915_buddy_block,
+						 link);
+		if (block)
+			break;
+	}
+
+	if (!block)
+		return ERR_PTR(-ENOSPC);
+
+	GEM_BUG_ON(!i915_buddy_block_free(block));
+
+	while (i != order) {
+		err = split_block(mm, block);
+		if (unlikely(err))
+			goto out_free;
+
+		/* Go low */
+		block = block->left;
+		i--;
+	}
+
+	mark_allocated(block);
+	return block;
+
+out_free:
+	__i915_buddy_free(mm, block);
+	return ERR_PTR(err);
+}
+
+static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+	return s1 <= e2 && e1 >= s2;
+}
+
+static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+	return s1 <= s2 && e1 >= e2;
+}
+
+/*
+ * Allocate range. Note that it's safe to chain together multiple alloc_ranges
+ * with the same blocks list.
+ *
+ * Intended for pre-allocating portions of the address space, for example to
+ * reserve a block for the initial framebuffer or similar, hence the expectation
+ * here is that i915_buddy_alloc() is still the main vehicle for
+ * allocations, so if that's not the case then the drm_mm range allocator is
+ * probably a much better fit, and so you should probably go use that instead.
+ */
+int i915_buddy_alloc_range(struct i915_buddy_mm *mm,
+			   struct list_head *blocks,
+			   u64 start, u64 size)
+{
+	struct i915_buddy_block *block;
+	struct i915_buddy_block *buddy;
+	LIST_HEAD(allocated);
+	LIST_HEAD(dfs);
+	u64 end;
+	int err;
+	int i;
+
+	if (size < mm->min_size)
+		return -EINVAL;
+
+	if (!IS_ALIGNED(start, mm->min_size))
+		return -EINVAL;
+
+	if (!size || !IS_ALIGNED(size, mm->min_size))
+		return -EINVAL;
+
+	if (range_overflows(start, size, mm->size))
+		return -EINVAL;
+
+	for (i = 0; i < mm->n_roots; ++i)
+		list_add_tail(&mm->roots[i]->tmp_link, &dfs);
+
+	end = start + size - 1;
+
+	do {
+		u64 block_start;
+		u64 block_end;
+
+		block = list_first_entry_or_null(&dfs,
+						 struct i915_buddy_block,
+						 tmp_link);
+		if (!block)
+			break;
+
+		list_del(&block->tmp_link);
+
+		block_start = i915_buddy_block_offset(block);
+		block_end = block_start + i915_buddy_block_size(mm, block) - 1;
+
+		if (!overlaps(start, end, block_start, block_end))
+			continue;
+
+		if (i915_buddy_block_allocated(block)) {
+			err = -ENOSPC;
+			goto err_free;
+		}
+
+		if (contains(start, end, block_start, block_end)) {
+			if (!i915_buddy_block_free(block)) {
+				err = -ENOSPC;
+				goto err_free;
+			}
+
+			mark_allocated(block);
+			list_add_tail(&block->link, &allocated);
+			continue;
+		}
+
+		if (!i915_buddy_block_split(block)) {
+			err = split_block(mm, block);
+			if (unlikely(err))
+				goto err_undo;
+		}
+
+		list_add(&block->right->tmp_link, &dfs);
+		list_add(&block->left->tmp_link, &dfs);
+	} while (1);
+
+	list_splice_tail(&allocated, blocks);
+	return 0;
+
+err_undo:
+	/*
+	 * We really don't want to leave around a bunch of split blocks, since
+	 * bigger is better, so make sure we merge everything back before we
+	 * free the allocated blocks.
+	 */
+	buddy = get_buddy(block);
+	if (buddy && (i915_buddy_block_free(block) &&
+	    i915_buddy_block_free(buddy)))
+		__i915_buddy_free(mm, block);
+
+err_free:
+	i915_buddy_free_list(mm, &allocated);
+	return err;
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftests/i915_buddy.c"
+#endif
diff --git a/drivers/gpu/drm/i915/i915_buddy.h b/drivers/gpu/drm/i915/i915_buddy.h
new file mode 100644
index 000000000000..615eecd7cf4a
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_buddy.h
@@ -0,0 +1,115 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_BUDDY_H__
+#define __I915_BUDDY_H__
+
+#include <linux/bitops.h>
+
+struct list_head;
+
+struct i915_buddy_block {
+#define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
+#define I915_BUDDY_HEADER_STATE  GENMASK_ULL(11, 10)
+#define   I915_BUDDY_ALLOCATED (1<<10)
+#define   I915_BUDDY_FREE	   (2<<10)
+#define   I915_BUDDY_SPLIT	   (3<<10)
+#define I915_BUDDY_HEADER_ORDER  GENMASK_ULL(9, 0)
+	u64 header;
+
+	struct i915_buddy_block *left;
+	struct i915_buddy_block *right;
+	struct i915_buddy_block *parent;
+
+	/* XXX: somwewhat funky */
+	struct list_head link;
+	struct list_head tmp_link;
+};
+
+#define I915_BUDDY_MAX_ORDER  I915_BUDDY_HEADER_ORDER
+
+/* Binary Buddy System */
+struct i915_buddy_mm {
+	struct kmem_cache *blocks;
+
+	/* Maintain a free list for each order. */
+	struct list_head *free_list;
+
+	/*
+	 * Maintain explicit binary tree(s) to track the allocation of the
+	 * address space. This gives us a simple way of finding a buddy block
+	 * and performing the potentially recursive merge step when freeing a
+	 * block.  Nodes are either allocated or free, in which case they will
+	 * also exist on the respective free list.
+	 */
+	struct i915_buddy_block **roots;
+
+	unsigned int n_roots;
+	unsigned int max_order;
+
+	/* Must be at least PAGE_SIZE */
+	u64 min_size;
+	u64 size;
+};
+
+static inline u64
+i915_buddy_block_offset(struct i915_buddy_block *block)
+{
+	return block->header & I915_BUDDY_HEADER_OFFSET;
+}
+
+static inline unsigned int
+i915_buddy_block_order(struct i915_buddy_block *block)
+{
+	return block->header & I915_BUDDY_HEADER_ORDER;
+}
+
+static inline unsigned int
+i915_buddy_block_state(struct i915_buddy_block *block)
+{
+	return block->header & I915_BUDDY_HEADER_STATE;
+}
+
+static inline bool
+i915_buddy_block_allocated(struct i915_buddy_block *block)
+{
+	return i915_buddy_block_state(block) == I915_BUDDY_ALLOCATED;
+}
+
+static inline bool
+i915_buddy_block_free(struct i915_buddy_block *block)
+{
+	return i915_buddy_block_state(block) == I915_BUDDY_FREE;
+}
+
+static inline bool
+i915_buddy_block_split(struct i915_buddy_block *block)
+{
+	return i915_buddy_block_state(block) == I915_BUDDY_SPLIT;
+}
+
+static inline u64
+i915_buddy_block_size(struct i915_buddy_mm *mm,
+		      struct i915_buddy_block *block)
+{
+	return BIT(i915_buddy_block_order(block)) * mm->min_size;
+}
+
+int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 min_size);
+
+void i915_buddy_fini(struct i915_buddy_mm *mm);
+
+struct i915_buddy_block *
+i915_buddy_alloc(struct i915_buddy_mm *mm, unsigned int order);
+
+int i915_buddy_alloc_range(struct i915_buddy_mm *mm,
+			   struct list_head *blocks,
+			   u64 start, u64 size);
+
+void i915_buddy_free(struct i915_buddy_mm *mm, struct i915_buddy_block *block);
+
+void i915_buddy_free_list(struct i915_buddy_mm *mm, struct list_head *objects);
+
+#endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_buddy.c b/drivers/gpu/drm/i915/selftests/i915_buddy.c
new file mode 100644
index 000000000000..2159aa9f4867
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/i915_buddy.c
@@ -0,0 +1,491 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/prime_numbers.h>
+
+#include "../i915_selftest.h"
+#include "i915_random.h"
+
+#define SZ_8G (1ULL << 33)
+
+static void __igt_dump_block(struct i915_buddy_mm *mm,
+			     struct i915_buddy_block *block,
+			     bool buddy)
+{
+	pr_err("block info: header=%llx, state=%u, order=%d, offset=%llx size=%llx root=%s buddy=%s\n",
+	       block->header,
+	       i915_buddy_block_state(block),
+	       i915_buddy_block_order(block),
+	       i915_buddy_block_offset(block),
+	       i915_buddy_block_size(mm, block),
+	       yesno(!block->parent),
+	       yesno(buddy));
+}
+
+static void igt_dump_block(struct i915_buddy_mm *mm,
+			   struct i915_buddy_block *block)
+{
+	struct i915_buddy_block *buddy;
+
+	__igt_dump_block(mm, block, false);
+
+	buddy = get_buddy(block);
+	if (buddy)
+		__igt_dump_block(mm, buddy, true);
+}
+
+static int igt_check_block(struct i915_buddy_mm *mm,
+			   struct i915_buddy_block *block)
+{
+	struct i915_buddy_block *buddy;
+	unsigned int block_state;
+	u64 block_size;
+	u64 offset;
+	int err = 0;
+
+	block_state = i915_buddy_block_state(block);
+
+	if (block_state != I915_BUDDY_ALLOCATED &&
+	    block_state != I915_BUDDY_FREE &&
+	    block_state != I915_BUDDY_SPLIT) {
+		pr_err("block state mismatch\n");
+		err = -EINVAL;
+	}
+
+	block_size = i915_buddy_block_size(mm, block);
+	offset = i915_buddy_block_offset(block);
+
+	if (block_size < mm->min_size) {
+		pr_err("block size smaller than min size\n");
+		err = -EINVAL;
+	}
+
+	if (!is_power_of_2(block_size)) {
+		pr_err("block size not power of two\n");
+		err = -EINVAL;
+	}
+
+	if (!IS_ALIGNED(block_size, mm->min_size)) {
+		pr_err("block size not aligned to min size\n");
+		err = -EINVAL;
+	}
+
+	if (!IS_ALIGNED(offset, mm->min_size)) {
+		pr_err("block offset not aligned to min size\n");
+		err = -EINVAL;
+	}
+
+	if (!IS_ALIGNED(offset, block_size)) {
+		pr_err("block offset not aligned to block size\n");
+		err = -EINVAL;
+	}
+
+	buddy = get_buddy(block);
+
+	if (!buddy && block->parent) {
+		pr_err("buddy has gone fishing\n");
+		err = -EINVAL;
+	}
+
+	if (buddy) {
+		if (i915_buddy_block_offset(buddy) != (offset ^ block_size)) {
+			pr_err("buddy has wrong offset\n");
+			err = -EINVAL;
+		}
+
+		if (i915_buddy_block_size(mm, buddy) != block_size) {
+			pr_err("buddy size mismatch\n");
+			err = -EINVAL;
+		}
+
+		if (i915_buddy_block_state(buddy) == block_state &&
+		    block_state == I915_BUDDY_FREE) {
+			pr_err("block and its buddy are free\n");
+			err = -EINVAL;
+		}
+	}
+
+	return err;
+}
+
+static int igt_check_blocks(struct i915_buddy_mm *mm,
+			    struct list_head *blocks,
+			    u64 expected_size,
+			    bool is_contiguous)
+{
+	struct i915_buddy_block *block;
+	struct i915_buddy_block *prev;
+	u64 total;
+	int err = 0;
+
+	block = NULL;
+	prev = NULL;
+	total = 0;
+
+	list_for_each_entry(block, blocks, link) {
+		err = igt_check_block(mm, block);
+
+		if (!i915_buddy_block_allocated(block)) {
+			pr_err("block not allocated\n"),
+			err = -EINVAL;
+		}
+
+		if (is_contiguous && prev) {
+			u64 prev_block_size;
+			u64 prev_offset;
+			u64 offset;
+
+			prev_offset = i915_buddy_block_offset(prev);
+			prev_block_size = i915_buddy_block_size(mm, prev);
+			offset = i915_buddy_block_offset(block);
+
+			if (offset != (prev_offset + prev_block_size)) {
+				pr_err("block offset mismatch\n");
+				err = -EINVAL;
+			}
+		}
+
+		if (err)
+			break;
+
+		total += i915_buddy_block_size(mm, block);
+		prev = block;
+	}
+
+	if (!err) {
+		if (total != expected_size) {
+			pr_err("size mismatch, expected=%llx, found=%llx\n",
+			       expected_size, total);
+			err = -EINVAL;
+		}
+		return err;
+	}
+
+	if (prev) {
+		pr_err("prev block, dump:\n");
+		igt_dump_block(mm, prev);
+	}
+
+	if (block) {
+		pr_err("bad block, dump:\n");
+		igt_dump_block(mm, block);
+	}
+
+	return err;
+}
+
+static int igt_check_mm(struct i915_buddy_mm *mm)
+{
+	struct i915_buddy_block *root;
+	struct i915_buddy_block *prev;
+	unsigned int i;
+	u64 total;
+	int err = 0;
+
+	if (!mm->n_roots) {
+		pr_err("n_roots is zero\n");
+		return -EINVAL;
+	}
+
+	if (mm->n_roots != hweight64(mm->size)) {
+		pr_err("n_roots mismatch, n_roots=%u, expected=%lu\n",
+		       mm->n_roots, hweight64(mm->size));
+		return -EINVAL;
+	}
+
+	root = NULL;
+	prev = NULL;
+	total = 0;
+
+	for (i = 0; i < mm->n_roots; ++i) {
+		struct i915_buddy_block *block;
+		unsigned int order;
+
+		root = mm->roots[i];
+		if (!root) {
+			pr_err("root(%u) is NULL\n", i);
+			err = -EINVAL;
+			break;
+		}
+
+		err = igt_check_block(mm, root);
+
+		if (!i915_buddy_block_free(root)) {
+			pr_err("root not free\n");
+			err = -EINVAL;
+		}
+
+		order = i915_buddy_block_order(root);
+
+		if (!i) {
+			if (order != mm->max_order) {
+				pr_err("max order root missing\n");
+				err = -EINVAL;
+			}
+		}
+
+		if (prev) {
+			u64 prev_block_size;
+			u64 prev_offset;
+			u64 offset;
+
+			prev_offset = i915_buddy_block_offset(prev);
+			prev_block_size = i915_buddy_block_size(mm, prev);
+			offset = i915_buddy_block_offset(root);
+
+			if (offset != (prev_offset + prev_block_size)) {
+				pr_err("root offset mismatch\n");
+				err = -EINVAL;
+			}
+		}
+
+		block = list_first_entry_or_null(&mm->free_list[order],
+						 struct i915_buddy_block,
+						 link);
+		if (block != root) {
+			pr_err("root mismatch at order=%u\n", order);
+			err = -EINVAL;
+		}
+
+		if (err)
+			break;
+
+		prev = root;
+		total += i915_buddy_block_size(mm, root);
+	}
+
+	if (!err) {
+		if (total != mm->size) {
+			pr_err("expected mm size=%llx, found=%llx\n", mm->size,
+			       total);
+			err = -EINVAL;
+		}
+		return err;
+	}
+
+	if (prev) {
+		pr_err("prev root(%u), dump:\n", i - 1);
+		igt_dump_block(mm, prev);
+	}
+
+	if (root) {
+		pr_err("bad root(%u), dump:\n", i);
+		igt_dump_block(mm, root);
+	}
+
+	return err;
+}
+
+static void igt_mm_config(u64 *size, u64 *min_size)
+{
+	I915_RND_STATE(prng);
+	u64 s, ms;
+
+	/* Nothing fancy, just try to get an interesting bit pattern */
+
+	prandom_seed_state(&prng, i915_selftest.random_seed);
+
+	s = i915_prandom_u64_state(&prng) & (SZ_8G - 1);
+	ms = BIT_ULL(12 + (prandom_u32_state(&prng) % ilog2(s >> 12)));
+	s = max(s & -ms, ms);
+
+	*min_size = ms;
+	*size = s;
+}
+
+static int igt_buddy_alloc(void *arg)
+{
+	struct i915_buddy_mm mm;
+	int max_order;
+	u64 min_size;
+	u64 mm_size;
+	int err;
+
+	igt_mm_config(&mm_size, &min_size);
+
+	pr_info("buddy_init with size=%llx, min_size=%llx\n", mm_size, min_size);
+
+	err = i915_buddy_init(&mm, mm_size, min_size);
+	if (err) {
+		pr_err("buddy_init failed(%d)\n", err);
+		return err;
+	}
+
+	for (max_order = mm.max_order; max_order >= 0; max_order--) {
+		struct i915_buddy_block *block;
+		int order;
+		LIST_HEAD(blocks);
+		u64 total;
+
+		err = igt_check_mm(&mm);
+		if (err) {
+			pr_err("pre-mm check failed, abort\n");
+			break;
+		}
+
+		pr_info("filling from max_order=%u\n", max_order);
+
+		order = max_order;
+		total = 0;
+
+		do {
+retry:
+			block = i915_buddy_alloc(&mm, order);
+			if (IS_ERR(block)) {
+				err = PTR_ERR(block);
+				if (err == -ENOMEM) {
+					pr_info("buddy_alloc hit -ENOMEM with order=%d\n",
+						order);
+				} else {
+					if (order--) {
+						err = 0;
+						goto retry;
+					}
+
+					pr_err("buddy_alloc with order=%d failed(%d)\n",
+					       order, err);
+				}
+
+				break;
+			}
+
+			list_add_tail(&block->link, &blocks);
+
+			if (i915_buddy_block_order(block) != order) {
+				pr_err("buddy_alloc order mismatch\n");
+				err = -EINVAL;
+				break;
+			}
+
+			total += i915_buddy_block_size(&mm, block);
+		} while (total < mm.size);
+
+		if (!err)
+			err = igt_check_blocks(&mm, &blocks, total, false);
+
+		i915_buddy_free_list(&mm, &blocks);
+
+		if (!err) {
+			err = igt_check_mm(&mm);
+			if (err)
+				pr_err("post-mm check failed\n");
+		}
+
+		if (err)
+			break;
+	}
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	i915_buddy_fini(&mm);
+
+	return err;
+}
+
+static int igt_buddy_alloc_range(void *arg)
+{
+	struct i915_buddy_mm mm;
+	unsigned long page_num;
+	LIST_HEAD(blocks);
+	u64 min_size;
+	u64 offset;
+	u64 size;
+	u64 rem;
+	int err;
+
+	igt_mm_config(&size, &min_size);
+
+	pr_info("buddy_init with size=%llx, min_size=%llx\n", size, min_size);
+
+	err = i915_buddy_init(&mm, size, min_size);
+	if (err) {
+		pr_err("buddy_init failed(%d)\n", err);
+		return err;
+	}
+
+	err = igt_check_mm(&mm);
+	if (err) {
+		pr_err("pre-mm check failed, abort, abort, abort!\n");
+		goto err_fini;
+	}
+
+	rem = mm.size;
+	offset = 0;
+
+	for_each_prime_number_from(page_num, 1, ULONG_MAX - 1) {
+		struct i915_buddy_block *block;
+		LIST_HEAD(tmp);
+
+		size = min(page_num * mm.min_size, rem);
+
+		err = i915_buddy_alloc_range(&mm, &tmp, offset, size);
+		if (err) {
+			if (err == -ENOMEM) {
+				pr_info("alloc_range hit -ENOMEM with size=%llx\n",
+					size);
+			} else {
+				pr_err("alloc_range with offset=%llx, size=%llx failed(%d)\n",
+				       offset, size, err);
+			}
+
+			break;
+		}
+
+		block = list_first_entry_or_null(&tmp,
+						 struct i915_buddy_block,
+						 link);
+		if (!block) {
+			pr_err("alloc_range has no blocks\n");
+			err = -EINVAL;
+		}
+
+		if (i915_buddy_block_offset(block) != offset) {
+			pr_err("alloc_range start offset mismatch, found=%llx, expected=%llx\n",
+			       i915_buddy_block_offset(block), offset);
+			err = -EINVAL;
+		}
+
+		if (!err)
+			err = igt_check_blocks(&mm, &tmp, size, true);
+
+		list_splice_tail(&tmp, &blocks);
+
+		if (err)
+			break;
+
+		offset += size;
+
+		rem -= size;
+		if (!rem)
+			break;
+	}
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	i915_buddy_free_list(&mm, &blocks);
+
+	if (!err) {
+		err = igt_check_mm(&mm);
+		if (err)
+			pr_err("post-mm check failed\n");
+	}
+
+err_fini:
+	i915_buddy_fini(&mm);
+
+	return err;
+}
+
+int i915_buddy_mock_selftests(void)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_buddy_alloc),
+		SUBTEST(igt_buddy_alloc_range),
+	};
+
+	return i915_subtests(tests, NULL);
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
index b55da4d9ccba..b88084fe3269 100644
--- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
@@ -25,3 +25,4 @@ selftest(evict, i915_gem_evict_mock_selftests)
 selftest(gtt, i915_gem_gtt_mock_selftests)
 selftest(hugepages, i915_gem_huge_page_mock_selftests)
 selftest(contexts, i915_gem_context_mock_selftests)
+selftest(buddy, i915_buddy_mock_selftests)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 02/37] drm/i915: introduce intel_memory_region
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
  2019-06-27 20:55 ` [PATCH v2 01/37] drm/i915: buddy allocator Matthew Auld
@ 2019-06-27 20:55 ` Matthew Auld
  2019-06-27 22:47   ` Chris Wilson
  2019-06-28  8:09   ` Chris Wilson
  2019-06-27 20:55 ` [PATCH v2 03/37] drm/i915/region: support basic eviction Matthew Auld
                   ` (37 subsequent siblings)
  39 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:55 UTC (permalink / raw)
  To: intel-gfx

Support memory regions, as defined by a given (start, end), and allow
creating GEM objects which are backed by said region. The immediate goal
here is to have something to represent our device memory, but later on
we also want to represent every memory domain with a region, so stolen,
shmem, and of course device. At some point we are probably going to want
use a common struct here, such that we are better aligned with say TTM.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   9 +
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  82 +++++++
 drivers/gpu/drm/i915/i915_drv.h               |   1 +
 drivers/gpu/drm/i915/i915_gem.c               |   1 +
 drivers/gpu/drm/i915/intel_memory_region.c    | 215 ++++++++++++++++++
 drivers/gpu/drm/i915/intel_memory_region.h    | 107 +++++++++
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
 .../drm/i915/selftests/intel_memory_region.c  | 111 +++++++++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |   1 +
 drivers/gpu/drm/i915/selftests/mock_region.c  |  55 +++++
 drivers/gpu/drm/i915/selftests/mock_region.h  |  16 ++
 12 files changed, 600 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/intel_memory_region.c
 create mode 100644 drivers/gpu/drm/i915/intel_memory_region.h
 create mode 100644 drivers/gpu/drm/i915/selftests/intel_memory_region.c
 create mode 100644 drivers/gpu/drm/i915/selftests/mock_region.c
 create mode 100644 drivers/gpu/drm/i915/selftests/mock_region.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index cb66cf1a5a10..28fac19f7b04 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -59,6 +59,7 @@ i915-y += i915_drv.o \
 i915-y += \
 	i915_memcpy.o \
 	i915_mm.o \
+	intel_memory_region.o \
 	i915_sw_fence.o \
 	i915_syncmap.o \
 	i915_user_extensions.o
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 34b51fad02de..8d760e852c4b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -64,6 +64,15 @@ struct drm_i915_gem_object {
 
 	const struct drm_i915_gem_object_ops *ops;
 
+	/**
+	 * Memory region for this object.
+	 */
+	struct intel_memory_region *memory_region;
+	/**
+	 * List of memory region blocks allocated for this object.
+	 */
+	struct list_head blocks;
+
 	struct {
 		/**
 		 * @vma.lock: protect the list/tree of vmas
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 2154cdee4ab3..fd547b98ec69 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -17,6 +17,7 @@
 
 #include "selftests/mock_drm.h"
 #include "selftests/mock_gem_device.h"
+#include "selftests/mock_region.h"
 #include "selftests/i915_random.h"
 
 static const unsigned int page_sizes[] = {
@@ -447,6 +448,86 @@ static int igt_mock_exhaust_device_supported_pages(void *arg)
 	return err;
 }
 
+
+static int igt_mock_memory_region_huge_pages(void *arg)
+{
+	struct i915_ppgtt *ppgtt = arg;
+	struct drm_i915_private *i915 = ppgtt->vm.i915;
+	unsigned long supported = INTEL_INFO(i915)->page_sizes;
+	struct intel_memory_region *mem;
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int bit;
+	int err = 0;
+
+	mem = mock_region_create(i915, 0, SZ_2G,
+				 I915_GTT_PAGE_SIZE_4K, 0);
+	if (IS_ERR(mem)) {
+		pr_err("failed to create memory region\n");
+		return PTR_ERR(mem);
+	}
+
+	for_each_set_bit(bit, &supported, ilog2(I915_GTT_MAX_PAGE_SIZE) + 1) {
+		unsigned int page_size = BIT(bit);
+		resource_size_t phys;
+
+		obj = i915_gem_object_create_region(mem, page_size, 0);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto out_destroy_device;
+		}
+
+		pr_info("memory region start(%pa)\n",
+		        &obj->memory_region->region.start);
+		pr_info("creating object, size=%x\n", page_size);
+
+		vma = i915_vma_instance(obj, &ppgtt->vm, NULL);
+		if (IS_ERR(vma)) {
+			err = PTR_ERR(vma);
+			goto out_put;
+		}
+
+		err = i915_vma_pin(vma, 0, 0, PIN_USER);
+		if (err)
+			goto out_close;
+
+		phys = i915_gem_object_get_dma_address(obj, 0);
+		if (!IS_ALIGNED(phys, page_size)) {
+			pr_err("memory region misaligned(%pa)\n", &phys);
+			err = -EINVAL;
+			goto out_close;
+		}
+
+		if (vma->page_sizes.gtt != page_size) {
+			pr_err("page_sizes.gtt=%u, expected=%u\n",
+			       vma->page_sizes.gtt, page_size);
+			err = -EINVAL;
+			goto out_unpin;
+		}
+
+		i915_vma_unpin(vma);
+		i915_vma_close(vma);
+
+		i915_gem_object_put(obj);
+	}
+
+	goto out_destroy_device;
+
+out_unpin:
+	i915_vma_unpin(vma);
+out_close:
+	i915_vma_close(vma);
+out_put:
+	i915_gem_object_put(obj);
+out_destroy_device:
+	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_drain_freed_objects(i915);
+	mutex_lock(&i915->drm.struct_mutex);
+	intel_memory_region_destroy(mem);
+
+	return err;
+}
+
 static int igt_mock_ppgtt_misaligned_dma(void *arg)
 {
 	struct i915_ppgtt *ppgtt = arg;
@@ -1679,6 +1760,7 @@ int i915_gem_huge_page_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_mock_exhaust_device_supported_pages),
+		SUBTEST(igt_mock_memory_region_huge_pages),
 		SUBTEST(igt_mock_ppgtt_misaligned_dma),
 		SUBTEST(igt_mock_ppgtt_huge_fill),
 		SUBTEST(igt_mock_ppgtt_64K),
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7e981b03face..97d02b32a208 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -77,6 +77,7 @@
 
 #include "intel_device_info.h"
 #include "intel_runtime_pm.h"
+#include "intel_memory_region.h"
 #include "intel_uc.h"
 #include "intel_uncore.h"
 #include "intel_wakeref.h"
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b7f290b77f8f..db3744b0bc80 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1815,4 +1815,5 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_gem_device.c"
 #include "selftests/i915_gem.c"
+#include "selftests/intel_memory_region.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
new file mode 100644
index 000000000000..4c89853a7769
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -0,0 +1,215 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "intel_memory_region.h"
+#include "i915_drv.h"
+
+static void
+memory_region_free_pages(struct drm_i915_gem_object *obj,
+			 struct sg_table *pages)
+{
+
+	struct i915_buddy_block *block, *on;
+
+	lockdep_assert_held(&obj->memory_region->mm_lock);
+
+	list_for_each_entry_safe(block, on, &obj->blocks, link) {
+		list_del_init(&block->link);
+		i915_buddy_free(&obj->memory_region->mm, block);
+	}
+
+	sg_free_table(pages);
+	kfree(pages);
+}
+
+void
+i915_memory_region_put_pages_buddy(struct drm_i915_gem_object *obj,
+				   struct sg_table *pages)
+{
+	mutex_lock(&obj->memory_region->mm_lock);
+	memory_region_free_pages(obj, pages);
+	mutex_unlock(&obj->memory_region->mm_lock);
+
+	obj->mm.dirty = false;
+}
+
+int
+i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
+{
+	struct intel_memory_region *mem = obj->memory_region;
+	resource_size_t size = obj->base.size;
+	struct sg_table *st;
+	struct scatterlist *sg;
+	unsigned int sg_page_sizes;
+	unsigned long n_pages;
+
+	GEM_BUG_ON(!IS_ALIGNED(size, mem->mm.min_size));
+	GEM_BUG_ON(!list_empty(&obj->blocks));
+
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (!st)
+		return -ENOMEM;
+
+	n_pages = size >> ilog2(mem->mm.min_size);
+
+	if (sg_alloc_table(st, n_pages, GFP_KERNEL)) {
+		kfree(st);
+		return -ENOMEM;
+	}
+
+	sg = st->sgl;
+	st->nents = 0;
+	sg_page_sizes = 0;
+
+	mutex_lock(&mem->mm_lock);
+
+	do {
+		struct i915_buddy_block *block;
+		unsigned int order;
+		u64 block_size;
+		u64 offset;
+
+		order = fls(n_pages) - 1;
+		GEM_BUG_ON(order > mem->mm.max_order);
+
+		do {
+			block = i915_buddy_alloc(&mem->mm, order);
+			if (!IS_ERR(block))
+				break;
+
+			/* XXX: some kind of eviction pass, local to the device */
+			if (!order--)
+				goto err_free_blocks;
+		} while (1);
+
+		n_pages -= BIT(order);
+
+		INIT_LIST_HEAD(&block->link);
+		list_add(&block->link, &obj->blocks);
+
+		/*
+		 * TODO: it might be worth checking consecutive blocks here and
+		 * coalesce if we can.
+		*/
+		block_size = i915_buddy_block_size(&mem->mm, block);
+		offset = i915_buddy_block_offset(block);
+
+		sg_dma_address(sg) = mem->region.start + offset;
+		sg_dma_len(sg) = block_size;
+
+		sg->length = block_size;
+		sg_page_sizes |= block_size;
+		st->nents++;
+
+		if (!n_pages) {
+			sg_mark_end(sg);
+			break;
+		}
+
+		sg = __sg_next(sg);
+	} while (1);
+
+	mutex_unlock(&mem->mm_lock);
+
+	i915_sg_trim(st);
+
+	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
+
+	return 0;
+
+err_free_blocks:
+	memory_region_free_pages(obj, st);
+	mutex_unlock(&mem->mm_lock);
+	return -ENXIO;
+}
+
+int i915_memory_region_init_buddy(struct intel_memory_region *mem)
+{
+	return i915_buddy_init(&mem->mm, resource_size(&mem->region),
+			       mem->min_page_size);
+}
+
+void i915_memory_region_release_buddy(struct intel_memory_region *mem)
+{
+	i915_buddy_fini(&mem->mm);
+}
+
+struct drm_i915_gem_object *
+i915_gem_object_create_region(struct intel_memory_region *mem,
+			      resource_size_t size,
+			      unsigned int flags)
+{
+	struct drm_i915_gem_object *obj;
+
+	if (!mem)
+		return ERR_PTR(-ENODEV);
+
+	size = round_up(size, mem->min_page_size);
+
+	GEM_BUG_ON(!size);
+	GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_MIN_ALIGNMENT));
+
+	if (size >> PAGE_SHIFT > INT_MAX)
+		return ERR_PTR(-E2BIG);
+
+	if (overflows_type(size, obj->base.size))
+		return ERR_PTR(-E2BIG);
+
+	obj = mem->ops->create_object(mem, size, flags);
+	if (IS_ERR(obj))
+		return obj;
+
+	INIT_LIST_HEAD(&obj->blocks);
+	obj->memory_region = mem;
+
+	return obj;
+}
+
+struct intel_memory_region *
+intel_memory_region_create(struct drm_i915_private *i915,
+			   resource_size_t start,
+			   resource_size_t size,
+			   resource_size_t min_page_size,
+			   resource_size_t io_start,
+			   const struct intel_memory_region_ops *ops)
+{
+	struct intel_memory_region *mem;
+	int err;
+
+	mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+	if (!mem)
+		return ERR_PTR(-ENOMEM);
+
+	mem->i915 = i915;
+	mem->region = (struct resource)DEFINE_RES_MEM(start, size);
+	mem->io_start = io_start;
+	mem->min_page_size = min_page_size;
+	mem->ops = ops;
+
+	mutex_init(&mem->mm_lock);
+
+	if (ops->init) {
+		err = ops->init(mem);
+		if (err) {
+			kfree(mem);
+			mem = ERR_PTR(err);
+		}
+	}
+
+	return mem;
+}
+
+void
+intel_memory_region_destroy(struct intel_memory_region *mem)
+{
+	if (mem->ops->release)
+		mem->ops->release(mem);
+
+	kfree(mem);
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftests/mock_region.c"
+#endif
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
new file mode 100644
index 000000000000..8d4736bdde50
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -0,0 +1,107 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __INTEL_MEMORY_REGION_H__
+#define __INTEL_MEMORY_REGION_H__
+
+#include <linux/ioport.h>
+#include <linux/mutex.h>
+#include <linux/io-mapping.h>
+
+#include "i915_buddy.h"
+
+struct drm_i915_private;
+struct drm_i915_gem_object;
+struct intel_memory_region;
+struct sg_table;
+
+/**
+ *  Base memory type
+ */
+enum intel_memory_type {
+	INTEL_SMEM = 0,
+	INTEL_LMEM,
+	INTEL_STOLEN,
+};
+
+enum intel_region_id {
+	INTEL_MEMORY_SMEM = 0,
+	INTEL_MEMORY_LMEM,
+	INTEL_MEMORY_STOLEN,
+	INTEL_MEMORY_UKNOWN, /* Should be last */
+};
+
+#define REGION_SMEM     BIT(INTEL_MEMORY_SMEM)
+#define REGION_LMEM     BIT(INTEL_MEMORY_LMEM)
+#define REGION_STOLEN   BIT(INTEL_MEMORY_STOLEN)
+
+#define INTEL_MEMORY_TYPE_SHIFT 16
+
+#define MEMORY_TYPE_FROM_REGION(r) (ilog2(r >> INTEL_MEMORY_TYPE_SHIFT))
+#define MEMORY_INSTANCE_FROM_REGION(r) (ilog2(r & 0xffff))
+
+/**
+ * Memory regions encoded as type | instance
+ */
+static const u32 intel_region_map[] = {
+	[INTEL_MEMORY_SMEM] = BIT(INTEL_SMEM + INTEL_MEMORY_TYPE_SHIFT) | BIT(0),
+	[INTEL_MEMORY_LMEM] = BIT(INTEL_LMEM + INTEL_MEMORY_TYPE_SHIFT) | BIT(0),
+	[INTEL_MEMORY_STOLEN] = BIT(INTEL_STOLEN + INTEL_MEMORY_TYPE_SHIFT) | BIT(0),
+};
+
+struct intel_memory_region_ops {
+	unsigned int flags;
+
+	int (*init)(struct intel_memory_region *);
+	void (*release)(struct intel_memory_region *);
+
+	struct drm_i915_gem_object *
+	(*create_object)(struct intel_memory_region *,
+			 resource_size_t,
+			 unsigned int);
+};
+
+struct intel_memory_region {
+	struct drm_i915_private *i915;
+
+	const struct intel_memory_region_ops *ops;
+
+	struct io_mapping iomap;
+	struct resource region;
+
+	struct i915_buddy_mm mm;
+	struct mutex mm_lock;
+
+	resource_size_t io_start;
+	resource_size_t min_page_size;
+
+	unsigned int type;
+	unsigned int instance;
+	unsigned int id;
+};
+
+int i915_memory_region_init_buddy(struct intel_memory_region *mem);
+void i915_memory_region_release_buddy(struct intel_memory_region *mem);
+
+int i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj);
+void i915_memory_region_put_pages_buddy(struct drm_i915_gem_object *obj,
+					struct sg_table *pages);
+
+struct intel_memory_region *
+intel_memory_region_create(struct drm_i915_private *i915,
+			   resource_size_t start,
+			   resource_size_t size,
+			   resource_size_t min_page_size,
+			   resource_size_t io_start,
+			   const struct intel_memory_region_ops *ops);
+void
+intel_memory_region_destroy(struct intel_memory_region *mem);
+
+struct drm_i915_gem_object *
+i915_gem_object_create_region(struct intel_memory_region *mem,
+			      resource_size_t size,
+			      unsigned int flags);
+
+#endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
index b88084fe3269..aa5a0e7f5d9e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
@@ -26,3 +26,4 @@ selftest(gtt, i915_gem_gtt_mock_selftests)
 selftest(hugepages, i915_gem_huge_page_mock_selftests)
 selftest(contexts, i915_gem_context_mock_selftests)
 selftest(buddy, i915_buddy_mock_selftests)
+selftest(memory_region, intel_memory_region_mock_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
new file mode 100644
index 000000000000..c3b160cfd713
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/prime_numbers.h>
+
+#include "../i915_selftest.h"
+
+#include "mock_gem_device.h"
+#include "gem/selftests/mock_context.h"
+#include "mock_drm.h"
+
+static void close_objects(struct list_head *objects)
+{
+	struct drm_i915_gem_object *obj, *on;
+
+	list_for_each_entry_safe(obj, on, objects, st_link) {
+		if (i915_gem_object_has_pinned_pages(obj))
+			i915_gem_object_unpin_pages(obj);
+		/* No polluting the memory region between tests */
+		__i915_gem_object_put_pages(obj, I915_MM_NORMAL);
+		i915_gem_object_put(obj);
+		list_del(&obj->st_link);
+	}
+}
+
+static int igt_mock_fill(void *arg)
+{
+	struct intel_memory_region *mem = arg;
+	resource_size_t total = resource_size(&mem->region);
+	resource_size_t page_size;
+	resource_size_t rem;
+	unsigned long max_pages;
+	unsigned long page_num;
+	LIST_HEAD(objects);
+	int err = 0;
+
+	page_size = mem->mm.min_size;
+	max_pages = div64_u64(total, page_size);
+	rem = total;
+
+	for_each_prime_number_from(page_num, 1, max_pages) {
+		resource_size_t size = page_num * page_size;
+		struct drm_i915_gem_object *obj;
+
+		obj = i915_gem_object_create_region(mem, size, 0);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			break;
+		}
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err) {
+			i915_gem_object_put(obj);
+			break;
+		}
+
+		list_add(&obj->st_link, &objects);
+		rem -= size;
+	}
+
+	if (err == -ENOMEM)
+		err = 0;
+	if (err == -ENXIO) {
+		if (page_num * page_size <= rem) {
+			pr_err("igt_mock_fill failed, space still left in region\n");
+			err = -EINVAL;
+		} else {
+			err = 0;
+		}
+	}
+
+	close_objects(&objects);
+
+	return err;
+}
+
+int intel_memory_region_mock_selftests(void)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_mock_fill),
+	};
+	struct intel_memory_region *mem;
+	struct drm_i915_private *i915;
+	int err;
+
+	i915 = mock_gem_device();
+	if (!i915)
+		return -ENOMEM;
+
+	mem = mock_region_create(i915, 0, SZ_2G,
+				 I915_GTT_PAGE_SIZE_4K, 0);
+	if (IS_ERR(mem)) {
+		pr_err("failed to create memory region\n");
+		err = PTR_ERR(mem);
+		goto out_unref;
+	}
+
+	mutex_lock(&i915->drm.struct_mutex);
+	err = i915_subtests(tests, mem);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	i915_gem_drain_freed_objects(i915);
+	intel_memory_region_destroy(mem);
+
+out_unref:
+	drm_dev_put(&i915->drm);
+
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 2741805b56c2..86f86c3d38a8 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -32,6 +32,7 @@
 #include "mock_gem_device.h"
 #include "mock_gtt.h"
 #include "mock_uncore.h"
+#include "mock_region.h"
 
 #include "gem/selftests/mock_context.h"
 #include "gem/selftests/mock_gem_object.h"
diff --git a/drivers/gpu/drm/i915/selftests/mock_region.c b/drivers/gpu/drm/i915/selftests/mock_region.c
new file mode 100644
index 000000000000..cb942a461e9d
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/mock_region.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "mock_region.h"
+
+static const struct drm_i915_gem_object_ops mock_region_obj_ops = {
+	.get_pages = i915_memory_region_get_pages_buddy,
+	.put_pages = i915_memory_region_put_pages_buddy,
+};
+
+static struct drm_i915_gem_object *
+mock_object_create(struct intel_memory_region *mem,
+		   resource_size_t size,
+		   unsigned int flags)
+{
+	struct drm_i915_private *i915 = mem->i915;
+	struct drm_i915_gem_object *obj;
+	unsigned int cache_level;
+
+	if (size > BIT(mem->mm.max_order) * mem->mm.min_size)
+		return ERR_PTR(-E2BIG);
+
+	obj = i915_gem_object_alloc();
+	if (!obj)
+		return ERR_PTR(-ENOMEM);
+
+	drm_gem_private_object_init(&i915->drm, &obj->base, size);
+	i915_gem_object_init(obj, &mock_region_obj_ops);
+
+	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
+
+	cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
+	i915_gem_object_set_cache_coherency(obj, cache_level);
+
+	return obj;
+}
+
+static const struct intel_memory_region_ops mock_region_ops = {
+	.init = i915_memory_region_init_buddy,
+	.release = i915_memory_region_release_buddy,
+	.create_object = mock_object_create,
+};
+
+struct intel_memory_region *
+mock_region_create(struct drm_i915_private *i915,
+		   resource_size_t start,
+		   resource_size_t size,
+		   resource_size_t min_page_size,
+		   resource_size_t io_start)
+{
+	return intel_memory_region_create(i915, start, size, min_page_size,
+					  io_start, &mock_region_ops);
+}
diff --git a/drivers/gpu/drm/i915/selftests/mock_region.h b/drivers/gpu/drm/i915/selftests/mock_region.h
new file mode 100644
index 000000000000..24608089d833
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/mock_region.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __MOCK_REGION_H
+#define __MOCK_REGION_H
+
+struct intel_memory_region *
+mock_region_create(struct drm_i915_private *i915,
+		   resource_size_t start,
+		   resource_size_t size,
+		   resource_size_t min_page_size,
+		   resource_size_t io_start);
+
+#endif /* !__MOCK_REGION_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
  2019-06-27 20:55 ` [PATCH v2 01/37] drm/i915: buddy allocator Matthew Auld
  2019-06-27 20:55 ` [PATCH v2 02/37] drm/i915: introduce intel_memory_region Matthew Auld
@ 2019-06-27 20:55 ` Matthew Auld
  2019-06-27 22:59   ` Chris Wilson
  2019-07-30 16:26   ` Daniel Vetter
  2019-06-27 20:56 ` [PATCH v2 04/37] drm/i915/region: support continuous allocations Matthew Auld
                   ` (36 subsequent siblings)
  39 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:55 UTC (permalink / raw)
  To: intel-gfx

Support basic eviction for regions.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  7 ++
 drivers/gpu/drm/i915/i915_gem.c               | 16 ++++
 drivers/gpu/drm/i915/intel_memory_region.c    | 89 ++++++++++++++++++-
 drivers/gpu/drm/i915/intel_memory_region.h    | 10 +++
 .../drm/i915/selftests/intel_memory_region.c  | 73 +++++++++++++++
 drivers/gpu/drm/i915/selftests/mock_region.c  |  1 +
 6 files changed, 192 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 8d760e852c4b..87000fc24ab3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -72,6 +72,13 @@ struct drm_i915_gem_object {
 	 * List of memory region blocks allocated for this object.
 	 */
 	struct list_head blocks;
+	/**
+	 * Element within memory_region->objects or memory_region->purgeable if
+	 * the object is marked as DONTNEED. Access is protected by
+	 * memory_region->obj_lock.
+	 */
+	struct list_head region_link;
+	struct list_head eviction_link;
 
 	struct {
 		/**
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index db3744b0bc80..85677ae89849 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1122,6 +1122,22 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	    !i915_gem_object_has_pages(obj))
 		i915_gem_object_truncate(obj);
 
+	if (obj->memory_region) {
+		mutex_lock(&obj->memory_region->obj_lock);
+
+		switch (obj->mm.madv) {
+		case I915_MADV_WILLNEED:
+			list_move(&obj->region_link, &obj->memory_region->objects);
+			break;
+		default:
+			list_move(&obj->region_link,
+				  &obj->memory_region->purgeable);
+			break;
+		}
+
+		mutex_unlock(&obj->memory_region->obj_lock);
+	}
+
 	args->retained = obj->mm.madv != __I915_MADV_PURGED;
 	mutex_unlock(&obj->mm.lock);
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 4c89853a7769..721b47e46492 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -6,6 +6,56 @@
 #include "intel_memory_region.h"
 #include "i915_drv.h"
 
+int i915_memory_region_evict(struct intel_memory_region *mem,
+			     resource_size_t target)
+{
+	struct drm_i915_gem_object *obj, *on;
+	resource_size_t found;
+	LIST_HEAD(purgeable);
+	int err;
+
+	err = 0;
+	found = 0;
+
+	mutex_lock(&mem->obj_lock);
+
+	list_for_each_entry(obj, &mem->purgeable, region_link) {
+		if (!i915_gem_object_has_pages(obj))
+			continue;
+
+		if (READ_ONCE(obj->pin_global))
+			continue;
+
+		if (atomic_read(&obj->bind_count))
+			continue;
+
+		list_add(&obj->eviction_link, &purgeable);
+
+		found += obj->base.size;
+		if (found >= target)
+			goto found;
+	}
+
+	err = -ENOSPC;
+found:
+	list_for_each_entry_safe(obj, on, &purgeable, eviction_link) {
+		if (!err) {
+			__i915_gem_object_put_pages(obj, I915_MM_SHRINKER);
+
+			mutex_lock_nested(&obj->mm.lock, I915_MM_SHRINKER);
+			if (!i915_gem_object_has_pages(obj))
+				obj->mm.madv = __I915_MADV_PURGED;
+			mutex_unlock(&obj->mm.lock);
+		}
+
+		list_del(&obj->eviction_link);
+	}
+
+	mutex_unlock(&mem->obj_lock);
+
+	return err;
+}
+
 static void
 memory_region_free_pages(struct drm_i915_gem_object *obj,
 			 struct sg_table *pages)
@@ -70,7 +120,8 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
 		unsigned int order;
 		u64 block_size;
 		u64 offset;
-
+		bool retry = true;
+retry:
 		order = fls(n_pages) - 1;
 		GEM_BUG_ON(order > mem->mm.max_order);
 
@@ -79,9 +130,24 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
 			if (!IS_ERR(block))
 				break;
 
-			/* XXX: some kind of eviction pass, local to the device */
-			if (!order--)
-				goto err_free_blocks;
+			if (!order--) {
+				resource_size_t target;
+				int err;
+
+				if (!retry)
+					goto err_free_blocks;
+
+				target = n_pages * mem->mm.min_size;
+
+				mutex_unlock(&mem->mm_lock);
+				err = i915_memory_region_evict(mem, target);
+				mutex_lock(&mem->mm_lock);
+				if (err)
+					goto err_free_blocks;
+
+				retry = false;
+				goto retry;
+			}
 		} while (1);
 
 		n_pages -= BIT(order);
@@ -136,6 +202,13 @@ void i915_memory_region_release_buddy(struct intel_memory_region *mem)
 	i915_buddy_fini(&mem->mm);
 }
 
+void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj)
+{
+	mutex_lock(&obj->memory_region->obj_lock);
+	list_del(&obj->region_link);
+	mutex_unlock(&obj->memory_region->obj_lock);
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_region(struct intel_memory_region *mem,
 			      resource_size_t size,
@@ -164,6 +237,10 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
 	INIT_LIST_HEAD(&obj->blocks);
 	obj->memory_region = mem;
 
+	mutex_lock(&mem->obj_lock);
+	list_add(&obj->region_link, &mem->objects);
+	mutex_unlock(&mem->obj_lock);
+
 	return obj;
 }
 
@@ -188,6 +265,10 @@ intel_memory_region_create(struct drm_i915_private *i915,
 	mem->min_page_size = min_page_size;
 	mem->ops = ops;
 
+	mutex_init(&mem->obj_lock);
+	INIT_LIST_HEAD(&mem->objects);
+	INIT_LIST_HEAD(&mem->purgeable);
+
 	mutex_init(&mem->mm_lock);
 
 	if (ops->init) {
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 8d4736bdde50..bee0c022d295 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -80,8 +80,16 @@ struct intel_memory_region {
 	unsigned int type;
 	unsigned int instance;
 	unsigned int id;
+
+	/* Protects access to objects and purgeable */
+	struct mutex obj_lock;
+	struct list_head objects;
+	struct list_head purgeable;
 };
 
+int i915_memory_region_evict(struct intel_memory_region *mem,
+			     resource_size_t target);
+
 int i915_memory_region_init_buddy(struct intel_memory_region *mem);
 void i915_memory_region_release_buddy(struct intel_memory_region *mem);
 
@@ -89,6 +97,8 @@ int i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj);
 void i915_memory_region_put_pages_buddy(struct drm_i915_gem_object *obj,
 					struct sg_table *pages);
 
+void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj);
+
 struct intel_memory_region *
 intel_memory_region_create(struct drm_i915_private *i915,
 			   resource_size_t start,
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index c3b160cfd713..ece499869747 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -76,10 +76,83 @@ static int igt_mock_fill(void *arg)
 	return err;
 }
 
+static void igt_mark_evictable(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_unpin_pages(obj);
+	obj->mm.madv = I915_MADV_DONTNEED;
+	list_move(&obj->region_link, &obj->memory_region->purgeable);
+}
+
+static int igt_mock_evict(void *arg)
+{
+	struct intel_memory_region *mem = arg;
+	struct drm_i915_gem_object *obj;
+	unsigned long n_objects;
+	LIST_HEAD(objects);
+	resource_size_t target;
+	resource_size_t total;
+	int err = 0;
+
+	target = mem->mm.min_size;
+	total = resource_size(&mem->region);
+	n_objects = total / target;
+
+	while (n_objects--) {
+		obj = i915_gem_object_create_region(mem, target, 0);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto err_close_objects;
+		}
+
+		list_add(&obj->st_link, &objects);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto err_close_objects;
+
+		/*
+		 * Make half of the region evictable, though do so in a
+		 * horribly fragmented fashion.
+		 */
+		if (n_objects % 2)
+			igt_mark_evictable(obj);
+	}
+
+	while (target <= total / 2) {
+		obj = i915_gem_object_create_region(mem, target, 0);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto err_close_objects;
+		}
+
+		list_add(&obj->st_link, &objects);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err) {
+			pr_err("failed to evict for target=%pa", &target);
+			goto err_close_objects;
+		}
+
+		/* Again, half of the region should remain evictable */
+		igt_mark_evictable(obj);
+
+		target <<= 1;
+	}
+
+err_close_objects:
+	close_objects(&objects);
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_mock_fill),
+		SUBTEST(igt_mock_evict),
 	};
 	struct intel_memory_region *mem;
 	struct drm_i915_private *i915;
diff --git a/drivers/gpu/drm/i915/selftests/mock_region.c b/drivers/gpu/drm/i915/selftests/mock_region.c
index cb942a461e9d..80eafdc54927 100644
--- a/drivers/gpu/drm/i915/selftests/mock_region.c
+++ b/drivers/gpu/drm/i915/selftests/mock_region.c
@@ -8,6 +8,7 @@
 static const struct drm_i915_gem_object_ops mock_region_obj_ops = {
 	.get_pages = i915_memory_region_get_pages_buddy,
 	.put_pages = i915_memory_region_put_pages_buddy,
+	.release = i915_gem_object_release_memory_region,
 };
 
 static struct drm_i915_gem_object *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 04/37] drm/i915/region: support continuous allocations
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (2 preceding siblings ...)
  2019-06-27 20:55 ` [PATCH v2 03/37] drm/i915/region: support basic eviction Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:01   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 05/37] drm/i915/region: support volatile objects Matthew Auld
                   ` (35 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Some objects may need to be allocated as a continuous block, thinking
ahead the various kernel io_mapping interfaces seem to expect it.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   4 +
 drivers/gpu/drm/i915/intel_memory_region.c    |   7 +-
 .../drm/i915/selftests/intel_memory_region.c  | 152 +++++++++++++++++-
 drivers/gpu/drm/i915/selftests/mock_region.c  |   3 +
 4 files changed, 160 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 87000fc24ab3..1c4b99e507c3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -133,6 +133,10 @@ struct drm_i915_gem_object {
 	struct list_head batch_pool_link;
 	I915_SELFTEST_DECLARE(struct list_head st_link);
 
+	unsigned long flags;
+#define I915_BO_ALLOC_CONTIGUOUS (1<<0)
+#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS)
+
 	/*
 	 * Is the object to be mapped as read-only to the GPU
 	 * Only honoured if hardware has relevant pte bit
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 721b47e46492..9b6a32bfa20d 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -90,6 +90,7 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
 {
 	struct intel_memory_region *mem = obj->memory_region;
 	resource_size_t size = obj->base.size;
+	unsigned int flags = obj->flags;
 	struct sg_table *st;
 	struct scatterlist *sg;
 	unsigned int sg_page_sizes;
@@ -130,7 +131,7 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
 			if (!IS_ERR(block))
 				break;
 
-			if (!order--) {
+			if (flags & I915_BO_ALLOC_CONTIGUOUS || !order--) {
 				resource_size_t target;
 				int err;
 
@@ -219,6 +220,9 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
 	if (!mem)
 		return ERR_PTR(-ENODEV);
 
+	if (flags & ~I915_BO_ALLOC_FLAGS)
+		return ERR_PTR(-EINVAL);
+
 	size = round_up(size, mem->min_page_size);
 
 	GEM_BUG_ON(!size);
@@ -236,6 +240,7 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
 
 	INIT_LIST_HEAD(&obj->blocks);
 	obj->memory_region = mem;
+	obj->flags = flags;
 
 	mutex_lock(&mem->obj_lock);
 	list_add(&obj->region_link, &mem->objects);
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index ece499869747..c9de8b5039e4 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -78,17 +78,17 @@ static int igt_mock_fill(void *arg)
 
 static void igt_mark_evictable(struct drm_i915_gem_object *obj)
 {
-	i915_gem_object_unpin_pages(obj);
+	if (i915_gem_object_has_pinned_pages(obj))
+		i915_gem_object_unpin_pages(obj);
 	obj->mm.madv = I915_MADV_DONTNEED;
 	list_move(&obj->region_link, &obj->memory_region->purgeable);
 }
 
-static int igt_mock_evict(void *arg)
+static int igt_frag_region(struct intel_memory_region *mem,
+			   struct list_head *objects)
 {
-	struct intel_memory_region *mem = arg;
 	struct drm_i915_gem_object *obj;
 	unsigned long n_objects;
-	LIST_HEAD(objects);
 	resource_size_t target;
 	resource_size_t total;
 	int err = 0;
@@ -104,7 +104,7 @@ static int igt_mock_evict(void *arg)
 			goto err_close_objects;
 		}
 
-		list_add(&obj->st_link, &objects);
+		list_add(&obj->st_link, objects);
 
 		err = i915_gem_object_pin_pages(obj);
 		if (err)
@@ -118,6 +118,39 @@ static int igt_mock_evict(void *arg)
 			igt_mark_evictable(obj);
 	}
 
+	return 0;
+
+err_close_objects:
+	close_objects(objects);
+	return err;
+}
+
+static void igt_defrag_region(struct list_head *objects)
+{
+	struct drm_i915_gem_object *obj;
+
+	list_for_each_entry(obj, objects, st_link) {
+		if (obj->mm.madv == I915_MADV_WILLNEED)
+			igt_mark_evictable(obj);
+	}
+}
+
+static int igt_mock_evict(void *arg)
+{
+	struct intel_memory_region *mem = arg;
+	struct drm_i915_gem_object *obj;
+	LIST_HEAD(objects);
+	resource_size_t target;
+	resource_size_t total;
+	int err;
+
+	err = igt_frag_region(mem, &objects);
+	if (err)
+		return err;
+
+	total = resource_size(&mem->region);
+	target = mem->mm.min_size;
+
 	while (target <= total / 2) {
 		obj = i915_gem_object_create_region(mem, target, 0);
 		if (IS_ERR(obj)) {
@@ -148,11 +181,120 @@ static int igt_mock_evict(void *arg)
 	return err;
 }
 
+static int igt_mock_continuous(void *arg)
+{
+	struct intel_memory_region *mem = arg;
+	struct drm_i915_gem_object *obj;
+	LIST_HEAD(objects);
+	resource_size_t target;
+	resource_size_t total;
+	int err;
+
+	err = igt_frag_region(mem, &objects);
+	if (err)
+		return err;
+
+	total = resource_size(&mem->region);
+	target = total / 2;
+
+	/*
+	 * Sanity check that we can allocate all of the available fragmented
+	 * space.
+	 */
+	obj = i915_gem_object_create_region(mem, target, 0);
+	if (IS_ERR(obj)) {
+		err = PTR_ERR(obj);
+		goto err_close_objects;
+	}
+
+	list_add(&obj->st_link, &objects);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err) {
+		pr_err("failed to allocate available space\n");
+		goto err_close_objects;
+	}
+
+	igt_mark_evictable(obj);
+
+	/* Try the smallest possible size -- should succeed */
+	obj = i915_gem_object_create_region(mem, mem->mm.min_size,
+					    I915_BO_ALLOC_CONTIGUOUS);
+	if (IS_ERR(obj)) {
+		err = PTR_ERR(obj);
+		goto err_close_objects;
+	}
+
+	list_add(&obj->st_link, &objects);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err) {
+		pr_err("failed to allocate smallest possible size\n");
+		goto err_close_objects;
+	}
+
+	igt_mark_evictable(obj);
+
+	if (obj->mm.pages->nents != 1) {
+		pr_err("[1]object spans multiple sg entries\n");
+		err = -EINVAL;
+		goto err_close_objects;
+	}
+
+	/*
+	 * Even though there is enough free space for the allocation, we
+	 * shouldn't be able to allocate it, given that it is fragmented, and
+	 * non-continuous.
+	 */
+	obj = i915_gem_object_create_region(mem, target, I915_BO_ALLOC_CONTIGUOUS);
+	if (IS_ERR(obj)) {
+		err = PTR_ERR(obj);
+		goto err_close_objects;
+	}
+
+	list_add(&obj->st_link, &objects);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (!err) {
+		pr_err("expected allocation to fail\n");
+		err = -EINVAL;
+		goto err_close_objects;
+	}
+
+	igt_defrag_region(&objects);
+
+	/* Should now succeed */
+	obj = i915_gem_object_create_region(mem, target, I915_BO_ALLOC_CONTIGUOUS);
+	if (IS_ERR(obj)) {
+		err = PTR_ERR(obj);
+		goto err_close_objects;
+	}
+
+	list_add(&obj->st_link, &objects);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err) {
+		pr_err("failed to allocate from defraged area\n");
+		goto err_close_objects;
+	}
+
+	if (obj->mm.pages->nents != 1) {
+		pr_err("object spans multiple sg entries\n");
+		err = -EINVAL;
+	}
+
+err_close_objects:
+	close_objects(&objects);
+
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_mock_fill),
 		SUBTEST(igt_mock_evict),
+		SUBTEST(igt_mock_continuous),
 	};
 	struct intel_memory_region *mem;
 	struct drm_i915_private *i915;
diff --git a/drivers/gpu/drm/i915/selftests/mock_region.c b/drivers/gpu/drm/i915/selftests/mock_region.c
index 80eafdc54927..9eeda8f45f38 100644
--- a/drivers/gpu/drm/i915/selftests/mock_region.c
+++ b/drivers/gpu/drm/i915/selftests/mock_region.c
@@ -20,6 +20,9 @@ mock_object_create(struct intel_memory_region *mem,
 	struct drm_i915_gem_object *obj;
 	unsigned int cache_level;
 
+	if (flags & I915_BO_ALLOC_CONTIGUOUS)
+		size = roundup_pow_of_two(size);
+
 	if (size > BIT(mem->mm.max_order) * mem->mm.min_size)
 		return ERR_PTR(-E2BIG);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 05/37] drm/i915/region: support volatile objects
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (3 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 04/37] drm/i915/region: support continuous allocations Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:03   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 06/37] drm/i915: Add memory region information to device_info Matthew Auld
                   ` (34 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang

Volatile objects are marked as DONTNEED while pinned, therefore once
unpinned the backing store can be discarded.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 +-
 drivers/gpu/drm/i915/intel_memory_region.c    | 13 ++++-
 .../drm/i915/selftests/intel_memory_region.c  | 56 +++++++++++++++++++
 3 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 1c4b99e507c3..80ff5ad9bc07 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -135,7 +135,8 @@ struct drm_i915_gem_object {
 
 	unsigned long flags;
 #define I915_BO_ALLOC_CONTIGUOUS (1<<0)
-#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS)
+#define I915_BO_ALLOC_VOLATILE   (1<<1)
+#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | I915_BO_ALLOC_VOLATILE)
 
 	/*
 	 * Is the object to be mapped as read-only to the GPU
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 9b6a32bfa20d..cd41c212bc35 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -82,6 +82,9 @@ i915_memory_region_put_pages_buddy(struct drm_i915_gem_object *obj,
 	memory_region_free_pages(obj, pages);
 	mutex_unlock(&obj->memory_region->mm_lock);
 
+	if (obj->flags & I915_BO_ALLOC_VOLATILE)
+		obj->mm.madv = I915_MADV_WILLNEED;
+
 	obj->mm.dirty = false;
 }
 
@@ -182,6 +185,9 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
 
 	i915_sg_trim(st);
 
+	if (flags & I915_BO_ALLOC_VOLATILE)
+		obj->mm.madv = I915_MADV_DONTNEED;
+
 	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
 
 	return 0;
@@ -243,7 +249,12 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
 	obj->flags = flags;
 
 	mutex_lock(&mem->obj_lock);
-	list_add(&obj->region_link, &mem->objects);
+
+	if (flags & I915_BO_ALLOC_VOLATILE)
+		list_add(&obj->region_link, &mem->purgeable);
+	else
+		list_add(&obj->region_link, &mem->objects);
+
 	mutex_unlock(&mem->obj_lock);
 
 	return obj;
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index c9de8b5039e4..bdf044e4781d 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -289,12 +289,68 @@ static int igt_mock_continuous(void *arg)
 	return err;
 }
 
+static int igt_mock_volatile(void *arg)
+{
+	struct intel_memory_region *mem = arg;
+	struct drm_i915_gem_object *obj;
+	int err;
+
+	obj = i915_gem_object_create_region(mem, PAGE_SIZE, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		goto err_put;
+
+	i915_gem_object_unpin_pages(obj);
+
+	err = i915_memory_region_evict(mem, PAGE_SIZE);
+	if (err != -ENOSPC) {
+		pr_err("shrink memory region\n");
+		goto err_put;
+	}
+
+	i915_gem_object_put(obj);
+
+	obj = i915_gem_object_create_region(mem, PAGE_SIZE, I915_BO_ALLOC_VOLATILE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	if (!(obj->flags & I915_BO_ALLOC_VOLATILE)) {
+		pr_err("missing flags\n");
+		goto err_put;
+	}
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		goto err_put;
+
+	i915_gem_object_unpin_pages(obj);
+
+	err = i915_memory_region_evict(mem, PAGE_SIZE);
+	if (err) {
+		pr_err("failed to shrink memory\n");
+		goto err_put;
+	}
+
+	if (i915_gem_object_has_pages(obj)) {
+		pr_err("object pages not discarded\n");
+		err = -EINVAL;
+	}
+
+err_put:
+	i915_gem_object_put(obj);
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_mock_fill),
 		SUBTEST(igt_mock_evict),
 		SUBTEST(igt_mock_continuous),
+		SUBTEST(igt_mock_volatile),
 	};
 	struct intel_memory_region *mem;
 	struct drm_i915_private *i915;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 06/37] drm/i915: Add memory region information to device_info
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (4 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 05/37] drm/i915/region: support volatile objects Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:05   ` Chris Wilson
  2019-06-27 23:08   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 07/37] drm/i915: support creating LMEM objects Matthew Auld
                   ` (33 subsequent siblings)
  39 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

Exposes available regions for the platform. Shared memory will
always be available.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h               |  2 ++
 drivers/gpu/drm/i915/i915_pci.c               | 29 ++++++++++++++-----
 drivers/gpu/drm/i915/intel_device_info.h      |  1 +
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  2 ++
 4 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 97d02b32a208..838a796d9c55 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2288,6 +2288,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 
 #define HAS_IPC(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_ipc)
 
+#define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
+
 /*
  * For now, anything with a GuC requires uCode loading, and then supports
  * command submission once loaded. But these are logically independent
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 94b588e0a1dd..c513532b8da7 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -144,6 +144,9 @@
 #define GEN_DEFAULT_PAGE_SIZES \
 	.page_sizes = I915_GTT_PAGE_SIZE_4K
 
+#define GEN_DEFAULT_REGIONS \
+	.memory_regions = REGION_SMEM | REGION_STOLEN
+
 #define I830_FEATURES \
 	GEN(2), \
 	.is_mobile = 1, \
@@ -161,7 +164,8 @@
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	I9XX_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES
+	GEN_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 #define I845_FEATURES \
 	GEN(2), \
@@ -178,7 +182,8 @@
 	I845_PIPE_OFFSETS, \
 	I845_CURSOR_OFFSETS, \
 	I9XX_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES
+	GEN_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 static const struct intel_device_info intel_i830_info = {
 	I830_FEATURES,
@@ -212,7 +217,8 @@ static const struct intel_device_info intel_i865g_info = {
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	I9XX_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES
+	GEN_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 static const struct intel_device_info intel_i915g_info = {
 	GEN3_FEATURES,
@@ -297,7 +303,8 @@ static const struct intel_device_info intel_pineview_m_info = {
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	I965_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES
+	GEN_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 static const struct intel_device_info intel_i965g_info = {
 	GEN4_FEATURES,
@@ -347,7 +354,8 @@ static const struct intel_device_info intel_gm45_info = {
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	ILK_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES
+	GEN_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 static const struct intel_device_info intel_ironlake_d_info = {
 	GEN5_FEATURES,
@@ -377,7 +385,8 @@ static const struct intel_device_info intel_ironlake_m_info = {
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	ILK_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES
+	GEN_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 #define SNB_D_PLATFORM \
 	GEN6_FEATURES, \
@@ -425,7 +434,8 @@ static const struct intel_device_info intel_sandybridge_m_gt2_info = {
 	IVB_PIPE_OFFSETS, \
 	IVB_CURSOR_OFFSETS, \
 	IVB_COLORS, \
-	GEN_DEFAULT_PAGE_SIZES
+	GEN_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 #define IVB_D_PLATFORM \
 	GEN7_FEATURES, \
@@ -486,6 +496,7 @@ static const struct intel_device_info intel_valleyview_info = {
 	I9XX_CURSOR_OFFSETS,
 	I965_COLORS,
 	GEN_DEFAULT_PAGE_SIZES,
+	GEN_DEFAULT_REGIONS,
 };
 
 #define G75_FEATURES  \
@@ -582,6 +593,7 @@ static const struct intel_device_info intel_cherryview_info = {
 	CHV_CURSOR_OFFSETS,
 	CHV_COLORS,
 	GEN_DEFAULT_PAGE_SIZES,
+	GEN_DEFAULT_REGIONS,
 };
 
 #define GEN9_DEFAULT_PAGE_SIZES \
@@ -657,7 +669,8 @@ static const struct intel_device_info intel_skylake_gt4_info = {
 	HSW_PIPE_OFFSETS, \
 	IVB_CURSOR_OFFSETS, \
 	IVB_COLORS, \
-	GEN9_DEFAULT_PAGE_SIZES
+	GEN9_DEFAULT_PAGE_SIZES, \
+	GEN_DEFAULT_REGIONS
 
 static const struct intel_device_info intel_broxton_info = {
 	GEN9_LP_FEATURES,
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index ddafc819bf30..63369b65110e 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -170,6 +170,7 @@ struct intel_device_info {
 	} display;
 
 	u16 ddb_size; /* in blocks */
+	u32 memory_regions;
 
 	/* Register offsets for the various display pipes and transcoders */
 	int pipe_offsets[I915_MAX_TRANSCODERS];
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 86f86c3d38a8..f8b48304fcec 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -179,6 +179,8 @@ struct drm_i915_private *mock_gem_device(void)
 		I915_GTT_PAGE_SIZE_64K |
 		I915_GTT_PAGE_SIZE_2M;
 
+	mkwrite_device_info(i915)->memory_regions = REGION_SMEM;
+
 	mock_uncore_init(&i915->uncore);
 	i915_gem_init__mm(i915);
 	intel_gt_init_early(&i915->gt, i915);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 07/37] drm/i915: support creating LMEM objects
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (5 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 06/37] drm/i915: Add memory region information to device_info Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:11   ` Chris Wilson
  2019-06-27 23:16   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 08/37] drm/i915: setup io-mapping for LMEM Matthew Auld
                   ` (32 subsequent siblings)
  39 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

We currently define LMEM, or local memory, as just another memory
region, like system memory or stolen, which we can expose to userspace
and can be mapped to the CPU via some BAR.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |  1 +
 drivers/gpu/drm/i915/i915_drv.h               |  5 ++
 drivers/gpu/drm/i915/intel_region_lmem.c      | 66 +++++++++++++++++++
 drivers/gpu/drm/i915/intel_region_lmem.h      | 16 +++++
 .../drm/i915/selftests/i915_live_selftests.h  |  1 +
 .../drm/i915/selftests/intel_memory_region.c  | 43 ++++++++++++
 6 files changed, 132 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.c
 create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 28fac19f7b04..e782f7d10524 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -132,6 +132,7 @@ i915-y += \
 	  i915_scheduler.o \
 	  i915_trace_points.o \
 	  i915_vma.o \
+	  intel_region_lmem.o \
 	  intel_wopcm.o
 
 # general-purpose microcontroller (GuC) support
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 838a796d9c55..7cbdffe3f129 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -93,6 +93,8 @@
 #include "gt/intel_timeline.h"
 #include "i915_vma.h"
 
+#include "intel_region_lmem.h"
+
 #include "intel_gvt.h"
 
 /* General customization:
@@ -1341,6 +1343,8 @@ struct drm_i915_private {
 	 */
 	resource_size_t stolen_usable_size;	/* Total size minus reserved ranges */
 
+	struct intel_memory_region *regions[ARRAY_SIZE(intel_region_map)];
+
 	struct intel_uncore uncore;
 
 	struct i915_virtual_gpu vgpu;
@@ -2289,6 +2293,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_IPC(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_ipc)
 
 #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
+#define HAS_LMEM(i915) HAS_REGION(i915, REGION_LMEM)
 
 /*
  * For now, anything with a GuC requires uCode loading, and then supports
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
new file mode 100644
index 000000000000..c4b5a88627a3
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_memory_region.h"
+#include "intel_region_lmem.h"
+
+static const struct drm_i915_gem_object_ops region_lmem_obj_ops = {
+	.get_pages = i915_memory_region_get_pages_buddy,
+	.put_pages = i915_memory_region_put_pages_buddy,
+	.release = i915_gem_object_release_memory_region,
+};
+
+static struct drm_i915_gem_object *
+lmem_create_object(struct intel_memory_region *mem,
+		   resource_size_t size,
+		   unsigned int flags)
+{
+	struct drm_i915_private *i915 = mem->i915;
+	struct drm_i915_gem_object *obj;
+	unsigned int cache_level;
+
+	if (flags & I915_BO_ALLOC_CONTIGUOUS)
+		size = roundup_pow_of_two(size);
+
+	if (size > BIT(mem->mm.max_order) * mem->mm.min_size)
+		return ERR_PTR(-E2BIG);
+
+	obj = i915_gem_object_alloc();
+	if (!obj)
+		return ERR_PTR(-ENOMEM);
+
+	drm_gem_private_object_init(&i915->drm, &obj->base, size);
+	i915_gem_object_init(obj, &region_lmem_obj_ops);
+
+	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
+
+	cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
+	i915_gem_object_set_cache_coherency(obj, cache_level);
+
+	return obj;
+}
+
+static const struct intel_memory_region_ops region_lmem_ops = {
+	.init = i915_memory_region_init_buddy,
+	.release = i915_memory_region_release_buddy,
+	.create_object = lmem_create_object,
+};
+
+bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
+{
+	struct intel_memory_region *region = obj->memory_region;
+
+	return region && region->type == INTEL_LMEM;
+}
+
+struct drm_i915_gem_object *
+i915_gem_object_create_lmem(struct drm_i915_private *i915,
+			    resource_size_t size,
+			    unsigned int flags)
+{
+	return i915_gem_object_create_region(i915->regions[INTEL_MEMORY_LMEM],
+					     size, flags);
+}
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.h b/drivers/gpu/drm/i915/intel_region_lmem.h
new file mode 100644
index 000000000000..0f0a6249d5b9
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_region_lmem.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __INTEL_REGION_LMEM_H
+#define __INTEL_REGION_LMEM_H
+
+bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
+
+struct drm_i915_gem_object *
+i915_gem_object_create_lmem(struct drm_i915_private *i915,
+			    resource_size_t size,
+			    unsigned int flags);
+
+#endif /* !__INTEL_REGION_LMEM_H */
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index 2b31a4ee0b4c..1b76c2c12ca9 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -28,6 +28,7 @@ selftest(contexts, i915_gem_context_live_selftests)
 selftest(blt, i915_gem_object_blt_live_selftests)
 selftest(client, i915_gem_client_blt_live_selftests)
 selftest(reset, intel_reset_live_selftests)
+selftest(memory_region, intel_memory_region_live_selftests)
 selftest(hangcheck, intel_hangcheck_live_selftests)
 selftest(execlists, intel_execlists_live_selftests)
 selftest(guc, intel_guc_live_selftest)
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index bdf044e4781d..3ac320b28ef1 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -344,6 +344,27 @@ static int igt_mock_volatile(void *arg)
 	return err;
 }
 
+static int igt_lmem_create(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct drm_i915_gem_object *obj;
+	int err = 0;
+
+	obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		goto out_put;
+
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
@@ -380,3 +401,25 @@ int intel_memory_region_mock_selftests(void)
 
 	return err;
 }
+
+int intel_memory_region_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_lmem_create),
+	};
+	int err;
+
+	if (!HAS_LMEM(i915)) {
+		pr_info("device lacks LMEM support, skipping\n");
+		return 0;
+	}
+
+	if (i915_terminally_wedged(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	err = i915_subtests(tests, i915);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return err;
+}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 08/37] drm/i915: setup io-mapping for LMEM
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (6 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 07/37] drm/i915: support creating LMEM objects Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 09/37] drm/i915/lmem: support kernel mapping Matthew Auld
                   ` (31 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/intel_region_lmem.c | 28 ++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index c4b5a88627a3..15655cc5013f 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -43,9 +43,33 @@ lmem_create_object(struct intel_memory_region *mem,
 	return obj;
 }
 
+static void
+region_lmem_release(struct intel_memory_region *mem)
+{
+	io_mapping_fini(&mem->iomap);
+	i915_memory_region_release_buddy(mem);
+}
+
+static int
+region_lmem_init(struct intel_memory_region *mem)
+{
+	int ret;
+
+	if (!io_mapping_init_wc(&mem->iomap,
+				mem->io_start,
+				resource_size(&mem->region)))
+		return -EIO;
+
+	ret = i915_memory_region_init_buddy(mem);
+	if (ret)
+		io_mapping_fini(&mem->iomap);
+
+	return ret;
+}
+
 static const struct intel_memory_region_ops region_lmem_ops = {
-	.init = i915_memory_region_init_buddy,
-	.release = i915_memory_region_release_buddy,
+	.init = region_lmem_init,
+	.release = region_lmem_release,
 	.create_object = lmem_create_object,
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 09/37] drm/i915/lmem: support kernel mapping
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (7 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 08/37] drm/i915: setup io-mapping for LMEM Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:27   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 10/37] drm/i915/blt: support copying objects Matthew Auld
                   ` (30 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

We can create LMEM objects, but we also need to support mapping them
into kernel space for internal use.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 18 ++++-
 drivers/gpu/drm/i915/intel_region_lmem.c      | 24 ++++++
 drivers/gpu/drm/i915/intel_region_lmem.h      |  6 ++
 .../drm/i915/selftests/intel_memory_region.c  | 77 +++++++++++++++++++
 4 files changed, 121 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index b36ad269f4ea..15eaaedffc46 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -176,7 +176,9 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 		void *ptr;
 
 		ptr = page_mask_bits(obj->mm.mapping);
-		if (is_vmalloc_addr(ptr))
+		if (i915_gem_object_is_lmem(obj))
+			io_mapping_unmap(ptr);
+		else if (is_vmalloc_addr(ptr))
 			vunmap(ptr);
 		else
 			kunmap(kmap_to_page(ptr));
@@ -235,7 +237,7 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 }
 
 /* The 'mapping' part of i915_gem_object_pin_map() below */
-static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
+static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
 				 enum i915_map_type type)
 {
 	unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
@@ -248,6 +250,11 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
 	pgprot_t pgprot;
 	void *addr;
 
+	if (i915_gem_object_is_lmem(obj)) {
+		/* XXX: we are ignoring the type here -- this is simply wc */
+		return i915_gem_object_lmem_io_map(obj, 0, obj->base.size);
+	}
+
 	/* A single page can always be kmapped */
 	if (n_pages == 1 && type == I915_MAP_WB)
 		return kmap(sg_page(sgt->sgl));
@@ -293,7 +300,8 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 	void *ptr;
 	int err;
 
-	if (unlikely(!i915_gem_object_has_struct_page(obj)))
+	if (unlikely(!i915_gem_object_has_struct_page(obj) &&
+		     !i915_gem_object_is_lmem(obj)))
 		return ERR_PTR(-ENXIO);
 
 	err = mutex_lock_interruptible(&obj->mm.lock);
@@ -325,7 +333,9 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 			goto err_unpin;
 		}
 
-		if (is_vmalloc_addr(ptr))
+		if (i915_gem_object_is_lmem(obj))
+			io_mapping_unmap(ptr);
+		else if (is_vmalloc_addr(ptr))
 			vunmap(ptr);
 		else
 			kunmap(kmap_to_page(ptr));
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 15655cc5013f..701bcac3479e 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -73,6 +73,30 @@ static const struct intel_memory_region_ops region_lmem_ops = {
 	.create_object = lmem_create_object,
 };
 
+/* XXX: Time to vfunc your life up? */
+void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
+					       unsigned long n)
+{
+	resource_size_t offset;
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+
+	return io_mapping_map_atomic_wc(&obj->memory_region->iomap, offset);
+}
+
+void __iomem *i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+					  unsigned long n,
+					  unsigned long size)
+{
+	resource_size_t offset;
+
+	GEM_BUG_ON(!(obj->flags & I915_BO_ALLOC_CONTIGUOUS));
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+
+	return io_mapping_map_wc(&obj->memory_region->iomap, offset, size);
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
 	struct intel_memory_region *region = obj->memory_region;
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.h b/drivers/gpu/drm/i915/intel_region_lmem.h
index 0f0a6249d5b9..20084f7b4bff 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.h
+++ b/drivers/gpu/drm/i915/intel_region_lmem.h
@@ -6,6 +6,12 @@
 #ifndef __INTEL_REGION_LMEM_H
 #define __INTEL_REGION_LMEM_H
 
+
+void __iomem *i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+					  unsigned long n, unsigned long size);
+void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
+					       unsigned long n);
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 3ac320b28ef1..85d118c10d15 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -9,8 +9,11 @@
 
 #include "mock_gem_device.h"
 #include "gem/selftests/mock_context.h"
+#include "selftests/igt_flush_test.h"
 #include "mock_drm.h"
 
+#include "gem/i915_gem_object_blt.h"
+
 static void close_objects(struct list_head *objects)
 {
 	struct drm_i915_gem_object *obj, *on;
@@ -365,6 +368,79 @@ static int igt_lmem_create(void *arg)
 	return err;
 }
 
+static int igt_lmem_write_cpu(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_context *ce = i915->engine[BCS0]->kernel_context;
+	struct drm_i915_gem_object *obj;
+	struct rnd_state prng;
+	u32 *vaddr;
+	u32 dword;
+	u32 val;
+	u32 sz;
+	int err;
+
+	if (!HAS_ENGINE(i915, BCS0))
+		return 0;
+
+	sz = round_up(prandom_u32_state(&prng) % SZ_32M, PAGE_SIZE);
+
+	obj = i915_gem_object_create_lmem(i915, sz, I915_BO_ALLOC_CONTIGUOUS);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	if (IS_ERR(vaddr)) {
+		pr_err("Failed to iomap lmembar; err=%d\n", (int)PTR_ERR(vaddr));
+		err = PTR_ERR(vaddr);
+		goto out_put;
+	}
+
+	val = prandom_u32_state(&prng);
+
+	/* Write from gpu and then read from cpu */
+	err = i915_gem_object_fill_blt(obj, ce, val);
+	if (err)
+		goto out_unpin;
+
+	i915_gem_object_lock(obj);
+	err = i915_gem_object_set_to_wc_domain(obj, true);
+	i915_gem_object_unlock(obj);
+	if (err)
+		goto out_unpin;
+
+	for (dword = 0; dword < sz / sizeof(u32); ++dword) {
+		if (vaddr[dword] != val) {
+			pr_err("vaddr[%u]=%u, val=%u\n", dword, vaddr[dword],
+			        val);
+			err = -EINVAL;
+			break;
+		}
+	}
+
+	/* Write from the cpu and read again from the cpu */
+	memset32(vaddr, val ^ 0xdeadbeaf, sz / sizeof(u32));
+
+	for (dword = 0; dword < sz / sizeof(u32); ++dword) {
+		if (vaddr[dword] != (val ^ 0xdeadbeaf)) {
+			pr_err("vaddr[%u]=%u, val=%u\n", dword, vaddr[dword],
+			        val ^ 0xdeadbeaf);
+			err = -EINVAL;
+			break;
+		}
+	}
+
+out_unpin:
+	i915_gem_object_unpin_map(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
@@ -406,6 +482,7 @@ int intel_memory_region_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_lmem_create),
+		SUBTEST(igt_lmem_write_cpu),
 	};
 	int err;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 10/37] drm/i915/blt: support copying objects
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (8 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 09/37] drm/i915/lmem: support kernel mapping Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:35   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 11/37] drm/i915/selftests: move gpu-write-dw into utils Matthew Auld
                   ` (29 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

We can already clear an object with the blt, so try to do the same to
support copying from one object backing store to another. Really this is
just object -> object, which is not that useful yet, what we really want
is two backing stores, but that will require some vma rework first,
otherwise we are stuck with "tmp" objects.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com
---
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 135 ++++++++++++++++++
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |   8 ++
 .../i915/gem/selftests/i915_gem_object_blt.c  | 105 ++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   3 +-
 4 files changed, 250 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
index cb42e3a312e2..c2b28e06c379 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
@@ -102,6 +102,141 @@ int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
 	return err;
 }
 
+int intel_emit_vma_copy_blt(struct i915_request *rq,
+			    struct i915_vma *src,
+			    struct i915_vma *dst)
+{
+	const int gen = INTEL_GEN(rq->i915);
+	u32 *cs;
+
+	GEM_BUG_ON(src->size != dst->size);
+
+	cs = intel_ring_begin(rq, 10);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (gen >= 9) {
+		*cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10-2);
+		*cs++ = BLT_DEPTH_32 | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = src->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = lower_32_bits(dst->node.start);
+		*cs++ = upper_32_bits(dst->node.start);
+		*cs++ = 0;
+		*cs++ = PAGE_SIZE;
+		*cs++ = lower_32_bits(src->node.start);
+		*cs++ = upper_32_bits(src->node.start);
+	} else if (gen >= 8) {
+		*cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10-2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = src->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = lower_32_bits(dst->node.start);
+		*cs++ = upper_32_bits(dst->node.start);
+		*cs++ = 0;
+		*cs++ = PAGE_SIZE;
+		*cs++ = lower_32_bits(src->node.start);
+		*cs++ = upper_32_bits(src->node.start);
+	} else {
+		*cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8-2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = src->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = dst->node.start;
+		*cs++ = 0;
+		*cs++ = PAGE_SIZE;
+		*cs++ = src->node.start;
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+	}
+
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
+			     struct drm_i915_gem_object *dst,
+			     struct intel_context *ce)
+{
+	struct drm_i915_private *i915 = to_i915(src->base.dev);
+	struct i915_gem_context *ctx = ce->gem_context;
+	struct i915_address_space *vm = ctx->vm ?: &i915->ggtt.vm;
+	struct drm_gem_object *objs[] = { &src->base, &dst->base };
+	struct ww_acquire_ctx acquire;
+	struct i915_vma *vma_src, *vma_dst;
+	struct i915_request *rq;
+	int err;
+
+	vma_src = i915_vma_instance(src, vm, NULL);
+	if (IS_ERR(vma_src))
+		return PTR_ERR(vma_src);
+
+	err = i915_vma_pin(vma_src, 0, 0, PIN_USER);
+	if (unlikely(err))
+		return err;
+
+	vma_dst = i915_vma_instance(dst, vm, NULL);
+	if (IS_ERR(vma_dst))
+		goto out_unpin_src;
+
+	err = i915_vma_pin(vma_dst, 0, 0, PIN_USER);
+	if (unlikely(err))
+		goto out_unpin_src;
+
+	rq = i915_request_create(ce);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_unpin_dst;
+	}
+
+	err = drm_gem_lock_reservations(objs, ARRAY_SIZE(objs), &acquire);
+	if (unlikely(err))
+		goto out_request;
+
+	if (src->cache_dirty & ~src->cache_coherent)
+		i915_gem_clflush_object(src, 0);
+
+	if (dst->cache_dirty & ~dst->cache_coherent)
+		i915_gem_clflush_object(dst, 0);
+
+	err = i915_request_await_object(rq, src, false);
+	if (unlikely(err))
+		goto out_unlock;
+
+	err = i915_vma_move_to_active(vma_src, rq, 0);
+	if (unlikely(err))
+		goto out_unlock;
+
+	err = i915_request_await_object(rq, dst, true);
+	if (unlikely(err))
+		goto out_unlock;
+
+	err = i915_vma_move_to_active(vma_dst, rq, EXEC_OBJECT_WRITE);
+	if (unlikely(err))
+		goto out_unlock;
+
+	if (ce->engine->emit_init_breadcrumb) {
+		err = ce->engine->emit_init_breadcrumb(rq);
+		if (unlikely(err))
+			goto out_unlock;
+	}
+
+	err = intel_emit_vma_copy_blt(rq, vma_src, vma_dst);
+out_unlock:
+	drm_gem_unlock_reservations(objs, ARRAY_SIZE(objs), &acquire);
+out_request:
+	if (unlikely(err))
+		i915_request_skip(rq, err);
+
+	i915_request_add(rq);
+out_unpin_dst:
+	i915_vma_unpin(vma_dst);
+out_unpin_src:
+	i915_vma_unpin(vma_src);
+	return err;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/i915_gem_object_blt.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
index 7ec7de6ac0c0..17fac835f391 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
@@ -21,4 +21,12 @@ int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
 			     struct intel_context *ce,
 			     u32 value);
 
+int intel_emit_vma_copy_blt(struct i915_request *rq,
+			    struct i915_vma *src,
+			    struct i915_vma *dst);
+
+int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
+			     struct drm_i915_gem_object *dst,
+			     struct intel_context *ce);
+
 #endif
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
index e23d8c9e9298..1f28a12f7bb4 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
@@ -94,10 +94,115 @@ static int igt_fill_blt(void *arg)
 	return err;
 }
 
+static int igt_copy_blt(void *arg)
+{
+	struct intel_context *ce = arg;
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+	struct drm_i915_gem_object *src, *dst;
+	struct rnd_state prng;
+	IGT_TIMEOUT(end);
+	u32 *vaddr;
+	int err = 0;
+
+	prandom_seed_state(&prng, i915_selftest.random_seed);
+
+	do {
+		u32 sz = prandom_u32_state(&prng) % SZ_32M;
+		u32 val = prandom_u32_state(&prng);
+		u32 i;
+
+		sz = round_up(sz, PAGE_SIZE);
+
+		pr_debug("%s with sz=%x, val=%x\n", __func__, sz, val);
+
+		src = i915_gem_object_create_internal(i915, sz);
+		if (IS_ERR(src)) {
+			err = PTR_ERR(vaddr);
+			goto err_flush;
+		}
+
+		vaddr = i915_gem_object_pin_map(src, I915_MAP_WB);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto err_put_src;
+		}
+
+		memset32(vaddr, val, src->base.size / sizeof(u32));
+
+		i915_gem_object_unpin_map(src);
+
+		if (!(src->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
+			src->cache_dirty = true;
+
+		dst = i915_gem_object_create_internal(i915, sz);
+		if (IS_ERR(dst)) {
+			err = PTR_ERR(vaddr);
+			goto err_put_src;
+		}
+
+		vaddr = i915_gem_object_pin_map(dst, I915_MAP_WB);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto err_put_dst;
+		}
+
+		memset32(vaddr, val ^ 0xdeadbeaf, dst->base.size / sizeof(u32));
+
+		if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
+			dst->cache_dirty = true;
+
+		mutex_lock(&i915->drm.struct_mutex);
+		err = i915_gem_object_copy_blt(src, dst, ce);
+		mutex_unlock(&i915->drm.struct_mutex);
+		if (err)
+			goto err_unpin;
+
+		i915_gem_object_lock(dst);
+		err = i915_gem_object_set_to_cpu_domain(dst, false);
+		i915_gem_object_unlock(dst);
+		if (err)
+			goto err_unpin;
+
+		for (i = 0; i < dst->base.size / sizeof(u32); ++i) {
+			if (vaddr[i] != val) {
+				pr_err("vaddr[%u]=%x, expected=%x\n", i,
+				       vaddr[i], val);
+				err = -EINVAL;
+				goto err_unpin;
+			}
+		}
+
+		i915_gem_object_unpin_map(dst);
+
+		i915_gem_object_put(src);
+		i915_gem_object_put(dst);
+	} while (!time_after(jiffies, end));
+
+	goto err_flush;
+
+err_unpin:
+	i915_gem_object_unpin_map(dst);
+err_put_dst:
+	i915_gem_object_put(dst);
+err_put_src:
+	i915_gem_object_put(src);
+err_flush:
+	mutex_lock(&i915->drm.struct_mutex);
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	return err;
+}
+
 int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_fill_blt),
+		SUBTEST(igt_copy_blt),
 	};
 
 	if (i915_terminally_wedged(i915))
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index eec31e36aca7..e3b23351669c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -182,7 +182,8 @@
 #define COLOR_BLT_CMD			(2<<29 | 0x40<<22 | (5-2))
 #define XY_COLOR_BLT_CMD		(2 << 29 | 0x50 << 22)
 #define SRC_COPY_BLT_CMD		((2<<29)|(0x43<<22)|4)
-#define XY_SRC_COPY_BLT_CMD		((2<<29)|(0x53<<22)|6)
+#define GEN9_XY_FAST_COPY_BLT_CMD	((2<<29)|(0x42<<22))
+#define XY_SRC_COPY_BLT_CMD		((2<<29)|(0x53<<22))
 #define XY_MONO_SRC_COPY_IMM_BLT	((2<<29)|(0x71<<22)|5)
 #define   BLT_WRITE_A			(2<<20)
 #define   BLT_WRITE_RGB			(1<<20)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 11/37] drm/i915/selftests: move gpu-write-dw into utils
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (9 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 10/37] drm/i915/blt: support copying objects Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 12/37] drm/i915/selftests: add write-dword test for LMEM Matthew Auld
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Using the gpu to write to some dword over a number of pages is rather
useful, and we already have two copies of such a thing, and we don't
want a third so move it to utils. There is probably some other stuff
also...

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 120 ++--------------
 .../drm/i915/gem/selftests/i915_gem_context.c | 134 ++---------------
 .../drm/i915/gem/selftests/igt_gem_utils.c    | 135 ++++++++++++++++++
 .../drm/i915/gem/selftests/igt_gem_utils.h    |  16 +++
 4 files changed, 169 insertions(+), 236 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index fd547b98ec69..1cdf98b7535e 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -960,126 +960,22 @@ static int igt_mock_ppgtt_64K(void *arg)
 	return err;
 }
 
-static struct i915_vma *
-gpu_write_dw(struct i915_vma *vma, u64 offset, u32 val)
-{
-	struct drm_i915_private *i915 = vma->vm->i915;
-	const int gen = INTEL_GEN(i915);
-	unsigned int count = vma->size >> PAGE_SHIFT;
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *batch;
-	unsigned int size;
-	u32 *cmd;
-	int n;
-	int err;
-
-	size = (1 + 4 * count) * sizeof(u32);
-	size = round_up(size, PAGE_SIZE);
-	obj = i915_gem_object_create_internal(i915, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	cmd = i915_gem_object_pin_map(obj, I915_MAP_WC);
-	if (IS_ERR(cmd)) {
-		err = PTR_ERR(cmd);
-		goto err;
-	}
-
-	offset += vma->node.start;
-
-	for (n = 0; n < count; n++) {
-		if (gen >= 8) {
-			*cmd++ = MI_STORE_DWORD_IMM_GEN4;
-			*cmd++ = lower_32_bits(offset);
-			*cmd++ = upper_32_bits(offset);
-			*cmd++ = val;
-		} else if (gen >= 4) {
-			*cmd++ = MI_STORE_DWORD_IMM_GEN4 |
-				(gen < 6 ? MI_USE_GGTT : 0);
-			*cmd++ = 0;
-			*cmd++ = offset;
-			*cmd++ = val;
-		} else {
-			*cmd++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
-			*cmd++ = offset;
-			*cmd++ = val;
-		}
-
-		offset += PAGE_SIZE;
-	}
-
-	*cmd = MI_BATCH_BUFFER_END;
-	intel_gt_chipset_flush(vma->vm->gt);
-
-	i915_gem_object_unpin_map(obj);
-
-	batch = i915_vma_instance(obj, vma->vm, NULL);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto err;
-	}
-
-	err = i915_vma_pin(batch, 0, 0, PIN_USER);
-	if (err)
-		goto err;
-
-	return batch;
-
-err:
-	i915_gem_object_put(obj);
-
-	return ERR_PTR(err);
-}
-
 static int gpu_write(struct i915_vma *vma,
 		     struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine,
-		     u32 dword,
-		     u32 value)
+		     u32 dw,
+		     u32 val)
 {
-	struct i915_request *rq;
-	struct i915_vma *batch;
 	int err;
 
-	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
-
-	batch = gpu_write_dw(vma, dword * sizeof(u32), value);
-	if (IS_ERR(batch))
-		return PTR_ERR(batch);
-
-	rq = igt_request_alloc(ctx, engine);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto err_batch;
-	}
-
-	i915_vma_lock(batch);
-	err = i915_vma_move_to_active(batch, rq, 0);
-	i915_vma_unlock(batch);
-	if (err)
-		goto err_request;
-
-	i915_vma_lock(vma);
-	err = i915_gem_object_set_to_gtt_domain(vma->obj, false);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
-	i915_vma_unlock(vma);
+	i915_gem_object_lock(vma->obj);
+	err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
+	i915_gem_object_unlock(vma->obj);
 	if (err)
-		goto err_request;
-
-	err = engine->emit_bb_start(rq,
-				    batch->node.start, batch->node.size,
-				    0);
-err_request:
-	if (err)
-		i915_request_skip(rq, err);
-	i915_request_add(rq);
-err_batch:
-	i915_vma_unpin(batch);
-	i915_vma_close(batch);
-	i915_vma_put(batch);
+		return err;
 
-	return err;
+	return igt_gpu_fill_dw(vma, ctx, engine, dw * sizeof(u32),
+			       vma->size >> PAGE_SHIFT, val);
 }
 
 static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 53c81b5dfd69..db666106c244 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -155,70 +155,6 @@ static int live_nop_switch(void *arg)
 	return err;
 }
 
-static struct i915_vma *
-gpu_fill_dw(struct i915_vma *vma, u64 offset, unsigned long count, u32 value)
-{
-	struct drm_i915_gem_object *obj;
-	const int gen = INTEL_GEN(vma->vm->i915);
-	unsigned long n, size;
-	u32 *cmd;
-	int err;
-
-	size = (4 * count + 1) * sizeof(u32);
-	size = round_up(size, PAGE_SIZE);
-	obj = i915_gem_object_create_internal(vma->vm->i915, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	cmd = i915_gem_object_pin_map(obj, I915_MAP_WB);
-	if (IS_ERR(cmd)) {
-		err = PTR_ERR(cmd);
-		goto err;
-	}
-
-	GEM_BUG_ON(offset + (count - 1) * PAGE_SIZE > vma->node.size);
-	offset += vma->node.start;
-
-	for (n = 0; n < count; n++) {
-		if (gen >= 8) {
-			*cmd++ = MI_STORE_DWORD_IMM_GEN4;
-			*cmd++ = lower_32_bits(offset);
-			*cmd++ = upper_32_bits(offset);
-			*cmd++ = value;
-		} else if (gen >= 4) {
-			*cmd++ = MI_STORE_DWORD_IMM_GEN4 |
-				(gen < 6 ? MI_USE_GGTT : 0);
-			*cmd++ = 0;
-			*cmd++ = offset;
-			*cmd++ = value;
-		} else {
-			*cmd++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
-			*cmd++ = offset;
-			*cmd++ = value;
-		}
-		offset += PAGE_SIZE;
-	}
-	*cmd = MI_BATCH_BUFFER_END;
-	i915_gem_object_flush_map(obj);
-	i915_gem_object_unpin_map(obj);
-
-	vma = i915_vma_instance(obj, vma->vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_USER);
-	if (err)
-		goto err;
-
-	return vma;
-
-err:
-	i915_gem_object_put(obj);
-	return ERR_PTR(err);
-}
-
 static unsigned long real_page_count(struct drm_i915_gem_object *obj)
 {
 	return huge_gem_object_phys_size(obj) >> PAGE_SHIFT;
@@ -235,10 +171,7 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 		    unsigned int dw)
 {
 	struct i915_address_space *vm = ctx->vm ?: &engine->gt->ggtt->vm;
-	struct i915_request *rq;
 	struct i915_vma *vma;
-	struct i915_vma *batch;
-	unsigned int flags;
 	int err;
 
 	GEM_BUG_ON(obj->base.size > vm->total);
@@ -249,7 +182,7 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 		return PTR_ERR(vma);
 
 	i915_gem_object_lock(obj);
-	err = i915_gem_object_set_to_gtt_domain(obj, false);
+	err = i915_gem_object_set_to_gtt_domain(obj, true);
 	i915_gem_object_unlock(obj);
 	if (err)
 		return err;
@@ -258,70 +191,23 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 	if (err)
 		return err;
 
-	/* Within the GTT the huge objects maps every page onto
+	/*
+	 * Within the GTT the huge objects maps every page onto
 	 * its 1024 real pages (using phys_pfn = dma_pfn % 1024).
 	 * We set the nth dword within the page using the nth
 	 * mapping via the GTT - this should exercise the GTT mapping
 	 * whilst checking that each context provides a unique view
 	 * into the object.
 	 */
-	batch = gpu_fill_dw(vma,
-			    (dw * real_page_count(obj)) << PAGE_SHIFT |
-			    (dw * sizeof(u32)),
-			    real_page_count(obj),
-			    dw);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto err_vma;
-	}
-
-	rq = igt_request_alloc(ctx, engine);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto err_batch;
-	}
-
-	flags = 0;
-	if (INTEL_GEN(vm->i915) <= 5)
-		flags |= I915_DISPATCH_SECURE;
-
-	err = engine->emit_bb_start(rq,
-				    batch->node.start, batch->node.size,
-				    flags);
-	if (err)
-		goto err_request;
-
-	i915_vma_lock(batch);
-	err = i915_vma_move_to_active(batch, rq, 0);
-	i915_vma_unlock(batch);
-	if (err)
-		goto skip_request;
-
-	i915_vma_lock(vma);
-	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
-	i915_vma_unlock(vma);
-	if (err)
-		goto skip_request;
-
-	i915_request_add(rq);
-
-	i915_vma_unpin(batch);
-	i915_vma_close(batch);
-	i915_vma_put(batch);
-
+	err = igt_gpu_fill_dw(vma,
+			      ctx,
+			      engine,
+			      (dw * real_page_count(obj)) << PAGE_SHIFT |
+			      (dw * sizeof(u32)),
+			      real_page_count(obj),
+			      dw);
 	i915_vma_unpin(vma);
 
-	return 0;
-
-skip_request:
-	i915_request_skip(rq, err);
-err_request:
-	i915_request_add(rq);
-err_batch:
-	i915_vma_unpin(batch);
-	i915_vma_put(batch);
-err_vma:
-	i915_vma_unpin(vma);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
index b232e6d2cd92..dc47a6e2586c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
+++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
@@ -9,6 +9,8 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_pm.h"
 #include "gt/intel_context.h"
+#include "i915_vma.h"
+#include "i915_drv.h"
 
 #include "i915_request.h"
 
@@ -32,3 +34,136 @@ igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 
 	return rq;
 }
+
+struct i915_vma *
+igt_emit_store_dw(struct i915_vma *vma,
+		  u64 offset,
+		  unsigned long count,
+		  u32 val)
+{
+	struct drm_i915_gem_object *obj;
+	const int gen = INTEL_GEN(vma->vm->i915);
+	unsigned long n, size;
+	u32 *cmd;
+	int err;
+
+	size = (4 * count + 1) * sizeof(u32);
+	size = round_up(size, PAGE_SIZE);
+	obj = i915_gem_object_create_internal(vma->vm->i915, size);
+	if (IS_ERR(obj))
+		return ERR_CAST(obj);
+
+	cmd = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	if (IS_ERR(cmd)) {
+		err = PTR_ERR(cmd);
+		goto err;
+	}
+
+	GEM_BUG_ON(offset + (count - 1) * PAGE_SIZE > vma->node.size);
+	offset += vma->node.start;
+
+	for (n = 0; n < count; n++) {
+		if (gen >= 8) {
+			*cmd++ = MI_STORE_DWORD_IMM_GEN4;
+			*cmd++ = lower_32_bits(offset);
+			*cmd++ = upper_32_bits(offset);
+			*cmd++ = val;
+		} else if (gen >= 4) {
+			*cmd++ = MI_STORE_DWORD_IMM_GEN4 |
+				(gen < 6 ? MI_USE_GGTT : 0);
+			*cmd++ = 0;
+			*cmd++ = offset;
+			*cmd++ = val;
+		} else {
+			*cmd++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
+			*cmd++ = offset;
+			*cmd++ = val;
+		}
+		offset += PAGE_SIZE;
+	}
+	*cmd = MI_BATCH_BUFFER_END;
+	i915_gem_object_unpin_map(obj);
+
+	vma = i915_vma_instance(obj, vma->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (err)
+		goto err;
+
+	return vma;
+
+err:
+	i915_gem_object_put(obj);
+	return ERR_PTR(err);
+}
+
+int igt_gpu_fill_dw(struct i915_vma *vma,
+		    struct i915_gem_context *ctx,
+		    struct intel_engine_cs *engine,
+		    u64 offset,
+		    unsigned long count,
+		    u32 val)
+{
+	struct i915_address_space *vm = ctx->vm ?: &engine->gt->ggtt->vm;
+	struct i915_request *rq;
+	struct i915_vma *batch;
+	unsigned int flags;
+	int err;
+
+	GEM_BUG_ON(vma->size > vm->total);
+	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
+	GEM_BUG_ON(!i915_vma_is_pinned(vma));
+
+	batch = igt_emit_store_dw(vma, offset, count, val);
+	if (IS_ERR(batch))
+		return PTR_ERR(batch);
+
+	rq = igt_request_alloc(ctx, engine);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_batch;
+	}
+
+	flags = 0;
+	if (INTEL_GEN(vm->i915) <= 5)
+		flags |= I915_DISPATCH_SECURE;
+
+	err = engine->emit_bb_start(rq,
+				    batch->node.start, batch->node.size,
+				    flags);
+	if (err)
+		goto err_request;
+
+	i915_vma_lock(batch);
+	err = i915_vma_move_to_active(batch, rq, 0);
+	i915_vma_unlock(batch);
+	if (err)
+		goto skip_request;
+
+	i915_vma_lock(vma);
+	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+	i915_vma_unlock(vma);
+	if (err)
+		goto skip_request;
+
+	i915_request_add(rq);
+
+	i915_vma_unpin(batch);
+	i915_vma_close(batch);
+	i915_vma_put(batch);
+
+	return 0;
+
+skip_request:
+	i915_request_skip(rq, err);
+err_request:
+	i915_request_add(rq);
+err_batch:
+	i915_vma_unpin(batch);
+	i915_vma_put(batch);
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
index 0f17251cf75d..361a7ef866b0 100644
--- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
+++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h
@@ -7,11 +7,27 @@
 #ifndef __IGT_GEM_UTILS_H__
 #define __IGT_GEM_UTILS_H__
 
+#include <linux/types.h>
+
 struct i915_request;
 struct i915_gem_context;
 struct intel_engine_cs;
+struct i915_vma;
 
 struct i915_request *
 igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
 
+struct i915_vma *
+igt_emit_store_dw(struct i915_vma *vma,
+		  u64 offset,
+		  unsigned long count,
+		  u32 val);
+
+int igt_gpu_fill_dw(struct i915_vma *vma,
+		    struct i915_gem_context *ctx,
+		    struct intel_engine_cs *engine,
+		    u64 offset,
+		    unsigned long count,
+		    u32 val);
+
 #endif /* __IGT_GEM_UTILS_H__ */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 12/37] drm/i915/selftests: add write-dword test for LMEM
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (10 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 11/37] drm/i915/selftests: move gpu-write-dw into utils Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 13/37] drm/i915/selftests: don't just test CACHE_NONE for huge-pages Matthew Auld
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Simple test writing to dwords across an object, using various engines in
a randomized order, checking that our writes land from the cpu.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 .../drm/i915/selftests/intel_memory_region.c  | 161 ++++++++++++++++++
 1 file changed, 161 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 85d118c10d15..23c466a1b800 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -10,6 +10,7 @@
 #include "mock_gem_device.h"
 #include "gem/selftests/mock_context.h"
 #include "selftests/igt_flush_test.h"
+#include "selftests/i915_random.h"
 #include "mock_drm.h"
 
 #include "gem/i915_gem_object_blt.h"
@@ -347,6 +348,128 @@ static int igt_mock_volatile(void *arg)
 	return err;
 }
 
+static int igt_gpu_write_dw(struct i915_vma *vma,
+			    struct i915_gem_context *ctx,
+			    struct intel_engine_cs *engine,
+			    u32 dword,
+			    u32 value)
+{
+	int err;
+
+	i915_gem_object_lock(vma->obj);
+	err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
+	i915_gem_object_unlock(vma->obj);
+	if (err)
+		return err;
+
+	return igt_gpu_fill_dw(vma, ctx, engine, dword * sizeof(u32),
+			       vma->size >> PAGE_SHIFT, value);
+}
+
+static int igt_cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
+{
+	unsigned long n;
+	int err;
+
+	i915_gem_object_lock(obj);
+	err = i915_gem_object_set_to_wc_domain(obj, false);
+	i915_gem_object_unlock(obj);
+	if (err)
+		return err;
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		return err;
+
+	for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
+		u32 __iomem *base;
+		u32 read_val;
+
+		base = i915_gem_object_lmem_io_map_page(obj, n);
+
+		read_val = ioread32(base + dword);
+		io_mapping_unmap_atomic(base);
+		if (read_val != val) {
+			pr_err("n=%lu base[%u]=%u, val=%u\n",
+			       n, dword, read_val, val);
+			err = -EINVAL;
+			break;
+		}
+	}
+
+	i915_gem_object_unpin_pages(obj);
+	return err;
+}
+
+static int igt_gpu_write(struct i915_gem_context *ctx,
+			 struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = ctx->i915;
+	struct i915_address_space *vm = ctx->vm ?: &i915->ggtt.vm;
+	static struct intel_engine_cs *engines[I915_NUM_ENGINES];
+	struct intel_engine_cs *engine;
+	IGT_TIMEOUT(end_time);
+	I915_RND_STATE(prng);
+	struct i915_vma *vma;
+	unsigned int id;
+	int *order;
+	int i, n;
+	int err;
+
+	n = 0;
+	for_each_engine(engine, i915, id) {
+		if (!intel_engine_can_store_dword(engine)) {
+			pr_info("store-dword-imm not supported on engine=%u\n",
+				id);
+			continue;
+		}
+		engines[n++] = engine;
+	}
+
+	if (!n)
+		return 0;
+
+	order = i915_random_order(n * I915_NUM_ENGINES, &prng);
+	if (!order)
+		return -ENOMEM;
+
+	vma = i915_vma_instance(obj, vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto out_free;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (err)
+		goto out_free;
+
+	i = 0;
+	do {
+		u32 rng = prandom_u32_state(&prng);
+		u32 dword = offset_in_page(rng) / 4;
+
+		engine = engines[order[i] % n];
+		i = (i + 1) % (n * I915_NUM_ENGINES);
+
+		err = igt_gpu_write_dw(vma, ctx, engine, dword, rng);
+		if (err)
+			break;
+
+		err = igt_cpu_check(obj, dword, rng);
+		if (err)
+			break;
+	} while (!__igt_timeout(end_time, NULL));
+
+	i915_vma_unpin(vma);
+out_free:
+	kfree(order);
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	return err;
+}
+
 static int igt_lmem_create(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -368,6 +491,43 @@ static int igt_lmem_create(void *arg)
 	return err;
 }
 
+static int igt_lmem_write_gpu(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct drm_i915_gem_object *obj;
+	I915_RND_STATE(prng);
+	u32 sz;
+	int err;
+
+	/*
+	 * XXX: Nothing too big for now; we don't want to upset CI. What we
+	 * really want is the huge dma stuff for device memory, then we can go
+	 * to town...
+	 */
+	sz = round_up(prandom_u32_state(&prng) % SZ_32M, PAGE_SIZE);
+
+	obj = i915_gem_object_create_lmem(i915, sz, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		goto out_put;
+
+	err = igt_gpu_write(i915->kernel_context, obj);
+	if (err)
+		pr_err("igt_gpu_write failed(%d)\n", err);
+
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	return err;
+}
+
 static int igt_lmem_write_cpu(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -483,6 +643,7 @@ int intel_memory_region_live_selftests(struct drm_i915_private *i915)
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_lmem_create),
 		SUBTEST(igt_lmem_write_cpu),
+		SUBTEST(igt_lmem_write_gpu),
 	};
 	int err;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 13/37] drm/i915/selftests: don't just test CACHE_NONE for huge-pages
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (11 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 12/37] drm/i915/selftests: add write-dword test for LMEM Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:40   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 14/37] drm/i915/selftest: extend coverage to include LMEM huge-pages Matthew Auld
                   ` (26 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

We also want to test LLC.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 1cdf98b7535e..1862bf06a20f 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -152,6 +152,7 @@ huge_pages_object(struct drm_i915_private *i915,
 		  unsigned int page_mask)
 {
 	struct drm_i915_gem_object *obj;
+	unsigned int cache_level;
 
 	GEM_BUG_ON(!size);
 	GEM_BUG_ON(!IS_ALIGNED(size, BIT(__ffs(page_mask))));
@@ -171,7 +172,9 @@ huge_pages_object(struct drm_i915_private *i915,
 
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
-	obj->cache_level = I915_CACHE_NONE;
+
+	cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
+	i915_gem_object_set_cache_coherency(obj, cache_level);
 
 	obj->mm.page_mask = page_mask;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 14/37] drm/i915/selftest: extend coverage to include LMEM huge-pages
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (12 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 13/37] drm/i915/selftests: don't just test CACHE_NONE for huge-pages Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:42   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 15/37] drm/i915/lmem: support CPU relocations Matthew Auld
                   ` (25 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 122 +++++++++++++++++-
 1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 1862bf06a20f..c81ea9ce289b 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -981,7 +981,7 @@ static int gpu_write(struct i915_vma *vma,
 			       vma->size >> PAGE_SHIFT, val);
 }
 
-static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
+static int __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 dword, u32 val)
 {
 	unsigned int needs_flush;
 	unsigned long n;
@@ -1013,6 +1013,53 @@ static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
 	return err;
 }
 
+static int __cpu_check_lmem(struct drm_i915_gem_object *obj, u32 dword, u32 val)
+{
+	unsigned long n;
+	int err;
+
+	i915_gem_object_lock(obj);
+	err = i915_gem_object_set_to_wc_domain(obj, false);
+	i915_gem_object_unlock(obj);
+	if (err)
+		return err;
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		return err;
+
+	for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
+		u32 __iomem *base;
+		u32 read_val;
+
+		base = i915_gem_object_lmem_io_map_page(obj, n);
+
+		read_val = ioread32(base + dword);
+		io_mapping_unmap_atomic(base);
+		if (read_val != val) {
+			pr_err("n=%lu base[%u]=%u, val=%u\n",
+			       n, dword, read_val, val);
+			err = -EINVAL;
+			break;
+		}
+	}
+
+	i915_gem_object_unpin_pages(obj);
+	return err;
+}
+
+static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
+	if (i915_gem_object_has_struct_page(obj))
+		return __cpu_check_shmem(obj, dword, val);
+	else if (HAS_LMEM(i915) && obj->memory_region)
+		return __cpu_check_lmem(obj, dword, val);
+
+	return -ENODEV;
+}
+
 static int __igt_write_huge(struct i915_gem_context *ctx,
 			    struct intel_engine_cs *engine,
 			    struct drm_i915_gem_object *obj,
@@ -1393,6 +1440,78 @@ static int igt_ppgtt_gemfs_huge(void *arg)
 	return err;
 }
 
+static int igt_ppgtt_lmem_huge(void *arg)
+{
+	struct i915_gem_context *ctx = arg;
+	struct drm_i915_private *i915 = ctx->i915;
+	struct drm_i915_gem_object *obj;
+	static const unsigned int sizes[] = {
+		SZ_64K,
+		SZ_512K,
+		SZ_1M,
+		SZ_2M,
+	};
+	int i;
+	int err;
+
+	if (!HAS_LMEM(i915)) {
+		pr_info("device lacks LMEM support, skipping\n");
+		return 0;
+	}
+
+	/*
+	 * Sanity check that the HW uses huge pages correctly through LMEM
+	 * -- ensure that our writes land in the right place.
+	 */
+
+	for (i = 0; i < ARRAY_SIZE(sizes); ++i) {
+		unsigned int size = sizes[i];
+
+		obj = i915_gem_object_create_lmem(i915, size, I915_BO_ALLOC_CONTIGUOUS);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			if (err == -E2BIG) {
+				pr_info("object too big for region!\n");
+				return 0;
+			}
+
+			return err;
+		}
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto out_put;
+
+		if (obj->mm.page_sizes.phys < I915_GTT_PAGE_SIZE_64K) {
+			pr_info("LMEM unable to allocate huge-page(s) with size=%u\n",
+				size);
+			goto out_unpin;
+		}
+
+		err = igt_write_huge(ctx, obj);
+		if (err) {
+			pr_err("LMEM write-huge failed with size=%u\n", size);
+			goto out_unpin;
+		}
+
+		i915_gem_object_unpin_pages(obj);
+		__i915_gem_object_put_pages(obj, I915_MM_NORMAL);
+		i915_gem_object_put(obj);
+	}
+
+	return 0;
+
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	return err;
+}
+
 static int igt_ppgtt_pin_update(void *arg)
 {
 	struct i915_gem_context *ctx = arg;
@@ -1717,6 +1836,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
 		SUBTEST(igt_ppgtt_exhaust_huge),
 		SUBTEST(igt_ppgtt_gemfs_huge),
 		SUBTEST(igt_ppgtt_internal_huge),
+		SUBTEST(igt_ppgtt_lmem_huge),
 	};
 	struct drm_file *file;
 	struct i915_gem_context *ctx;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 15/37] drm/i915/lmem: support CPU relocations
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (13 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 14/37] drm/i915/selftest: extend coverage to include LMEM huge-pages Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:46   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 16/37] drm/i915/lmem: support pread Matthew Auld
                   ` (24 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Add LMEM support for the CPU reloc path. When doing relocations we have
both a GPU and CPU reloc path, as well as some debugging options to force a
particular path. The GPU reloc path is preferred when the object
is not currently idle, otherwise we use the CPU reloc path. Since we
can't kmap the object, and the mappable aperture might not be available,
add support for mapping it through LMEMBAR.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 69 ++++++++++++++++---
 1 file changed, 58 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1c5dfbfad71b..b724143e88d2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -251,6 +251,7 @@ struct i915_execbuffer {
 		bool has_llc : 1;
 		bool has_fence : 1;
 		bool needs_unfenced : 1;
+		bool is_lmem : 1;
 
 		struct i915_request *rq;
 		u32 *rq_cmd;
@@ -963,6 +964,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
 	cache->use_64bit_reloc = HAS_64BIT_RELOC(i915);
 	cache->has_fence = cache->gen < 4;
 	cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
+	cache->is_lmem = false;
 	cache->node.allocated = false;
 	cache->rq = NULL;
 	cache->rq_size = 0;
@@ -1020,16 +1022,23 @@ static void reloc_cache_reset(struct reloc_cache *cache)
 		i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm);
 	} else {
 		wmb();
-		io_mapping_unmap_atomic((void __iomem *)vaddr);
-		if (cache->node.allocated) {
-			struct i915_ggtt *ggtt = cache_to_ggtt(cache);
-
-			ggtt->vm.clear_range(&ggtt->vm,
-					     cache->node.start,
-					     cache->node.size);
-			drm_mm_remove_node(&cache->node);
+
+		if (cache->is_lmem) {
+			io_mapping_unmap_atomic((void __iomem *)vaddr);
+			i915_gem_object_unpin_pages((struct drm_i915_gem_object *)cache->node.mm);
+			cache->is_lmem = false;
 		} else {
-			i915_vma_unpin((struct i915_vma *)cache->node.mm);
+			io_mapping_unmap_atomic((void __iomem *)vaddr);
+			if (cache->node.allocated) {
+				struct i915_ggtt *ggtt = cache_to_ggtt(cache);
+
+				ggtt->vm.clear_range(&ggtt->vm,
+						     cache->node.start,
+						     cache->node.size);
+				drm_mm_remove_node(&cache->node);
+			} else {
+				i915_vma_unpin((struct i915_vma *)cache->node.mm);
+			}
 		}
 	}
 
@@ -1069,6 +1078,40 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
 	return vaddr;
 }
 
+static void *reloc_lmem(struct drm_i915_gem_object *obj,
+			struct reloc_cache *cache,
+			unsigned long page)
+{
+	void *vaddr;
+	int err;
+
+	GEM_BUG_ON(use_cpu_reloc(cache, obj));
+
+	if (cache->vaddr) {
+		io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr));
+	} else {
+		i915_gem_object_lock(obj);
+		err = i915_gem_object_set_to_wc_domain(obj, true);
+		i915_gem_object_unlock(obj);
+		if (err)
+			return ERR_PTR(err);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			return ERR_PTR(err);
+
+		cache->node.mm = (void *)obj;
+		cache->is_lmem = true;
+	}
+
+	vaddr = i915_gem_object_lmem_io_map_page(obj, page);
+
+	cache->vaddr = (unsigned long)vaddr;
+	cache->page = page;
+
+	return vaddr;
+}
+
 static void *reloc_iomap(struct drm_i915_gem_object *obj,
 			 struct reloc_cache *cache,
 			 unsigned long page)
@@ -1145,8 +1188,12 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj,
 		vaddr = unmask_page(cache->vaddr);
 	} else {
 		vaddr = NULL;
-		if ((cache->vaddr & KMAP) == 0)
-			vaddr = reloc_iomap(obj, cache, page);
+		if ((cache->vaddr & KMAP) == 0) {
+			if (i915_gem_object_is_lmem(obj))
+				vaddr = reloc_lmem(obj, cache, page);
+			else
+				vaddr = reloc_iomap(obj, cache, page);
+		}
 		if (!vaddr)
 			vaddr = reloc_kmap(obj, cache, page);
 	}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 16/37] drm/i915/lmem: support pread
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (14 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 15/37] drm/i915/lmem: support CPU relocations Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:50   ` Chris Wilson
  2019-07-30  8:58   ` Daniel Vetter
  2019-06-27 20:56 ` [PATCH v2 17/37] drm/i915/lmem: support pwrite Matthew Auld
                   ` (23 subsequent siblings)
  39 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

We need to add support for pread'ing an LMEM object.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  2 +
 drivers/gpu/drm/i915/i915_gem.c               |  6 ++
 drivers/gpu/drm/i915/intel_region_lmem.c      | 76 +++++++++++++++++++
 3 files changed, 84 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 80ff5ad9bc07..8cdee185251a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -52,6 +52,8 @@ struct drm_i915_gem_object_ops {
 	void (*truncate)(struct drm_i915_gem_object *obj);
 	void (*writeback)(struct drm_i915_gem_object *obj);
 
+	int (*pread)(struct drm_i915_gem_object *,
+		     const struct drm_i915_gem_pread *arg);
 	int (*pwrite)(struct drm_i915_gem_object *obj,
 		      const struct drm_i915_gem_pwrite *arg);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 85677ae89849..4ba386ab35e7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -463,6 +463,12 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 
 	trace_i915_gem_object_pread(obj, args->offset, args->size);
 
+	ret = -ENODEV;
+	if (obj->ops->pread)
+		ret = obj->ops->pread(obj, args);
+	if (ret != -ENODEV)
+		goto out;
+
 	ret = i915_gem_object_wait(obj,
 				   I915_WAIT_INTERRUPTIBLE,
 				   MAX_SCHEDULE_TIMEOUT);
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 701bcac3479e..54b2c7bf177d 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -7,10 +7,86 @@
 #include "intel_memory_region.h"
 #include "intel_region_lmem.h"
 
+static int lmem_pread(struct drm_i915_gem_object *obj,
+		      const struct drm_i915_gem_pread *arg)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_runtime_pm *rpm = &i915->runtime_pm;
+	intel_wakeref_t wakeref;
+	struct dma_fence *fence;
+	char __user *user_data;
+	unsigned int offset;
+	unsigned long idx;
+	u64 remain;
+	int ret;
+
+	ret = i915_gem_object_pin_pages(obj);
+	if (ret)
+		return ret;
+
+	i915_gem_object_lock(obj);
+	ret = i915_gem_object_set_to_wc_domain(obj, false);
+	if (ret) {
+		i915_gem_object_unlock(obj);
+		goto out_unpin;
+	}
+
+	fence = i915_gem_object_lock_fence(obj);
+	i915_gem_object_unlock(obj);
+	if (!fence) {
+		ret = -ENOMEM;
+		goto out_unpin;
+	}
+
+	wakeref = intel_runtime_pm_get(rpm);
+
+	remain = arg->size;
+	user_data = u64_to_user_ptr(arg->data_ptr);
+	offset = offset_in_page(arg->offset);
+	for (idx = arg->offset >> PAGE_SHIFT; remain; idx++) {
+		unsigned long unwritten;
+		void __iomem *vaddr;
+		int length;
+
+		length = remain;
+		if (offset + length > PAGE_SIZE)
+			length = PAGE_SIZE - offset;
+
+		vaddr = i915_gem_object_lmem_io_map_page(obj, idx);
+		if (!vaddr) {
+			ret = -ENOMEM;
+			goto out_put;
+		}
+
+		unwritten = copy_to_user(user_data,
+					 (void __force *)vaddr + offset,
+					 length);
+		io_mapping_unmap_atomic(vaddr);
+		if (unwritten) {
+			ret = -EFAULT;
+			goto out_put;
+		}
+
+		remain -= length;
+		user_data += length;
+		offset = 0;
+	}
+
+out_put:
+	intel_runtime_pm_put(rpm, wakeref);
+	i915_gem_object_unlock_fence(obj, fence);
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+
+	return ret;
+}
+
 static const struct drm_i915_gem_object_ops region_lmem_obj_ops = {
 	.get_pages = i915_memory_region_get_pages_buddy,
 	.put_pages = i915_memory_region_put_pages_buddy,
 	.release = i915_gem_object_release_memory_region,
+
+	.pread = lmem_pread,
 };
 
 static struct drm_i915_gem_object *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 17/37] drm/i915/lmem: support pwrite
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (15 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 16/37] drm/i915/lmem: support pread Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 18/37] drm/i915: enumerate and init each supported region Matthew Auld
                   ` (22 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

We need to add support for pwrite'ing an LMEM object.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_region_lmem.c | 75 ++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 54b2c7bf177d..d0a5311cf235 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -81,12 +81,87 @@ static int lmem_pread(struct drm_i915_gem_object *obj,
 	return ret;
 }
 
+static int lmem_pwrite(struct drm_i915_gem_object *obj,
+		       const struct drm_i915_gem_pwrite *arg)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_runtime_pm *rpm = &i915->runtime_pm;
+	intel_wakeref_t wakeref;
+	struct dma_fence *fence;
+	char __user *user_data;
+	unsigned int offset;
+	unsigned long idx;
+	u64 remain;
+	int ret;
+
+	ret = i915_gem_object_pin_pages(obj);
+	if (ret)
+		return ret;
+
+	i915_gem_object_lock(obj);
+	ret = i915_gem_object_set_to_wc_domain(obj, true);
+	if (ret) {
+		i915_gem_object_unlock(obj);
+		goto out_unpin;
+	}
+
+	fence = i915_gem_object_lock_fence(obj);
+	i915_gem_object_unlock(obj);
+	if (!fence) {
+		ret = -ENOMEM;
+		goto out_unpin;
+	}
+
+	wakeref = intel_runtime_pm_get(rpm);
+
+	remain = arg->size;
+	user_data = u64_to_user_ptr(arg->data_ptr);
+	offset = offset_in_page(arg->offset);
+	for (idx = arg->offset >> PAGE_SHIFT; remain; idx++) {
+		unsigned long unwritten;
+		void __iomem *vaddr;
+		int length;
+
+		length = remain;
+		if (offset + length > PAGE_SIZE)
+			length = PAGE_SIZE - offset;
+
+		vaddr = i915_gem_object_lmem_io_map_page(obj, idx);
+		if (!vaddr) {
+			ret = -ENOMEM;
+			goto out_put;
+		}
+
+		unwritten = copy_from_user((void __force*)vaddr + offset,
+					   user_data,
+					   length);
+		io_mapping_unmap_atomic(vaddr);
+		if (unwritten) {
+			ret = -EFAULT;
+			goto out_put;
+		}
+
+		remain -= length;
+		user_data += length;
+		offset = 0;
+	}
+
+out_put:
+	intel_runtime_pm_put(rpm, wakeref);
+	i915_gem_object_unlock_fence(obj, fence);
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+
+	return ret;
+}
+
 static const struct drm_i915_gem_object_ops region_lmem_obj_ops = {
 	.get_pages = i915_memory_region_get_pages_buddy,
 	.put_pages = i915_memory_region_put_pages_buddy,
 	.release = i915_gem_object_release_memory_region,
 
 	.pread = lmem_pread,
+	.pwrite = lmem_pwrite,
 };
 
 static struct drm_i915_gem_object *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 18/37] drm/i915: enumerate and init each supported region
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (16 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 17/37] drm/i915/lmem: support pwrite Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 19/37] drm/i915: treat shmem as a region Matthew Auld
                   ` (21 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

Nothing to enumerate yet...

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h               |  3 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 73 +++++++++++++++++--
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  6 ++
 3 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7cbdffe3f129..42674a41e469 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2607,6 +2607,9 @@ int __must_check i915_gem_evict_for_node(struct i915_address_space *vm,
 					 unsigned int flags);
 int i915_gem_evict_vm(struct i915_address_space *vm);
 
+void i915_gem_cleanup_memory_regions(struct drm_i915_private *i915);
+int i915_gem_init_memory_regions(struct drm_i915_private *i915);
+
 /* i915_gem_stolen.c */
 int i915_gem_stolen_insert_node(struct drm_i915_private *dev_priv,
 				struct drm_mm_node *node, u64 size,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ff1d5008a256..e4f811fdaedc 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2902,6 +2902,71 @@ int i915_init_ggtt(struct drm_i915_private *i915)
 	return 0;
 }
 
+void i915_gem_cleanup_memory_regions(struct drm_i915_private *i915)
+{
+	int i;
+
+	i915_gem_cleanup_stolen(i915);
+
+	for (i = 0; i < ARRAY_SIZE(i915->regions); ++i)	{
+		struct intel_memory_region *region = i915->regions[i];
+
+		if (region)
+			intel_memory_region_destroy(region);
+	}
+}
+
+int i915_gem_init_memory_regions(struct drm_i915_private *i915)
+{
+	int err, i;
+
+	/* All platforms currently have system memory */
+	GEM_BUG_ON(!HAS_REGION(i915, REGION_SMEM));
+
+	/*
+	 * Initialise stolen early so that we may reserve preallocated
+	 * objects for the BIOS to KMS transition.
+	 */
+	/* XXX: stolen will become a region at some point */
+	err = i915_gem_init_stolen(i915);
+	if (err)
+		return err;
+
+	for (i = 0; i < ARRAY_SIZE(intel_region_map); i++) {
+		struct intel_memory_region *mem = NULL;
+		u32 type;
+
+		if (!HAS_REGION(i915, BIT(i)))
+			continue;
+
+		type = MEMORY_TYPE_FROM_REGION(intel_region_map[i]);
+		switch (type) {
+		default:
+			break;
+		}
+
+		if (IS_ERR(mem)) {
+			err = PTR_ERR(mem);
+			DRM_ERROR("Failed to setup region(%d) type=%d\n", err, type);
+			goto out_cleanup;
+		}
+
+		if (mem) {
+			mem->id = intel_region_map[i];
+			mem->type = type;
+			mem->instance = MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);
+		}
+
+		i915->regions[i] = mem;
+	}
+
+	return 0;
+
+out_cleanup:
+	i915_gem_cleanup_memory_regions(i915);
+	return err;
+}
+
 static void ggtt_cleanup_hw(struct i915_ggtt *ggtt)
 {
 	struct drm_i915_private *i915 = ggtt->vm.i915;
@@ -2950,7 +3015,7 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private *i915)
 		__pagevec_release(pvec);
 	}
 
-	i915_gem_cleanup_stolen(i915);
+	i915_gem_cleanup_memory_regions(i915);
 }
 
 static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
@@ -3600,11 +3665,7 @@ int i915_ggtt_init_hw(struct drm_i915_private *dev_priv)
 	if (ret)
 		return ret;
 
-	/*
-	 * Initialise stolen early so that we may reserve preallocated
-	 * objects for the BIOS to KMS transition.
-	 */
-	ret = i915_gem_init_stolen(dev_priv);
+	ret = i915_gem_init_memory_regions(dev_priv);
 	if (ret)
 		goto out_gtt_cleanup;
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index f8b48304fcec..df07a0bd089d 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -82,6 +82,8 @@ static void mock_device_release(struct drm_device *dev)
 
 	i915_gemfs_fini(i915);
 
+	i915_gem_cleanup_memory_regions(i915);
+
 	drm_mode_config_cleanup(&i915->drm);
 
 	drm_dev_fini(&i915->drm);
@@ -225,6 +227,10 @@ struct drm_i915_private *mock_gem_device(void)
 
 	WARN_ON(i915_gemfs_init(i915));
 
+	err = i915_gem_init_memory_regions(i915);
+	if (err)
+		goto err_context;
+
 	return i915;
 
 err_context:
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 19/37] drm/i915: treat shmem as a region
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (17 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 18/37] drm/i915: enumerate and init each supported region Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:55   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 20/37] drm/i915: treat stolen " Matthew Auld
                   ` (20 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     | 67 ++++++++++++++-----
 drivers/gpu/drm/i915/i915_drv.c               |  5 +-
 drivers/gpu/drm/i915/i915_drv.h               |  4 +-
 drivers/gpu/drm/i915/i915_gem.c               | 13 +---
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 11 ++-
 drivers/gpu/drm/i915/intel_memory_region.c    |  9 +++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  4 --
 7 files changed, 68 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 19d9ecdb2894..11c7ebda9e0e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -8,6 +8,7 @@
 #include <linux/swap.h>
 
 #include "i915_drv.h"
+#include "i915_gemfs.h"
 #include "i915_gem_object.h"
 #include "i915_scatterlist.h"
 
@@ -25,6 +26,7 @@ static void check_release_pagevec(struct pagevec *pvec)
 static int shmem_get_pages(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_memory_region *mem = obj->memory_region;
 	const unsigned long page_count = obj->base.size / PAGE_SIZE;
 	unsigned long i;
 	struct address_space *mapping;
@@ -51,7 +53,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	 * If there's no chance of allocating enough pages for the whole
 	 * object, bail early.
 	 */
-	if (page_count > totalram_pages())
+	if (obj->base.size > resource_size(&mem->region))
 		return -ENOMEM;
 
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
@@ -424,9 +426,11 @@ const struct drm_i915_gem_object_ops i915_gem_shmem_ops = {
 	.writeback = shmem_writeback,
 
 	.pwrite = shmem_pwrite,
+
+	.release = i915_gem_object_release_memory_region,
 };
 
-static int create_shmem(struct drm_i915_private *i915,
+static int __create_shmem(struct drm_i915_private *i915,
 			struct drm_gem_object *obj,
 			size_t size)
 {
@@ -447,31 +451,23 @@ static int create_shmem(struct drm_i915_private *i915,
 	return 0;
 }
 
-struct drm_i915_gem_object *
-i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
+static struct drm_i915_gem_object *
+create_shmem(struct intel_memory_region *mem,
+	     resource_size_t size,
+	     unsigned flags)
 {
+	struct drm_i915_private *i915 = mem->i915;
 	struct drm_i915_gem_object *obj;
 	struct address_space *mapping;
 	unsigned int cache_level;
 	gfp_t mask;
 	int ret;
 
-	/* There is a prevalence of the assumption that we fit the object's
-	 * page count inside a 32bit _signed_ variable. Let's document this and
-	 * catch if we ever need to fix it. In the meantime, if you do spot
-	 * such a local variable, please consider fixing!
-	 */
-	if (size >> PAGE_SHIFT > INT_MAX)
-		return ERR_PTR(-E2BIG);
-
-	if (overflows_type(size, obj->base.size))
-		return ERR_PTR(-E2BIG);
-
 	obj = i915_gem_object_alloc();
 	if (!obj)
 		return ERR_PTR(-ENOMEM);
 
-	ret = create_shmem(i915, &obj->base, size);
+	ret = __create_shmem(i915, &obj->base, size);
 	if (ret)
 		goto fail;
 
@@ -510,8 +506,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
 
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 
-	trace_i915_gem_object_create(obj);
-
 	return obj;
 
 fail:
@@ -519,6 +513,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
 	return ERR_PTR(ret);
 }
 
+struct drm_i915_gem_object *
+i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size)
+{
+	return i915_gem_object_create_region(i915->regions[INTEL_MEMORY_SMEM],
+					     size, 0);
+}
+
 /* Allocate a new GEM object and fill it with the supplied data */
 struct drm_i915_gem_object *
 i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
@@ -569,3 +570,33 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
 	i915_gem_object_put(obj);
 	return ERR_PTR(err);
 }
+
+static int init_shmem(struct intel_memory_region *mem)
+{
+	int err;
+
+	err = i915_gemfs_init(mem->i915);
+	if (err)
+		 DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err);
+
+	return 0; /* Don't error, we can simply fallback to the kernel mnt */
+}
+
+static void release_shmem(struct intel_memory_region *mem)
+{
+	i915_gemfs_fini(mem->i915);
+}
+
+static const struct intel_memory_region_ops shmem_region_ops = {
+	.init = init_shmem,
+	.release = release_shmem,
+	.create_object = create_shmem,
+};
+
+struct intel_memory_region *i915_gem_shmem_setup(struct drm_i915_private *i915)
+{
+	return intel_memory_region_create(i915, 0,
+					  totalram_pages() << PAGE_SHIFT,
+					  I915_GTT_PAGE_SIZE_4K, 0,
+					  &shmem_region_ops);
+}
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 794c6814a6d0..ac8fbada0406 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -926,9 +926,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv)
 
 	intel_gt_init_early(&dev_priv->gt, dev_priv);
 
-	ret = i915_gem_init_early(dev_priv);
-	if (ret < 0)
-		goto err_workqueues;
+	i915_gem_init_early(dev_priv);
 
 	/* This must be called before any calls to HAS_PCH_* */
 	intel_detect_pch(dev_priv);
@@ -954,7 +952,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv)
 err_uc:
 	intel_uc_cleanup_early(dev_priv);
 	i915_gem_cleanup_early(dev_priv);
-err_workqueues:
 	i915_workqueues_cleanup(dev_priv);
 err_engines:
 	i915_engines_cleanup(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 42674a41e469..e8f41c235e70 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2447,11 +2447,13 @@ static inline bool intel_vgpu_active(struct drm_i915_private *dev_priv)
 int i915_gem_init_userptr(struct drm_i915_private *dev_priv);
 void i915_gem_cleanup_userptr(struct drm_i915_private *dev_priv);
 void i915_gem_sanitize(struct drm_i915_private *i915);
-int i915_gem_init_early(struct drm_i915_private *dev_priv);
+void i915_gem_init_early(struct drm_i915_private *dev_priv);
 void i915_gem_cleanup_early(struct drm_i915_private *dev_priv);
 int i915_gem_freeze(struct drm_i915_private *dev_priv);
 int i915_gem_freeze_late(struct drm_i915_private *dev_priv);
 
+struct intel_memory_region *i915_gem_shmem_setup(struct drm_i915_private *i915);
+
 static inline void i915_gem_drain_freed_objects(struct drm_i915_private *i915)
 {
 	if (!atomic_read(&i915->mm.free_count))
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4ba386ab35e7..009e7199bea6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -45,7 +45,6 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
 #include "gem/i915_gem_pm.h"
-#include "gem/i915_gemfs.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_mocs.h"
@@ -1677,10 +1676,8 @@ static void i915_gem_init__mm(struct drm_i915_private *i915)
 	i915_gem_init__objects(i915);
 }
 
-int i915_gem_init_early(struct drm_i915_private *dev_priv)
+void i915_gem_init_early(struct drm_i915_private *dev_priv)
 {
-	int err;
-
 	i915_gem_init__mm(dev_priv);
 	i915_gem_init__pm(dev_priv);
 
@@ -1692,12 +1689,6 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 	atomic_set(&dev_priv->mm.bsd_engine_dispatch_index, 0);
 
 	spin_lock_init(&dev_priv->fb_tracking.lock);
-
-	err = i915_gemfs_init(dev_priv);
-	if (err)
-		DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err);
-
-	return 0;
 }
 
 void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
@@ -1708,8 +1699,6 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
 	WARN_ON(dev_priv->mm.shrink_count);
 
 	cleanup_srcu_struct(&dev_priv->gpu_error.reset_backoff_srcu);
-
-	i915_gemfs_fini(dev_priv);
 }
 
 int i915_gem_freeze(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index e4f811fdaedc..958c61e88200 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2941,7 +2941,8 @@ int i915_gem_init_memory_regions(struct drm_i915_private *i915)
 
 		type = MEMORY_TYPE_FROM_REGION(intel_region_map[i]);
 		switch (type) {
-		default:
+		case INTEL_SMEM:
+			mem = i915_gem_shmem_setup(i915);
 			break;
 		}
 
@@ -2951,11 +2952,9 @@ int i915_gem_init_memory_regions(struct drm_i915_private *i915)
 			goto out_cleanup;
 		}
 
-		if (mem) {
-			mem->id = intel_region_map[i];
-			mem->type = type;
-			mem->instance = MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);
-		}
+		mem->id = intel_region_map[i];
+		mem->type = type;
+		mem->instance = MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);
 
 		i915->regions[i] = mem;
 	}
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index cd41c212bc35..e6bf4bc122bd 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -234,6 +234,13 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
 	GEM_BUG_ON(!size);
 	GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_MIN_ALIGNMENT));
 
+	/*
+	 * There is a prevalence of the assumption that we fit the object's
+	 * page count inside a 32bit _signed_ variable. Let's document this and
+	 * catch if we ever need to fix it. In the meantime, if you do spot
+	 * such a local variable, please consider fixing!
+	 */
+
 	if (size >> PAGE_SHIFT > INT_MAX)
 		return ERR_PTR(-E2BIG);
 
@@ -257,6 +264,8 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
 
 	mutex_unlock(&mem->obj_lock);
 
+	trace_i915_gem_object_create(obj);
+
 	return obj;
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index df07a0bd089d..007108d26acb 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -80,8 +80,6 @@ static void mock_device_release(struct drm_device *dev)
 
 	destroy_workqueue(i915->wq);
 
-	i915_gemfs_fini(i915);
-
 	i915_gem_cleanup_memory_regions(i915);
 
 	drm_mode_config_cleanup(&i915->drm);
@@ -225,8 +223,6 @@ struct drm_i915_private *mock_gem_device(void)
 
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	WARN_ON(i915_gemfs_init(i915));
-
 	err = i915_gem_init_memory_regions(i915);
 	if (err)
 		goto err_context;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 20/37] drm/i915: treat stolen as a region
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (18 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 19/37] drm/i915: treat shmem as a region Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 21/37] drm/i915: define HAS_MAPPABLE_APERTURE Matthew Auld
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Convert stolen memory over to a region object. Still leaves open the
question with what to do with pre-allocated objects...

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 66 +++++++++++++++++++---
 drivers/gpu/drm/i915/i915_drv.h            |  3 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 14 +----
 drivers/gpu/drm/i915/intel_memory_region.c |  2 +-
 4 files changed, 62 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index de1fab2058ec..d0b894854921 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -149,7 +149,7 @@ static int i915_adjust_stolen(struct drm_i915_private *dev_priv,
 	return 0;
 }
 
-void i915_gem_cleanup_stolen(struct drm_i915_private *dev_priv)
+static void i915_gem_cleanup_stolen(struct drm_i915_private *dev_priv)
 {
 	if (!drm_mm_initialized(&dev_priv->mm.stolen))
 		return;
@@ -354,7 +354,7 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
 	}
 }
 
-int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
+static int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
 {
 	resource_size_t reserved_base, stolen_top;
 	resource_size_t reserved_total, reserved_size;
@@ -533,6 +533,9 @@ i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
 
 	i915_gem_stolen_remove_node(dev_priv, stolen);
 	kfree(stolen);
+
+	if (obj->memory_region)
+		i915_gem_object_release_memory_region(obj);
 }
 
 static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
@@ -542,8 +545,8 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
 };
 
 static struct drm_i915_gem_object *
-_i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
-			       struct drm_mm_node *stolen)
+__i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
+				struct drm_mm_node *stolen)
 {
 	struct drm_i915_gem_object *obj;
 	unsigned int cache_level;
@@ -570,10 +573,12 @@ _i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 	return NULL;
 }
 
-struct drm_i915_gem_object *
-i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
-			      resource_size_t size)
+static struct drm_i915_gem_object *
+_i915_gem_object_create_stolen(struct intel_memory_region *mem,
+			       resource_size_t size,
+			       unsigned int flags)
 {
+	struct drm_i915_private *dev_priv = mem->i915;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	int ret;
@@ -594,7 +599,7 @@ i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 		return NULL;
 	}
 
-	obj = _i915_gem_object_create_stolen(dev_priv, stolen);
+	obj = __i915_gem_object_create_stolen(dev_priv, stolen);
 	if (obj)
 		return obj;
 
@@ -603,6 +608,49 @@ i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 	return NULL;
 }
 
+struct drm_i915_gem_object *
+i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
+			      resource_size_t size)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = i915_gem_object_create_region(dev_priv->regions[INTEL_MEMORY_STOLEN],
+					    size, I915_BO_ALLOC_CONTIGUOUS);
+	if (IS_ERR(obj))
+		return NULL;
+
+	return obj;
+}
+
+static int init_stolen(struct intel_memory_region *mem)
+{
+	/*
+	 * Initialise stolen early so that we may reserve preallocated
+	 * objects for the BIOS to KMS transition.
+	 */
+	return i915_gem_init_stolen(mem->i915);
+}
+
+static void release_stolen(struct intel_memory_region *mem)
+{
+	i915_gem_cleanup_stolen(mem->i915);
+}
+
+static const struct intel_memory_region_ops i915_region_stolen_ops = {
+	.init = init_stolen,
+	.release = release_stolen,
+	.create_object = _i915_gem_object_create_stolen,
+};
+
+struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
+{
+	return intel_memory_region_create(i915,
+					  intel_graphics_stolen_res.start,
+					  resource_size(&intel_graphics_stolen_res),
+					  I915_GTT_PAGE_SIZE_4K, 0,
+					  &i915_region_stolen_ops);
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv,
 					       resource_size_t stolen_offset,
@@ -644,7 +692,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
 		return NULL;
 	}
 
-	obj = _i915_gem_object_create_stolen(dev_priv, stolen);
+	obj = __i915_gem_object_create_stolen(dev_priv, stolen);
 	if (obj == NULL) {
 		DRM_DEBUG_DRIVER("failed to allocate stolen object\n");
 		i915_gem_stolen_remove_node(dev_priv, stolen);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e8f41c235e70..a124d1b17773 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2622,8 +2622,7 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
 					 u64 end);
 void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
 				 struct drm_mm_node *node);
-int i915_gem_init_stolen(struct drm_i915_private *dev_priv);
-void i915_gem_cleanup_stolen(struct drm_i915_private *dev_priv);
+struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 			      resource_size_t size);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 958c61e88200..5b7e46e487bf 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2906,8 +2906,6 @@ void i915_gem_cleanup_memory_regions(struct drm_i915_private *i915)
 {
 	int i;
 
-	i915_gem_cleanup_stolen(i915);
-
 	for (i = 0; i < ARRAY_SIZE(i915->regions); ++i)	{
 		struct intel_memory_region *region = i915->regions[i];
 
@@ -2923,15 +2921,6 @@ int i915_gem_init_memory_regions(struct drm_i915_private *i915)
 	/* All platforms currently have system memory */
 	GEM_BUG_ON(!HAS_REGION(i915, REGION_SMEM));
 
-	/*
-	 * Initialise stolen early so that we may reserve preallocated
-	 * objects for the BIOS to KMS transition.
-	 */
-	/* XXX: stolen will become a region at some point */
-	err = i915_gem_init_stolen(i915);
-	if (err)
-		return err;
-
 	for (i = 0; i < ARRAY_SIZE(intel_region_map); i++) {
 		struct intel_memory_region *mem = NULL;
 		u32 type;
@@ -2944,6 +2933,9 @@ int i915_gem_init_memory_regions(struct drm_i915_private *i915)
 		case INTEL_SMEM:
 			mem = i915_gem_shmem_setup(i915);
 			break;
+		case INTEL_STOLEN:
+			mem = i915_gem_stolen_setup(i915);
+			break;
 		}
 
 		if (IS_ERR(mem)) {
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index e6bf4bc122bd..ab57b94b27a9 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -248,7 +248,7 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
 		return ERR_PTR(-E2BIG);
 
 	obj = mem->ops->create_object(mem, size, flags);
-	if (IS_ERR(obj))
+	if (IS_ERR_OR_NULL(obj))
 		return obj;
 
 	INIT_LIST_HEAD(&obj->blocks);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 21/37] drm/i915: define HAS_MAPPABLE_APERTURE
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (19 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 20/37] drm/i915: treat stolen " Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 22/37] drm/i915: do not map aperture if it is not available Matthew Auld
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

The following patches in the series will use it to avoid certain
operations when aperture is not available in HW.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a124d1b17773..4d24f9dc1193 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2244,6 +2244,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
 		(INTEL_INFO(dev_priv)->display.overlay_needs_physical)
 
+#define HAS_MAPPABLE_APERTURE(dev_priv) (dev_priv->ggtt.mappable_end > 0)
+
 /* Early gen2 have a totally busted CS tlb and require pinned batches. */
 #define HAS_BROKEN_CS_TLB(dev_priv)	(IS_I830(dev_priv) || IS_I845G(dev_priv))
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 22/37] drm/i915: do not map aperture if it is not available.
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (20 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 21/37] drm/i915: define HAS_MAPPABLE_APERTURE Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 23/37] drm/i915: expose missing map_gtt support to users Matthew Auld
                   ` (17 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Skip both setup and cleanup of the aperture mapping if the HW doesn't
have an aperture bar.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 36 ++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5b7e46e487bf..43b99136a3ae 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2984,8 +2984,10 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt)
 
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	arch_phys_wc_del(ggtt->mtrr);
-	io_mapping_fini(&ggtt->iomap);
+	if (HAS_MAPPABLE_APERTURE(i915)) {
+		arch_phys_wc_del(ggtt->mtrr);
+		io_mapping_fini(&ggtt->iomap);
+	}
 }
 
 /**
@@ -3386,10 +3388,13 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 	int err;
 
 	/* TODO: We're not aware of mappable constraints on gen8 yet */
-	ggtt->gmadr =
-		(struct resource) DEFINE_RES_MEM(pci_resource_start(pdev, 2),
-						 pci_resource_len(pdev, 2));
-	ggtt->mappable_end = resource_size(&ggtt->gmadr);
+	/* FIXME: We probably need to add do device_info or runtime_info */
+	if (!HAS_LMEM(dev_priv)) {
+		ggtt->gmadr =
+			(struct resource) DEFINE_RES_MEM(pci_resource_start(pdev, 2),
+							 pci_resource_len(pdev, 2));
+		ggtt->mappable_end = resource_size(&ggtt->gmadr);
+	}
 
 	err = pci_set_dma_mask(pdev, DMA_BIT_MASK(39));
 	if (!err)
@@ -3619,15 +3624,18 @@ static int ggtt_init_hw(struct i915_ggtt *ggtt)
 	if (!HAS_LLC(i915) && !HAS_PPGTT(i915))
 		ggtt->vm.mm.color_adjust = i915_gtt_color_adjust;
 
-	if (!io_mapping_init_wc(&ggtt->iomap,
-				ggtt->gmadr.start,
-				ggtt->mappable_end)) {
-		ggtt->vm.cleanup(&ggtt->vm);
-		ret = -EIO;
-		goto out;
-	}
+	if (HAS_MAPPABLE_APERTURE(i915)) {
+		if (!io_mapping_init_wc(&ggtt->iomap,
+					ggtt->gmadr.start,
+					ggtt->mappable_end)) {
+			ggtt->vm.cleanup(&ggtt->vm);
+			ret = -EIO;
+			goto out;
+		}
 
-	ggtt->mtrr = arch_phys_wc_add(ggtt->gmadr.start, ggtt->mappable_end);
+		ggtt->mtrr = arch_phys_wc_add(ggtt->gmadr.start,
+					      ggtt->mappable_end);
+	}
 
 	i915_ggtt_init_fences(ggtt);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 23/37] drm/i915: expose missing map_gtt support to users
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (21 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 22/37] drm/i915: do not map aperture if it is not available Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 23:59   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 24/37] drm/i915: set num_fence_regs to 0 if there is no aperture Matthew Auld
                   ` (16 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Done by returning -ENODEV from the map_gtt version ioctl.

Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ac8fbada0406..34edc0302691 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -425,6 +425,8 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 			return value;
 		break;
 	case I915_PARAM_MMAP_GTT_VERSION:
+		if (!HAS_MAPPABLE_APERTURE(dev_priv))
+			return -ENODEV;
 		/* Though we've started our numbering from 1, and so class all
 		 * earlier versions as 0, in effect their value is undefined as
 		 * the ioctl will report EINVAL for the unknown param!
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 24/37] drm/i915: set num_fence_regs to 0 if there is no aperture
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (22 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 23/37] drm/i915: expose missing map_gtt support to users Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-28  0:00   ` Chris Wilson
  2019-06-27 20:56 ` [PATCH v2 25/37] drm/i915/selftests: check for missing aperture Matthew Auld
                   ` (15 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

We can't fence anything without aperture.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_fence_reg.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index bcac359ec661..bb7d9321cadf 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -808,8 +808,10 @@ void i915_ggtt_init_fences(struct i915_ggtt *ggtt)
 
 	detect_bit_6_swizzle(i915);
 
-	if (INTEL_GEN(i915) >= 7 &&
-	    !(IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)))
+	if (!HAS_MAPPABLE_APERTURE(i915))
+		num_fences = 0;
+	else if (INTEL_GEN(i915) >= 7 &&
+		 !(IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)))
 		num_fences = 32;
 	else if (INTEL_GEN(i915) >= 4 ||
 		 IS_I945G(i915) || IS_I945GM(i915) ||
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 25/37] drm/i915/selftests: check for missing aperture
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (23 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 24/37] drm/i915: set num_fence_regs to 0 if there is no aperture Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 26/37] drm/i915: error capture with no ggtt slot Matthew Auld
                   ` (14 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

We may be missing support for the mappable aperture on some platforms.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 .../drm/i915/gem/selftests/i915_gem_coherency.c    |  5 ++++-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |  3 +++
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c       | 14 ++++++++++----
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c      |  3 +++
 4 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 8f22d3f18422..7f80fdc94e40 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -242,7 +242,10 @@ static bool always_valid(struct drm_i915_private *i915)
 
 static bool needs_fence_registers(struct drm_i915_private *i915)
 {
-	return !i915_terminally_wedged(i915);
+	if (i915_terminally_wedged(i915))
+		return false;
+
+	return i915->ggtt.num_fences;
 }
 
 static bool needs_mi_store_dword(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index a1f0b235f56b..6949df0f963f 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -184,6 +184,9 @@ static int igt_partial_tiling(void *arg)
 	int tiling;
 	int err;
 
+	if (!HAS_MAPPABLE_APERTURE(i915))
+		return 0;
+
 	/* We want to check the page mapping and fencing of a large object
 	 * mmapped through the GTT. The object we create is larger than can
 	 * possibly be mmaped as a whole, and so we must use partial GGTT vma.
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index cf592a049a71..d9712149aa1a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -1182,8 +1182,12 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 	struct i915_request *rq;
 	struct evict_vma arg;
 	struct hang h;
+	unsigned int pin_flags;
 	int err;
 
+	if (!i915->ggtt.num_fences && flags & EXEC_OBJECT_NEEDS_FENCE)
+		return 0;
+
 	if (!intel_engine_can_store_dword(i915->engine[RCS0]))
 		return 0;
 
@@ -1220,10 +1224,12 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 		goto out_obj;
 	}
 
-	err = i915_vma_pin(arg.vma, 0, 0,
-			   i915_vma_is_ggtt(arg.vma) ?
-			   PIN_GLOBAL | PIN_MAPPABLE :
-			   PIN_USER);
+	pin_flags = i915_vma_is_ggtt(arg.vma) ? PIN_GLOBAL : PIN_USER;
+
+	if (flags & EXEC_OBJECT_NEEDS_FENCE)
+		pin_flags |= PIN_MAPPABLE;
+
+	err = i915_vma_pin(arg.vma, 0, 0, pin_flags);
 	if (err) {
 		i915_request_add(rq);
 		goto out_obj;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 31a51ca1ddcb..d049b7f32233 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -1148,6 +1148,9 @@ static int igt_ggtt_page(void *arg)
 	unsigned int *order, n;
 	int err;
 
+	if (!HAS_MAPPABLE_APERTURE(i915))
+		return 0;
+
 	mutex_lock(&i915->drm.struct_mutex);
 
 	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 26/37] drm/i915: error capture with no ggtt slot
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (24 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 25/37] drm/i915/selftests: check for missing aperture Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 27/37] drm/i915: Don't try to place HWS in non-existing mappable region Matthew Auld
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

If the aperture is not available in HW we can't use a ggtt slot and wc
copy, so fall back to regular kmap.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 16 ++++---
 drivers/gpu/drm/i915/i915_gpu_error.c | 63 ++++++++++++++++++++-------
 2 files changed, 57 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 43b99136a3ae..3a8965048a06 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2850,13 +2850,15 @@ static int init_ggtt(struct i915_ggtt *ggtt)
 	if (ret)
 		return ret;
 
-	/* Reserve a mappable slot for our lockless error capture */
-	ret = drm_mm_insert_node_in_range(&ggtt->vm.mm, &ggtt->error_capture,
-					  PAGE_SIZE, 0, I915_COLOR_UNEVICTABLE,
-					  0, ggtt->mappable_end,
-					  DRM_MM_INSERT_LOW);
-	if (ret)
-		return ret;
+	if (HAS_MAPPABLE_APERTURE(ggtt->vm.i915)) {
+		/* Reserve a mappable slot for our lockless error capture */
+		ret = drm_mm_insert_node_in_range(&ggtt->vm.mm, &ggtt->error_capture,
+						  PAGE_SIZE, 0, I915_COLOR_UNEVICTABLE,
+						  0, ggtt->mappable_end,
+						  DRM_MM_INSERT_LOW);
+		if (ret)
+			return ret;
+	}
 
 	/*
 	 * The upper portion of the GuC address space has a sizeable hole
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 5489cd879315..be920deb7ed7 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -218,7 +218,7 @@ struct compress {
 	void *tmp;
 };
 
-static bool compress_init(struct compress *c)
+static bool compress_init(struct compress *c, bool wc)
 {
 	struct z_stream_s *zstream = memset(&c->zstream, 0, sizeof(c->zstream));
 
@@ -234,7 +234,7 @@ static bool compress_init(struct compress *c)
 	}
 
 	c->tmp = NULL;
-	if (i915_has_memcpy_from_wc())
+	if (wc && i915_has_memcpy_from_wc())
 		c->tmp = (void *)__get_free_page(GFP_ATOMIC | __GFP_NOWARN);
 
 	return true;
@@ -335,10 +335,12 @@ static void err_compression_marker(struct drm_i915_error_state_buf *m)
 #else
 
 struct compress {
+	bool wc;
 };
 
-static bool compress_init(struct compress *c)
+static bool compress_init(struct compress *c, bool wc)
 {
+	c->wc = wc;
 	return true;
 }
 
@@ -354,7 +356,7 @@ static int compress_page(struct compress *c,
 		return -ENOMEM;
 
 	ptr = (void *)page;
-	if (!i915_memcpy_from_wc(ptr, src, PAGE_SIZE))
+	if (!(c->wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE)))
 		memcpy(ptr, src, PAGE_SIZE);
 	dst->pages[dst->page_count++] = ptr;
 
@@ -998,7 +1000,6 @@ i915_error_object_create(struct drm_i915_private *i915,
 	struct compress compress;
 	unsigned long num_pages;
 	struct sgt_iter iter;
-	dma_addr_t dma;
 	int ret;
 
 	if (!vma || !vma->pages)
@@ -1017,22 +1018,52 @@ i915_error_object_create(struct drm_i915_private *i915,
 	dst->page_count = 0;
 	dst->unused = 0;
 
-	if (!compress_init(&compress)) {
+	if (!compress_init(&compress, drm_mm_node_allocated(&ggtt->error_capture))) {
 		kfree(dst);
 		return NULL;
 	}
 
 	ret = -EINVAL;
-	for_each_sgt_dma(dma, iter, vma->pages) {
+	if (drm_mm_node_allocated(&ggtt->error_capture)) {
 		void __iomem *s;
+		dma_addr_t dma;
 
-		ggtt->vm.insert_page(&ggtt->vm, dma, slot, I915_CACHE_NONE, 0);
+		for_each_sgt_dma(dma, iter, vma->pages) {
+			ggtt->vm.insert_page(&ggtt->vm, dma, slot,
+					     I915_CACHE_NONE, 0);
 
-		s = io_mapping_map_atomic_wc(&ggtt->iomap, slot);
-		ret = compress_page(&compress, (void  __force *)s, dst);
-		io_mapping_unmap_atomic(s);
-		if (ret)
-			break;
+			s = io_mapping_map_atomic_wc(&ggtt->iomap, slot);
+			ret = compress_page(&compress, (void  __force *)s, dst);
+			io_mapping_unmap_atomic(s);
+
+			if (ret)
+				break;
+		}
+	} else if (i915_gem_object_is_lmem(vma->obj)) {
+		void *s;
+		dma_addr_t dma;
+		struct intel_memory_region *mem = vma->obj->memory_region;
+
+		for_each_sgt_dma(dma, iter, vma->pages) {
+			s = io_mapping_map_atomic_wc(&mem->iomap, dma);
+			ret = compress_page(&compress, s, dst);
+			io_mapping_unmap_atomic(s);
+
+			if (ret)
+				break;
+		}
+	} else {
+		void *s;
+		struct page *page;
+
+		for_each_sgt_page(page, iter, vma->pages) {
+			s = kmap_atomic(page);
+			ret = compress_page(&compress, s, dst);
+			kunmap_atomic(s);
+
+			if (ret)
+				break;
+		}
 	}
 
 	if (ret || compress_flush(&compress, dst)) {
@@ -1745,9 +1776,11 @@ static unsigned long capture_find_epoch(const struct i915_gpu_state *error)
 static void capture_finish(struct i915_gpu_state *error)
 {
 	struct i915_ggtt *ggtt = &error->i915->ggtt;
-	const u64 slot = ggtt->error_capture.start;
 
-	ggtt->vm.clear_range(&ggtt->vm, slot, PAGE_SIZE);
+	if (drm_mm_node_allocated(&ggtt->error_capture)) {
+		const u64 slot = ggtt->error_capture.start;
+		ggtt->vm.clear_range(&ggtt->vm, slot, PAGE_SIZE);
+	}
 }
 
 static int capture(void *data)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 27/37] drm/i915: Don't try to place HWS in non-existing mappable region
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (25 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 26/37] drm/i915: error capture with no ggtt slot Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core Matthew Auld
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

HWS placement restrictions can't just rely on HAS_LLC flag.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index d1508f0b4c84..a4aedf1d7f2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -543,7 +543,7 @@ static int pin_ggtt_status_page(struct intel_engine_cs *engine,
 	unsigned int flags;
 
 	flags = PIN_GLOBAL;
-	if (!HAS_LLC(engine->i915))
+	if (!HAS_LLC(engine->i915) && HAS_MAPPABLE_APERTURE(engine->i915))
 		/*
 		 * On g33, we cannot place HWS above 256MiB, so
 		 * restrict its pinning to the low mappable arena.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (26 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 27/37] drm/i915: Don't try to place HWS in non-existing mappable region Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-28  0:05   ` Chris Wilson
                     ` (3 more replies)
  2019-06-27 20:56 ` [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET Matthew Auld
                   ` (11 subsequent siblings)
  39 siblings, 4 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

This enables us to store extra data within vma->vm_private_data and assign
the pagefault ops for each mmap instance.

We replace the core drm_gem_mmap implementation to overcome the limitation
in having only a single offset node per gem object, allowing us to have
multiple offsets per object. This enables a mapping instance to use unique
fault-hadlers, per object.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 179 ++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  32 ++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  17 +-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  17 ++
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  12 +-
 drivers/gpu/drm/i915/gt/intel_reset.c         |  19 +-
 drivers/gpu/drm/i915/i915_drv.c               |   9 +-
 drivers/gpu/drm/i915/i915_drv.h               |   1 +
 drivers/gpu/drm/i915/i915_vma.c               |  21 +-
 9 files changed, 268 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 391621ee3cbb..7b46f44d9c20 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -219,7 +219,8 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 {
 #define MIN_CHUNK_PAGES (SZ_1M >> PAGE_SHIFT)
 	struct vm_area_struct *area = vmf->vma;
-	struct drm_i915_gem_object *obj = to_intel_bo(area->vm_private_data);
+	struct i915_mmap_offset *priv = area->vm_private_data;
+	struct drm_i915_gem_object *obj = priv->obj;
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct intel_runtime_pm *rpm = &i915->runtime_pm;
@@ -371,13 +372,15 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
 {
 	struct i915_vma *vma;
+	struct i915_mmap_offset *mmo;
 
 	GEM_BUG_ON(!obj->userfault_count);
 
 	obj->userfault_count = 0;
 	list_del(&obj->userfault_link);
-	drm_vma_node_unmap(&obj->base.vma_node,
-			   obj->base.dev->anon_inode->i_mapping);
+	list_for_each_entry(mmo, &obj->mmap_offsets, offset)
+		drm_vma_node_unmap(&mmo->vma_node,
+				   obj->base.dev->anon_inode->i_mapping);
 
 	for_each_ggtt_vma(vma, obj)
 		i915_vma_unset_userfault(vma);
@@ -431,14 +434,31 @@ void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
 	intel_runtime_pm_put(&i915->runtime_pm, wakeref);
 }
 
-static int create_mmap_offset(struct drm_i915_gem_object *obj)
+static void init_mmap_offset(struct drm_i915_gem_object *obj,
+			     struct i915_mmap_offset *mmo)
+{
+	mutex_lock(&obj->mmo_lock);
+	kref_init(&mmo->ref);
+	list_add(&mmo->offset, &obj->mmap_offsets);
+	mutex_unlock(&obj->mmo_lock);
+}
+
+static int create_mmap_offset(struct drm_i915_gem_object *obj,
+			      struct i915_mmap_offset *mmo)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct drm_device *dev = obj->base.dev;
 	int err;
 
-	err = drm_gem_create_mmap_offset(&obj->base);
-	if (likely(!err))
+	drm_vma_node_reset(&mmo->vma_node);
+	if (mmo->file)
+		drm_vma_node_allow(&mmo->vma_node, mmo->file);
+	err = drm_vma_offset_add(dev->vma_offset_manager, &mmo->vma_node,
+				 obj->base.size / PAGE_SIZE);
+	if (likely(!err)) {
+		init_mmap_offset(obj, mmo);
 		return 0;
+	}
 
 	/* Attempt to reap some mmap space from dead objects */
 	do {
@@ -449,32 +469,49 @@ static int create_mmap_offset(struct drm_i915_gem_object *obj)
 			break;
 
 		i915_gem_drain_freed_objects(i915);
-		err = drm_gem_create_mmap_offset(&obj->base);
-		if (!err)
+		err = drm_vma_offset_add(dev->vma_offset_manager, &mmo->vma_node,
+					 obj->base.size / PAGE_SIZE);
+		if (!err) {
+			init_mmap_offset(obj, mmo);
 			break;
+		}
 
 	} while (flush_delayed_work(&i915->gem.retire_work));
 
 	return err;
 }
 
-int
-i915_gem_mmap_gtt(struct drm_file *file,
-		  struct drm_device *dev,
-		  u32 handle,
-		  u64 *offset)
+static int
+__assign_gem_object_mmap_data(struct drm_file *file,
+			      u32 handle,
+			      enum i915_mmap_type mmap_type,
+			      u64 *offset)
 {
 	struct drm_i915_gem_object *obj;
+	struct i915_mmap_offset *mmo;
 	int ret;
 
 	obj = i915_gem_object_lookup(file, handle);
 	if (!obj)
 		return -ENOENT;
 
-	ret = create_mmap_offset(obj);
-	if (ret == 0)
-		*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
+	mmo = kzalloc(sizeof(*mmo), GFP_KERNEL);
+	if (!mmo) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	mmo->file = file;
+	ret = create_mmap_offset(obj, mmo);
+	if (ret) {
+		kfree(mmo);
+		goto err;
+	}
 
+	mmo->mmap_type = mmap_type;
+	mmo->obj = obj;
+	*offset = drm_vma_node_offset_addr(&mmo->vma_node);
+err:
 	i915_gem_object_put(obj);
 	return ret;
 }
@@ -498,9 +535,115 @@ int
 i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file)
 {
-	struct drm_i915_gem_mmap_gtt *args = data;
+	struct drm_i915_gem_mmap_offset *args = data;
 
-	return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset);
+	return __assign_gem_object_mmap_data(file, args->handle,
+					     I915_MMAP_TYPE_GTT,
+					     &args->offset);
+}
+
+void i915_mmap_offset_object_release(struct kref *ref)
+{
+	struct i915_mmap_offset *mmo = container_of(ref,
+						    struct i915_mmap_offset,
+						    ref);
+	struct drm_i915_gem_object *obj = mmo->obj;
+	struct drm_device *dev = obj->base.dev;
+
+	if (mmo->file)
+		drm_vma_node_revoke(&mmo->vma_node, mmo->file);
+	drm_vma_offset_remove(dev->vma_offset_manager, &mmo->vma_node);
+	list_del(&mmo->offset);
+
+	kfree(mmo);
+}
+
+static void i915_gem_vm_open(struct vm_area_struct *vma)
+{
+	struct i915_mmap_offset *priv = vma->vm_private_data;
+	struct drm_i915_gem_object *obj = priv->obj;
+
+	drm_gem_object_get(&obj->base);
+}
+
+static void i915_gem_vm_close(struct vm_area_struct *vma)
+{
+	struct i915_mmap_offset *priv = vma->vm_private_data;
+	struct drm_i915_gem_object *obj = priv->obj;
+
+	drm_gem_object_put_unlocked(&obj->base);
+	kref_put(&priv->ref, i915_mmap_offset_object_release);
+}
+
+static const struct vm_operations_struct i915_gem_gtt_vm_ops = {
+	.fault = i915_gem_fault,
+	.open = i915_gem_vm_open,
+	.close = i915_gem_vm_close,
+};
+
+/* This overcomes the limitation in drm_gem_mmap's assignment of a
+ * drm_gem_object as the vma->vm_private_data. Since we need to
+ * be able to resolve multiple mmap offsets which could be tied
+ * to a single gem object.
+ */
+int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+	struct drm_vma_offset_node *node;
+	struct drm_file *priv = filp->private_data;
+	struct drm_device *dev = priv->minor->dev;
+	struct i915_mmap_offset *mmo;
+	struct drm_gem_object *obj = NULL;
+
+	if (drm_dev_is_unplugged(dev))
+		return -ENODEV;
+
+	drm_vma_offset_lock_lookup(dev->vma_offset_manager);
+	node = drm_vma_offset_exact_lookup_locked(dev->vma_offset_manager,
+						  vma->vm_pgoff,
+						  vma_pages(vma));
+	if (likely(node)) {
+	        mmo = container_of(node, struct i915_mmap_offset,
+				   vma_node);
+
+		/* Take a ref for our mmap_offset and gem objects. The reference is cleaned
+		 * up when the vma is closed.
+		 *
+		 * Skip 0-refcnted objects as it is in the process of being destroyed
+		 * and will be invalid when the vma manager lock is released.
+		 */
+		if (kref_get_unless_zero(&mmo->ref)) {
+			obj = &mmo->obj->base;
+			if (!kref_get_unless_zero(&obj->refcount))
+				obj = NULL;
+		}
+	}
+	drm_vma_offset_unlock_lookup(dev->vma_offset_manager);
+
+	if (!obj)
+		return -EINVAL;
+
+	if (!drm_vma_node_is_allowed(node, priv)) {
+		drm_gem_object_put_unlocked(obj);
+		return -EACCES;
+	}
+
+	if (node->readonly) {
+		if (vma->vm_flags & VM_WRITE) {
+			drm_gem_object_put_unlocked(obj);
+			return -EINVAL;
+		}
+
+		vma->vm_flags &= ~VM_MAYWRITE;
+	}
+
+	vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
+	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+	vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
+	vma->vm_private_data = mmo;
+
+	vma->vm_ops = &i915_gem_gtt_vm_ops;
+
+	return 0;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 43194fbcbc2e..343162bc8181 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -55,6 +55,27 @@ frontbuffer_retire(struct i915_active_request *active,
 	intel_fb_obj_flush(obj, ORIGIN_CS);
 }
 
+static void i915_gem_object_funcs_free(struct drm_gem_object *obj)
+{
+	/* Unused for now. Mandatory callback */
+}
+
+static void i915_gem_object_funcs_close(struct drm_gem_object *gem_obj, struct drm_file *file)
+{
+	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
+	struct i915_mmap_offset *mmo, *on;
+
+	mutex_lock(&obj->mmo_lock);
+	list_for_each_entry_safe(mmo, on, &obj->mmap_offsets, offset)
+		kref_put(&mmo->ref, i915_mmap_offset_object_release);
+	mutex_unlock(&obj->mmo_lock);
+}
+
+static const struct drm_gem_object_funcs i915_gem_object_funcs = {
+	.free = i915_gem_object_funcs_free,
+	.close = i915_gem_object_funcs_close,
+};
+
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops)
 {
@@ -66,9 +87,13 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&obj->lut_list);
 	INIT_LIST_HEAD(&obj->batch_pool_link);
 
+	mutex_init(&obj->mmo_lock);
+	INIT_LIST_HEAD(&obj->mmap_offsets);
+
 	init_rcu_head(&obj->rcu);
 
 	obj->ops = ops;
+	obj->base.funcs = &i915_gem_object_funcs;
 
 	obj->frontbuffer_ggtt_origin = ORIGIN_GTT;
 	i915_active_request_init(&obj->frontbuffer_write,
@@ -155,6 +180,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
 	llist_for_each_entry_safe(obj, on, freed, freed) {
 		struct i915_vma *vma, *vn;
+		struct i915_mmap_offset *mmo, *on;
 
 		trace_i915_gem_object_destroy(obj);
 
@@ -184,6 +210,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 			spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 		}
 
+		i915_gem_object_release_mmap(obj);
 		mutex_unlock(&i915->drm.struct_mutex);
 
 		GEM_BUG_ON(atomic_read(&obj->bind_count));
@@ -203,6 +230,11 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 
 		drm_gem_object_release(&obj->base);
 
+		mutex_lock(&obj->mmo_lock);
+		list_for_each_entry_safe(mmo, on, &obj->mmap_offsets, offset)
+			kref_put(&mmo->ref, i915_mmap_offset_object_release);
+		mutex_unlock(&obj->mmo_lock);
+
 		bitmap_free(obj->bit_17);
 		i915_gem_object_free(obj);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 20754c15412a..42b46bb46580 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -125,13 +125,23 @@ void i915_gem_object_unlock_fence(struct drm_i915_gem_object *obj,
 static inline void
 i915_gem_object_set_readonly(struct drm_i915_gem_object *obj)
 {
-	obj->base.vma_node.readonly = true;
+	struct i915_mmap_offset *mmo;
+
+	list_for_each_entry(mmo, &obj->mmap_offsets, offset)
+	        mmo->vma_node.readonly = true;
 }
 
 static inline bool
 i915_gem_object_is_readonly(const struct drm_i915_gem_object *obj)
 {
-	return obj->base.vma_node.readonly;
+	struct i915_mmap_offset *mmo;
+
+	list_for_each_entry(mmo, &obj->mmap_offsets, offset) {
+		if (mmo->vma_node.readonly)
+			return true;
+	}
+
+	return false;
 }
 
 static inline bool
@@ -419,6 +429,9 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
 int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 				  unsigned int flags,
 				  const struct i915_sched_attr *attr);
+
+void i915_mmap_offset_object_release(struct kref *ref);
+
 #define I915_PRIORITY_DISPLAY I915_USER_PRIORITY(I915_PRIORITY_MAX)
 
 #endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 8cdee185251a..86f358da8085 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -61,6 +61,19 @@ struct drm_i915_gem_object_ops {
 	void (*release)(struct drm_i915_gem_object *obj);
 };
 
+enum i915_mmap_type {
+	I915_MMAP_TYPE_GTT = 0,
+};
+
+struct i915_mmap_offset {
+	struct drm_vma_offset_node vma_node;
+	struct drm_i915_gem_object* obj;
+	struct drm_file *file;
+	enum i915_mmap_type mmap_type;
+	struct kref ref;
+	struct list_head offset;
+};
+
 struct drm_i915_gem_object {
 	struct drm_gem_object base;
 
@@ -132,6 +145,10 @@ struct drm_i915_gem_object {
 	unsigned int userfault_count;
 	struct list_head userfault_link;
 
+	/* Protects access to mmap offsets */
+	struct mutex mmo_lock;
+	struct list_head mmap_offsets;
+
 	struct list_head batch_pool_link;
 	I915_SELFTEST_DECLARE(struct list_head st_link);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 6949df0f963f..ea90ba9fd34c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -366,15 +366,20 @@ static bool assert_mmap_offset(struct drm_i915_private *i915,
 			       int expected)
 {
 	struct drm_i915_gem_object *obj;
+	/* refcounted in create_mmap_offset */
+	struct i915_mmap_offset *mmo = kzalloc(sizeof(*mmo), GFP_KERNEL);
 	int err;
 
 	obj = i915_gem_object_create_internal(i915, size);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	err = create_mmap_offset(obj);
+	err = create_mmap_offset(obj, mmo);
 	i915_gem_object_put(obj);
 
+	if (err)
+		kfree(mmo);
+
 	return err == expected;
 }
 
@@ -405,6 +410,8 @@ static int igt_mmap_offset_exhaustion(void *arg)
 	struct drm_mm *mm = &i915->drm.vma_offset_manager->vm_addr_space_mm;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node resv, *hole;
+	/* refcounted in create_mmap_offset */
+	struct i915_mmap_offset *mmo = kzalloc(sizeof(*mmo), GFP_KERNEL);
 	u64 hole_start, hole_end;
 	int loop, err;
 
@@ -446,9 +453,10 @@ static int igt_mmap_offset_exhaustion(void *arg)
 		goto out;
 	}
 
-	err = create_mmap_offset(obj);
+	err = create_mmap_offset(obj, mmo);
 	if (err) {
 		pr_err("Unable to insert object into reclaimed hole\n");
+		kfree(mmo);
 		goto err_obj;
 	}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index adfdb908587f..1ca98b0c4f82 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -688,6 +688,7 @@ static void revoke_mmaps(struct drm_i915_private *i915)
 
 	for (i = 0; i < i915->ggtt.num_fences; i++) {
 		struct drm_vma_offset_node *node;
+		struct i915_mmap_offset *mmo;
 		struct i915_vma *vma;
 		u64 vma_offset;
 
@@ -701,10 +702,20 @@ static void revoke_mmaps(struct drm_i915_private *i915)
 		GEM_BUG_ON(vma->fence != &i915->ggtt.fence_regs[i]);
 		node = &vma->obj->base.vma_node;
 		vma_offset = vma->ggtt_view.partial.offset << PAGE_SHIFT;
-		unmap_mapping_range(i915->drm.anon_inode->i_mapping,
-				    drm_vma_node_offset_addr(node) + vma_offset,
-				    vma->size,
-				    1);
+
+		mutex_lock(&vma->obj->mmo_lock);
+		list_for_each_entry(mmo, &vma->obj->mmap_offsets, offset) {
+			node = &mmo->vma_node;
+			if (!drm_mm_node_allocated(&node->vm_node) ||
+			    mmo->mmap_type != I915_MMAP_TYPE_GTT)
+				continue;
+
+			unmap_mapping_range(i915->drm.anon_inode->i_mapping,
+					    drm_vma_node_offset_addr(node) + vma_offset,
+					    vma->size,
+					    1);
+		}
+		mutex_unlock(&vma->obj->mmo_lock);
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 34edc0302691..0f1f3b7f3029 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3124,18 +3124,12 @@ const struct dev_pm_ops i915_pm_ops = {
 	.runtime_resume = intel_runtime_resume,
 };
 
-static const struct vm_operations_struct i915_gem_vm_ops = {
-	.fault = i915_gem_fault,
-	.open = drm_gem_vm_open,
-	.close = drm_gem_vm_close,
-};
-
 static const struct file_operations i915_driver_fops = {
 	.owner = THIS_MODULE,
 	.open = drm_open,
 	.release = drm_release,
 	.unlocked_ioctl = drm_ioctl,
-	.mmap = drm_gem_mmap,
+	.mmap = i915_gem_mmap,
 	.poll = drm_poll,
 	.read = drm_read,
 	.compat_ioctl = i915_compat_ioctl,
@@ -3224,7 +3218,6 @@ static struct drm_driver driver = {
 
 	.gem_close_object = i915_gem_close_object,
 	.gem_free_object_unlocked = i915_gem_free_object,
-	.gem_vm_ops = &i915_gem_vm_ops,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4d24f9dc1193..dc2bf48165f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2557,6 +2557,7 @@ int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv,
 void i915_gem_suspend(struct drm_i915_private *dev_priv);
 void i915_gem_suspend_late(struct drm_i915_private *dev_priv);
 void i915_gem_resume(struct drm_i915_private *dev_priv);
+int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma);
 vm_fault_t i915_gem_fault(struct vm_fault *vmf);
 
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c20a3022cd80..e0e07818efe0 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -856,7 +856,8 @@ static void __i915_vma_iounmap(struct i915_vma *vma)
 
 void i915_vma_revoke_mmap(struct i915_vma *vma)
 {
-	struct drm_vma_offset_node *node = &vma->obj->base.vma_node;
+	struct drm_vma_offset_node *node;
+	struct i915_mmap_offset *mmo;
 	u64 vma_offset;
 
 	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
@@ -868,10 +869,20 @@ void i915_vma_revoke_mmap(struct i915_vma *vma)
 	GEM_BUG_ON(!vma->obj->userfault_count);
 
 	vma_offset = vma->ggtt_view.partial.offset << PAGE_SHIFT;
-	unmap_mapping_range(vma->vm->i915->drm.anon_inode->i_mapping,
-			    drm_vma_node_offset_addr(node) + vma_offset,
-			    vma->size,
-			    1);
+
+	mutex_lock(&vma->obj->mmo_lock);
+	list_for_each_entry(mmo, &vma->obj->mmap_offsets, offset) {
+		node = &mmo->vma_node;
+		if (!drm_mm_node_allocated(&node->vm_node) ||
+		    mmo->mmap_type != I915_MMAP_TYPE_GTT)
+			continue;
+
+		unmap_mapping_range(vma->vm->i915->drm.anon_inode->i_mapping,
+				    drm_vma_node_offset_addr(node) + vma_offset,
+				    vma->size,
+				    1);
+	}
+	mutex_unlock(&vma->obj->mmo_lock);
 
 	i915_vma_unset_userfault(vma);
 	if (!--vma->obj->userfault_count)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (27 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-28  0:12   ` Chris Wilson
  2019-07-30  9:49   ` Daniel Vetter
  2019-06-27 20:56 ` [PATCH v2 30/37] drm/i915/lmem: add helper to get CPU accessible offset Matthew Auld
                   ` (10 subsequent siblings)
  39 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

Add a new CPU mmap implementation that allows multiple fault handlers
that depends on the object's backing pages.

Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
and use the zero extending behaviour of drm to differentiate between
them, when we inspect the flags.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |  2 ++
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 30 ++++++++++++++++++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
 drivers/gpu/drm/i915/i915_drv.c               |  3 +-
 include/uapi/drm/i915_drm.h                   | 31 +++++++++++++++++++
 5 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
index ddc7f2a52b3e..5abd5b2172f2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
@@ -30,6 +30,8 @@ int i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file);
 int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file);
+int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
+			       struct drm_file *file_priv);
 int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file);
 int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 7b46f44d9c20..cbf89e80a97b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -536,12 +536,42 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file)
 {
 	struct drm_i915_gem_mmap_offset *args = data;
+	struct drm_i915_private *i915 = to_i915(dev);
+
+	if (args->flags & I915_MMAP_OFFSET_FLAGS)
+		return i915_gem_mmap_offset_ioctl(dev, data, file);
+
+	if (!HAS_MAPPABLE_APERTURE(i915)) {
+		DRM_ERROR("No aperture, cannot mmap via legacy GTT\n");
+		return -ENODEV;
+	}
 
 	return __assign_gem_object_mmap_data(file, args->handle,
 					     I915_MMAP_TYPE_GTT,
 					     &args->offset);
 }
 
+int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
+			       struct drm_file *file)
+{
+	struct drm_i915_gem_mmap_offset *args = data;
+	enum i915_mmap_type type;
+
+	if ((args->flags & (I915_MMAP_OFFSET_WC | I915_MMAP_OFFSET_WB)) &&
+	    !boot_cpu_has(X86_FEATURE_PAT))
+		return -ENODEV;
+
+	if (args->flags & I915_MMAP_OFFSET_WC)
+		type = I915_MMAP_TYPE_OFFSET_WC;
+	else if (args->flags & I915_MMAP_OFFSET_WB)
+		type = I915_MMAP_TYPE_OFFSET_WB;
+	else if (args->flags & I915_MMAP_OFFSET_UC)
+		type = I915_MMAP_TYPE_OFFSET_UC;
+
+	return __assign_gem_object_mmap_data(file, args->handle, type,
+					     &args->offset);
+}
+
 void i915_mmap_offset_object_release(struct kref *ref)
 {
 	struct i915_mmap_offset *mmo = container_of(ref,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 86f358da8085..f95e54a25426 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -63,6 +63,9 @@ struct drm_i915_gem_object_ops {
 
 enum i915_mmap_type {
 	I915_MMAP_TYPE_GTT = 0,
+	I915_MMAP_TYPE_OFFSET_WC,
+	I915_MMAP_TYPE_OFFSET_WB,
+	I915_MMAP_TYPE_OFFSET_UC,
 };
 
 struct i915_mmap_offset {
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0f1f3b7f3029..8dadd6b9a0a9 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -459,6 +459,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
 	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
 	case I915_PARAM_HAS_EXEC_SUBMIT_FENCE:
+	case I915_PARAM_MMAP_OFFSET_VERSION:
 		/* For the time being all of these are always true;
 		 * if some supported hardware does not have one of these
 		 * features this value needs to be provided from
@@ -3176,7 +3177,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_PREAD, i915_gem_pread_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_PWRITE, i915_gem_pwrite_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP, i915_gem_mmap_ioctl, DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP_GTT, i915_gem_mmap_gtt_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP_OFFSET, i915_gem_mmap_gtt_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_SET_DOMAIN, i915_gem_set_domain_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_SW_FINISH, i915_gem_sw_finish_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_SET_TILING, i915_gem_set_tiling_ioctl, DRM_RENDER_ALLOW),
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 328d05e77d9f..729e729e2282 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -359,6 +359,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_QUERY			0x39
 #define DRM_I915_GEM_VM_CREATE		0x3a
 #define DRM_I915_GEM_VM_DESTROY		0x3b
+#define DRM_I915_GEM_MMAP_OFFSET   	DRM_I915_GEM_MMAP_GTT
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -421,6 +422,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
 #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_OFFSET, struct drm_i915_gem_mmap_offset)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -610,6 +612,10 @@ typedef struct drm_i915_irq_wait {
  * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
  */
 #define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
+
+/* Mmap offset ioctl */
+#define I915_PARAM_MMAP_OFFSET_VERSION	54
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -785,6 +791,31 @@ struct drm_i915_gem_mmap_gtt {
 	__u64 offset;
 };
 
+struct drm_i915_gem_mmap_offset {
+	/** Handle for the object being mapped. */
+	__u32 handle;
+	__u32 pad;
+	/**
+	 * Fake offset to use for subsequent mmap call
+	 *
+	 * This is a fixed-size type for 32/64 compatibility.
+	 */
+	__u64 offset;
+
+	/**
+	 * Flags for extended behaviour.
+	 *
+	 * It is mandatory that either one of the _WC/_WB flags
+	 * should be passed here.
+	 */
+	__u64 flags;
+#define I915_MMAP_OFFSET_WC (1 << 0)
+#define I915_MMAP_OFFSET_WB (1 << 1)
+#define I915_MMAP_OFFSET_UC (1 << 2)
+#define I915_MMAP_OFFSET_FLAGS \
+	(I915_MMAP_OFFSET_WC | I915_MMAP_OFFSET_WB | I915_MMAP_OFFSET_UC)
+};
+
 struct drm_i915_gem_set_domain {
 	/** Handle for the object */
 	__u32 handle;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 30/37] drm/i915/lmem: add helper to get CPU accessible offset
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (28 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 31/37] drm/i915: Add cpu and lmem fault handlers Matthew Auld
                   ` (9 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

LMEM can be accessed by the CPU through a BAR. The mapping itself should
be 1:1.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_region_lmem.c | 16 ++++++++++++++++
 drivers/gpu/drm/i915/intel_region_lmem.h |  3 +++
 2 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index d0a5311cf235..ceec2bff465f 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -248,6 +248,22 @@ void __iomem *i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
 	return io_mapping_map_wc(&obj->memory_region->iomap, offset, size);
 }
 
+resource_size_t i915_gem_object_lmem_io_offset(struct drm_i915_gem_object *obj,
+					       unsigned long n)
+{
+	struct intel_memory_region *mem = obj->memory_region;
+	dma_addr_t daddr;
+
+	/*
+	 * XXX: It's not a dma address, more a device address or physical
+	 * offset, so we are clearly abusing the semantics of the sg_table
+	 * here, and elsewhere like in the gtt paths.
+	 */
+	daddr = i915_gem_object_get_dma_address(obj, n);
+
+	return mem->io_start + daddr;
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
 	struct intel_memory_region *region = obj->memory_region;
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.h b/drivers/gpu/drm/i915/intel_region_lmem.h
index 20084f7b4bff..609de692489d 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.h
+++ b/drivers/gpu/drm/i915/intel_region_lmem.h
@@ -12,6 +12,9 @@ void __iomem *i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
 void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
 					       unsigned long n);
 
+resource_size_t i915_gem_object_lmem_io_offset(struct drm_i915_gem_object *obj,
+					       unsigned long n);
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 31/37] drm/i915: Add cpu and lmem fault handlers
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (29 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 30/37] drm/i915/lmem: add helper to get CPU accessible offset Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 32/37] drm/i915: cpu-map based dumb buffers Matthew Auld
                   ` (8 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

Fault handler to handle missing pages to be filled depending on an
object's backing storage. Handle also changes needed to refault pages
depending on fault handler usage.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   | 154 +++++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   2 +-
 drivers/gpu/drm/i915/i915_gem.c            |   2 +-
 drivers/gpu/drm/i915/intel_region_lmem.c   |  47 +++++++
 drivers/gpu/drm/i915/intel_region_lmem.h   |   2 +
 5 files changed, 192 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index cbf89e80a97b..5941648ee0ce 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -5,6 +5,7 @@
  */
 
 #include <linux/mman.h>
+#include <linux/pfn_t.h>
 #include <linux/sizes.h>
 
 #include "i915_drv.h"
@@ -369,7 +370,61 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	}
 }
 
-void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
+static vm_fault_t i915_gem_fault_cpu(struct vm_fault *vmf)
+{
+	struct vm_area_struct *area = vmf->vma;
+	struct i915_mmap_offset *priv = area->vm_private_data;
+	struct drm_i915_gem_object *obj = priv->obj;
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	vm_fault_t vmf_ret;
+	unsigned long size = area->vm_end - area->vm_start;
+	bool write = area->vm_flags & VM_WRITE;
+	int i, ret;
+
+	/* Sanity check that we allow writing into this object */
+	if (i915_gem_object_is_readonly(obj) && write)
+		return VM_FAULT_SIGBUS;
+
+	ret = i915_gem_object_pin_pages(obj);
+	if (ret)
+		goto err;
+
+	for (i = 0; i < size >> PAGE_SHIFT; i++) {
+		struct page *page = i915_gem_object_get_page(obj, i);
+		vmf_ret = vmf_insert_pfn(area,
+					 (unsigned long)area->vm_start + i * PAGE_SIZE,
+					 page_to_pfn(page));
+		if (vmf_ret & VM_FAULT_ERROR) {
+			ret = vm_fault_to_errno(vmf_ret, 0);
+			break;
+		}
+	}
+
+	i915_gem_object_unpin_pages(obj);
+err:
+	switch (ret) {
+	case -EIO:
+		if (!i915_terminally_wedged(dev_priv))
+			return VM_FAULT_SIGBUS;
+	case -EAGAIN:
+	case 0:
+	case -ERESTARTSYS:
+	case -EINTR:
+	case -EBUSY:
+		return VM_FAULT_NOPAGE;
+	case -ENOMEM:
+		return VM_FAULT_OOM;
+	case -ENOSPC:
+	case -EFAULT:
+		return VM_FAULT_SIGBUS;
+	default:
+		WARN_ONCE(ret, "unhandled error in %s: %i\n", __func__, ret);
+		return VM_FAULT_SIGBUS;
+	}
+}
+
+void __i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj)
 {
 	struct i915_vma *vma;
 	struct i915_mmap_offset *mmo;
@@ -378,21 +433,20 @@ void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
 
 	obj->userfault_count = 0;
 	list_del(&obj->userfault_link);
-	list_for_each_entry(mmo, &obj->mmap_offsets, offset)
-		drm_vma_node_unmap(&mmo->vma_node,
-				   obj->base.dev->anon_inode->i_mapping);
+
+	mutex_lock(&obj->mmo_lock);
+	list_for_each_entry(mmo, &obj->mmap_offsets, offset) {
+		if (mmo->mmap_type == I915_MMAP_TYPE_GTT)
+			drm_vma_node_unmap(&mmo->vma_node,
+					   obj->base.dev->anon_inode->i_mapping);
+	}
+	mutex_unlock(&obj->mmo_lock);
 
 	for_each_ggtt_vma(vma, obj)
 		i915_vma_unset_userfault(vma);
 }
 
 /**
- * i915_gem_object_release_mmap - remove physical page mappings
- * @obj: obj in question
- *
- * Preserve the reservation of the mmapping with the DRM core code, but
- * relinquish ownership of the pages back to the system.
- *
  * It is vital that we remove the page mapping if we have mapped a tiled
  * object through the GTT and then lose the fence register due to
  * resource pressure. Similarly if the object has been moved out of the
@@ -400,7 +454,7 @@ void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
  * mapping will then trigger a page fault on the next user access, allowing
  * fixup by i915_gem_fault().
  */
-void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
+static void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	intel_wakeref_t wakeref;
@@ -419,7 +473,7 @@ void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
 	if (!obj->userfault_count)
 		goto out;
 
-	__i915_gem_object_release_mmap(obj);
+	__i915_gem_object_release_mmap_gtt(obj);
 
 	/* Ensure that the CPU's PTE are revoked and there are not outstanding
 	 * memory transactions from userspace before we return. The TLB
@@ -434,6 +488,35 @@ void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
 	intel_runtime_pm_put(&i915->runtime_pm, wakeref);
 }
 
+static void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj)
+{
+	struct i915_mmap_offset *mmo;
+
+	mutex_lock(&obj->mmo_lock);
+	list_for_each_entry(mmo, &obj->mmap_offsets, offset) {
+		if (mmo->mmap_type == I915_MMAP_TYPE_OFFSET_WC ||
+		    mmo->mmap_type == I915_MMAP_TYPE_OFFSET_WB ||
+		    mmo->mmap_type == I915_MMAP_TYPE_OFFSET_UC ||
+		    mmo->mmap_type == I915_MMAP_TYPE_DUMB_WC)
+			drm_vma_node_unmap(&mmo->vma_node,
+					   obj->base.dev->anon_inode->i_mapping);
+	}
+	mutex_unlock(&obj->mmo_lock);
+}
+
+/**
+ * i915_gem_object_release_mmap - remove physical page mappings
+ * @obj: obj in question
+ *
+ * Preserve the reservation of the mmapping with the DRM core code, but
+ * relinquish ownership of the pages back to the system.
+ */
+void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_release_mmap_gtt(obj);
+	i915_gem_object_release_mmap_offset(obj);
+}
+
 static void init_mmap_offset(struct drm_i915_gem_object *obj,
 			     struct i915_mmap_offset *mmo)
 {
@@ -611,6 +694,42 @@ static const struct vm_operations_struct i915_gem_gtt_vm_ops = {
 	.close = i915_gem_vm_close,
 };
 
+static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
+	.fault = i915_gem_fault_cpu,
+	.open = i915_gem_vm_open,
+	.close = i915_gem_vm_close,
+};
+
+static const struct vm_operations_struct i915_gem_lmem_vm_ops = {
+	.fault = i915_gem_fault_lmem,
+	.open = i915_gem_vm_open,
+	.close = i915_gem_vm_close,
+};
+
+static void set_vmdata_mmap_offset(struct i915_mmap_offset *mmo, struct vm_area_struct *vma)
+{
+	switch (mmo->mmap_type) {
+	case I915_MMAP_TYPE_OFFSET_WC:
+		vma->vm_page_prot =
+			pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+		break;
+	case I915_MMAP_TYPE_OFFSET_WB:
+		vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
+		break;
+	case I915_MMAP_TYPE_OFFSET_UC:
+		vma->vm_page_prot =
+			pgprot_noncached(vm_get_page_prot(vma->vm_flags));
+		break;
+	default:
+		break;
+	}
+
+	if (i915_gem_object_is_lmem(mmo->obj))
+		vma->vm_ops = &i915_gem_lmem_vm_ops;
+	else
+		vma->vm_ops = &i915_gem_cpu_vm_ops;
+}
+
 /* This overcomes the limitation in drm_gem_mmap's assignment of a
  * drm_gem_object as the vma->vm_private_data. Since we need to
  * be able to resolve multiple mmap offsets which could be tied
@@ -671,7 +790,16 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 	vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
 	vma->vm_private_data = mmo;
 
-	vma->vm_ops = &i915_gem_gtt_vm_ops;
+	switch (mmo->mmap_type) {
+	case I915_MMAP_TYPE_OFFSET_WC:
+	case I915_MMAP_TYPE_OFFSET_WB:
+	case I915_MMAP_TYPE_OFFSET_UC:
+		set_vmdata_mmap_offset(mmo, vma);
+		break;
+	case I915_MMAP_TYPE_GTT:
+		vma->vm_ops = &i915_gem_gtt_vm_ops;
+		break;
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 42b46bb46580..a7bfe79015ee 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -348,7 +348,7 @@ static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj)
 	i915_gem_object_unpin_pages(obj);
 }
 
-void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj);
+void __i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj);
 void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj);
 
 void
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 009e7199bea6..ecdaca437797 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -865,7 +865,7 @@ void i915_gem_runtime_suspend(struct drm_i915_private *i915)
 
 	list_for_each_entry_safe(obj, on,
 				 &i915->ggtt.userfault_list, userfault_link)
-		__i915_gem_object_release_mmap(obj);
+		__i915_gem_object_release_mmap_gtt(obj);
 
 	/*
 	 * The fence will be lost when the device powers down. If any were
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index ceec2bff465f..7003e7bff90c 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -155,6 +155,53 @@ static int lmem_pwrite(struct drm_i915_gem_object *obj,
 	return ret;
 }
 
+vm_fault_t i915_gem_fault_lmem(struct vm_fault *vmf)
+{
+	struct vm_area_struct *area = vmf->vma;
+	struct i915_mmap_offset *priv = area->vm_private_data;
+	struct drm_i915_gem_object *obj = priv->obj;
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *i915 = to_i915(dev);
+	unsigned long size = area->vm_end - area->vm_start;
+	bool write = area->vm_flags & VM_WRITE;
+	vm_fault_t vmf_ret;
+	int i, ret;
+
+	/* Sanity check that we allow writing into this object */
+	if (i915_gem_object_is_readonly(obj) && write)
+		return VM_FAULT_SIGBUS;
+
+	for (i = 0; i < size >> PAGE_SHIFT; i++) {
+		vmf_ret = vmf_insert_pfn(area,
+					 (unsigned long)area->vm_start + i * PAGE_SIZE,
+					 i915_gem_object_lmem_io_offset(obj, i) >> PAGE_SHIFT);
+		if (vmf_ret & VM_FAULT_ERROR) {
+			ret = vm_fault_to_errno(vmf_ret, 0);
+			goto err;
+		}
+	}
+err:
+	switch (ret) {
+	case -EIO:
+		if (!i915_terminally_wedged(i915))
+			return VM_FAULT_SIGBUS;
+	case -EAGAIN:
+	case 0:
+	case -ERESTARTSYS:
+	case -EINTR:
+	case -EBUSY:
+		return VM_FAULT_NOPAGE;
+	case -ENOMEM:
+		return VM_FAULT_OOM;
+	case -ENOSPC:
+	case -EFAULT:
+		return VM_FAULT_SIGBUS;
+	default:
+		WARN_ONCE(ret, "unhandled error in %s: %i\n", __func__, ret);
+		return VM_FAULT_SIGBUS;
+	}
+}
+
 static const struct drm_i915_gem_object_ops region_lmem_obj_ops = {
 	.get_pages = i915_memory_region_get_pages_buddy,
 	.put_pages = i915_memory_region_put_pages_buddy,
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.h b/drivers/gpu/drm/i915/intel_region_lmem.h
index 609de692489d..68232615a874 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.h
+++ b/drivers/gpu/drm/i915/intel_region_lmem.h
@@ -17,6 +17,8 @@ resource_size_t i915_gem_object_lmem_io_offset(struct drm_i915_gem_object *obj,
 
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
+vm_fault_t i915_gem_fault_lmem(struct vm_fault *vmf);
+
 struct drm_i915_gem_object *
 i915_gem_object_create_lmem(struct drm_i915_private *i915,
 			    resource_size_t size,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 32/37] drm/i915: cpu-map based dumb buffers
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (30 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 31/37] drm/i915: Add cpu and lmem fault handlers Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 33/37] drm/i915: support basic object migration Matthew Auld
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

If there is no aperture we can't use map_gtt to map dumb buffers, so we
need a cpu-map based path to do it. We prefer map_gtt on platforms that
do have aperture.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c         | 15 +++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object_types.h |  1 +
 drivers/gpu/drm/i915/i915_drv.c                  |  2 +-
 drivers/gpu/drm/i915/i915_drv.h                  |  2 +-
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 5941648ee0ce..ce95d3f2b819 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -599,6 +599,19 @@ __assign_gem_object_mmap_data(struct drm_file *file,
 	return ret;
 }
 
+int
+i915_gem_mmap_dumb(struct drm_file *file,
+		  struct drm_device *dev,
+		  u32 handle,
+		  u64 *offset)
+{
+	struct drm_i915_private *i915 = dev->dev_private;
+	enum i915_mmap_type mmap_type = HAS_MAPPABLE_APERTURE(i915) ?
+		I915_MMAP_TYPE_GTT : I915_MMAP_TYPE_DUMB_WC;
+
+	return __assign_gem_object_mmap_data(file, handle, mmap_type, offset);
+}
+
 /**
  * i915_gem_mmap_gtt_ioctl - prepare an object for GTT mmap'ing
  * @dev: DRM device
@@ -710,6 +723,7 @@ static void set_vmdata_mmap_offset(struct i915_mmap_offset *mmo, struct vm_area_
 {
 	switch (mmo->mmap_type) {
 	case I915_MMAP_TYPE_OFFSET_WC:
+	case I915_MMAP_TYPE_DUMB_WC:
 		vma->vm_page_prot =
 			pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
 		break;
@@ -794,6 +808,7 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 	case I915_MMAP_TYPE_OFFSET_WC:
 	case I915_MMAP_TYPE_OFFSET_WB:
 	case I915_MMAP_TYPE_OFFSET_UC:
+	case I915_MMAP_TYPE_DUMB_WC:
 		set_vmdata_mmap_offset(mmo, vma);
 		break;
 	case I915_MMAP_TYPE_GTT:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index f95e54a25426..a888ca64cc3f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -66,6 +66,7 @@ enum i915_mmap_type {
 	I915_MMAP_TYPE_OFFSET_WC,
 	I915_MMAP_TYPE_OFFSET_WB,
 	I915_MMAP_TYPE_OFFSET_UC,
+	I915_MMAP_TYPE_DUMB_WC,
 };
 
 struct i915_mmap_offset {
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 8dadd6b9a0a9..1c3d5cb2893c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3229,7 +3229,7 @@ static struct drm_driver driver = {
 	.get_scanout_position = i915_get_crtc_scanoutpos,
 
 	.dumb_create = i915_gem_dumb_create,
-	.dumb_map_offset = i915_gem_mmap_gtt,
+	.dumb_map_offset = i915_gem_mmap_dumb,
 	.ioctls = i915_ioctls,
 	.num_ioctls = ARRAY_SIZE(i915_ioctls),
 	.fops = &i915_driver_fops,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index dc2bf48165f0..715e630a872d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2513,7 +2513,7 @@ i915_mutex_lock_interruptible(struct drm_device *dev)
 int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
-int i915_gem_mmap_gtt(struct drm_file *file_priv, struct drm_device *dev,
+int i915_gem_mmap_dumb(struct drm_file *file_priv, struct drm_device *dev,
 		      u32 handle, u64 *offset);
 int i915_gem_mmap_gtt_version(void);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 33/37] drm/i915: support basic object migration
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (31 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 32/37] drm/i915: cpu-map based dumb buffers Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION Matthew Auld
                   ` (6 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

We are going want to able to move objects between different regions
like system memory and local memory. In the future everything should
be just another region.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 129 ++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 ++
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |   2 +-
 .../drm/i915/selftests/intel_memory_region.c  | 129 ++++++++++++++++++
 4 files changed, 267 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 343162bc8181..691af388e4e7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -28,6 +28,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_object.h"
+#include "i915_gem_object_blt.h"
 #include "i915_globals.h"
 
 static struct i915_global_object {
@@ -171,6 +172,134 @@ void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
 	}
 }
 
+int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj)
+{
+	int err;
+
+	lockdep_assert_held(&obj->base.dev->struct_mutex);
+
+	if (obj->mm.madv != I915_MADV_WILLNEED)
+		return -EINVAL;
+
+	if (i915_gem_object_needs_bit17_swizzle(obj))
+		return -EINVAL;
+
+	if (atomic_read(&obj->mm.pages_pin_count) >
+	    atomic_read(&obj->bind_count))
+		return -EBUSY;
+
+	if (obj->pin_global)
+		return -EBUSY;
+
+	i915_gem_object_release_mmap(obj);
+
+	GEM_BUG_ON(obj->mm.mapping);
+	GEM_BUG_ON(obj->base.filp && mapping_mapped(obj->base.filp->f_mapping));
+
+	err = i915_gem_object_wait(obj,
+				   I915_WAIT_INTERRUPTIBLE |
+				   I915_WAIT_LOCKED |
+				   I915_WAIT_ALL,
+				   MAX_SCHEDULE_TIMEOUT);
+	if (err)
+		return err;
+
+	return i915_gem_object_unbind(obj);
+}
+
+int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
+			    struct intel_context *ce,
+			    enum intel_region_id id)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct drm_i915_gem_object *donor;
+	struct intel_memory_region *mem;
+	int err = 0;
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	GEM_BUG_ON(id >= INTEL_MEMORY_UKNOWN);
+	GEM_BUG_ON(obj->memory_region->id == id);
+	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
+
+	mem = i915->regions[id];
+
+	donor = i915_gem_object_create_region(mem, obj->base.size, 0);
+	if (IS_ERR(donor))
+		return PTR_ERR(donor);
+
+	/* Copy backing-pages if we have to */
+	if (i915_gem_object_has_pages(obj)) {
+		struct sg_table *pages;
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto err_put_donor;
+
+		err = i915_gem_object_copy_blt(obj, donor, ce);
+		if (err)
+			goto err_put_donor;
+
+		i915_gem_object_lock(donor);
+		err = i915_gem_object_set_to_cpu_domain(donor, false);
+		i915_gem_object_unlock(donor);
+		if (err)
+			goto err_put_donor;
+
+		i915_retire_requests(i915);
+
+		i915_gem_object_unbind(donor);
+		err = i915_gem_object_unbind(obj);
+		if (err)
+			goto err_put_donor;
+
+		mutex_lock(&obj->mm.lock);
+
+		pages = fetch_and_zero(&obj->mm.pages);
+		obj->ops->put_pages(obj, pages);
+
+		memcpy(&obj->mm.page_sizes, &donor->mm.page_sizes,
+		       sizeof(struct i915_page_sizes));
+		obj->mm.pages = __i915_gem_object_unset_pages(donor);
+
+		obj->mm.get_page.sg_pos = obj->mm.pages->sgl;
+		obj->mm.get_page.sg_idx = 0;
+		__i915_gem_object_reset_page_iter(obj);
+
+		mutex_unlock(&obj->mm.lock);
+	}
+
+	if (obj->ops->release)
+		obj->ops->release(obj);
+
+	/* We need still need a little special casing for shmem */
+	if (obj->base.filp)
+		fput(fetch_and_zero(&obj->base.filp));
+	else
+		obj->base.filp = fetch_and_zero(&donor->base.filp);
+
+	obj->base.size = donor->base.size;
+	obj->memory_region = mem;
+	obj->flags = donor->flags;
+	obj->ops = donor->ops;
+
+	list_replace_init(&donor->blocks, &obj->blocks);
+
+	mutex_lock(&mem->obj_lock);
+	list_add(&obj->region_link, &mem->objects);
+	mutex_unlock(&mem->obj_lock);
+
+	GEM_BUG_ON(i915_gem_object_has_pages(donor));
+	GEM_BUG_ON(i915_gem_object_has_pinned_pages(donor));
+
+err_put_donor:
+	i915_gem_object_put(donor);
+	if (i915_gem_object_has_pinned_pages(obj))
+		i915_gem_object_unpin_pages(obj);
+
+	return err;
+}
+
 static void __i915_gem_free_objects(struct drm_i915_private *i915,
 				    struct llist_node *freed)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index a7bfe79015ee..11afb4dea215 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -40,8 +40,16 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align);
 void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file);
 void i915_gem_free_object(struct drm_gem_object *obj);
 
+enum intel_region_id;
+int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj);
+int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
+			    struct intel_context *ce,
+			    enum intel_region_id id);
+
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
+void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj);
+
 struct sg_table *
 __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj);
 void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 15eaaedffc46..c1bc047d5fc4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -139,7 +139,7 @@ void i915_gem_object_writeback(struct drm_i915_gem_object *obj)
 		obj->ops->writeback(obj);
 }
 
-static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj)
+void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj)
 {
 	struct radix_tree_iter iter;
 	void __rcu **slot;
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 23c466a1b800..ccfdc4cbd174 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -491,6 +491,59 @@ static int igt_lmem_create(void *arg)
 	return err;
 }
 
+static int igt_smem_create_migrate(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_context *ce = i915->engine[BCS0]->kernel_context;
+	struct drm_i915_gem_object *obj;
+	int err;
+
+	/* Switch object backing-store on create */
+	obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	err = i915_gem_object_migrate(obj, ce, INTEL_MEMORY_SMEM);
+	if (err)
+		goto out_put;
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		goto out_put;
+
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
+static int igt_lmem_create_migrate(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_context *ce = i915->engine[BCS0]->kernel_context;
+	struct drm_i915_gem_object *obj;
+	int err;
+
+	/* Switch object backing-store on create */
+	obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	err = i915_gem_object_migrate(obj, ce, INTEL_MEMORY_LMEM);
+	if (err)
+		goto out_put;
+
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		goto out_put;
+
+	i915_gem_object_unpin_pages(obj);
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
 static int igt_lmem_write_gpu(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -601,6 +654,79 @@ static int igt_lmem_write_cpu(void *arg)
 	return err;
 }
 
+static int igt_lmem_pages_migrate(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_context *ce = i915->engine[BCS0]->kernel_context;
+	struct drm_i915_gem_object *obj;
+	IGT_TIMEOUT(end_time);
+	I915_RND_STATE(prng);
+	u32 sz;
+	int err;
+
+	sz = round_up(prandom_u32_state(&prng) % SZ_32M, PAGE_SIZE);
+
+	obj = i915_gem_object_create_lmem(i915, sz, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	err = i915_gem_object_fill_blt(obj, ce, 0);
+	if (err)
+		goto out_put;
+
+	do {
+		err = i915_gem_object_prepare_move(obj);
+		if (err)
+			goto out_put;
+
+		if (i915_gem_object_is_lmem(obj)) {
+			err = i915_gem_object_migrate(obj, ce, INTEL_MEMORY_SMEM);
+			if (err)
+				goto out_put;
+
+			if (i915_gem_object_is_lmem(obj)) {
+				pr_err("object still backed by lmem\n");
+				err = -EINVAL;
+			}
+
+			if (!list_empty(&obj->blocks)) {
+				pr_err("object leaking memory region\n");
+				err = -EINVAL;
+			}
+
+			if (!i915_gem_object_has_struct_page(obj)) {
+				pr_err("object not backed by struct page\n");
+				err = -EINVAL;
+			}
+
+		} else {
+			err = i915_gem_object_migrate(obj, ce, INTEL_MEMORY_LMEM);
+			if (err)
+				goto out_put;
+
+			if (i915_gem_object_has_struct_page(obj)) {
+				pr_err("object still backed by struct page\n");
+				err = -EINVAL;
+			}
+
+			if (!i915_gem_object_is_lmem(obj)) {
+				pr_err("object not backed by lmem\n");
+				err = -EINVAL;
+			}
+		}
+
+		if (!err)
+			err = i915_gem_object_fill_blt(obj, ce, 0xdeadbeaf);
+		if (err)
+			break;
+	} while (!__igt_timeout(end_time, NULL));
+
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
@@ -644,6 +770,9 @@ int intel_memory_region_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(igt_lmem_create),
 		SUBTEST(igt_lmem_write_cpu),
 		SUBTEST(igt_lmem_write_gpu),
+		SUBTEST(igt_smem_create_migrate),
+		SUBTEST(igt_lmem_create_migrate),
+		SUBTEST(igt_lmem_pages_migrate),
 	};
 	int err;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (32 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 33/37] drm/i915: support basic object migration Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-28  0:22   ` Chris Wilson
                     ` (2 more replies)
  2019-06-27 20:56 ` [PATCH v2 35/37] drm/i915/query: Expose memory regions through the query uAPI Matthew Auld
                   ` (5 subsequent siblings)
  39 siblings, 3 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

This call will specify which memory region an object should be placed.

Note that changing the object's backing storage should be immediately
done after an object is created or if it's not yet in use, otherwise
this will fail on a busy object.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  12 ++
 drivers/gpu/drm/i915/gem/i915_gem_context.h |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c  | 117 ++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.c             |   2 +-
 include/uapi/drm/i915_drm.h                 |  27 +++++
 6 files changed, 161 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 8a9787cf0cd0..157ca8247752 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -75,6 +75,7 @@
 #include "i915_globals.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
+#include "i915_gem_ioctls.h"
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
@@ -2357,6 +2358,17 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file)
+{
+	struct drm_i915_gem_context_param *args = data;
+
+	if (args->param <= I915_CONTEXT_PARAM_MAX)
+		return i915_gem_context_setparam_ioctl(dev, data, file);
+
+	return i915_gem_object_setparam_ioctl(dev, data, file);
+}
+
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 				       void *data, struct drm_file *file)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 9691dd062f72..d5a9a63bb34c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -157,6 +157,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
+int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file);
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
index 5abd5b2172f2..af7465bceebd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
@@ -32,6 +32,8 @@ int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file);
 int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file_priv);
+int i915_gem_object_setparam_ioctl(struct drm_device *dev, void *data,
+				   struct drm_file *file_priv);
 int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file);
 int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 691af388e4e7..bc95f449de50 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -551,6 +551,123 @@ int __init i915_global_objects_init(void)
 	return 0;
 }
 
+static enum intel_region_id
+__region_id(u32 region)
+{
+	enum intel_region_id id;
+
+	for (id = 0; id < ARRAY_SIZE(intel_region_map); ++id) {
+		if (intel_region_map[id] == region)
+			return id;
+	}
+
+	return INTEL_MEMORY_UKNOWN;
+}
+
+static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
+					 struct drm_i915_gem_object_param *args,
+					 struct drm_file *file,
+					 struct drm_i915_gem_object *obj)
+{
+	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
+	u32 __user *uregions = u64_to_user_ptr(args->data);
+	u32 uregions_copy[INTEL_MEMORY_UKNOWN];
+	int i, ret;
+
+	if (args->size > ARRAY_SIZE(intel_region_map))
+		return -EINVAL;
+
+	memset(uregions_copy, 0, sizeof(uregions_copy));
+	for (i = 0; i < args->size; i++) {
+		u32 region;
+
+		ret = get_user(region, uregions);
+		if (ret)
+			return ret;
+
+		uregions_copy[i] = region;
+		++uregions;
+	}
+
+	mutex_lock(&dev_priv->drm.struct_mutex);
+	ret = i915_gem_object_prepare_move(obj);
+	if (ret) {
+		DRM_ERROR("Cannot set memory region, object in use\n");
+	        goto err;
+	}
+
+	if (args->size > ARRAY_SIZE(intel_region_map))
+		return -EINVAL;
+
+	for (i = 0; i < args->size; i++) {
+		u32 region = uregions_copy[i];
+		enum intel_region_id id = __region_id(region);
+
+		if (id == INTEL_MEMORY_UKNOWN) {
+			ret = -EINVAL;
+			goto err;
+		}
+
+		ret = i915_gem_object_migrate(obj, ce, id);
+		if (!ret) {
+			if (MEMORY_TYPE_FROM_REGION(region) ==
+			    INTEL_LMEM) {
+				/*
+				 * TODO: this should be part of get_pages(),
+				 * when async get_pages arrives
+				 */
+				ret = i915_gem_object_fill_blt(obj, ce, 0);
+				if (ret) {
+					DRM_ERROR("Failed clearing the object\n");
+					goto err;
+				}
+
+				i915_gem_object_lock(obj);
+				ret = i915_gem_object_set_to_cpu_domain(obj, false);
+				i915_gem_object_unlock(obj);
+				if (ret)
+					goto err;
+			}
+			break;
+		}
+	}
+err:
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+	return ret;
+}
+
+int i915_gem_object_setparam_ioctl(struct drm_device *dev, void *data,
+				   struct drm_file *file)
+{
+
+	struct drm_i915_gem_object_param *args = data;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct drm_i915_gem_object *obj;
+	int ret;
+
+	obj = i915_gem_object_lookup(file, args->handle);
+	if (!obj)
+		return -ENOENT;
+
+	switch (args->param) {
+	case I915_PARAM_MEMORY_REGION:
+		ret = i915_gem_object_region_select(dev_priv, args, file, obj);
+		if (ret) {
+			DRM_ERROR("Cannot set memory region, migration failed\n");
+			goto err;
+		}
+
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+err:
+	i915_gem_object_put(obj);
+	return ret;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/huge_gem_object.c"
 #include "selftests/huge_pages.c"
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1c3d5cb2893c..3d6fe993f26e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3196,7 +3196,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_RENDER_ALLOW),
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 729e729e2282..5cf976e7608a 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -360,6 +360,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_VM_CREATE		0x3a
 #define DRM_I915_GEM_VM_DESTROY		0x3b
 #define DRM_I915_GEM_MMAP_OFFSET   	DRM_I915_GEM_MMAP_GTT
+#define DRM_I915_GEM_OBJECT_SETPARAM	DRM_I915_GEM_CONTEXT_SETPARAM
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -423,6 +424,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_OFFSET, struct drm_i915_gem_mmap_offset)
+#define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1595,11 +1597,36 @@ struct drm_i915_gem_context_param {
  *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
  */
 #define I915_CONTEXT_PARAM_ENGINES	0xa
+
+#define I915_CONTEXT_PARAM_MAX	        0xffffffff
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
 };
 
+struct drm_i915_gem_object_param {
+	/** Handle for the object */
+	__u32 handle;
+
+	__u32 size;
+
+	/* Must be 1 */
+	__u32 object_class;
+
+	/** Set the memory region for the object listed in preference order
+	 *  as an array of region ids within data. To force an object
+	 *  to a particular memory region, set the region as the sole entry.
+	 *
+	 *  Valid region ids are derived from the id field of
+	 *  struct drm_i915_memory_region_info.
+	 *  See struct drm_i915_query_memory_region_info.
+	 */
+#define I915_PARAM_MEMORY_REGION 0x1
+	__u32 param;
+
+	__u64 data;
+};
+
 /**
  * Context SSEU programming
  *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 35/37] drm/i915/query: Expose memory regions through the query uAPI
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (33 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-28  5:59   ` Tvrtko Ursulin
  2019-06-27 20:56 ` [PATCH v2 36/37] HAX drm/i915: add the fake lmem region Matthew Auld
                   ` (4 subsequent siblings)
  39 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

Returns the available memory region areas supported by the HW.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_query.c | 57 +++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h       | 39 +++++++++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index 7b7016171057..21c4c2592d6c 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -143,10 +143,67 @@ query_engine_info(struct drm_i915_private *i915,
 	return len;
 }
 
+static int query_memregion_info(struct drm_i915_private *dev_priv,
+				struct drm_i915_query_item *query_item)
+{
+	struct drm_i915_query_memory_region_info __user *query_ptr =
+		u64_to_user_ptr(query_item->data_ptr);
+	struct drm_i915_memory_region_info __user *info_ptr =
+		&query_ptr->regions[0];
+	struct drm_i915_memory_region_info info = { };
+	struct drm_i915_query_memory_region_info query;
+	u32 total_length;
+	int ret, i;
+
+	if (query_item->flags != 0)
+		return -EINVAL;
+
+	total_length = sizeof(struct drm_i915_query_memory_region_info);
+	for (i = 0; i < ARRAY_SIZE(dev_priv->regions); ++i) {
+		struct intel_memory_region *region = dev_priv->regions[i];
+
+		if (!region)
+			continue;
+
+		total_length += sizeof(struct drm_i915_memory_region_info);
+	}
+
+	ret = copy_query_item(&query, sizeof(query), total_length,
+			      query_item);
+	if (ret != 0)
+		return ret;
+
+	if (query.num_regions || query.rsvd[0] || query.rsvd[1] ||
+	    query.rsvd[2])
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(dev_priv->regions); ++i) {
+		struct intel_memory_region *region = dev_priv->regions[i];
+
+		if (!region)
+			continue;
+
+		info.id = region->id;
+		info.size = resource_size(&region->region);
+
+		if (__copy_to_user(info_ptr, &info, sizeof(info)))
+			return -EFAULT;
+
+		query.num_regions++;
+		info_ptr++;
+	}
+
+	if (__copy_to_user(query_ptr, &query, sizeof(query)))
+		return -EFAULT;
+
+	return total_length;
+}
+
 static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
 					struct drm_i915_query_item *query_item) = {
 	query_topology_info,
 	query_engine_info,
+	query_memregion_info,
 };
 
 int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 5cf976e7608a..9b77d8af9877 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2041,6 +2041,7 @@ struct drm_i915_query_item {
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
 #define DRM_I915_QUERY_ENGINE_INFO	2
+#define DRM_I915_QUERY_MEMREGION_INFO   3
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -2180,6 +2181,44 @@ struct drm_i915_query_engine_info {
 	struct drm_i915_engine_info engines[];
 };
 
+struct drm_i915_memory_region_info {
+
+	/** Base type of a region
+	 */
+#define I915_SYSTEM_MEMORY         0
+#define I915_DEVICE_MEMORY         1
+
+	/** The region id is encoded in a layout which makes it possible to
+	 *  retrieve the following information:
+	 *
+	 *  Base type: log2(ID >> 16)
+	 *  Instance:  log2(ID & 0xffff)
+	 */
+	__u32 id;
+
+	/** Reserved field. MBZ */
+	__u32 rsvd0;
+
+	/** Unused for now. MBZ */
+	__u64 flags;
+
+	__u64 size;
+
+	/** Reserved fields must be cleared to zero. */
+	__u64 rsvd1[4];
+};
+
+struct drm_i915_query_memory_region_info {
+
+	/** Number of struct drm_i915_memory_region_info structs */
+	__u32 num_regions;
+
+	/** MBZ */
+	__u32 rsvd[3];
+
+	struct drm_i915_memory_region_info regions[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 36/37] HAX drm/i915: add the fake lmem region
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (34 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 35/37] drm/i915/query: Expose memory regions through the query uAPI Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 20:56 ` [PATCH v2 37/37] HAX drm/i915/lmem: default userspace allocations to LMEM Matthew Auld
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Intended for upstream testing so that we can still exercise the LMEM
plumbing and !HAS_MAPPABLE_APERTURE paths. Smoke tested on Skull Canyon
device. This works by allocating an intel_memory_region for a reserved
portion of system memory, which we treat like LMEM. For the LMEMBAR we
steal the aperture and 1:1 it map to the stolen region.

To enable simply set i915_fake_lmem_start= on the kernel cmdline with the
start of reserved region(see memmap=). The size of the region we can
use is determined by the size of the mappable aperture, so the size of
reserved region should be >= mappable_end.

eg. memmap=2G$16G i915.fake_lmem_start=0x400000000

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            | 15 +++++
 drivers/gpu/drm/i915/i915_drv.h            |  2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  3 +
 drivers/gpu/drm/i915/i915_params.c         |  3 +
 drivers/gpu/drm/i915/i915_params.h         |  3 +-
 drivers/gpu/drm/i915/intel_memory_region.h |  4 ++
 drivers/gpu/drm/i915/intel_region_lmem.c   | 76 ++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_region_lmem.h   |  3 +
 8 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 3d6fe993f26e..891937de6a2b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1869,6 +1869,9 @@ static void i915_driver_destroy(struct drm_i915_private *i915)
 	pci_set_drvdata(pdev, NULL);
 }
 
+struct resource intel_graphics_fake_lmem_res __ro_after_init = DEFINE_RES_MEM(0, 0);
+EXPORT_SYMBOL(intel_graphics_fake_lmem_res);
+
 /**
  * i915_driver_load - setup chip and create an initial config
  * @pdev: PCI device
@@ -1895,6 +1898,18 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (!i915_modparams.nuclear_pageflip && match_info->gen < 5)
 		dev_priv->drm.driver_features &= ~DRIVER_ATOMIC;
 
+	/* Check if we need fake LMEM */
+	if (INTEL_GEN(dev_priv) >= 9 && i915_modparams.fake_lmem_start > 0) {
+		intel_graphics_fake_lmem_res.start = i915_modparams.fake_lmem_start;
+		intel_graphics_fake_lmem_res.end = SZ_2G; /* Placeholder; depends on aperture size */
+
+		mkwrite_device_info(dev_priv)->memory_regions =
+			REGION_SMEM | REGION_LMEM;
+		GEM_BUG_ON(!HAS_LMEM(dev_priv));
+
+		pr_info("Intel graphics fake LMEM starts at %pa\n", &intel_graphics_fake_lmem_res.start);
+	}
+
 	ret = pci_enable_device(pdev);
 	if (ret)
 		goto out_fini;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 715e630a872d..9a2c79fa8088 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2815,4 +2815,6 @@ static inline void add_taint_for_CI(unsigned int taint)
 	add_taint(taint, LOCKDEP_STILL_OK);
 }
 
+extern struct resource intel_graphics_fake_lmem_res;
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3a8965048a06..df4928c8b10a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2938,6 +2938,9 @@ int i915_gem_init_memory_regions(struct drm_i915_private *i915)
 		case INTEL_STOLEN:
 			mem = i915_gem_stolen_setup(i915);
 			break;
+		case INTEL_LMEM:
+			mem = i915_gem_setup_fake_lmem(i915);
+			break;
 		}
 
 		if (IS_ERR(mem)) {
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 296452f9efe4..59a6ad6261b9 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -164,6 +164,9 @@ i915_param_named_unsafe(dmc_firmware_path, charp, 0400,
 i915_param_named_unsafe(enable_dp_mst, bool, 0600,
 	"Enable multi-stream transport (MST) for new DisplayPort sinks. (default: true)");
 
+i915_param_named_unsafe(fake_lmem_start, ulong, 0600,
+	"Fake LMEM start offset (default: 0)");
+
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
 i915_param_named_unsafe(inject_load_failure, uint, 0400,
 	"Force an error after a number of failure check points (0:disabled (default), N:force failure at the Nth failure check point)");
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index d29ade3b7de6..b9698722c957 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -77,7 +77,8 @@ struct drm_printer;
 	param(bool, verbose_state_checks, true) \
 	param(bool, nuclear_pageflip, false) \
 	param(bool, enable_dp_mst, true) \
-	param(bool, enable_gvt, false)
+	param(bool, enable_gvt, false) \
+	param(unsigned long, fake_lmem_start, 0)
 
 #define MEMBER(T, member, ...) T member;
 struct i915_params {
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index bee0c022d295..4960096ec30f 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -9,6 +9,7 @@
 #include <linux/ioport.h>
 #include <linux/mutex.h>
 #include <linux/io-mapping.h>
+#include <drm/drm_mm.h>
 
 #include "i915_buddy.h"
 
@@ -71,6 +72,9 @@ struct intel_memory_region {
 	struct io_mapping iomap;
 	struct resource region;
 
+	/* For faking for lmem */
+	struct drm_mm_node fake_mappable;
+
 	struct i915_buddy_mm mm;
 	struct mutex mm_lock;
 
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 7003e7bff90c..2028261f4e80 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -241,9 +241,41 @@ lmem_create_object(struct intel_memory_region *mem,
 	return obj;
 }
 
+static int i915_gem_init_fake_lmem_bar(struct intel_memory_region *mem)
+{
+	struct drm_i915_private *i915 = mem->i915;
+	struct i915_ggtt *ggtt = &i915->ggtt;
+	unsigned long n;
+	int ret;
+
+	mem->fake_mappable.start = 0;
+	mem->fake_mappable.size = resource_size(&mem->region);
+	mem->fake_mappable.color = I915_COLOR_UNEVICTABLE;
+
+	ret = drm_mm_reserve_node(&ggtt->vm.mm, &mem->fake_mappable);
+	if (ret)
+		return ret;
+
+	/* 1:1 map the mappable aperture to our reserved region */
+	for (n = 0; n < mem->fake_mappable.size >> PAGE_SHIFT; ++n) {
+		ggtt->vm.insert_page(&ggtt->vm,
+				     mem->region.start + (n << PAGE_SHIFT),
+				     n << PAGE_SHIFT, I915_CACHE_NONE, 0);
+	}
+
+	return 0;
+}
+
+static void i915_gem_relase_fake_lmem_bar(struct intel_memory_region *mem)
+{
+	if (drm_mm_node_allocated(&mem->fake_mappable))
+		drm_mm_remove_node(&mem->fake_mappable);
+}
+
 static void
 region_lmem_release(struct intel_memory_region *mem)
 {
+	i915_gem_relase_fake_lmem_bar(mem);
 	io_mapping_fini(&mem->iomap);
 	i915_memory_region_release_buddy(mem);
 }
@@ -253,6 +285,14 @@ region_lmem_init(struct intel_memory_region *mem)
 {
 	int ret;
 
+	if (intel_graphics_fake_lmem_res.start) {
+		ret = i915_gem_init_fake_lmem_bar(mem);
+		if (ret) {
+			GEM_BUG_ON(1);
+			return ret;
+		}
+	}
+
 	if (!io_mapping_init_wc(&mem->iomap,
 				mem->io_start,
 				resource_size(&mem->region)))
@@ -278,6 +318,7 @@ void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
 	resource_size_t offset;
 
 	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= intel_graphics_fake_lmem_res.start;
 
 	return io_mapping_map_atomic_wc(&obj->memory_region->iomap, offset);
 }
@@ -291,6 +332,7 @@ void __iomem *i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(!(obj->flags & I915_BO_ALLOC_CONTIGUOUS));
 
 	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= intel_graphics_fake_lmem_res.start;
 
 	return io_mapping_map_wc(&obj->memory_region->iomap, offset, size);
 }
@@ -307,6 +349,7 @@ resource_size_t i915_gem_object_lmem_io_offset(struct drm_i915_gem_object *obj,
 	 * here, and elsewhere like in the gtt paths.
 	 */
 	daddr = i915_gem_object_get_dma_address(obj, n);
+	daddr -= intel_graphics_fake_lmem_res.start;
 
 	return mem->io_start + daddr;
 }
@@ -326,3 +369,36 @@ i915_gem_object_create_lmem(struct drm_i915_private *i915,
 	return i915_gem_object_create_region(i915->regions[INTEL_MEMORY_LMEM],
 					     size, flags);
 }
+
+struct intel_memory_region *
+i915_gem_setup_fake_lmem(struct drm_i915_private *i915)
+{
+	struct pci_dev *pdev = i915->drm.pdev;
+	struct intel_memory_region *mem;
+	resource_size_t mappable_end;
+	resource_size_t io_start;
+	resource_size_t start;
+
+	GEM_BUG_ON(HAS_MAPPABLE_APERTURE(i915));
+	GEM_BUG_ON(!intel_graphics_fake_lmem_res.start);
+
+	/* Your mappable aperture belongs to me now! */
+	mappable_end = pci_resource_len(pdev, 2);
+	io_start = pci_resource_start(pdev, 2),
+	start = intel_graphics_fake_lmem_res.start;
+
+	mem = intel_memory_region_create(i915,
+					 start,
+					 mappable_end,
+					 I915_GTT_PAGE_SIZE_4K,
+					 io_start,
+					 &region_lmem_ops);
+	if (!IS_ERR(mem)) {
+		DRM_INFO("Intel graphics fake LMEM: %pR\n", &mem->region);
+		DRM_INFO("Intel graphics fake LMEM IO start: %llx\n",
+			 (u64)mem->io_start);
+	}
+
+	return mem;
+}
+
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.h b/drivers/gpu/drm/i915/intel_region_lmem.h
index 68232615a874..41bc411068de 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.h
+++ b/drivers/gpu/drm/i915/intel_region_lmem.h
@@ -24,4 +24,7 @@ i915_gem_object_create_lmem(struct drm_i915_private *i915,
 			    resource_size_t size,
 			    unsigned int flags);
 
+struct intel_memory_region *
+i915_gem_setup_fake_lmem(struct drm_i915_private *i915);
+
 #endif /* !__INTEL_REGION_LMEM_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 37/37] HAX drm/i915/lmem: default userspace allocations to LMEM
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (35 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 36/37] HAX drm/i915: add the fake lmem region Matthew Auld
@ 2019-06-27 20:56 ` Matthew Auld
  2019-06-27 21:36 ` ✗ Fi.CI.CHECKPATCH: warning for Introduce memory region concept (including device local memory) (rev2) Patchwork
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 88+ messages in thread
From: Matthew Auld @ 2019-06-27 20:56 UTC (permalink / raw)
  To: intel-gfx

Hack patch to default all userspace allocations to LMEM. Useful for
testing purposes.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 37 +++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ecdaca437797..a6f29acfb300 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -44,6 +44,7 @@
 #include "gem/i915_gem_clflush.h"
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
+#include "gem/i915_gem_object_blt.h"
 #include "gem/i915_gem_pm.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
@@ -164,9 +165,45 @@ i915_gem_create(struct drm_file *file,
 
 	/* Allocate the new object */
 	obj = i915_gem_object_create_shmem(dev_priv, size);
+	if (HAS_LMEM(dev_priv))
+		obj = i915_gem_object_create_lmem(dev_priv, size, 0);
+	else
+		obj = i915_gem_object_create_shmem(dev_priv, size);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
+	if (i915_gem_object_is_lmem(obj)) {
+		struct intel_context *ce =
+			dev_priv->engine[BCS0]->kernel_context;
+
+		/*
+		 * XXX: We really want to move this to get_pages(), but we
+		 * require grabbing the BKL for the blitting operation which is
+		 * annoying. In the pipeline is support for async get_pages()
+		 * which should fit nicely for this. Also note that the actual
+		 * clear should be done async(we currently do an object_wait
+		 * which is pure garbage), we just need to take care if
+		 * userspace opts of implicit sync for the execbuf, to avoid any
+		 * potential info leak.
+		 */
+
+		mutex_lock(&dev_priv->drm.struct_mutex);
+		ret = i915_gem_object_fill_blt(obj, ce, 0);
+		mutex_unlock(&dev_priv->drm.struct_mutex);
+		if (ret) {
+			i915_gem_object_put(obj);
+			return ret;
+		}
+
+		i915_gem_object_lock(obj);
+		ret = i915_gem_object_set_to_cpu_domain(obj, false);
+		i915_gem_object_unlock(obj);
+		if (ret) {
+			i915_gem_object_put(obj);
+			return ret;
+		}
+	}
+
 	ret = drm_gem_handle_create(file, &obj->base, &handle);
 	/* drop reference from allocate - handle holds it now */
 	i915_gem_object_put(obj);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for Introduce memory region concept (including device local memory) (rev2)
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (36 preceding siblings ...)
  2019-06-27 20:56 ` [PATCH v2 37/37] HAX drm/i915/lmem: default userspace allocations to LMEM Matthew Auld
@ 2019-06-27 21:36 ` Patchwork
  2019-06-27 21:50 ` ✗ Fi.CI.SPARSE: " Patchwork
  2019-06-28  9:59 ` ✗ Fi.CI.BAT: failure " Patchwork
  39 siblings, 0 replies; 88+ messages in thread
From: Patchwork @ 2019-06-27 21:36 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: Introduce memory region concept (including device local memory) (rev2)
URL   : https://patchwork.freedesktop.org/series/56683/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
6f6eb8a214a1 drm/i915: buddy allocator
-:29: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#29: 
new file mode 100644

-:278: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#278: FILE: drivers/gpu/drm/i915/i915_buddy.c:245:
+void i915_buddy_free_list(struct i915_buddy_mm *mm,
+			      struct list_head *objects)

-:436: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#436: FILE: drivers/gpu/drm/i915/i915_buddy.c:403:
+	if (buddy && (i915_buddy_block_free(block) &&
+	    i915_buddy_block_free(buddy)))

-:468: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#468: FILE: drivers/gpu/drm/i915/i915_buddy.h:16:
+#define   I915_BUDDY_ALLOCATED (1<<10)
                                  ^

-:469: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#469: FILE: drivers/gpu/drm/i915/i915_buddy.h:17:
+#define   I915_BUDDY_FREE	   (2<<10)
                          	     ^

-:470: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#470: FILE: drivers/gpu/drm/i915/i915_buddy.h:18:
+#define   I915_BUDDY_SPLIT	   (3<<10)
                           	     ^

total: 0 errors, 1 warnings, 5 checks, 1030 lines checked
9dd43433710a drm/i915: introduce intel_memory_region
-:65: CHECK:LINE_SPACING: Please don't use multiple blank lines
#65: FILE: drivers/gpu/drm/i915/gem/selftests/huge_pages.c:451:
 
+

-:95: ERROR:CODE_INDENT: code indent should use tabs where possible
#95: FILE: drivers/gpu/drm/i915/gem/selftests/huge_pages.c:481:
+^I^I        &obj->memory_region->region.start);$

-:179: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#179: 
new file mode 100644

-:196: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#196: FILE: drivers/gpu/drm/i915/intel_memory_region.c:13:
+{
+

-:278: WARNING:BLOCK_COMMENT_STYLE: Block comments should align the * on each line
#278: FILE: drivers/gpu/drm/i915/intel_memory_region.c:95:
+		 * coalesce if we can.
+		*/

-:437: WARNING:TYPO_SPELLING: 'UKNOWN' may be misspelled - perhaps 'UNKNOWN'?
#437: FILE: drivers/gpu/drm/i915/intel_memory_region.h:33:
+	INTEL_MEMORY_UKNOWN, /* Should be last */

-:446: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'r' may be better as '(r)' to avoid precedence issues
#446: FILE: drivers/gpu/drm/i915/intel_memory_region.h:42:
+#define MEMORY_TYPE_FROM_REGION(r) (ilog2(r >> INTEL_MEMORY_TYPE_SHIFT))

-:447: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'r' may be better as '(r)' to avoid precedence issues
#447: FILE: drivers/gpu/drm/i915/intel_memory_region.h:43:
+#define MEMORY_INSTANCE_FROM_REGION(r) (ilog2(r & 0xffff))

-:461: WARNING:FUNCTION_ARGUMENTS: function definition argument 'struct intel_memory_region *' should also have an identifier name
#461: FILE: drivers/gpu/drm/i915/intel_memory_region.h:57:
+	int (*init)(struct intel_memory_region *);

-:462: WARNING:FUNCTION_ARGUMENTS: function definition argument 'struct intel_memory_region *' should also have an identifier name
#462: FILE: drivers/gpu/drm/i915/intel_memory_region.h:58:
+	void (*release)(struct intel_memory_region *);

-:464: WARNING:FUNCTION_ARGUMENTS: function definition argument 'struct intel_memory_region *' should also have an identifier name
#464: FILE: drivers/gpu/drm/i915/intel_memory_region.h:60:
+	struct drm_i915_gem_object *

-:464: WARNING:FUNCTION_ARGUMENTS: function definition argument 'resource_size_t' should also have an identifier name
#464: FILE: drivers/gpu/drm/i915/intel_memory_region.h:60:
+	struct drm_i915_gem_object *

-:464: WARNING:FUNCTION_ARGUMENTS: function definition argument 'unsigned int' should also have an identifier name
#464: FILE: drivers/gpu/drm/i915/intel_memory_region.h:60:
+	struct drm_i915_gem_object *

-:479: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#479: FILE: drivers/gpu/drm/i915/intel_memory_region.h:75:
+	struct mutex mm_lock;

-:593: WARNING:EMBEDDED_FUNCTION_NAME: Prefer using '"%s...", __func__' to using 'igt_mock_fill', this function's name, in a string
#593: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:67:
+			pr_err("igt_mock_fill failed, space still left in region\n");

total: 1 errors, 9 warnings, 5 checks, 649 lines checked
b199239505e9 drm/i915/region: support basic eviction
97c56199279a drm/i915/region: support continuous allocations
-:22: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#22: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:137:
+#define I915_BO_ALLOC_CONTIGUOUS (1<<0)
                                    ^

total: 0 errors, 0 warnings, 1 checks, 238 lines checked
896ab6f953bb drm/i915/region: support volatile objects
-:23: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#23: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:138:
+#define I915_BO_ALLOC_VOLATILE   (1<<1)
                                    ^

total: 0 errors, 0 warnings, 1 checks, 108 lines checked
158dcac99d23 drm/i915: Add memory region information to device_info
07335c6c968c drm/i915: support creating LMEM objects
-:57: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#57: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 171 lines checked
894c2da949ad drm/i915: setup io-mapping for LMEM
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 35 lines checked
8784beb8d0cd drm/i915/lmem: support kernel mapping
-:113: CHECK:LINE_SPACING: Please don't use multiple blank lines
#113: FILE: drivers/gpu/drm/i915/intel_region_lmem.h:9:
 
+

-:186: ERROR:CODE_INDENT: code indent should use tabs where possible
#186: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:415:
+^I^I^I        val);$

-:186: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#186: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:415:
+			pr_err("vaddr[%u]=%u, val=%u\n", dword, vaddr[dword],
+			        val);

-:198: ERROR:CODE_INDENT: code indent should use tabs where possible
#198: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:427:
+^I^I^I        val ^ 0xdeadbeaf);$

-:198: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#198: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:427:
+			pr_err("vaddr[%u]=%u, val=%u\n", dword, vaddr[dword],
+			        val ^ 0xdeadbeaf);

total: 2 errors, 0 warnings, 3 checks, 187 lines checked
cac31ed0edda drm/i915/blt: support copying objects
-:14: ERROR:BAD_SIGN_OFF: Unrecognized email address: 'Abdiel Janulgue <abdiel.janulgue@linux.intel.com'
#14: 
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com

-:38: CHECK:SPACING: spaces preferred around that '-' (ctx:VxV)
#38: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_blt.c:119:
+		*cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10-2);
 		                                       ^

-:49: CHECK:SPACING: spaces preferred around that '-' (ctx:VxV)
#49: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_blt.c:130:
+		*cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10-2);
 		                                                  ^

-:60: CHECK:SPACING: spaces preferred around that '-' (ctx:VxV)
#60: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_blt.c:141:
+		*cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8-2);
 		                                                 ^

-:193: WARNING:LINE_SPACING: Missing a blank line after declarations
#193: FILE: drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c:103:
+	struct rnd_state prng;
+	IGT_TIMEOUT(end);

-:308: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#308: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:185:
+#define GEN9_XY_FAST_COPY_BLT_CMD	((2<<29)|(0x42<<22))
                                  	   ^

-:308: CHECK:SPACING: spaces preferred around that '|' (ctx:VxV)
#308: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:185:
+#define GEN9_XY_FAST_COPY_BLT_CMD	((2<<29)|(0x42<<22))
                                  	        ^

-:308: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#308: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:185:
+#define GEN9_XY_FAST_COPY_BLT_CMD	((2<<29)|(0x42<<22))
                                  	              ^

-:309: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#309: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:186:
+#define XY_SRC_COPY_BLT_CMD		((2<<29)|(0x53<<22))
                            		   ^

-:309: CHECK:SPACING: spaces preferred around that '|' (ctx:VxV)
#309: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:186:
+#define XY_SRC_COPY_BLT_CMD		((2<<29)|(0x53<<22))
                            		        ^

-:309: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#309: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:186:
+#define XY_SRC_COPY_BLT_CMD		((2<<29)|(0x53<<22))
                            		              ^

total: 1 errors, 1 warnings, 9 checks, 277 lines checked
4de58ea97434 drm/i915/selftests: move gpu-write-dw into utils
cc31698abc26 drm/i915/selftests: add write-dword test for LMEM
-:87: WARNING:LINE_SPACING: Missing a blank line after declarations
#87: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:411:
+	struct intel_engine_cs *engine;
+	IGT_TIMEOUT(end_time);

-:160: WARNING:LINE_SPACING: Missing a blank line after declarations
#160: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:498:
+	struct drm_i915_gem_object *obj;
+	I915_RND_STATE(prng);

total: 0 errors, 2 warnings, 0 checks, 185 lines checked
88e9fcca4054 drm/i915/selftests: don't just test CACHE_NONE for huge-pages
7e21a904a5ac drm/i915/selftest: extend coverage to include LMEM huge-pages
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 146 lines checked
6bef737f4655 drm/i915/lmem: support CPU relocations
-:85: CHECK:SPACING: No space is necessary after a cast
#85: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1091:
+		io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr));

total: 0 errors, 0 warnings, 1 checks, 100 lines checked
1c5dce2c18de drm/i915/lmem: support pread
-:20: WARNING:FUNCTION_ARGUMENTS: function definition argument 'struct drm_i915_gem_object *' should also have an identifier name
#20: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:55:
+	int (*pread)(struct drm_i915_gem_object *,

total: 0 errors, 1 warnings, 0 checks, 106 lines checked
282c4317ef1a drm/i915/lmem: support pwrite
-:71: ERROR:POINTER_LOCATION: "(foo*)" should be "(foo *)"
#71: FILE: drivers/gpu/drm/i915/intel_region_lmem.c:135:
+		unwritten = copy_from_user((void __force*)vaddr + offset,

total: 1 errors, 0 warnings, 0 checks, 87 lines checked
c0332db20e13 drm/i915: enumerate and init each supported region
eb9166edbaaa drm/i915: treat shmem as a region
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

-:49: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#49: FILE: drivers/gpu/drm/i915/gem/i915_gem_shmem.c:434:
+static int __create_shmem(struct drm_i915_private *i915,
 			struct drm_gem_object *obj,

-:61: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#61: FILE: drivers/gpu/drm/i915/gem/i915_gem_shmem.c:457:
+	     unsigned flags)

-:123: WARNING:SUSPECT_CODE_INDENT: suspect code indent for conditional statements (8, 17)
#123: FILE: drivers/gpu/drm/i915/gem/i915_gem_shmem.c:579:
+	if (err)
+		 DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err);

-:124: WARNING:LONG_LINE: line over 100 characters
#124: FILE: drivers/gpu/drm/i915/gem/i915_gem_shmem.c:580:
+		 DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err);

total: 0 errors, 4 warnings, 1 checks, 254 lines checked
3fff0096edab drm/i915: treat stolen as a region
089ac99864c6 drm/i915: define HAS_MAPPABLE_APERTURE
-:20: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'dev_priv' may be better as '(dev_priv)' to avoid precedence issues
#20: FILE: drivers/gpu/drm/i915/i915_drv.h:2247:
+#define HAS_MAPPABLE_APERTURE(dev_priv) (dev_priv->ggtt.mappable_end > 0)

total: 0 errors, 0 warnings, 1 checks, 8 lines checked
50a3eeb8189f drm/i915: do not map aperture if it is not available.
-:40: CHECK:SPACING: No space is necessary after a cast
#40: FILE: drivers/gpu/drm/i915/i915_gem_gtt.c:3394:
+			(struct resource) DEFINE_RES_MEM(pci_resource_start(pdev, 2),

total: 0 errors, 0 warnings, 1 checks, 55 lines checked
3fb004e577d6 drm/i915: expose missing map_gtt support to users
5681ea496f31 drm/i915: set num_fence_regs to 0 if there is no aperture
c3ebef177d7d drm/i915/selftests: check for missing aperture
6cef423853ea drm/i915: error capture with no ggtt slot
-:162: WARNING:LINE_SPACING: Missing a blank line after declarations
#162: FILE: drivers/gpu/drm/i915/i915_gpu_error.c:1782:
+		const u64 slot = ggtt->error_capture.start;
+		ggtt->vm.clear_range(&ggtt->vm, slot, PAGE_SIZE);

total: 0 errors, 1 warnings, 0 checks, 139 lines checked
bf7a75b46b28 drm/i915: Don't try to place HWS in non-existing mappable region
c5129ddc06ad drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
-:218: ERROR:CODE_INDENT: code indent should use tabs where possible
#218: FILE: drivers/gpu/drm/i915/gem/i915_gem_mman.c:605:
+^I        mmo = container_of(node, struct i915_mmap_offset,$

-:349: ERROR:CODE_INDENT: code indent should use tabs where possible
#349: FILE: drivers/gpu/drm/i915/gem/i915_gem_object.h:131:
+^I        mmo->vma_node.readonly = true;$

-:391: ERROR:POINTER_LOCATION: "foo* bar" should be "foo *bar"
#391: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:70:
+	struct drm_i915_gem_object* obj;

total: 3 errors, 0 warnings, 0 checks, 501 lines checked
4981ff369abe drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
-:120: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#120: FILE: include/uapi/drm/i915_drm.h:362:
+#define DRM_I915_GEM_MMAP_OFFSET   ^IDRM_I915_GEM_MMAP_GTT$

-:128: WARNING:LONG_LINE: line over 100 characters
#128: FILE: include/uapi/drm/i915_drm.h:425:
+#define DRM_IOCTL_I915_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_OFFSET, struct drm_i915_gem_mmap_offset)

total: 0 errors, 2 warnings, 0 checks, 129 lines checked
5cd5c23c7ae8 drm/i915/lmem: add helper to get CPU accessible offset
3a94fa78f237 drm/i915: Add cpu and lmem fault handlers
-:53: WARNING:LINE_SPACING: Missing a blank line after declarations
#53: FILE: drivers/gpu/drm/i915/gem/i915_gem_mman.c:395:
+		struct page *page = i915_gem_object_get_page(obj, i);
+		vmf_ret = vmf_insert_pfn(area,

total: 0 errors, 1 warnings, 0 checks, 285 lines checked
8c21fd6a50e8 drm/i915: cpu-map based dumb buffers
-:25: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#25: FILE: drivers/gpu/drm/i915/gem/i915_gem_mman.c:604:
+i915_gem_mmap_dumb(struct drm_file *file,
+		  struct drm_device *dev,

-:90: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#90: FILE: drivers/gpu/drm/i915/i915_drv.h:2517:
+int i915_gem_mmap_dumb(struct drm_file *file_priv, struct drm_device *dev,
 		      u32 handle, u64 *offset);

total: 0 errors, 0 warnings, 2 checks, 56 lines checked
80c18bff8a6f drm/i915: support basic object migration
-:77: WARNING:TYPO_SPELLING: 'UKNOWN' may be misspelled - perhaps 'UNKNOWN'?
#77: FILE: drivers/gpu/drm/i915/gem/i915_gem_object.c:221:
+	GEM_BUG_ON(id >= INTEL_MEMORY_UKNOWN);

-:269: WARNING:LINE_SPACING: Missing a blank line after declarations
#269: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:662:
+	struct drm_i915_gem_object *obj;
+	IGT_TIMEOUT(end_time);

total: 0 errors, 2 warnings, 0 checks, 312 lines checked
d3e6c6c9f2a5 drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
-:90: WARNING:TYPO_SPELLING: 'UKNOWN' may be misspelled - perhaps 'UNKNOWN'?
#90: FILE: drivers/gpu/drm/i915/gem/i915_gem_object.c:564:
+	return INTEL_MEMORY_UKNOWN;

-:100: WARNING:TYPO_SPELLING: 'UKNOWN' may be misspelled - perhaps 'UNKNOWN'?
#100: FILE: drivers/gpu/drm/i915/gem/i915_gem_object.c:574:
+	u32 uregions_copy[INTEL_MEMORY_UKNOWN];

-:122: ERROR:CODE_INDENT: code indent should use tabs where possible
#122: FILE: drivers/gpu/drm/i915/gem/i915_gem_object.c:596:
+^I        goto err;$

-:132: WARNING:TYPO_SPELLING: 'UKNOWN' may be misspelled - perhaps 'UNKNOWN'?
#132: FILE: drivers/gpu/drm/i915/gem/i915_gem_object.c:606:
+		if (id == INTEL_MEMORY_UKNOWN) {

-:168: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#168: FILE: drivers/gpu/drm/i915/gem/i915_gem_object.c:642:
+{
+

-:229: WARNING:LONG_LINE: line over 100 characters
#229: FILE: include/uapi/drm/i915_drm.h:427:
+#define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param)

total: 1 errors, 4 warnings, 1 checks, 221 lines checked
45761cc62ad8 drm/i915/query: Expose memory regions through the query uAPI
-:100: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#100: FILE: include/uapi/drm/i915_drm.h:2185:
+struct drm_i915_memory_region_info {
+

-:127: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#127: FILE: include/uapi/drm/i915_drm.h:2212:
+struct drm_i915_query_memory_region_info {
+

total: 0 errors, 0 warnings, 2 checks, 118 lines checked
f5de7bb8e2d3 HAX drm/i915: add the fake lmem region
-:44: WARNING:LONG_LINE_COMMENT: line over 100 characters
#44: FILE: drivers/gpu/drm/i915/i915_drv.c:1904:
+		intel_graphics_fake_lmem_res.end = SZ_2G; /* Placeholder; depends on aperture size */

-:50: WARNING:LONG_LINE: line over 100 characters
#50: FILE: drivers/gpu/drm/i915/i915_drv.c:1910:
+		pr_info("Intel graphics fake LMEM starts at %pa\n", &intel_graphics_fake_lmem_res.start);

-:90: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#90: FILE: drivers/gpu/drm/i915/i915_params.c:168:
+i915_param_named_unsafe(fake_lmem_start, ulong, 0600,
+	"Fake LMEM start offset (default: 0)");

total: 0 errors, 2 warnings, 1 checks, 195 lines checked
2c530c39178d HAX drm/i915/lmem: default userspace allocations to LMEM

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* ✗ Fi.CI.SPARSE: warning for Introduce memory region concept (including device local memory) (rev2)
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (37 preceding siblings ...)
  2019-06-27 21:36 ` ✗ Fi.CI.CHECKPATCH: warning for Introduce memory region concept (including device local memory) (rev2) Patchwork
@ 2019-06-27 21:50 ` Patchwork
  2019-06-28  9:59 ` ✗ Fi.CI.BAT: failure " Patchwork
  39 siblings, 0 replies; 88+ messages in thread
From: Patchwork @ 2019-06-27 21:50 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: Introduce memory region concept (including device local memory) (rev2)
URL   : https://patchwork.freedesktop.org/series/56683/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: buddy allocator
+drivers/gpu/drm/i915/selftests/i915_buddy.c:292:13: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_buddy.c:292:13: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_buddy.c:422:24: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_buddy.c:422:24: warning: expression using sizeof(void)
+./include/linux/slab.h:666:13: error: not a function <noident>
+./include/linux/slab.h:666:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/slab.h:666:13: warning: call with no type!

Commit: drm/i915: introduce intel_memory_region
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915/region: support basic eviction
Okay!

Commit: drm/i915/region: support continuous allocations
Okay!

Commit: drm/i915/region: support volatile objects
Okay!

Commit: drm/i915: Add memory region information to device_info
Okay!

Commit: drm/i915: support creating LMEM objects
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: setup io-mapping for LMEM
Okay!

Commit: drm/i915/lmem: support kernel mapping
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:180:42:    expected void [noderef] <asn:2>*vaddr
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:180:42:    got void *[assigned] ptr
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:180:42: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:255:51:    expected void *
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:255:51:    got void [noderef] <asn:2>*
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:255:51: warning: incorrect type in return expression (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:337:42:    expected void [noderef] <asn:2>*vaddr
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:337:42:    got void *[assigned] ptr
+drivers/gpu/drm/i915/gem/i915_gem_pages.c:337:42: warning: incorrect type in argument 1 (different address spaces)

Commit: drm/i915/blt: support copying objects
Okay!

Commit: drm/i915/selftests: move gpu-write-dw into utils
Okay!

Commit: drm/i915/selftests: add write-dword test for LMEM
Okay!

Commit: drm/i915/selftests: don't just test CACHE_NONE for huge-pages
Okay!

Commit: drm/i915/selftest: extend coverage to include LMEM huge-pages
Okay!

Commit: drm/i915/lmem: support CPU relocations
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1107:15:    expected void *vaddr
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1107:15:    got void [noderef] <asn:2>*
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1107:15: warning: incorrect type in assignment (different address spaces)

Commit: drm/i915/lmem: support pread
Okay!

Commit: drm/i915/lmem: support pwrite
Okay!

Commit: drm/i915: enumerate and init each supported region
Okay!

Commit: drm/i915: treat shmem as a region
Okay!

Commit: drm/i915: treat stolen as a region
Okay!

Commit: drm/i915: define HAS_MAPPABLE_APERTURE
Okay!

Commit: drm/i915: do not map aperture if it is not available.
Okay!

Commit: drm/i915: expose missing map_gtt support to users
Okay!

Commit: drm/i915: set num_fence_regs to 0 if there is no aperture
Okay!

Commit: drm/i915/selftests: check for missing aperture
Okay!

Commit: drm/i915: error capture with no ggtt slot
+drivers/gpu/drm/i915/i915_gpu_error.c:1048:27:    expected void *s
+drivers/gpu/drm/i915/i915_gpu_error.c:1048:27:    got void [noderef] <asn:2>*
+drivers/gpu/drm/i915/i915_gpu_error.c:1048:27: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/i915/i915_gpu_error.c:1050:49:    expected void [noderef] <asn:2>*vaddr
+drivers/gpu/drm/i915/i915_gpu_error.c:1050:49:    got void *s
+drivers/gpu/drm/i915/i915_gpu_error.c:1050:49: warning: incorrect type in argument 1 (different address spaces)

Commit: drm/i915: Don't try to place HWS in non-existing mappable region
Okay!

Commit: drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
+ ^
+                                                 ^~
+ }
-drivers/gpu/drm/i915/display/icl_dsi.c:135:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/icl_dsi.c:1425:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/icl_dsi.c:1425:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/icl_dsi.c:1426:26: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/icl_dsi.c:1426:26: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_audio.c:306:15: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_audio.c:306:15: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_audio.c:482:15: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_audio.c:601:15: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_audio.c:971:34: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_audio.c:971:34: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:129:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:129:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:169:19: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:169:19: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:171:20: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:171:20: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:191:30: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:191:30: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:195:44: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:195:44: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_bw.c:244:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2251:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2254:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2263:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2271:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2280:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2312:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2312:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2348:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2348:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2541:17: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2541:17: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2575:17: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_cdclk.c:2575:17: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:121:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:121:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:121:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:121:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:121:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:121:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:121:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:227:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:227:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:227:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:227:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:227:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:227:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:227:29: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:237:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:237:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:237:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:237:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:237:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:237:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:237:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:240:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:240:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:240:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:240:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:240:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:240:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:240:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:243:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:243:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:243:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:243:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:243:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:243:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:243:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:245:38: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:245:38: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:245:38: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:245:38: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:245:38: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:245:38: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:245:38: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:248:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:248:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:248:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:248:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:248:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:248:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:248:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:251:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:251:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:251:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:251:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:251:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:251:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:251:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:341:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:341:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:341:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:341:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:341:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:341:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_color.c:341:37: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_ddi.c:671:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_ddi.c:673:24: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_display.c:1202:22: error: Expected constant expression in case statement
-drivers/gpu/drm/i915/display/intel_display.c:1205:22: error: Expected constant expression in case statement
-drivers/gpu/drm/i915/display/intel_display.c:1208:22: error: Expected constant expression in case statement
-drivers/gpu/drm/i915/display/intel_display.c:1211:22: error: Expected constant expression in case statement
-drivers/gpu/drm/i915/display/intel_display.c:14391:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_display.c:14391:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_display.c:2420:13: error: undefined identifier '__builtin_add_overflow_p'
-drivers/gpu/drm/i915/display/intel_display.c:2792:28: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_display.c:2792:28: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_display.c:7372:26: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_display.c:883:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_display.c:883:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c:158:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c:158:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c:158:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c:158:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c:158:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c:158:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c:158:21: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1442:39: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1806:23: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1806:23: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1939:23: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1959:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1959:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1981:58: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:1981:58: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:255:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:300:30: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:300:30: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:394:28: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:394:28: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:4371:26: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:4371:26: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:4414:27: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:4414:27: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:5941:30: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:6645:31: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:6674:9: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/display/intel_dp.c:6674:9: warning: expression using sizeof(void)
-driv

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 01/37] drm/i915: buddy allocator
  2019-06-27 20:55 ` [PATCH v2 01/37] drm/i915: buddy allocator Matthew Auld
@ 2019-06-27 22:28   ` Chris Wilson
  2019-06-28  9:35   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 22:28 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:55:57)
> Simple buddy allocator. We want to allocate properly aligned
> power-of-two blocks to promote usage of huge-pages for the GTT, so 64K,
> 2M and possibly even 1G. While we do support allocating stuff at a
> specific offset, it is more intended for preallocating portions of the
> address space, say for an initial framebuffer, for other uses drm_mm is
> probably a much better fit. Anyway, hopefully this can all be thrown
> away if we eventually move to having the core MM manage device memory.

Yeah, I still have no idea why drm_mm doesn't suffice.

The advantage of drm_mm_node being embedded and non-allocating is still
present, but drm_mm_node has the disadvantage of having grown to
accommodate multiple primary keys.

A bit more of a detailed discussion on the shortcomings, and
measurements for why we should use an intrinsically aligned allocator
over an allocator that allows arbitrary alignments. The obvious one
being allocation speed for >chunk_size allocations, but numbers :)
And just how frequently does it matter?

The major downside is the fragmentation penalty. That also needs
discussion. (Think of the joys of kcompactd and kmigrated.)

That discussion should also be mirrored in a theory of operation
documentation.
 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |   1 +
>  drivers/gpu/drm/i915/i915_buddy.c             | 413 +++++++++++++++
>  drivers/gpu/drm/i915/i915_buddy.h             | 115 ++++
>  drivers/gpu/drm/i915/selftests/i915_buddy.c   | 491 ++++++++++++++++++
>  .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
>  5 files changed, 1021 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/i915_buddy.c
>  create mode 100644 drivers/gpu/drm/i915/i915_buddy.h
>  create mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 3bd8f0349a8a..cb66cf1a5a10 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -117,6 +117,7 @@ gem-y += \
>  i915-y += \
>           $(gem-y) \
>           i915_active.o \
> +         i915_buddy.o \
>           i915_cmd_parser.o \
>           i915_gem_batch_pool.o \
>           i915_gem_evict.o \
> diff --git a/drivers/gpu/drm/i915/i915_buddy.c b/drivers/gpu/drm/i915/i915_buddy.c
> new file mode 100644
> index 000000000000..c0ac68d71d94
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_buddy.c
> @@ -0,0 +1,413 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include <linux/slab.h>
> +#include <linux/list.h>
> +
> +#include "i915_buddy.h"
> +
> +#include "i915_gem.h"
> +#include "i915_utils.h"
> +
> +static void mark_allocated(struct i915_buddy_block *block)
> +{
> +       block->header &= ~I915_BUDDY_HEADER_STATE;
> +       block->header |= I915_BUDDY_ALLOCATED;
> +
> +       list_del_init(&block->link);
> +}
> +
> +static void mark_free(struct i915_buddy_mm *mm,
> +                     struct i915_buddy_block *block)
> +{
> +       block->header &= ~I915_BUDDY_HEADER_STATE;
> +       block->header |= I915_BUDDY_FREE;
> +
> +       list_add(&block->link,
> +                &mm->free_list[i915_buddy_block_order(block)]);
> +}
> +
> +static void mark_split(struct i915_buddy_block *block)
> +{
> +       block->header &= ~I915_BUDDY_HEADER_STATE;
> +       block->header |= I915_BUDDY_SPLIT;
> +
> +       list_del_init(&block->link);
> +}
> +
> +int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 min_size)
> +{
> +       unsigned int i;
> +       u64 offset;
> +
> +       if (size < min_size)
> +               return -EINVAL;
> +
> +       if (min_size < PAGE_SIZE)
> +               return -EINVAL;
> +
> +       if (!is_power_of_2(min_size))
> +               return -EINVAL;
> +
> +       size = round_down(size, min_size);
> +
> +       mm->size = size;
> +       mm->min_size = min_size;
> +       mm->max_order = ilog2(rounddown_pow_of_two(size)) - ilog2(min_size);

ilog2(rounddown_pow_of_two(size)) is ilog2(size)

rounddown_pow_of_two(size): 1 << ilog2(size)
so you are saying ilog2(1 << ilog2(size))

fwiw, I still think of min_size as min_chunk_size (or just chunk_size).

> +
> +       GEM_BUG_ON(mm->max_order > I915_BUDDY_MAX_ORDER);
> +
> +       mm->free_list = kmalloc_array(mm->max_order + 1,
> +                                     sizeof(struct list_head),
> +                                     GFP_KERNEL);
> +       if (!mm->free_list)
> +               return -ENOMEM;
> +
> +       for (i = 0; i <= mm->max_order; ++i)
> +               INIT_LIST_HEAD(&mm->free_list[i]);
> +
> +       mm->blocks = KMEM_CACHE(i915_buddy_block, SLAB_HWCACHE_ALIGN);
> +       if (!mm->blocks)
> +               goto out_free_list;

Would not one global slab cache will suffice? We would have better
reuse, if the kernel hasn't decided to merge slab caches.

> +       mm->n_roots = hweight64(size);
> +
> +       mm->roots = kmalloc_array(mm->n_roots,
> +                                 sizeof(struct i915_buddy_block *),
> +                                 GFP_KERNEL);
> +       if (!mm->roots)
> +               goto out_free_blocks;
> +
> +       offset = 0;
> +       i = 0;
> +
> +       /*
> +        * Split into power-of-two blocks, in case we are given a size that is
> +        * not itself a power-of-two.
> +        */
> +       do {
> +               struct i915_buddy_block *root;
> +               unsigned int order;
> +               u64 root_size;
> +
> +               root = kmem_cache_zalloc(mm->blocks, GFP_KERNEL);
> +               if (!root)
> +                       goto out_free_roots;
> +
> +               root_size = rounddown_pow_of_two(size);
> +               order = ilog2(root_size) - ilog2(min_size);
> +
> +               root->header = offset;
> +               root->header |= order;
> +
> +               mark_free(mm, root);
> +
> +               GEM_BUG_ON(i > mm->max_order);
> +               GEM_BUG_ON(i915_buddy_block_size(mm, root) < min_size);
> +
> +               mm->roots[i] = root;
> +
> +               offset += root_size;
> +               size -= root_size;
> +               i++;
> +       } while (size);
> +
> +       return 0;
> +
> +out_free_roots:
> +       while (i--)
> +               kmem_cache_free(mm->blocks, mm->roots[i]);
> +       kfree(mm->roots);
> +out_free_blocks:
> +       kmem_cache_destroy(mm->blocks);
> +out_free_list:
> +       kfree(mm->free_list);
> +       return -ENOMEM;
> +}
> +
> +void i915_buddy_fini(struct i915_buddy_mm *mm)
> +{
> +       int err = 0;
> +       int i;
> +
> +       for (i = 0; i < mm->n_roots; ++i) {
> +               if (!i915_buddy_block_free(mm->roots[i])) {
> +                       err = -EBUSY;
> +                       continue;
> +               }
> +
> +               kmem_cache_free(mm->blocks, mm->roots[i]);
> +       }
> +
> +       /*
> +        * XXX: Rather leak memory for now, than hit a potential user-after-free
> +        */

Bug bug bug est est.

> +       if (WARN_ON(err))
> +               return;
> +
> +       kfree(mm->roots);
> +       kfree(mm->free_list);
> +       kmem_cache_destroy(mm->blocks);
> +}
> +
> +static int split_block(struct i915_buddy_mm *mm,
> +                      struct i915_buddy_block *block)
> +{
> +       unsigned int order = i915_buddy_block_order(block);
> +       u64 offset = i915_buddy_block_offset(block);
> +
> +       GEM_BUG_ON(!i915_buddy_block_free(block));
> +       GEM_BUG_ON(!order);

uint order = block_order(block) - 1;

> +
> +       block->left = kmem_cache_zalloc(mm->blocks, GFP_KERNEL);
> +       if (!block->left)
> +               return -ENOMEM;
> +
> +       block->left->header = offset;
> +       block->left->header |= order - 1;
> +
> +       block->left->parent = block;
> +
> +       INIT_LIST_HEAD(&block->left->link);
> +       INIT_LIST_HEAD(&block->left->tmp_link);
> +
> +       block->right = kmem_cache_zalloc(mm->blocks, GFP_KERNEL);
> +       if (!block->right) {
> +               kmem_cache_free(mm->blocks, block->left);
> +               return -ENOMEM;
> +       }
> +
> +       block->right->header = offset + (BIT_ULL(order - 1) * mm->min_size);
> +       block->right->header |= order - 1;
> +
> +       block->right->parent = block;
> +
> +       INIT_LIST_HEAD(&block->right->link);
> +       INIT_LIST_HEAD(&block->right->tmp_link);

Duplicate code?

block->left = alloc_block(block, order, offset);
block->right = alloc_block(block, order, offset + BIT(order) * mm->min_size);

alloc_block:
	split->parent = block;
	split->header = offset;
	split->header |= order - 1;

	/* XXX having to init this pair is worrisome? */
	INIT_LIST_HEAD(&split->link);
	INIT_LIST_HEAD(&split->tmp_link);

	mark_free(mm, split);
	return split;

However, isn't one of the joys of buddy allocators that you allocate
in pairs, so finding the buddy is just pointer arithmetic?

> +       mark_free(mm, block->left);
> +       mark_free(mm, block->right);
> +
> +       mark_split(block);
> +
> +       return 0;
> +}
> +
> +static struct i915_buddy_block *
> +get_buddy(struct i915_buddy_block *block)
> +{
> +       struct i915_buddy_block *parent;
> +
> +       parent = block->parent;
> +       if (!parent)
> +               return NULL;
> +
> +       if (parent->left == block)
> +               return parent->right;
> +
> +       return parent->left;
> +}
> +
> +static void __i915_buddy_free(struct i915_buddy_mm *mm,
> +                             struct i915_buddy_block *block)
> +{
> +       list_del_init(&block->link); /* We have ownership now */
> +
> +       while (block->parent) {
> +               struct i915_buddy_block *buddy;
> +
> +               buddy = get_buddy(block);
> +
> +               if (!i915_buddy_block_free(buddy))
> +                       break;
> +
> +               list_del(&buddy->link);
> +
> +               kmem_cache_free(mm->blocks, block);
> +               kmem_cache_free(mm->blocks, buddy);
> +
> +               block = block->parent;
> +       }
> +
> +       mark_free(mm, block);
> +}
> +
> +void i915_buddy_free(struct i915_buddy_mm *mm,
> +                    struct i915_buddy_block *block)
> +{
> +       GEM_BUG_ON(!i915_buddy_block_allocated(block));
> +       __i915_buddy_free(mm, block);
> +}
> +
> +void i915_buddy_free_list(struct i915_buddy_mm *mm,
> +                             struct list_head *objects)
> +{
> +       struct i915_buddy_block *block, *on;
> +
> +       list_for_each_entry_safe(block, on, objects, link)
> +               i915_buddy_free(mm, block);
> +}
> +
> +/*
> + * Allocate power-of-two block. The order value here translates to:
> + *
> + *   0 = 2^0 * mm->min_size
> + *   1 = 2^1 * mm->min_size
> + *   2 = 2^2 * mm->min_size
> + *   ...
> + */
> +struct i915_buddy_block *
> +i915_buddy_alloc(struct i915_buddy_mm *mm, unsigned int order)
> +{
> +       struct i915_buddy_block *block = NULL;
> +       unsigned int i;
> +       int err;
> +
> +       for (i = order; i <= mm->max_order; ++i) {
> +               block = list_first_entry_or_null(&mm->free_list[i],
> +                                                struct i915_buddy_block,
> +                                                link);
> +               if (block)
> +                       break;
> +       }
> +
> +       if (!block)
> +               return ERR_PTR(-ENOSPC);
> +
> +       GEM_BUG_ON(!i915_buddy_block_free(block));
> +
> +       while (i != order) {
> +               err = split_block(mm, block);
> +               if (unlikely(err))
> +                       goto out_free;
> +
> +               /* Go low */
> +               block = block->left;
> +               i--;
> +       }
> +
> +       mark_allocated(block);
> +       return block;
> +
> +out_free:
> +       __i915_buddy_free(mm, block);
> +       return ERR_PTR(err);
> +}
> +
> +static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
> +{
> +       return s1 <= e2 && e1 >= s2;
> +}
> +
> +static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
> +{
> +       return s1 <= s2 && e1 >= e2;
> +}
> +
> +/*
> + * Allocate range. Note that it's safe to chain together multiple alloc_ranges
> + * with the same blocks list.
> + *
> + * Intended for pre-allocating portions of the address space, for example to
> + * reserve a block for the initial framebuffer or similar, hence the expectation
> + * here is that i915_buddy_alloc() is still the main vehicle for
> + * allocations, so if that's not the case then the drm_mm range allocator is
> + * probably a much better fit, and so you should probably go use that instead.
> + */
> +int i915_buddy_alloc_range(struct i915_buddy_mm *mm,
> +                          struct list_head *blocks,
> +                          u64 start, u64 size)
> +{
> +       struct i915_buddy_block *block;
> +       struct i915_buddy_block *buddy;
> +       LIST_HEAD(allocated);
> +       LIST_HEAD(dfs);
> +       u64 end;
> +       int err;
> +       int i;
> +
> +       if (size < mm->min_size)
> +               return -EINVAL;
> +
> +       if (!IS_ALIGNED(start, mm->min_size))
> +               return -EINVAL;
> +
> +       if (!size || !IS_ALIGNED(size, mm->min_size))
> +               return -EINVAL;
> +
> +       if (range_overflows(start, size, mm->size))
> +               return -EINVAL;
> +
> +       for (i = 0; i < mm->n_roots; ++i)
> +               list_add_tail(&mm->roots[i]->tmp_link, &dfs);
> +
> +       end = start + size - 1;
> +
> +       do {
> +               u64 block_start;
> +               u64 block_end;
> +
> +               block = list_first_entry_or_null(&dfs,
> +                                                struct i915_buddy_block,
> +                                                tmp_link);
> +               if (!block)
> +                       break;
> +
> +               list_del(&block->tmp_link);
> +
> +               block_start = i915_buddy_block_offset(block);
> +               block_end = block_start + i915_buddy_block_size(mm, block) - 1;
> +
> +               if (!overlaps(start, end, block_start, block_end))
> +                       continue;
> +
> +               if (i915_buddy_block_allocated(block)) {
> +                       err = -ENOSPC;
> +                       goto err_free;
> +               }
> +
> +               if (contains(start, end, block_start, block_end)) {
> +                       if (!i915_buddy_block_free(block)) {
> +                               err = -ENOSPC;
> +                               goto err_free;
> +                       }
> +
> +                       mark_allocated(block);
> +                       list_add_tail(&block->link, &allocated);
> +                       continue;
> +               }
> +
> +               if (!i915_buddy_block_split(block)) {
> +                       err = split_block(mm, block);
> +                       if (unlikely(err))
> +                               goto err_undo;
> +               }
> +
> +               list_add(&block->right->tmp_link, &dfs);
> +               list_add(&block->left->tmp_link, &dfs);
> +       } while (1);
> +
> +       list_splice_tail(&allocated, blocks);
> +       return 0;
> +
> +err_undo:
> +       /*
> +        * We really don't want to leave around a bunch of split blocks, since
> +        * bigger is better, so make sure we merge everything back before we
> +        * free the allocated blocks.
> +        */
> +       buddy = get_buddy(block);
> +       if (buddy && (i915_buddy_block_free(block) &&
> +           i915_buddy_block_free(buddy)))
> +               __i915_buddy_free(mm, block);
> +
> +err_free:
> +       i915_buddy_free_list(mm, &allocated);
> +       return err;
> +}
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> +#include "selftests/i915_buddy.c"
> +#endif
> diff --git a/drivers/gpu/drm/i915/i915_buddy.h b/drivers/gpu/drm/i915/i915_buddy.h
> new file mode 100644
> index 000000000000..615eecd7cf4a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_buddy.h
> @@ -0,0 +1,115 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __I915_BUDDY_H__
> +#define __I915_BUDDY_H__
> +
> +#include <linux/bitops.h>
> +
> +struct list_head;

You need the actual include for embedding the struct.

> +
> +struct i915_buddy_block {
> +#define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
> +#define I915_BUDDY_HEADER_STATE  GENMASK_ULL(11, 10)
> +#define   I915_BUDDY_ALLOCATED (1<<10)
> +#define   I915_BUDDY_FREE         (2<<10)
> +#define   I915_BUDDY_SPLIT        (3<<10)
> +#define I915_BUDDY_HEADER_ORDER  GENMASK_ULL(9, 0)
> +       u64 header;
> +
> +       struct i915_buddy_block *left;
> +       struct i915_buddy_block *right;
> +       struct i915_buddy_block *parent;
> +
> +       /* XXX: somwewhat funky */
> +       struct list_head link;
> +       struct list_head tmp_link;
> +};
> +
> +#define I915_BUDDY_MAX_ORDER  I915_BUDDY_HEADER_ORDER
> +
> +/* Binary Buddy System */
> +struct i915_buddy_mm {
> +       struct kmem_cache *blocks;
> +
> +       /* Maintain a free list for each order. */
> +       struct list_head *free_list;
> +
> +       /*
> +        * Maintain explicit binary tree(s) to track the allocation of the
> +        * address space. This gives us a simple way of finding a buddy block
> +        * and performing the potentially recursive merge step when freeing a
> +        * block.  Nodes are either allocated or free, in which case they will
> +        * also exist on the respective free list.
> +        */
> +       struct i915_buddy_block **roots;
> +
> +       unsigned int n_roots;
> +       unsigned int max_order;
> +
> +       /* Must be at least PAGE_SIZE */
> +       u64 min_size;
> +       u64 size;
> +};

Not one mention that the owner/caller is responsible for locking.

> +
> +static inline u64
> +i915_buddy_block_offset(struct i915_buddy_block *block)
> +{
> +       return block->header & I915_BUDDY_HEADER_OFFSET;
> +}
> +
> +static inline unsigned int
> +i915_buddy_block_order(struct i915_buddy_block *block)
> +{
> +       return block->header & I915_BUDDY_HEADER_ORDER;
> +}
> +
> +static inline unsigned int
> +i915_buddy_block_state(struct i915_buddy_block *block)
> +{
> +       return block->header & I915_BUDDY_HEADER_STATE;
> +}
> +
> +static inline bool
> +i915_buddy_block_allocated(struct i915_buddy_block *block)
> +{
> +       return i915_buddy_block_state(block) == I915_BUDDY_ALLOCATED;
> +}
> +
> +static inline bool
> +i915_buddy_block_free(struct i915_buddy_block *block)

This needs an 'is' as free() is a rather common verb and this is very,
very confusing.

So is_allocated, is_free, is_split.

> +{
> +       return i915_buddy_block_state(block) == I915_BUDDY_FREE;
> +}
> +
> +static inline bool
> +i915_buddy_block_split(struct i915_buddy_block *block)
> +{
> +       return i915_buddy_block_state(block) == I915_BUDDY_SPLIT;
> +}
> +
> +static inline u64
> +i915_buddy_block_size(struct i915_buddy_mm *mm,
> +                     struct i915_buddy_block *block)
> +{
> +       return BIT(i915_buddy_block_order(block)) * mm->min_size;
> +}
> +
> +int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 min_size);
> +
> +void i915_buddy_fini(struct i915_buddy_mm *mm);
> +
> +struct i915_buddy_block *
> +i915_buddy_alloc(struct i915_buddy_mm *mm, unsigned int order);
> +
> +int i915_buddy_alloc_range(struct i915_buddy_mm *mm,
> +                          struct list_head *blocks,
> +                          u64 start, u64 size);
> +
> +void i915_buddy_free(struct i915_buddy_mm *mm, struct i915_buddy_block *block);
> +
> +void i915_buddy_free_list(struct i915_buddy_mm *mm, struct list_head *objects);

> +
> +#endif
> diff --git a/drivers/gpu/drm/i915/selftests/i915_buddy.c b/drivers/gpu/drm/i915/selftests/i915_buddy.c
> new file mode 100644
> index 000000000000..2159aa9f4867
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/selftests/i915_buddy.c
> @@ -0,0 +1,491 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include <linux/prime_numbers.h>
> +
> +#include "../i915_selftest.h"
> +#include "i915_random.h"
> +
> +#define SZ_8G (1ULL << 33)
> +
> +static void __igt_dump_block(struct i915_buddy_mm *mm,
> +                            struct i915_buddy_block *block,
> +                            bool buddy)
> +{
> +       pr_err("block info: header=%llx, state=%u, order=%d, offset=%llx size=%llx root=%s buddy=%s\n",
> +              block->header,
> +              i915_buddy_block_state(block),
> +              i915_buddy_block_order(block),
> +              i915_buddy_block_offset(block),
> +              i915_buddy_block_size(mm, block),
> +              yesno(!block->parent),
> +              yesno(buddy));
> +}
> +
> +static void igt_dump_block(struct i915_buddy_mm *mm,
> +                          struct i915_buddy_block *block)
> +{
> +       struct i915_buddy_block *buddy;
> +
> +       __igt_dump_block(mm, block, false);
> +
> +       buddy = get_buddy(block);
> +       if (buddy)
> +               __igt_dump_block(mm, buddy, true);
> +}
> +
> +static int igt_check_block(struct i915_buddy_mm *mm,
> +                          struct i915_buddy_block *block)
> +{
> +       struct i915_buddy_block *buddy;
> +       unsigned int block_state;
> +       u64 block_size;
> +       u64 offset;
> +       int err = 0;
> +
> +       block_state = i915_buddy_block_state(block);
> +
> +       if (block_state != I915_BUDDY_ALLOCATED &&
> +           block_state != I915_BUDDY_FREE &&
> +           block_state != I915_BUDDY_SPLIT) {
> +               pr_err("block state mismatch\n");
> +               err = -EINVAL;
> +       }
> +
> +       block_size = i915_buddy_block_size(mm, block);
> +       offset = i915_buddy_block_offset(block);
> +
> +       if (block_size < mm->min_size) {
> +               pr_err("block size smaller than min size\n");
> +               err = -EINVAL;
> +       }
> +
> +       if (!is_power_of_2(block_size)) {
> +               pr_err("block size not power of two\n");
> +               err = -EINVAL;
> +       }
> +
> +       if (!IS_ALIGNED(block_size, mm->min_size)) {
> +               pr_err("block size not aligned to min size\n");
> +               err = -EINVAL;
> +       }
> +
> +       if (!IS_ALIGNED(offset, mm->min_size)) {
> +               pr_err("block offset not aligned to min size\n");
> +               err = -EINVAL;
> +       }
> +
> +       if (!IS_ALIGNED(offset, block_size)) {
> +               pr_err("block offset not aligned to block size\n");
> +               err = -EINVAL;
> +       }
> +
> +       buddy = get_buddy(block);
> +
> +       if (!buddy && block->parent) {
> +               pr_err("buddy has gone fishing\n");
> +               err = -EINVAL;
> +       }
> +
> +       if (buddy) {
> +               if (i915_buddy_block_offset(buddy) != (offset ^ block_size)) {
> +                       pr_err("buddy has wrong offset\n");
> +                       err = -EINVAL;
> +               }
> +
> +               if (i915_buddy_block_size(mm, buddy) != block_size) {
> +                       pr_err("buddy size mismatch\n");
> +                       err = -EINVAL;
> +               }
> +
> +               if (i915_buddy_block_state(buddy) == block_state &&
> +                   block_state == I915_BUDDY_FREE) {
> +                       pr_err("block and its buddy are free\n");
> +                       err = -EINVAL;
> +               }
> +       }
> +
> +       return err;
> +}
> +
> +static int igt_check_blocks(struct i915_buddy_mm *mm,
> +                           struct list_head *blocks,
> +                           u64 expected_size,
> +                           bool is_contiguous)
> +{
> +       struct i915_buddy_block *block;
> +       struct i915_buddy_block *prev;
> +       u64 total;
> +       int err = 0;
> +
> +       block = NULL;
> +       prev = NULL;
> +       total = 0;
> +
> +       list_for_each_entry(block, blocks, link) {
> +               err = igt_check_block(mm, block);
> +
> +               if (!i915_buddy_block_allocated(block)) {
> +                       pr_err("block not allocated\n"),
> +                       err = -EINVAL;
> +               }
> +
> +               if (is_contiguous && prev) {
> +                       u64 prev_block_size;
> +                       u64 prev_offset;
> +                       u64 offset;
> +
> +                       prev_offset = i915_buddy_block_offset(prev);
> +                       prev_block_size = i915_buddy_block_size(mm, prev);
> +                       offset = i915_buddy_block_offset(block);
> +
> +                       if (offset != (prev_offset + prev_block_size)) {
> +                               pr_err("block offset mismatch\n");
> +                               err = -EINVAL;
> +                       }
> +               }
> +
> +               if (err)
> +                       break;
> +
> +               total += i915_buddy_block_size(mm, block);
> +               prev = block;
> +       }
> +
> +       if (!err) {
> +               if (total != expected_size) {
> +                       pr_err("size mismatch, expected=%llx, found=%llx\n",
> +                              expected_size, total);
> +                       err = -EINVAL;
> +               }
> +               return err;
> +       }
> +
> +       if (prev) {
> +               pr_err("prev block, dump:\n");
> +               igt_dump_block(mm, prev);
> +       }
> +
> +       if (block) {
> +               pr_err("bad block, dump:\n");
> +               igt_dump_block(mm, block);
> +       }
> +
> +       return err;
> +}
> +
> +static int igt_check_mm(struct i915_buddy_mm *mm)
> +{
> +       struct i915_buddy_block *root;
> +       struct i915_buddy_block *prev;
> +       unsigned int i;
> +       u64 total;
> +       int err = 0;
> +
> +       if (!mm->n_roots) {
> +               pr_err("n_roots is zero\n");
> +               return -EINVAL;
> +       }
> +
> +       if (mm->n_roots != hweight64(mm->size)) {
> +               pr_err("n_roots mismatch, n_roots=%u, expected=%lu\n",
> +                      mm->n_roots, hweight64(mm->size));
> +               return -EINVAL;
> +       }
> +
> +       root = NULL;
> +       prev = NULL;
> +       total = 0;
> +
> +       for (i = 0; i < mm->n_roots; ++i) {
> +               struct i915_buddy_block *block;
> +               unsigned int order;
> +
> +               root = mm->roots[i];
> +               if (!root) {
> +                       pr_err("root(%u) is NULL\n", i);
> +                       err = -EINVAL;
> +                       break;
> +               }
> +
> +               err = igt_check_block(mm, root);
> +
> +               if (!i915_buddy_block_free(root)) {
> +                       pr_err("root not free\n");
> +                       err = -EINVAL;
> +               }
> +
> +               order = i915_buddy_block_order(root);
> +
> +               if (!i) {
> +                       if (order != mm->max_order) {
> +                               pr_err("max order root missing\n");
> +                               err = -EINVAL;
> +                       }
> +               }
> +
> +               if (prev) {
> +                       u64 prev_block_size;
> +                       u64 prev_offset;
> +                       u64 offset;
> +
> +                       prev_offset = i915_buddy_block_offset(prev);
> +                       prev_block_size = i915_buddy_block_size(mm, prev);
> +                       offset = i915_buddy_block_offset(root);
> +
> +                       if (offset != (prev_offset + prev_block_size)) {
> +                               pr_err("root offset mismatch\n");
> +                               err = -EINVAL;
> +                       }
> +               }
> +
> +               block = list_first_entry_or_null(&mm->free_list[order],
> +                                                struct i915_buddy_block,
> +                                                link);
> +               if (block != root) {
> +                       pr_err("root mismatch at order=%u\n", order);
> +                       err = -EINVAL;
> +               }
> +
> +               if (err)
> +                       break;
> +
> +               prev = root;
> +               total += i915_buddy_block_size(mm, root);
> +       }
> +
> +       if (!err) {
> +               if (total != mm->size) {
> +                       pr_err("expected mm size=%llx, found=%llx\n", mm->size,
> +                              total);
> +                       err = -EINVAL;
> +               }
> +               return err;
> +       }
> +
> +       if (prev) {
> +               pr_err("prev root(%u), dump:\n", i - 1);
> +               igt_dump_block(mm, prev);
> +       }
> +
> +       if (root) {
> +               pr_err("bad root(%u), dump:\n", i);
> +               igt_dump_block(mm, root);
> +       }
> +
> +       return err;
> +}
> +
> +static void igt_mm_config(u64 *size, u64 *min_size)
> +{
> +       I915_RND_STATE(prng);
> +       u64 s, ms;
> +
> +       /* Nothing fancy, just try to get an interesting bit pattern */
> +
> +       prandom_seed_state(&prng, i915_selftest.random_seed);
> +
> +       s = i915_prandom_u64_state(&prng) & (SZ_8G - 1);
> +       ms = BIT_ULL(12 + (prandom_u32_state(&prng) % ilog2(s >> 12)));
> +       s = max(s & -ms, ms);
> +
> +       *min_size = ms;
> +       *size = s;
> +}
> +
> +static int igt_buddy_alloc(void *arg)
> +{
> +       struct i915_buddy_mm mm;
> +       int max_order;
> +       u64 min_size;
> +       u64 mm_size;
> +       int err;
> +
> +       igt_mm_config(&mm_size, &min_size);
> +
> +       pr_info("buddy_init with size=%llx, min_size=%llx\n", mm_size, min_size);
> +
> +       err = i915_buddy_init(&mm, mm_size, min_size);
> +       if (err) {
> +               pr_err("buddy_init failed(%d)\n", err);
> +               return err;
> +       }
> +
> +       for (max_order = mm.max_order; max_order >= 0; max_order--) {
> +               struct i915_buddy_block *block;
> +               int order;
> +               LIST_HEAD(blocks);
> +               u64 total;
> +
> +               err = igt_check_mm(&mm);
> +               if (err) {
> +                       pr_err("pre-mm check failed, abort\n");
> +                       break;
> +               }
> +
> +               pr_info("filling from max_order=%u\n", max_order);
> +
> +               order = max_order;
> +               total = 0;
> +
> +               do {
> +retry:
> +                       block = i915_buddy_alloc(&mm, order);
> +                       if (IS_ERR(block)) {
> +                               err = PTR_ERR(block);
> +                               if (err == -ENOMEM) {
> +                                       pr_info("buddy_alloc hit -ENOMEM with order=%d\n",
> +                                               order);
> +                               } else {
> +                                       if (order--) {
> +                                               err = 0;
> +                                               goto retry;
> +                                       }
> +
> +                                       pr_err("buddy_alloc with order=%d failed(%d)\n",
> +                                              order, err);
> +                               }
> +
> +                               break;
> +                       }
> +
> +                       list_add_tail(&block->link, &blocks);
> +
> +                       if (i915_buddy_block_order(block) != order) {
> +                               pr_err("buddy_alloc order mismatch\n");
> +                               err = -EINVAL;
> +                               break;
> +                       }
> +
> +                       total += i915_buddy_block_size(&mm, block);
> +               } while (total < mm.size);
> +
> +               if (!err)
> +                       err = igt_check_blocks(&mm, &blocks, total, false);
> +
> +               i915_buddy_free_list(&mm, &blocks);
> +
> +               if (!err) {
> +                       err = igt_check_mm(&mm);
> +                       if (err)
> +                               pr_err("post-mm check failed\n");
> +               }
> +
> +               if (err)
> +                       break;
> +       }
> +
> +       if (err == -ENOMEM)
> +               err = 0;
> +
> +       i915_buddy_fini(&mm);
> +
> +       return err;
> +}
> +
> +static int igt_buddy_alloc_range(void *arg)
> +{
> +       struct i915_buddy_mm mm;
> +       unsigned long page_num;
> +       LIST_HEAD(blocks);
> +       u64 min_size;
> +       u64 offset;
> +       u64 size;
> +       u64 rem;
> +       int err;
> +
> +       igt_mm_config(&size, &min_size);
> +
> +       pr_info("buddy_init with size=%llx, min_size=%llx\n", size, min_size);
> +
> +       err = i915_buddy_init(&mm, size, min_size);
> +       if (err) {
> +               pr_err("buddy_init failed(%d)\n", err);
> +               return err;
> +       }
> +
> +       err = igt_check_mm(&mm);
> +       if (err) {
> +               pr_err("pre-mm check failed, abort, abort, abort!\n");
> +               goto err_fini;
> +       }
> +
> +       rem = mm.size;
> +       offset = 0;
> +
> +       for_each_prime_number_from(page_num, 1, ULONG_MAX - 1) {
> +               struct i915_buddy_block *block;
> +               LIST_HEAD(tmp);
> +
> +               size = min(page_num * mm.min_size, rem);
> +
> +               err = i915_buddy_alloc_range(&mm, &tmp, offset, size);
> +               if (err) {
> +                       if (err == -ENOMEM) {
> +                               pr_info("alloc_range hit -ENOMEM with size=%llx\n",
> +                                       size);
> +                       } else {
> +                               pr_err("alloc_range with offset=%llx, size=%llx failed(%d)\n",
> +                                      offset, size, err);
> +                       }
> +
> +                       break;
> +               }
> +
> +               block = list_first_entry_or_null(&tmp,
> +                                                struct i915_buddy_block,
> +                                                link);
> +               if (!block) {
> +                       pr_err("alloc_range has no blocks\n");
> +                       err = -EINVAL;
> +               }
> +
> +               if (i915_buddy_block_offset(block) != offset) {
> +                       pr_err("alloc_range start offset mismatch, found=%llx, expected=%llx\n",
> +                              i915_buddy_block_offset(block), offset);
> +                       err = -EINVAL;
> +               }
> +
> +               if (!err)
> +                       err = igt_check_blocks(&mm, &tmp, size, true);
> +
> +               list_splice_tail(&tmp, &blocks);
> +
> +               if (err)
> +                       break;
> +
> +               offset += size;
> +
> +               rem -= size;
> +               if (!rem)
> +                       break;
> +       }
> +
> +       if (err == -ENOMEM)
> +               err = 0;
> +
> +       i915_buddy_free_list(&mm, &blocks);
> +
> +       if (!err) {
> +               err = igt_check_mm(&mm);
> +               if (err)
> +                       pr_err("post-mm check failed\n");
> +       }
> +
> +err_fini:
> +       i915_buddy_fini(&mm);
> +
> +       return err;
> +}
> +
> +int i915_buddy_mock_selftests(void)
> +{
> +       static const struct i915_subtest tests[] = {
> +               SUBTEST(igt_buddy_alloc),
> +               SUBTEST(igt_buddy_alloc_range),
> +       };
> +
> +       return i915_subtests(tests, NULL);
> +}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> index b55da4d9ccba..b88084fe3269 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
> @@ -25,3 +25,4 @@ selftest(evict, i915_gem_evict_mock_selftests)
>  selftest(gtt, i915_gem_gtt_mock_selftests)
>  selftest(hugepages, i915_gem_huge_page_mock_selftests)
>  selftest(contexts, i915_gem_context_mock_selftests)
> +selftest(buddy, i915_buddy_mock_selftests)
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 02/37] drm/i915: introduce intel_memory_region
  2019-06-27 20:55 ` [PATCH v2 02/37] drm/i915: introduce intel_memory_region Matthew Auld
@ 2019-06-27 22:47   ` Chris Wilson
  2019-06-28  8:09   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 22:47 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:55:58)
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> new file mode 100644
> index 000000000000..4c89853a7769
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -0,0 +1,215 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "intel_memory_region.h"
> +#include "i915_drv.h"
> +
> +static void
> +memory_region_free_pages(struct drm_i915_gem_object *obj,
> +                        struct sg_table *pages)
> +{
> +
> +       struct i915_buddy_block *block, *on;
> +
> +       lockdep_assert_held(&obj->memory_region->mm_lock);
> +
> +       list_for_each_entry_safe(block, on, &obj->blocks, link) {
> +               list_del_init(&block->link);

Block is freed, link is dead already.

> +               i915_buddy_free(&obj->memory_region->mm, block);
> +       }

So instead of deleting every link, you can just reinitialise the list.

> +       sg_free_table(pages);
> +       kfree(pages);
> +}
> +
> +void
> +i915_memory_region_put_pages_buddy(struct drm_i915_gem_object *obj,
> +                                  struct sg_table *pages)
> +{
> +       mutex_lock(&obj->memory_region->mm_lock);
> +       memory_region_free_pages(obj, pages);
> +       mutex_unlock(&obj->memory_region->mm_lock);
> +
> +       obj->mm.dirty = false;
> +}
> +
> +int
> +i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)

This is not operating on a intel_memory_region now, is it?

> +{
> +       struct intel_memory_region *mem = obj->memory_region;
> +       resource_size_t size = obj->base.size;
> +       struct sg_table *st;
> +       struct scatterlist *sg;
> +       unsigned int sg_page_sizes;
> +       unsigned long n_pages;
> +
> +       GEM_BUG_ON(!IS_ALIGNED(size, mem->mm.min_size));
> +       GEM_BUG_ON(!list_empty(&obj->blocks));
> +
> +       st = kmalloc(sizeof(*st), GFP_KERNEL);
> +       if (!st)
> +               return -ENOMEM;
> +
> +       n_pages = size >> ilog2(mem->mm.min_size);
> +
> +       if (sg_alloc_table(st, n_pages, GFP_KERNEL)) {
> +               kfree(st);
> +               return -ENOMEM;
> +       }
> +
> +       sg = st->sgl;
> +       st->nents = 0;
> +       sg_page_sizes = 0;
> +
> +       mutex_lock(&mem->mm_lock);
> +
> +       do {
> +               struct i915_buddy_block *block;
> +               unsigned int order;
> +               u64 block_size;
> +               u64 offset;
> +
> +               order = fls(n_pages) - 1;
> +               GEM_BUG_ON(order > mem->mm.max_order);
> +
> +               do {
> +                       block = i915_buddy_alloc(&mem->mm, order);
> +                       if (!IS_ERR(block))
> +                               break;
> +
> +                       /* XXX: some kind of eviction pass, local to the device */
> +                       if (!order--)
> +                               goto err_free_blocks;
> +               } while (1);
> +
> +               n_pages -= BIT(order);
> +
> +               INIT_LIST_HEAD(&block->link);

No need, list_add works on the unitialised.

> +               list_add(&block->link, &obj->blocks);
> +
> +               /*
> +                * TODO: it might be worth checking consecutive blocks here and
> +                * coalesce if we can.
> +               */
Hah.
> +               block_size = i915_buddy_block_size(&mem->mm, block);
> +               offset = i915_buddy_block_offset(block);
> +
> +               sg_dma_address(sg) = mem->region.start + offset;
> +               sg_dma_len(sg) = block_size;
> +
> +               sg->length = block_size;
> +               sg_page_sizes |= block_size;
> +               st->nents++;
> +
> +               if (!n_pages) {
> +                       sg_mark_end(sg);
> +                       break;
> +               }
> +
> +               sg = __sg_next(sg);
> +       } while (1);
> +

Ok, nothing else strayed under the lock.

> +       mutex_unlock(&mem->mm_lock);
> +
> +       i915_sg_trim(st);
> +
> +       __i915_gem_object_set_pages(obj, st, sg_page_sizes);
> +
> +       return 0;
> +
> +err_free_blocks:
> +       memory_region_free_pages(obj, st);
> +       mutex_unlock(&mem->mm_lock);
> +       return -ENXIO;
> +}
> +
> +int i915_memory_region_init_buddy(struct intel_memory_region *mem)
> +{
> +       return i915_buddy_init(&mem->mm, resource_size(&mem->region),
> +                              mem->min_page_size);
> +}
> +
> +void i915_memory_region_release_buddy(struct intel_memory_region *mem)

Exporting these, with the wrong prefix even?

> +{
> +       i915_buddy_fini(&mem->mm);
> +}
> +
> +struct drm_i915_gem_object *
> +i915_gem_object_create_region(struct intel_memory_region *mem,
> +                             resource_size_t size,
> +                             unsigned int flags)
> +{
> +       struct drm_i915_gem_object *obj;
> +
> +       if (!mem)
> +               return ERR_PTR(-ENODEV);
> +
> +       size = round_up(size, mem->min_page_size);
> +
> +       GEM_BUG_ON(!size);
> +       GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_MIN_ALIGNMENT));
> +
> +       if (size >> PAGE_SHIFT > INT_MAX)
> +               return ERR_PTR(-E2BIG);
> +
> +       if (overflows_type(size, obj->base.size))
> +               return ERR_PTR(-E2BIG);
> +
> +       obj = mem->ops->create_object(mem, size, flags);
> +       if (IS_ERR(obj))
> +               return obj;
> +
> +       INIT_LIST_HEAD(&obj->blocks);
> +       obj->memory_region = mem;

This strikes me as odd, the pattern would be that the ->create_object()
called a common init. That is the way of the pipelined interface, this
is the way of the midlayer.

i915_gem_object_(init|set)_memory_region(obj, mem) {
	obj->memory_region = mem;
	INIT_LIST_HEAD(&obj->blocks);
}

> +       return obj;
> +}
> +
> +struct intel_memory_region *
> +intel_memory_region_create(struct drm_i915_private *i915,
> +                          resource_size_t start,
> +                          resource_size_t size,
> +                          resource_size_t min_page_size,
> +                          resource_size_t io_start,
> +                          const struct intel_memory_region_ops *ops)
> +{
> +       struct intel_memory_region *mem;
> +       int err;
> +
> +       mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> +       if (!mem)
> +               return ERR_PTR(-ENOMEM);
> +
> +       mem->i915 = i915;
> +       mem->region = (struct resource)DEFINE_RES_MEM(start, size);
> +       mem->io_start = io_start;
> +       mem->min_page_size = min_page_size;
> +       mem->ops = ops;
> +
> +       mutex_init(&mem->mm_lock);

Hmm, why do I expect this lock to be nested? Would it make more sense to
have a lock_class per type?

> +       if (ops->init) {
> +               err = ops->init(mem);
> +               if (err) {
> +                       kfree(mem);
> +                       mem = ERR_PTR(err);
> +               }
> +       }
> +
> +       return mem;
> +}
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-06-27 20:55 ` [PATCH v2 03/37] drm/i915/region: support basic eviction Matthew Auld
@ 2019-06-27 22:59   ` Chris Wilson
  2019-07-30 16:26   ` Daniel Vetter
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 22:59 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:55:59)
> +int i915_memory_region_evict(struct intel_memory_region *mem,

What type is this again?

> +                            resource_size_t target)
> +{
> +       struct drm_i915_gem_object *obj, *on;
> +       resource_size_t found;
> +       LIST_HEAD(purgeable);
> +       int err;
> +
> +       err = 0;
> +       found = 0;
> +
> +       mutex_lock(&mem->obj_lock);
> +
> +       list_for_each_entry(obj, &mem->purgeable, region_link) {
> +               if (!i915_gem_object_has_pages(obj))
> +                       continue;
> +
> +               if (READ_ONCE(obj->pin_global))
> +                       continue;
> +
> +               if (atomic_read(&obj->bind_count))
> +                       continue;
> +
> +               list_add(&obj->eviction_link, &purgeable);
> +
> +               found += obj->base.size;
> +               if (found >= target)
> +                       goto found;
> +       }
> +
> +       err = -ENOSPC;
> +found:
> +       list_for_each_entry_safe(obj, on, &purgeable, eviction_link) {
> +               if (!err) {
> +                       __i915_gem_object_put_pages(obj, I915_MM_SHRINKER);

How come put_pages is not taking mm->obj_lock to remove the
obj->region_link?

I'm getting fishy vibes.

> +
> +                       mutex_lock_nested(&obj->mm.lock, I915_MM_SHRINKER);

> +                       if (!i915_gem_object_has_pages(obj))
> +                               obj->mm.madv = __I915_MADV_PURGED;

That should be pushed to put_pages() as reason. The unlock/lock is just
asking for trouble.

> +                       mutex_unlock(&obj->mm.lock);
> +               }
> +
> +               list_del(&obj->eviction_link);
> +       }

You will have noticed that a separate eviction_link is superfluous? If
both region_link and evction_link are only valid underneath obj_lock,
you can list_move(&obj->region_link, &purgeable) in the first pass, and
unwind on error.

However, I'm going hmm.

So you keep all objects on the shrink lists even when not allocated. Ho
hum. With a bit more creative locking, read careful acquisition of
resources then dropping the lock before actually evicting, it should
work out.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 04/37] drm/i915/region: support continuous allocations
  2019-06-27 20:56 ` [PATCH v2 04/37] drm/i915/region: support continuous allocations Matthew Auld
@ 2019-06-27 23:01   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:01 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:00)
> Some objects may need to be allocated as a continuous block, thinking
> ahead the various kernel io_mapping interfaces seem to expect it.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   4 +
>  drivers/gpu/drm/i915/intel_memory_region.c    |   7 +-
>  .../drm/i915/selftests/intel_memory_region.c  | 152 +++++++++++++++++-
>  drivers/gpu/drm/i915/selftests/mock_region.c  |   3 +
>  4 files changed, 160 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 87000fc24ab3..1c4b99e507c3 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -133,6 +133,10 @@ struct drm_i915_gem_object {
>         struct list_head batch_pool_link;
>         I915_SELFTEST_DECLARE(struct list_head st_link);
>  
> +       unsigned long flags;
> +#define I915_BO_ALLOC_CONTIGUOUS (1<<0)
BIT(0)
> +#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS)
> +
>         /*
>          * Is the object to be mapped as read-only to the GPU
>          * Only honoured if hardware has relevant pte bit
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index 721b47e46492..9b6a32bfa20d 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -90,6 +90,7 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
>  {
>         struct intel_memory_region *mem = obj->memory_region;
>         resource_size_t size = obj->base.size;
> +       unsigned int flags = obj->flags;

Was unsigned long.

>         struct sg_table *st;
>         struct scatterlist *sg;
>         unsigned int sg_page_sizes;
> @@ -130,7 +131,7 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
>                         if (!IS_ERR(block))
>                                 break;
>  
> -                       if (!order--) {
> +                       if (flags & I915_BO_ALLOC_CONTIGUOUS || !order--) {
>                                 resource_size_t target;
>                                 int err;
>  
> @@ -219,6 +220,9 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
>         if (!mem)
>                 return ERR_PTR(-ENODEV);
>  
> +       if (flags & ~I915_BO_ALLOC_FLAGS)
> +               return ERR_PTR(-EINVAL);
> +
>         size = round_up(size, mem->min_page_size);
>  
>         GEM_BUG_ON(!size);
> @@ -236,6 +240,7 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
>  
>         INIT_LIST_HEAD(&obj->blocks);
>         obj->memory_region = mem;
> +       obj->flags = flags;
>  
>         mutex_lock(&mem->obj_lock);
>         list_add(&obj->region_link, &mem->objects);
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index ece499869747..c9de8b5039e4 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -78,17 +78,17 @@ static int igt_mock_fill(void *arg)
>  
>  static void igt_mark_evictable(struct drm_i915_gem_object *obj)
>  {
> -       i915_gem_object_unpin_pages(obj);
> +       if (i915_gem_object_has_pinned_pages(obj))
> +               i915_gem_object_unpin_pages(obj);
>         obj->mm.madv = I915_MADV_DONTNEED;
>         list_move(&obj->region_link, &obj->memory_region->purgeable);
>  }
>  
> -static int igt_mock_evict(void *arg)
> +static int igt_frag_region(struct intel_memory_region *mem,
> +                          struct list_head *objects)
>  {
> -       struct intel_memory_region *mem = arg;
>         struct drm_i915_gem_object *obj;
>         unsigned long n_objects;
> -       LIST_HEAD(objects);
>         resource_size_t target;
>         resource_size_t total;
>         int err = 0;
> @@ -104,7 +104,7 @@ static int igt_mock_evict(void *arg)
>                         goto err_close_objects;
>                 }
>  
> -               list_add(&obj->st_link, &objects);
> +               list_add(&obj->st_link, objects);
>  
>                 err = i915_gem_object_pin_pages(obj);
>                 if (err)
> @@ -118,6 +118,39 @@ static int igt_mock_evict(void *arg)
>                         igt_mark_evictable(obj);
>         }
>  
> +       return 0;
> +
> +err_close_objects:
> +       close_objects(objects);
> +       return err;
> +}
> +
> +static void igt_defrag_region(struct list_head *objects)
> +{
> +       struct drm_i915_gem_object *obj;
> +
> +       list_for_each_entry(obj, objects, st_link) {
> +               if (obj->mm.madv == I915_MADV_WILLNEED)
> +                       igt_mark_evictable(obj);
> +       }
> +}
> +
> +static int igt_mock_evict(void *arg)
> +{
> +       struct intel_memory_region *mem = arg;
> +       struct drm_i915_gem_object *obj;
> +       LIST_HEAD(objects);
> +       resource_size_t target;
> +       resource_size_t total;
> +       int err;
> +
> +       err = igt_frag_region(mem, &objects);
> +       if (err)
> +               return err;
> +
> +       total = resource_size(&mem->region);
> +       target = mem->mm.min_size;
> +
>         while (target <= total / 2) {
>                 obj = i915_gem_object_create_region(mem, target, 0);
>                 if (IS_ERR(obj)) {
> @@ -148,11 +181,120 @@ static int igt_mock_evict(void *arg)
>         return err;
>  }
>  
> +static int igt_mock_continuous(void *arg)
> +{
> +       struct intel_memory_region *mem = arg;
> +       struct drm_i915_gem_object *obj;
> +       LIST_HEAD(objects);
> +       resource_size_t target;
> +       resource_size_t total;
> +       int err;
> +
> +       err = igt_frag_region(mem, &objects);
> +       if (err)
> +               return err;
> +
> +       total = resource_size(&mem->region);
> +       target = total / 2;
> +
> +       /*
> +        * Sanity check that we can allocate all of the available fragmented
> +        * space.
> +        */
> +       obj = i915_gem_object_create_region(mem, target, 0);
> +       if (IS_ERR(obj)) {
> +               err = PTR_ERR(obj);
> +               goto err_close_objects;
> +       }
> +
> +       list_add(&obj->st_link, &objects);
> +
> +       err = i915_gem_object_pin_pages(obj);
> +       if (err) {
> +               pr_err("failed to allocate available space\n");
> +               goto err_close_objects;
> +       }
> +
> +       igt_mark_evictable(obj);
> +
> +       /* Try the smallest possible size -- should succeed */
> +       obj = i915_gem_object_create_region(mem, mem->mm.min_size,
> +                                           I915_BO_ALLOC_CONTIGUOUS);
> +       if (IS_ERR(obj)) {
> +               err = PTR_ERR(obj);
> +               goto err_close_objects;
> +       }
> +
> +       list_add(&obj->st_link, &objects);
> +
> +       err = i915_gem_object_pin_pages(obj);
> +       if (err) {
> +               pr_err("failed to allocate smallest possible size\n");
> +               goto err_close_objects;
> +       }
> +
> +       igt_mark_evictable(obj);
> +
> +       if (obj->mm.pages->nents != 1) {
> +               pr_err("[1]object spans multiple sg entries\n");
> +               err = -EINVAL;
> +               goto err_close_objects;
> +       }
> +
> +       /*
> +        * Even though there is enough free space for the allocation, we
> +        * shouldn't be able to allocate it, given that it is fragmented, and
> +        * non-continuous.
> +        */
> +       obj = i915_gem_object_create_region(mem, target, I915_BO_ALLOC_CONTIGUOUS);
> +       if (IS_ERR(obj)) {
> +               err = PTR_ERR(obj);
> +               goto err_close_objects;
> +       }
> +
> +       list_add(&obj->st_link, &objects);
> +
> +       err = i915_gem_object_pin_pages(obj);
> +       if (!err) {
> +               pr_err("expected allocation to fail\n");
> +               err = -EINVAL;
> +               goto err_close_objects;
> +       }
> +
> +       igt_defrag_region(&objects);
> +
> +       /* Should now succeed */
> +       obj = i915_gem_object_create_region(mem, target, I915_BO_ALLOC_CONTIGUOUS);
> +       if (IS_ERR(obj)) {
> +               err = PTR_ERR(obj);
> +               goto err_close_objects;
> +       }
> +
> +       list_add(&obj->st_link, &objects);
> +
> +       err = i915_gem_object_pin_pages(obj);
> +       if (err) {
> +               pr_err("failed to allocate from defraged area\n");
> +               goto err_close_objects;
> +       }
> +
> +       if (obj->mm.pages->nents != 1) {
> +               pr_err("object spans multiple sg entries\n");
> +               err = -EINVAL;
> +       }
> +
> +err_close_objects:
> +       close_objects(&objects);
> +
> +       return err;
> +}
> +
>  int intel_memory_region_mock_selftests(void)
>  {
>         static const struct i915_subtest tests[] = {
>                 SUBTEST(igt_mock_fill),
>                 SUBTEST(igt_mock_evict),
> +               SUBTEST(igt_mock_continuous),
>         };
>         struct intel_memory_region *mem;
>         struct drm_i915_private *i915;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_region.c b/drivers/gpu/drm/i915/selftests/mock_region.c
> index 80eafdc54927..9eeda8f45f38 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_region.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_region.c
> @@ -20,6 +20,9 @@ mock_object_create(struct intel_memory_region *mem,
>         struct drm_i915_gem_object *obj;
>         unsigned int cache_level;
>  
> +       if (flags & I915_BO_ALLOC_CONTIGUOUS)
> +               size = roundup_pow_of_two(size);
> +
>         if (size > BIT(mem->mm.max_order) * mem->mm.min_size)
>                 return ERR_PTR(-E2BIG);
>  
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 05/37] drm/i915/region: support volatile objects
  2019-06-27 20:56 ` [PATCH v2 05/37] drm/i915/region: support volatile objects Matthew Auld
@ 2019-06-27 23:03   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:03 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: CQ Tang

Quoting Matthew Auld (2019-06-27 21:56:01)
> Volatile objects are marked as DONTNEED while pinned, therefore once
> unpinned the backing store can be discarded.

Apply to existing code...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 06/37] drm/i915: Add memory region information to device_info
  2019-06-27 20:56 ` [PATCH v2 06/37] drm/i915: Add memory region information to device_info Matthew Auld
@ 2019-06-27 23:05   ` Chris Wilson
  2019-06-27 23:08   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:05 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:02)
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> Exposes available regions for the platform. Shared memory will
> always be available.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h               |  2 ++
>  drivers/gpu/drm/i915/i915_pci.c               | 29 ++++++++++++++-----
>  drivers/gpu/drm/i915/intel_device_info.h      |  1 +
>  .../gpu/drm/i915/selftests/mock_gem_device.c  |  2 ++
>  4 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 97d02b32a208..838a796d9c55 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2288,6 +2288,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>  
>  #define HAS_IPC(dev_priv)               (INTEL_INFO(dev_priv)->display.has_ipc)
>  
> +#define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
> +
>  /*
>   * For now, anything with a GuC requires uCode loading, and then supports
>   * command submission once loaded. But these are logically independent
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 94b588e0a1dd..c513532b8da7 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -144,6 +144,9 @@
>  #define GEN_DEFAULT_PAGE_SIZES \
>         .page_sizes = I915_GTT_PAGE_SIZE_4K
>  
> +#define GEN_DEFAULT_REGIONS \
> +       .memory_regions = REGION_SMEM | REGION_STOLEN
> +
>  #define I830_FEATURES \
>         GEN(2), \
>         .is_mobile = 1, \
> @@ -161,7 +164,8 @@
>         I9XX_PIPE_OFFSETS, \
>         I9XX_CURSOR_OFFSETS, \
>         I9XX_COLORS, \
> -       GEN_DEFAULT_PAGE_SIZES
> +       GEN_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  #define I845_FEATURES \
>         GEN(2), \
> @@ -178,7 +182,8 @@
>         I845_PIPE_OFFSETS, \
>         I845_CURSOR_OFFSETS, \
>         I9XX_COLORS, \
> -       GEN_DEFAULT_PAGE_SIZES
> +       GEN_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  static const struct intel_device_info intel_i830_info = {
>         I830_FEATURES,
> @@ -212,7 +217,8 @@ static const struct intel_device_info intel_i865g_info = {
>         I9XX_PIPE_OFFSETS, \
>         I9XX_CURSOR_OFFSETS, \
>         I9XX_COLORS, \
> -       GEN_DEFAULT_PAGE_SIZES
> +       GEN_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  static const struct intel_device_info intel_i915g_info = {
>         GEN3_FEATURES,
> @@ -297,7 +303,8 @@ static const struct intel_device_info intel_pineview_m_info = {
>         I9XX_PIPE_OFFSETS, \
>         I9XX_CURSOR_OFFSETS, \
>         I965_COLORS, \
> -       GEN_DEFAULT_PAGE_SIZES
> +       GEN_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  static const struct intel_device_info intel_i965g_info = {
>         GEN4_FEATURES,
> @@ -347,7 +354,8 @@ static const struct intel_device_info intel_gm45_info = {
>         I9XX_PIPE_OFFSETS, \
>         I9XX_CURSOR_OFFSETS, \
>         ILK_COLORS, \
> -       GEN_DEFAULT_PAGE_SIZES
> +       GEN_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  static const struct intel_device_info intel_ironlake_d_info = {
>         GEN5_FEATURES,
> @@ -377,7 +385,8 @@ static const struct intel_device_info intel_ironlake_m_info = {
>         I9XX_PIPE_OFFSETS, \
>         I9XX_CURSOR_OFFSETS, \
>         ILK_COLORS, \
> -       GEN_DEFAULT_PAGE_SIZES
> +       GEN_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  #define SNB_D_PLATFORM \
>         GEN6_FEATURES, \
> @@ -425,7 +434,8 @@ static const struct intel_device_info intel_sandybridge_m_gt2_info = {
>         IVB_PIPE_OFFSETS, \
>         IVB_CURSOR_OFFSETS, \
>         IVB_COLORS, \
> -       GEN_DEFAULT_PAGE_SIZES
> +       GEN_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  #define IVB_D_PLATFORM \
>         GEN7_FEATURES, \
> @@ -486,6 +496,7 @@ static const struct intel_device_info intel_valleyview_info = {
>         I9XX_CURSOR_OFFSETS,
>         I965_COLORS,
>         GEN_DEFAULT_PAGE_SIZES,
> +       GEN_DEFAULT_REGIONS,
>  };
>  
>  #define G75_FEATURES  \
> @@ -582,6 +593,7 @@ static const struct intel_device_info intel_cherryview_info = {
>         CHV_CURSOR_OFFSETS,
>         CHV_COLORS,
>         GEN_DEFAULT_PAGE_SIZES,
> +       GEN_DEFAULT_REGIONS,
>  };
>  
>  #define GEN9_DEFAULT_PAGE_SIZES \
> @@ -657,7 +669,8 @@ static const struct intel_device_info intel_skylake_gt4_info = {
>         HSW_PIPE_OFFSETS, \
>         IVB_CURSOR_OFFSETS, \
>         IVB_COLORS, \
> -       GEN9_DEFAULT_PAGE_SIZES
> +       GEN9_DEFAULT_PAGE_SIZES, \
> +       GEN_DEFAULT_REGIONS
>  
>  static const struct intel_device_info intel_broxton_info = {
>         GEN9_LP_FEATURES,
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index ddafc819bf30..63369b65110e 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -170,6 +170,7 @@ struct intel_device_info {
>         } display;
>  
>         u16 ddb_size; /* in blocks */
> +       u32 memory_regions;

Why here? You are in between various display entities, just a few lines
above you have the ppgtt and older page sizes.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 06/37] drm/i915: Add memory region information to device_info
  2019-06-27 20:56 ` [PATCH v2 06/37] drm/i915: Add memory region information to device_info Matthew Auld
  2019-06-27 23:05   ` Chris Wilson
@ 2019-06-27 23:08   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:08 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:02)
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> Exposes available regions for the platform. Shared memory will
> always be available.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h               |  2 ++
>  drivers/gpu/drm/i915/i915_pci.c               | 29 ++++++++++++++-----
>  drivers/gpu/drm/i915/intel_device_info.h      |  1 +
>  .../gpu/drm/i915/selftests/mock_gem_device.c  |  2 ++
>  4 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 97d02b32a208..838a796d9c55 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2288,6 +2288,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>  
>  #define HAS_IPC(dev_priv)               (INTEL_INFO(dev_priv)->display.has_ipc)
>  
> +#define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
> +
>  /*
>   * For now, anything with a GuC requires uCode loading, and then supports
>   * command submission once loaded. But these are logically independent
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 94b588e0a1dd..c513532b8da7 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -144,6 +144,9 @@
>  #define GEN_DEFAULT_PAGE_SIZES \
>         .page_sizes = I915_GTT_PAGE_SIZE_4K
>  
> +#define GEN_DEFAULT_REGIONS \
> +       .memory_regions = REGION_SMEM | REGION_STOLEN

But you didn't add a stolen memory_region and use the new interface for
allocating the current stolen objects?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 07/37] drm/i915: support creating LMEM objects
  2019-06-27 20:56 ` [PATCH v2 07/37] drm/i915: support creating LMEM objects Matthew Auld
@ 2019-06-27 23:11   ` Chris Wilson
  2019-06-27 23:16   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:11 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:03)
> We currently define LMEM, or local memory, as just another memory
> region, like system memory or stolen, which we can expose to userspace
> and can be mapped to the CPU via some BAR.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |  1 +
>  drivers/gpu/drm/i915/i915_drv.h               |  5 ++
>  drivers/gpu/drm/i915/intel_region_lmem.c      | 66 +++++++++++++++++++
>  drivers/gpu/drm/i915/intel_region_lmem.h      | 16 +++++

You missed the mm/ vibes I was trying to send? ;)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 07/37] drm/i915: support creating LMEM objects
  2019-06-27 20:56 ` [PATCH v2 07/37] drm/i915: support creating LMEM objects Matthew Auld
  2019-06-27 23:11   ` Chris Wilson
@ 2019-06-27 23:16   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:03)
> We currently define LMEM, or local memory, as just another memory
> region, like system memory or stolen, which we can expose to userspace
> and can be mapped to the CPU via some BAR.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |  1 +
>  drivers/gpu/drm/i915/i915_drv.h               |  5 ++
>  drivers/gpu/drm/i915/intel_region_lmem.c      | 66 +++++++++++++++++++
>  drivers/gpu/drm/i915/intel_region_lmem.h      | 16 +++++
>  .../drm/i915/selftests/i915_live_selftests.h  |  1 +
>  .../drm/i915/selftests/intel_memory_region.c  | 43 ++++++++++++
>  6 files changed, 132 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.c
>  create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 28fac19f7b04..e782f7d10524 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -132,6 +132,7 @@ i915-y += \
>           i915_scheduler.o \
>           i915_trace_points.o \
>           i915_vma.o \
> +         intel_region_lmem.o \
>           intel_wopcm.o
>  
>  # general-purpose microcontroller (GuC) support
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 838a796d9c55..7cbdffe3f129 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -93,6 +93,8 @@
>  #include "gt/intel_timeline.h"
>  #include "i915_vma.h"
>  
> +#include "intel_region_lmem.h"
> +
>  #include "intel_gvt.h"
>  
>  /* General customization:
> @@ -1341,6 +1343,8 @@ struct drm_i915_private {
>          */
>         resource_size_t stolen_usable_size;     /* Total size minus reserved ranges */
>  
> +       struct intel_memory_region *regions[ARRAY_SIZE(intel_region_map)];
> +
>         struct intel_uncore uncore;
>  
>         struct i915_virtual_gpu vgpu;
> @@ -2289,6 +2293,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>  #define HAS_IPC(dev_priv)               (INTEL_INFO(dev_priv)->display.has_ipc)
>  
>  #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
> +#define HAS_LMEM(i915) HAS_REGION(i915, REGION_LMEM)
>  
>  /*
>   * For now, anything with a GuC requires uCode loading, and then supports
> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
> new file mode 100644
> index 000000000000..c4b5a88627a3
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
> @@ -0,0 +1,66 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "intel_memory_region.h"
> +#include "intel_region_lmem.h"
> +
> +static const struct drm_i915_gem_object_ops region_lmem_obj_ops = {
> +       .get_pages = i915_memory_region_get_pages_buddy,
> +       .put_pages = i915_memory_region_put_pages_buddy,
> +       .release = i915_gem_object_release_memory_region,
> +};
> +
> +static struct drm_i915_gem_object *
> +lmem_create_object(struct intel_memory_region *mem,
> +                  resource_size_t size,
> +                  unsigned int flags)
> +{
> +       struct drm_i915_private *i915 = mem->i915;
> +       struct drm_i915_gem_object *obj;
> +       unsigned int cache_level;
> +
> +       if (flags & I915_BO_ALLOC_CONTIGUOUS)
> +               size = roundup_pow_of_two(size);

That should not be required. Seems like a missed opportunity to
pass the flag down to the allocator.

> +       if (size > BIT(mem->mm.max_order) * mem->mm.min_size)
> +               return ERR_PTR(-E2BIG);
> +
> +       obj = i915_gem_object_alloc();
> +       if (!obj)
> +               return ERR_PTR(-ENOMEM);
> +
> +       drm_gem_private_object_init(&i915->drm, &obj->base, size);
> +       i915_gem_object_init(obj, &region_lmem_obj_ops);


> +       obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
> +       cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
> +       i915_gem_object_set_cache_coherency(obj, cache_level);

That seems a little optimistic. I would strongly suggest pulling that
information from the intel_memory_region.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 09/37] drm/i915/lmem: support kernel mapping
  2019-06-27 20:56 ` [PATCH v2 09/37] drm/i915/lmem: support kernel mapping Matthew Auld
@ 2019-06-27 23:27   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:27 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:05)
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> We can create LMEM objects, but we also need to support mapping them
> into kernel space for internal use.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 18 ++++-
>  drivers/gpu/drm/i915/intel_region_lmem.c      | 24 ++++++
>  drivers/gpu/drm/i915/intel_region_lmem.h      |  6 ++
>  .../drm/i915/selftests/intel_memory_region.c  | 77 +++++++++++++++++++
>  4 files changed, 121 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> index b36ad269f4ea..15eaaedffc46 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> @@ -176,7 +176,9 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
>                 void *ptr;
>  
>                 ptr = page_mask_bits(obj->mm.mapping);
> -               if (is_vmalloc_addr(ptr))
> +               if (i915_gem_object_is_lmem(obj))
> +                       io_mapping_unmap(ptr);
> +               else if (is_vmalloc_addr(ptr))
>                         vunmap(ptr);
>                 else
>                         kunmap(kmap_to_page(ptr));
> @@ -235,7 +237,7 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
>  }
>  
>  /* The 'mapping' part of i915_gem_object_pin_map() below */
> -static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
> +static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
>                                  enum i915_map_type type)
>  {
>         unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
> @@ -248,6 +250,11 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
>         pgprot_t pgprot;
>         void *addr;
>  
> +       if (i915_gem_object_is_lmem(obj)) {
> +               /* XXX: we are ignoring the type here -- this is simply wc */

Yeah, don't. The callers certainly do not expect that.

> +               return i915_gem_object_lmem_io_map(obj, 0, obj->base.size);
> +       }
> +
>         /* A single page can always be kmapped */
>         if (n_pages == 1 && type == I915_MAP_WB)
>                 return kmap(sg_page(sgt->sgl));
> @@ -293,7 +300,8 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
>         void *ptr;
>         int err;
>  
> -       if (unlikely(!i915_gem_object_has_struct_page(obj)))
> +       if (unlikely(!i915_gem_object_has_struct_page(obj) &&
> +                    !i915_gem_object_is_lmem(obj)))

Redefine the feature bit in the obj->ops->flags.

>                 return ERR_PTR(-ENXIO);
>  
>         err = mutex_lock_interruptible(&obj->mm.lock);
> @@ -325,7 +333,9 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
>                         goto err_unpin;
>                 }
>  
> -               if (is_vmalloc_addr(ptr))
> +               if (i915_gem_object_is_lmem(obj))
> +                       io_mapping_unmap(ptr);
> +               else if (is_vmalloc_addr(ptr))
>                         vunmap(ptr);
>                 else
>                         kunmap(kmap_to_page(ptr));
> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
> index 15655cc5013f..701bcac3479e 100644
> --- a/drivers/gpu/drm/i915/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
> @@ -73,6 +73,30 @@ static const struct intel_memory_region_ops region_lmem_ops = {
>         .create_object = lmem_create_object,
>  };
>  
> +/* XXX: Time to vfunc your life up? */
> +void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
> +                                              unsigned long n)
> +{
> +       resource_size_t offset;
> +
> +       offset = i915_gem_object_get_dma_address(obj, n);

That seems dubious. So dubious that I again say do not mix terms.

> +       return io_mapping_map_atomic_wc(&obj->memory_region->iomap, offset);

Ahem. The caller may not be ready to abide by the terms of the atomic
contract.

> +}
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 10/37] drm/i915/blt: support copying objects
  2019-06-27 20:56 ` [PATCH v2 10/37] drm/i915/blt: support copying objects Matthew Auld
@ 2019-06-27 23:35   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:35 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:06)
> We can already clear an object with the blt, so try to do the same to
> support copying from one object backing store to another. Really this is
> just object -> object, which is not that useful yet, what we really want
> is two backing stores, but that will require some vma rework first,
> otherwise we are stuck with "tmp" objects.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 135 ++++++++++++++++++
>  .../gpu/drm/i915/gem/i915_gem_object_blt.h    |   8 ++
>  .../i915/gem/selftests/i915_gem_object_blt.c  | 105 ++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   3 +-
>  4 files changed, 250 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> index cb42e3a312e2..c2b28e06c379 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> @@ -102,6 +102,141 @@ int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
>         return err;
>  }
>  
> +int intel_emit_vma_copy_blt(struct i915_request *rq,
> +                           struct i915_vma *src,
> +                           struct i915_vma *dst)
> +{
> +       const int gen = INTEL_GEN(rq->i915);
> +       u32 *cs;
> +
> +       GEM_BUG_ON(src->size != dst->size);

For a low level interface, I would suggest a little over engineering and
take src_offset, dst_offset, length. For bonus points, 2D -- but I
accept that may be too much over-engineering without a user.

> +       cs = intel_ring_begin(rq, 10);
> +       if (IS_ERR(cs))
> +               return PTR_ERR(cs);
> +
> +       if (gen >= 9) {
> +               *cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10-2);
> +               *cs++ = BLT_DEPTH_32 | PAGE_SIZE;
> +               *cs++ = 0;
> +               *cs++ = src->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> +               *cs++ = lower_32_bits(dst->node.start);
> +               *cs++ = upper_32_bits(dst->node.start);
> +               *cs++ = 0;
> +               *cs++ = PAGE_SIZE;
> +               *cs++ = lower_32_bits(src->node.start);
> +               *cs++ = upper_32_bits(src->node.start);

Reminds me that we didn't fix the earlier routines to handle more than
32k pages either. Please add a test case :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 13/37] drm/i915/selftests: don't just test CACHE_NONE for huge-pages
  2019-06-27 20:56 ` [PATCH v2 13/37] drm/i915/selftests: don't just test CACHE_NONE for huge-pages Matthew Auld
@ 2019-06-27 23:40   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:40 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:09)
> We also want to test LLC.

Then add a test for llc/snoop. It should be fine if it is physically
tagged...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 14/37] drm/i915/selftest: extend coverage to include LMEM huge-pages
  2019-06-27 20:56 ` [PATCH v2 14/37] drm/i915/selftest: extend coverage to include LMEM huge-pages Matthew Auld
@ 2019-06-27 23:42   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:42 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:10)
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  .../gpu/drm/i915/gem/selftests/huge_pages.c   | 122 +++++++++++++++++-
>  1 file changed, 121 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index 1862bf06a20f..c81ea9ce289b 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -981,7 +981,7 @@ static int gpu_write(struct i915_vma *vma,
>                                vma->size >> PAGE_SHIFT, val);
>  }
>  
> -static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
> +static int __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 dword, u32 val)
>  {
>         unsigned int needs_flush;
>         unsigned long n;
> @@ -1013,6 +1013,53 @@ static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
>         return err;
>  }
>  
> +static int __cpu_check_lmem(struct drm_i915_gem_object *obj, u32 dword, u32 val)
> +{
> +       unsigned long n;
> +       int err;
> +
> +       i915_gem_object_lock(obj);
> +       err = i915_gem_object_set_to_wc_domain(obj, false);
> +       i915_gem_object_unlock(obj);
> +       if (err)
> +               return err;
> +
> +       err = i915_gem_object_pin_pages(obj);
> +       if (err)
> +               return err;
> +
> +       for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
> +               u32 __iomem *base;
> +               u32 read_val;
> +
> +               base = i915_gem_object_lmem_io_map_page(obj, n);
> +
> +               read_val = ioread32(base + dword);
> +               io_mapping_unmap_atomic(base);
> +               if (read_val != val) {
> +                       pr_err("n=%lu base[%u]=%u, val=%u\n",
> +                              n, dword, read_val, val);
> +                       err = -EINVAL;
> +                       break;
> +               }
> +       }
> +
> +       i915_gem_object_unpin_pages(obj);
> +       return err;
> +}
> +
> +static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)

We have different meanings of cpu :-p
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 15/37] drm/i915/lmem: support CPU relocations
  2019-06-27 20:56 ` [PATCH v2 15/37] drm/i915/lmem: support CPU relocations Matthew Auld
@ 2019-06-27 23:46   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:46 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:11)
> @@ -1020,16 +1022,23 @@ static void reloc_cache_reset(struct reloc_cache *cache)
>                 i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm);
>         } else {
>                 wmb();
> -               io_mapping_unmap_atomic((void __iomem *)vaddr);
> -               if (cache->node.allocated) {
> -                       struct i915_ggtt *ggtt = cache_to_ggtt(cache);
> -
> -                       ggtt->vm.clear_range(&ggtt->vm,
> -                                            cache->node.start,
> -                                            cache->node.size);
> -                       drm_mm_remove_node(&cache->node);
> +
> +               if (cache->is_lmem) {
> +                       io_mapping_unmap_atomic((void __iomem *)vaddr);
> +                       i915_gem_object_unpin_pages((struct drm_i915_gem_object *)cache->node.mm);
> +                       cache->is_lmem = false;
>                 } else {
> -                       i915_vma_unpin((struct i915_vma *)cache->node.mm);
> +                       io_mapping_unmap_atomic((void __iomem *)vaddr);

The first step of each branch is the same. What am I missing?


> +                       if (cache->node.allocated) {
> +                               struct i915_ggtt *ggtt = cache_to_ggtt(cache);
> +
> +                               ggtt->vm.clear_range(&ggtt->vm,
> +                                                    cache->node.start,
> +                                                    cache->node.size);
> +                               drm_mm_remove_node(&cache->node);
> +                       } else {
> +                               i915_vma_unpin((struct i915_vma *)cache->node.mm);
> +                       }
>                 }
>         }
>  
> @@ -1069,6 +1078,40 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
>         return vaddr;
>  }
>  
> +static void *reloc_lmem(struct drm_i915_gem_object *obj,
> +                       struct reloc_cache *cache,
> +                       unsigned long page)
> +{
> +       void *vaddr;
> +       int err;
> +
> +       GEM_BUG_ON(use_cpu_reloc(cache, obj));
> +
> +       if (cache->vaddr) {
> +               io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr));
> +       } else {
> +               i915_gem_object_lock(obj);
> +               err = i915_gem_object_set_to_wc_domain(obj, true);
> +               i915_gem_object_unlock(obj);
> +               if (err)
> +                       return ERR_PTR(err);
> +
> +               err = i915_gem_object_pin_pages(obj);
> +               if (err)
> +                       return ERR_PTR(err);
> +
> +               cache->node.mm = (void *)obj;
> +               cache->is_lmem = true;
> +       }
> +
> +       vaddr = i915_gem_object_lmem_io_map_page(obj, page);

Secret atomic. Notice the asymmetric release.

> +       cache->vaddr = (unsigned long)vaddr;
> +       cache->page = page;
> +
> +       return vaddr;
> +}
> +
>  static void *reloc_iomap(struct drm_i915_gem_object *obj,
>                          struct reloc_cache *cache,
>                          unsigned long page)
> @@ -1145,8 +1188,12 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj,
>                 vaddr = unmask_page(cache->vaddr);
>         } else {
>                 vaddr = NULL;
> -               if ((cache->vaddr & KMAP) == 0)
> -                       vaddr = reloc_iomap(obj, cache, page);
> +               if ((cache->vaddr & KMAP) == 0) {
> +                       if (i915_gem_object_is_lmem(obj))
> +                               vaddr = reloc_lmem(obj, cache, page);
> +                       else
> +                               vaddr = reloc_iomap(obj, cache, page);
> +               }
>                 if (!vaddr)
>                         vaddr = reloc_kmap(obj, cache, page);
>         }
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 16/37] drm/i915/lmem: support pread
  2019-06-27 20:56 ` [PATCH v2 16/37] drm/i915/lmem: support pread Matthew Auld
@ 2019-06-27 23:50   ` Chris Wilson
  2019-07-30  8:58   ` Daniel Vetter
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:50 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:12)
> We need to add support for pread'ing an LMEM object.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  2 +
>  drivers/gpu/drm/i915/i915_gem.c               |  6 ++
>  drivers/gpu/drm/i915/intel_region_lmem.c      | 76 +++++++++++++++++++
>  3 files changed, 84 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 80ff5ad9bc07..8cdee185251a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -52,6 +52,8 @@ struct drm_i915_gem_object_ops {
>         void (*truncate)(struct drm_i915_gem_object *obj);
>         void (*writeback)(struct drm_i915_gem_object *obj);
>  
> +       int (*pread)(struct drm_i915_gem_object *,
> +                    const struct drm_i915_gem_pread *arg);
>         int (*pwrite)(struct drm_i915_gem_object *obj,
>                       const struct drm_i915_gem_pwrite *arg);
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 85677ae89849..4ba386ab35e7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -463,6 +463,12 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>  
>         trace_i915_gem_object_pread(obj, args->offset, args->size);
>  
> +       ret = -ENODEV;
> +       if (obj->ops->pread)
> +               ret = obj->ops->pread(obj, args);
> +       if (ret != -ENODEV)
> +               goto out;
> +
>         ret = i915_gem_object_wait(obj,
>                                    I915_WAIT_INTERRUPTIBLE,
>                                    MAX_SCHEDULE_TIMEOUT);
> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
> index 701bcac3479e..54b2c7bf177d 100644
> --- a/drivers/gpu/drm/i915/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
> @@ -7,10 +7,86 @@
>  #include "intel_memory_region.h"
>  #include "intel_region_lmem.h"
>  
> +static int lmem_pread(struct drm_i915_gem_object *obj,
> +                     const struct drm_i915_gem_pread *arg)
> +{
> +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +       struct intel_runtime_pm *rpm = &i915->runtime_pm;
> +       intel_wakeref_t wakeref;
> +       struct dma_fence *fence;
> +       char __user *user_data;
> +       unsigned int offset;
> +       unsigned long idx;
> +       u64 remain;
> +       int ret;
> +
> +       ret = i915_gem_object_pin_pages(obj);
> +       if (ret)
> +               return ret;
> +
> +       i915_gem_object_lock(obj);
> +       ret = i915_gem_object_set_to_wc_domain(obj, false);

You chose to opt out of the unlocked wait before the locked wait?


> +       if (ret) {
> +               i915_gem_object_unlock(obj);
> +               goto out_unpin;
> +       }
> +
> +       fence = i915_gem_object_lock_fence(obj);
> +       i915_gem_object_unlock(obj);
> +       if (!fence) {
> +               ret = -ENOMEM;
> +               goto out_unpin;
> +       }
> +
> +       wakeref = intel_runtime_pm_get(rpm);

Something not mentioned so far is the story for mm->rpm.

> +       remain = arg->size;
> +       user_data = u64_to_user_ptr(arg->data_ptr);
> +       offset = offset_in_page(arg->offset);
> +       for (idx = arg->offset >> PAGE_SHIFT; remain; idx++) {
> +               unsigned long unwritten;
> +               void __iomem *vaddr;
> +               int length;
> +
> +               length = remain;
> +               if (offset + length > PAGE_SIZE)
> +                       length = PAGE_SIZE - offset;
> +
> +               vaddr = i915_gem_object_lmem_io_map_page(obj, idx);
> +               if (!vaddr) {
> +                       ret = -ENOMEM;
> +                       goto out_put;
> +               }
> +
> +               unwritten = copy_to_user(user_data,

Except this is a secret atomic section!!!

> +                                        (void __force *)vaddr + offset,
> +                                        length);
> +               io_mapping_unmap_atomic(vaddr);
> +               if (unwritten) {
> +                       ret = -EFAULT;
> +                       goto out_put;
> +               }
> +
> +               remain -= length;
> +               user_data += length;
> +               offset = 0;
> +       }
> +
> +out_put:
> +       intel_runtime_pm_put(rpm, wakeref);
> +       i915_gem_object_unlock_fence(obj, fence);
> +out_unpin:
> +       i915_gem_object_unpin_pages(obj);
> +
> +       return ret;
> +}
> +
>  static const struct drm_i915_gem_object_ops region_lmem_obj_ops = {
>         .get_pages = i915_memory_region_get_pages_buddy,
>         .put_pages = i915_memory_region_put_pages_buddy,
>         .release = i915_gem_object_release_memory_region,
> +
> +       .pread = lmem_pread,
>  };
>  
>  static struct drm_i915_gem_object *
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 19/37] drm/i915: treat shmem as a region
  2019-06-27 20:56 ` [PATCH v2 19/37] drm/i915: treat shmem as a region Matthew Auld
@ 2019-06-27 23:55   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:55 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:15)
>  int i915_gem_freeze(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index e4f811fdaedc..958c61e88200 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2941,7 +2941,8 @@ int i915_gem_init_memory_regions(struct drm_i915_private *i915)
>  
>                 type = MEMORY_TYPE_FROM_REGION(intel_region_map[i]);
>                 switch (type) {
> -               default:
> +               case INTEL_SMEM:
> +                       mem = i915_gem_shmem_setup(i915);
>                         break;
>                 }
>  
> @@ -2951,11 +2952,9 @@ int i915_gem_init_memory_regions(struct drm_i915_private *i915)
>                         goto out_cleanup;
>                 }
>  
> -               if (mem) {
> -                       mem->id = intel_region_map[i];
> -                       mem->type = type;
> -                       mem->instance = MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);
> -               }
> +               mem->id = intel_region_map[i];
> +               mem->type = type;
> +               mem->instance = MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);

Go back and adjust the stub function you just introduced to avoid
self-inflicted churn.

Meanwhile I'm left with this magic that isn't even defined in this patch
to try and figure out if this is equivalent to the code you just
removed.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 23/37] drm/i915: expose missing map_gtt support to users
  2019-06-27 20:56 ` [PATCH v2 23/37] drm/i915: expose missing map_gtt support to users Matthew Auld
@ 2019-06-27 23:59   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-27 23:59 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:19)
> From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> Done by returning -ENODEV from the map_gtt version ioctl.
> 
> Cc: Antonio Argenziano <antonio.argenziano@intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index ac8fbada0406..34edc0302691 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -425,6 +425,8 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>                         return value;
>                 break;
>         case I915_PARAM_MMAP_GTT_VERSION:
> +               if (!HAS_MAPPABLE_APERTURE(dev_priv))
> +                       return -ENODEV;

The ioctl version is still going to be there, since we just extend it
report offsets fot the many alternative mappings, with the different
fences and everything. Right?

If we don't support a ggtt mmap via the extended mmap_offset ioctl, we
report the flags as being invalid.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 24/37] drm/i915: set num_fence_regs to 0 if there is no aperture
  2019-06-27 20:56 ` [PATCH v2 24/37] drm/i915: set num_fence_regs to 0 if there is no aperture Matthew Auld
@ 2019-06-28  0:00   ` Chris Wilson
  0 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  0:00 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:20)
> From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> We can't fence anything without aperture.

s/aperture/fence registers/
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
  2019-06-27 20:56 ` [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core Matthew Auld
@ 2019-06-28  0:05   ` Chris Wilson
  2019-06-28  0:08   ` Chris Wilson
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  0:05 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:24)
> +static void i915_gem_vm_open(struct vm_area_struct *vma)
> +{
> +       struct i915_mmap_offset *priv = vma->vm_private_data;
> +       struct drm_i915_gem_object *obj = priv->obj;
> +
> +       drm_gem_object_get(&obj->base);

Pleae use the right getters, i915_gem_object_get and
i915_gem_object_put.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
  2019-06-27 20:56 ` [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core Matthew Auld
  2019-06-28  0:05   ` Chris Wilson
@ 2019-06-28  0:08   ` Chris Wilson
  2019-06-28  0:09   ` Chris Wilson
  2019-06-28  0:10   ` Chris Wilson
  3 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  0:08 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:24)
> +int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
> +{
> +       struct drm_vma_offset_node *node;
> +       struct drm_file *priv = filp->private_data;
> +       struct drm_device *dev = priv->minor->dev;
> +       struct i915_mmap_offset *mmo;
> +       struct drm_gem_object *obj = NULL;
> +
> +       if (drm_dev_is_unplugged(dev))
> +               return -ENODEV;
> +
> +       drm_vma_offset_lock_lookup(dev->vma_offset_manager);
> +       node = drm_vma_offset_exact_lookup_locked(dev->vma_offset_manager,
> +                                                 vma->vm_pgoff,
> +                                                 vma_pages(vma));
> +       if (likely(node)) {
> +               mmo = container_of(node, struct i915_mmap_offset,
> +                                  vma_node);
> +
> +               /* Take a ref for our mmap_offset and gem objects. The reference is cleaned

/*
 * Take

> +                * up when the vma is closed.
> +                *
> +                * Skip 0-refcnted objects as it is in the process of being destroyed
> +                * and will be invalid when the vma manager lock is released.
> +                */
> +               if (kref_get_unless_zero(&mmo->ref)) {
> +                       obj = &mmo->obj->base;
> +                       if (!kref_get_unless_zero(&obj->refcount))
> +                               obj = NULL;
> +               }
> +       }
> +       drm_vma_offset_unlock_lookup(dev->vma_offset_manager);
> +
> +       if (!obj)
> +               return -EINVAL;

Please check the error paths for reference leaks.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
  2019-06-27 20:56 ` [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core Matthew Auld
  2019-06-28  0:05   ` Chris Wilson
  2019-06-28  0:08   ` Chris Wilson
@ 2019-06-28  0:09   ` Chris Wilson
  2019-06-28  0:10   ` Chris Wilson
  3 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  0:09 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:24)
> +       if (node->readonly) {

Note that we can now drop the readonly field from the node as we only
added it for out benefit. Now we've extended the vma impl, we can use
our obj readonly flag directly.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
  2019-06-27 20:56 ` [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core Matthew Auld
                     ` (2 preceding siblings ...)
  2019-06-28  0:09   ` Chris Wilson
@ 2019-06-28  0:10   ` Chris Wilson
  3 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  0:10 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:24)
> +       vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;

Strictly speaking, it's not actually VM_IO as we do not wrap and expose
mmio registers to userspace.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  2019-06-27 20:56 ` [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET Matthew Auld
@ 2019-06-28  0:12   ` Chris Wilson
  2019-07-30  9:49   ` Daniel Vetter
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  0:12 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:25)
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> Add a new CPU mmap implementation that allows multiple fault handlers
> that depends on the object's backing pages.
> 
> Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
> and use the zero extending behaviour of drm to differentiate between
> them, when we inspect the flags.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |  2 ++
>  drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 30 ++++++++++++++++++
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
>  drivers/gpu/drm/i915/i915_drv.c               |  3 +-
>  include/uapi/drm/i915_drm.h                   | 31 +++++++++++++++++++
>  5 files changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> index ddc7f2a52b3e..5abd5b2172f2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> @@ -30,6 +30,8 @@ int i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>                         struct drm_file *file);
>  int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>                             struct drm_file *file);
> +int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
> +                              struct drm_file *file_priv);
>  int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>                          struct drm_file *file);
>  int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index 7b46f44d9c20..cbf89e80a97b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -536,12 +536,42 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>                         struct drm_file *file)
>  {
>         struct drm_i915_gem_mmap_offset *args = data;
> +       struct drm_i915_private *i915 = to_i915(dev);
> +
> +       if (args->flags & I915_MMAP_OFFSET_FLAGS)
> +               return i915_gem_mmap_offset_ioctl(dev, data, file);
> +
> +       if (!HAS_MAPPABLE_APERTURE(i915)) {
> +               DRM_ERROR("No aperture, cannot mmap via legacy GTT\n");
> +               return -ENODEV;
> +       }
>  
>         return __assign_gem_object_mmap_data(file, args->handle,
>                                              I915_MMAP_TYPE_GTT,
>                                              &args->offset);
>  }
>  
> +int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
> +                              struct drm_file *file)

This seems highly redundant and not correctly plugged in.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
  2019-06-27 20:56 ` [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION Matthew Auld
@ 2019-06-28  0:22   ` Chris Wilson
  2019-06-28  5:53   ` Tvrtko Ursulin
  2019-07-30 16:17   ` Daniel Vetter
  2 siblings, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  0:22 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:56:30)
> +int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
> +                           struct drm_file *file)
> +{
> +       struct drm_i915_gem_context_param *args = data;

The plan was to use the upper_32_bits() or whatever as the class. To
future proof, I would recommend being more explicit with a switch.

> +       if (args->param <= I915_CONTEXT_PARAM_MAX)
> +               return i915_gem_context_setparam_ioctl(dev, data, file);
> +
> +       return i915_gem_object_setparam_ioctl(dev, data, file);
> +}

>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>   * on the security mechanisms provided by hardware.
> @@ -1595,11 +1597,36 @@ struct drm_i915_gem_context_param {
>   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
>   */
>  #define I915_CONTEXT_PARAM_ENGINES     0xa
> +
> +#define I915_CONTEXT_PARAM_MAX         0xffffffff
>  /* Must be kept compact -- no holes and well documented */

Hahaha. Good one.

The rest of the patch is clearly very early proof of concept as it needs
the locking reworked.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
  2019-06-27 20:56 ` [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION Matthew Auld
  2019-06-28  0:22   ` Chris Wilson
@ 2019-06-28  5:53   ` Tvrtko Ursulin
  2019-07-30 16:17   ` Daniel Vetter
  2 siblings, 0 replies; 88+ messages in thread
From: Tvrtko Ursulin @ 2019-06-28  5:53 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx


On 27/06/2019 21:56, Matthew Auld wrote:
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> This call will specify which memory region an object should be placed.
> 
> Note that changing the object's backing storage should be immediately
> done after an object is created or if it's not yet in use, otherwise
> this will fail on a busy object.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c |  12 ++
>   drivers/gpu/drm/i915/gem/i915_gem_context.h |   2 +
>   drivers/gpu/drm/i915/gem/i915_gem_ioctls.h  |   2 +
>   drivers/gpu/drm/i915/gem/i915_gem_object.c  | 117 ++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_drv.c             |   2 +-
>   include/uapi/drm/i915_drm.h                 |  27 +++++
>   6 files changed, 161 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8a9787cf0cd0..157ca8247752 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -75,6 +75,7 @@
>   #include "i915_globals.h"
>   #include "i915_trace.h"
>   #include "i915_user_extensions.h"
> +#include "i915_gem_ioctls.h"
>   
>   #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
>   
> @@ -2357,6 +2358,17 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>   	return ret;
>   }
>   
> +int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
> +			    struct drm_file *file)
> +{
> +	struct drm_i915_gem_context_param *args = data;
> +
> +	if (args->param <= I915_CONTEXT_PARAM_MAX)
> +		return i915_gem_context_setparam_ioctl(dev, data, file);
> +
> +	return i915_gem_object_setparam_ioctl(dev, data, file);
> +}
> +
>   int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>   				       void *data, struct drm_file *file)
>   {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> index 9691dd062f72..d5a9a63bb34c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> @@ -157,6 +157,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>   				    struct drm_file *file_priv);
>   int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>   				    struct drm_file *file_priv);
> +int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
> +			    struct drm_file *file);
>   int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>   				       struct drm_file *file);
>   
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> index 5abd5b2172f2..af7465bceebd 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> @@ -32,6 +32,8 @@ int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>   			    struct drm_file *file);
>   int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>   			       struct drm_file *file_priv);
> +int i915_gem_object_setparam_ioctl(struct drm_device *dev, void *data,
> +				   struct drm_file *file_priv);
>   int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>   			 struct drm_file *file);
>   int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> index 691af388e4e7..bc95f449de50 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -551,6 +551,123 @@ int __init i915_global_objects_init(void)
>   	return 0;
>   }
>   
> +static enum intel_region_id
> +__region_id(u32 region)
> +{
> +	enum intel_region_id id;
> +
> +	for (id = 0; id < ARRAY_SIZE(intel_region_map); ++id) {
> +		if (intel_region_map[id] == region)
> +			return id;
> +	}
> +
> +	return INTEL_MEMORY_UKNOWN;
> +}
> +
> +static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
> +					 struct drm_i915_gem_object_param *args,
> +					 struct drm_file *file,
> +					 struct drm_i915_gem_object *obj)
> +{
> +	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
> +	u32 __user *uregions = u64_to_user_ptr(args->data);
> +	u32 uregions_copy[INTEL_MEMORY_UKNOWN];
> +	int i, ret;
> +
> +	if (args->size > ARRAY_SIZE(intel_region_map))
> +		return -EINVAL;
> +
> +	memset(uregions_copy, 0, sizeof(uregions_copy));
> +	for (i = 0; i < args->size; i++) {
> +		u32 region;
> +
> +		ret = get_user(region, uregions);
> +		if (ret)
> +			return ret;
> +
> +		uregions_copy[i] = region;
> +		++uregions;
> +	}
> +
> +	mutex_lock(&dev_priv->drm.struct_mutex);
> +	ret = i915_gem_object_prepare_move(obj);
> +	if (ret) {
> +		DRM_ERROR("Cannot set memory region, object in use\n");
> +	        goto err;
> +	}
> +
> +	if (args->size > ARRAY_SIZE(intel_region_map))
> +		return -EINVAL;
> +
> +	for (i = 0; i < args->size; i++) {
> +		u32 region = uregions_copy[i];
> +		enum intel_region_id id = __region_id(region);
> +
> +		if (id == INTEL_MEMORY_UKNOWN) {
> +			ret = -EINVAL;
> +			goto err;
> +		}
> +
> +		ret = i915_gem_object_migrate(obj, ce, id);
> +		if (!ret) {
> +			if (MEMORY_TYPE_FROM_REGION(region) ==
> +			    INTEL_LMEM) {
> +				/*
> +				 * TODO: this should be part of get_pages(),
> +				 * when async get_pages arrives
> +				 */
> +				ret = i915_gem_object_fill_blt(obj, ce, 0);
> +				if (ret) {
> +					DRM_ERROR("Failed clearing the object\n");
> +					goto err;
> +				}
> +
> +				i915_gem_object_lock(obj);
> +				ret = i915_gem_object_set_to_cpu_domain(obj, false);
> +				i915_gem_object_unlock(obj);
> +				if (ret)
> +					goto err;
> +			}
> +			break;
> +		}
> +	}
> +err:
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
> +	return ret;
> +}
> +
> +int i915_gem_object_setparam_ioctl(struct drm_device *dev, void *data,
> +				   struct drm_file *file)
> +{
> +
> +	struct drm_i915_gem_object_param *args = data;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct drm_i915_gem_object *obj;
> +	int ret;
> +
> +	obj = i915_gem_object_lookup(file, args->handle);
> +	if (!obj)
> +		return -ENOENT;
> +
> +	switch (args->param) {
> +	case I915_PARAM_MEMORY_REGION:
> +		ret = i915_gem_object_region_select(dev_priv, args, file, obj);
> +		if (ret) {
> +			DRM_ERROR("Cannot set memory region, migration failed\n");
> +			goto err;
> +		}
> +
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +
> +err:
> +	i915_gem_object_put(obj);
> +	return ret;
> +}
> +
>   #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>   #include "selftests/huge_gem_object.c"
>   #include "selftests/huge_pages.c"
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 1c3d5cb2893c..3d6fe993f26e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -3196,7 +3196,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>   	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
> -	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_RENDER_ALLOW),
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 729e729e2282..5cf976e7608a 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -360,6 +360,7 @@ typedef struct _drm_i915_sarea {
>   #define DRM_I915_GEM_VM_CREATE		0x3a
>   #define DRM_I915_GEM_VM_DESTROY		0x3b
>   #define DRM_I915_GEM_MMAP_OFFSET   	DRM_I915_GEM_MMAP_GTT
> +#define DRM_I915_GEM_OBJECT_SETPARAM	DRM_I915_GEM_CONTEXT_SETPARAM
>   /* Must be kept compact -- no holes */
>   
>   #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> @@ -423,6 +424,7 @@ typedef struct _drm_i915_sarea {
>   #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
>   #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
>   #define DRM_IOCTL_I915_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_OFFSET, struct drm_i915_gem_mmap_offset)
> +#define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param)
>   
>   /* Allow drivers to submit batchbuffers directly to hardware, relying
>    * on the security mechanisms provided by hardware.
> @@ -1595,11 +1597,36 @@ struct drm_i915_gem_context_param {
>    *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
>    */
>   #define I915_CONTEXT_PARAM_ENGINES	0xa
> +
> +#define I915_CONTEXT_PARAM_MAX	        0xffffffff
>   /* Must be kept compact -- no holes and well documented */
>   
>   	__u64 value;
>   };
>   
> +struct drm_i915_gem_object_param {
> +	/** Handle for the object */
> +	__u32 handle;
> +
> +	__u32 size;
> +
> +	/* Must be 1 */
> +	__u32 object_class;

What is this for? It's not used in the patch.

> +
> +	/** Set the memory region for the object listed in preference order
> +	 *  as an array of region ids within data. To force an object
> +	 *  to a particular memory region, set the region as the sole entry.
> +	 *
> +	 *  Valid region ids are derived from the id field of
> +	 *  struct drm_i915_memory_region_info.
> +	 *  See struct drm_i915_query_memory_region_info.
> +	 */

These two structs only come in the next patch so I suspect the order of 
the two needs swapping.

Regards,

Tvrtko

> +#define I915_PARAM_MEMORY_REGION 0x1
> +	__u32 param;
> +
> +	__u64 data;
> +};
> +
>   /**
>    * Context SSEU programming
>    *
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 35/37] drm/i915/query: Expose memory regions through the query uAPI
  2019-06-27 20:56 ` [PATCH v2 35/37] drm/i915/query: Expose memory regions through the query uAPI Matthew Auld
@ 2019-06-28  5:59   ` Tvrtko Ursulin
  0 siblings, 0 replies; 88+ messages in thread
From: Tvrtko Ursulin @ 2019-06-28  5:59 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx


On 27/06/2019 21:56, Matthew Auld wrote:
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> Returns the available memory region areas supported by the HW.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_query.c | 57 +++++++++++++++++++++++++++++++
>   include/uapi/drm/i915_drm.h       | 39 +++++++++++++++++++++
>   2 files changed, 96 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> index 7b7016171057..21c4c2592d6c 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -143,10 +143,67 @@ query_engine_info(struct drm_i915_private *i915,
>   	return len;
>   }
>   
> +static int query_memregion_info(struct drm_i915_private *dev_priv,
> +				struct drm_i915_query_item *query_item)
> +{
> +	struct drm_i915_query_memory_region_info __user *query_ptr =
> +		u64_to_user_ptr(query_item->data_ptr);
> +	struct drm_i915_memory_region_info __user *info_ptr =
> +		&query_ptr->regions[0];
> +	struct drm_i915_memory_region_info info = { };
> +	struct drm_i915_query_memory_region_info query;
> +	u32 total_length;
> +	int ret, i;
> +
> +	if (query_item->flags != 0)
> +		return -EINVAL;
> +
> +	total_length = sizeof(struct drm_i915_query_memory_region_info);
> +	for (i = 0; i < ARRAY_SIZE(dev_priv->regions); ++i) {
> +		struct intel_memory_region *region = dev_priv->regions[i];
> +
> +		if (!region)
> +			continue;
> +
> +		total_length += sizeof(struct drm_i915_memory_region_info);
> +	}
> +
> +	ret = copy_query_item(&query, sizeof(query), total_length,
> +			      query_item);
> +	if (ret != 0)
> +		return ret;
> +
> +	if (query.num_regions || query.rsvd[0] || query.rsvd[1] ||
> +	    query.rsvd[2])
> +		return -EINVAL;
> +
> +	for (i = 0; i < ARRAY_SIZE(dev_priv->regions); ++i) {
> +		struct intel_memory_region *region = dev_priv->regions[i];
> +
> +		if (!region)
> +			continue;
> +
> +		info.id = region->id;
> +		info.size = resource_size(&region->region);
> +
> +		if (__copy_to_user(info_ptr, &info, sizeof(info)))
> +			return -EFAULT;
> +
> +		query.num_regions++;
> +		info_ptr++;
> +	}
> +
> +	if (__copy_to_user(query_ptr, &query, sizeof(query)))
> +		return -EFAULT;
> +
> +	return total_length;
> +}
> +
>   static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
>   					struct drm_i915_query_item *query_item) = {
>   	query_topology_info,
>   	query_engine_info,
> +	query_memregion_info,
>   };
>   
>   int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 5cf976e7608a..9b77d8af9877 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -2041,6 +2041,7 @@ struct drm_i915_query_item {
>   	__u64 query_id;
>   #define DRM_I915_QUERY_TOPOLOGY_INFO    1
>   #define DRM_I915_QUERY_ENGINE_INFO	2
> +#define DRM_I915_QUERY_MEMREGION_INFO   3
>   /* Must be kept compact -- no holes and well documented */
>   
>   	/*
> @@ -2180,6 +2181,44 @@ struct drm_i915_query_engine_info {
>   	struct drm_i915_engine_info engines[];
>   };
>   
> +struct drm_i915_memory_region_info {
> +
> +	/** Base type of a region
> +	 */
> +#define I915_SYSTEM_MEMORY         0
> +#define I915_DEVICE_MEMORY         1
> +
> +	/** The region id is encoded in a layout which makes it possible to
> +	 *  retrieve the following information:
> +	 *
> +	 *  Base type: log2(ID >> 16)
> +	 *  Instance:  log2(ID & 0xffff)
> +	 */
> +	__u32 id;

Should we consider, for simplicity and similarity with the engine 
interface, go for something like:

struct i915_memory_type_instance {
	__u16 type;
	__u16 instance;
};

struct drm_i915_memory_region_info {
	struct i915_memory_type_instance region;
	...
};

?

> +
> +	/** Reserved field. MBZ */
> +	__u32 rsvd0;
> +
> +	/** Unused for now. MBZ */
> +	__u64 flags;
> +
> +	__u64 size;
> +
> +	/** Reserved fields must be cleared to zero. */
> +	__u64 rsvd1[4];
> +};
> +
> +struct drm_i915_query_memory_region_info {
> +
> +	/** Number of struct drm_i915_memory_region_info structs */
> +	__u32 num_regions;
> +
> +	/** MBZ */
> +	__u32 rsvd[3];

It's not that important, just a note that given some recent discussion 
on the engine query front, I wished I had more rsvd there. So maybe bump 
this up just to be extra safe.

> +
> +	struct drm_i915_memory_region_info regions[];
> +};
> +
>   #if defined(__cplusplus)
>   }
>   #endif
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 02/37] drm/i915: introduce intel_memory_region
  2019-06-27 20:55 ` [PATCH v2 02/37] drm/i915: introduce intel_memory_region Matthew Auld
  2019-06-27 22:47   ` Chris Wilson
@ 2019-06-28  8:09   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  8:09 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:55:58)
> @@ -0,0 +1,107 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __INTEL_MEMORY_REGION_H__
> +#define __INTEL_MEMORY_REGION_H__
> +
> +#include <linux/ioport.h>
> +#include <linux/mutex.h>
> +#include <linux/io-mapping.h>
> +
> +#include "i915_buddy.h"
> +
> +struct drm_i915_private;
> +struct drm_i915_gem_object;
> +struct intel_memory_region;
> +struct sg_table;
> +
> +/**
> + *  Base memory type
> + */
> +enum intel_memory_type {
> +       INTEL_SMEM = 0,
> +       INTEL_LMEM,
> +       INTEL_STOLEN,
> +};
> +
> +enum intel_region_id {
> +       INTEL_MEMORY_SMEM = 0,
> +       INTEL_MEMORY_LMEM,
> +       INTEL_MEMORY_STOLEN,
> +       INTEL_MEMORY_UKNOWN, /* Should be last */
> +};
> +
> +#define REGION_SMEM     BIT(INTEL_MEMORY_SMEM)
> +#define REGION_LMEM     BIT(INTEL_MEMORY_LMEM)
> +#define REGION_STOLEN   BIT(INTEL_MEMORY_STOLEN)
> +
> +#define INTEL_MEMORY_TYPE_SHIFT 16
> +
> +#define MEMORY_TYPE_FROM_REGION(r) (ilog2(r >> INTEL_MEMORY_TYPE_SHIFT))
> +#define MEMORY_INSTANCE_FROM_REGION(r) (ilog2(r & 0xffff))
> +
> +/**
> + * Memory regions encoded as type | instance
> + */
> +static const u32 intel_region_map[] = {
> +       [INTEL_MEMORY_SMEM] = BIT(INTEL_SMEM + INTEL_MEMORY_TYPE_SHIFT) | BIT(0),
> +       [INTEL_MEMORY_LMEM] = BIT(INTEL_LMEM + INTEL_MEMORY_TYPE_SHIFT) | BIT(0),
> +       [INTEL_MEMORY_STOLEN] = BIT(INTEL_STOLEN + INTEL_MEMORY_TYPE_SHIFT) | BIT(0),
> +};

You put this array into the header, ergo a separate instance is created
for every compilation unit pulling in this header. Incoming build
failure report :-p
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 01/37] drm/i915: buddy allocator
  2019-06-27 20:55 ` [PATCH v2 01/37] drm/i915: buddy allocator Matthew Auld
  2019-06-27 22:28   ` Chris Wilson
@ 2019-06-28  9:35   ` Chris Wilson
  1 sibling, 0 replies; 88+ messages in thread
From: Chris Wilson @ 2019-06-28  9:35 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-06-27 21:55:57)
> +static void __i915_buddy_free(struct i915_buddy_mm *mm,
> +                             struct i915_buddy_block *block)
> +{
> +       list_del_init(&block->link); /* We have ownership now */

That is an important observation. Even more important is that as you
didn't own it, you shouldn't touch the previous linkage, and just assume
control. The owner relinquished all control of the block upon freeing.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* ✗ Fi.CI.BAT: failure for Introduce memory region concept (including device local memory) (rev2)
  2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
                   ` (38 preceding siblings ...)
  2019-06-27 21:50 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-06-28  9:59 ` Patchwork
  39 siblings, 0 replies; 88+ messages in thread
From: Patchwork @ 2019-06-28  9:59 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: Introduce memory region concept (including device local memory) (rev2)
URL   : https://patchwork.freedesktop.org/series/56683/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6376 -> Patchwork_13460
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_13460 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_13460, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_13460:

### IGT changes ###

#### Possible regressions ####

  * igt@debugfs_test@read_all_entries:
    - fi-skl-iommu:       [PASS][1] -> [DMESG-WARN][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-iommu/igt@debugfs_test@read_all_entries.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-iommu/igt@debugfs_test@read_all_entries.html
    - fi-glk-dsi:         [PASS][3] -> [DMESG-WARN][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-glk-dsi/igt@debugfs_test@read_all_entries.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-glk-dsi/igt@debugfs_test@read_all_entries.html
    - fi-ivb-3770:        [PASS][5] -> [DMESG-WARN][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-ivb-3770/igt@debugfs_test@read_all_entries.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-ivb-3770/igt@debugfs_test@read_all_entries.html
    - fi-cml-u:           [PASS][7] -> [DMESG-WARN][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-cml-u/igt@debugfs_test@read_all_entries.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-cml-u/igt@debugfs_test@read_all_entries.html
    - fi-hsw-peppy:       [PASS][9] -> [DMESG-WARN][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-hsw-peppy/igt@debugfs_test@read_all_entries.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-hsw-peppy/igt@debugfs_test@read_all_entries.html
    - fi-icl-u3:          [PASS][11] -> [DMESG-WARN][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-icl-u3/igt@debugfs_test@read_all_entries.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-icl-u3/igt@debugfs_test@read_all_entries.html
    - fi-bdw-gvtdvm:      [PASS][13] -> [DMESG-WARN][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-bdw-gvtdvm/igt@debugfs_test@read_all_entries.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-bdw-gvtdvm/igt@debugfs_test@read_all_entries.html
    - fi-bxt-j4205:       [PASS][15] -> [DMESG-WARN][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-bxt-j4205/igt@debugfs_test@read_all_entries.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-bxt-j4205/igt@debugfs_test@read_all_entries.html
    - fi-kbl-7500u:       [PASS][17] -> [DMESG-WARN][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-kbl-7500u/igt@debugfs_test@read_all_entries.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-kbl-7500u/igt@debugfs_test@read_all_entries.html
    - fi-snb-2520m:       [PASS][19] -> [DMESG-WARN][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-snb-2520m/igt@debugfs_test@read_all_entries.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-snb-2520m/igt@debugfs_test@read_all_entries.html
    - fi-gdg-551:         [PASS][21] -> [DMESG-WARN][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-gdg-551/igt@debugfs_test@read_all_entries.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-gdg-551/igt@debugfs_test@read_all_entries.html
    - fi-icl-u2:          [PASS][23] -> [DMESG-WARN][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-icl-u2/igt@debugfs_test@read_all_entries.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-icl-u2/igt@debugfs_test@read_all_entries.html
    - fi-cfl-8109u:       [PASS][25] -> [DMESG-WARN][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-cfl-8109u/igt@debugfs_test@read_all_entries.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-cfl-8109u/igt@debugfs_test@read_all_entries.html
    - fi-pnv-d510:        [PASS][27] -> [DMESG-WARN][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-pnv-d510/igt@debugfs_test@read_all_entries.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-pnv-d510/igt@debugfs_test@read_all_entries.html
    - fi-ilk-650:         [PASS][29] -> [DMESG-WARN][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-ilk-650/igt@debugfs_test@read_all_entries.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-ilk-650/igt@debugfs_test@read_all_entries.html
    - fi-skl-6770hq:      [PASS][31] -> [DMESG-WARN][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-6770hq/igt@debugfs_test@read_all_entries.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-6770hq/igt@debugfs_test@read_all_entries.html
    - fi-byt-n2820:       [PASS][33] -> [DMESG-WARN][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-byt-n2820/igt@debugfs_test@read_all_entries.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-byt-n2820/igt@debugfs_test@read_all_entries.html
    - fi-elk-e7500:       [PASS][35] -> [DMESG-WARN][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-elk-e7500/igt@debugfs_test@read_all_entries.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-elk-e7500/igt@debugfs_test@read_all_entries.html
    - fi-skl-lmem:        [PASS][37] -> [DMESG-WARN][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-lmem/igt@debugfs_test@read_all_entries.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-lmem/igt@debugfs_test@read_all_entries.html
    - fi-skl-6260u:       [PASS][39] -> [DMESG-WARN][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-6260u/igt@debugfs_test@read_all_entries.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-6260u/igt@debugfs_test@read_all_entries.html
    - fi-snb-2600:        NOTRUN -> [DMESG-WARN][41]
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-snb-2600/igt@debugfs_test@read_all_entries.html
    - fi-hsw-4770r:       [PASS][42] -> [DMESG-WARN][43]
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-hsw-4770r/igt@debugfs_test@read_all_entries.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-hsw-4770r/igt@debugfs_test@read_all_entries.html
    - fi-skl-gvtdvm:      [PASS][44] -> [DMESG-WARN][45]
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-gvtdvm/igt@debugfs_test@read_all_entries.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-gvtdvm/igt@debugfs_test@read_all_entries.html
    - fi-kbl-guc:         [PASS][46] -> [DMESG-WARN][47]
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-kbl-guc/igt@debugfs_test@read_all_entries.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-kbl-guc/igt@debugfs_test@read_all_entries.html
    - fi-bsw-kefka:       [PASS][48] -> [DMESG-WARN][49]
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-bsw-kefka/igt@debugfs_test@read_all_entries.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-bsw-kefka/igt@debugfs_test@read_all_entries.html
    - fi-kbl-x1275:       [PASS][50] -> [DMESG-WARN][51]
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-kbl-x1275/igt@debugfs_test@read_all_entries.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-kbl-x1275/igt@debugfs_test@read_all_entries.html
    - fi-blb-e6850:       [PASS][52] -> [DMESG-WARN][53]
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-blb-e6850/igt@debugfs_test@read_all_entries.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-blb-e6850/igt@debugfs_test@read_all_entries.html
    - fi-bwr-2160:        [PASS][54] -> [DMESG-WARN][55]
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-bwr-2160/igt@debugfs_test@read_all_entries.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-bwr-2160/igt@debugfs_test@read_all_entries.html
    - fi-bdw-5557u:       [PASS][56] -> [DMESG-WARN][57]
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-bdw-5557u/igt@debugfs_test@read_all_entries.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-bdw-5557u/igt@debugfs_test@read_all_entries.html
    - fi-kbl-r:           [PASS][58] -> [DMESG-WARN][59]
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-kbl-r/igt@debugfs_test@read_all_entries.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-kbl-r/igt@debugfs_test@read_all_entries.html
    - fi-skl-guc:         [PASS][60] -> [DMESG-WARN][61]
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-guc/igt@debugfs_test@read_all_entries.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-guc/igt@debugfs_test@read_all_entries.html
    - fi-kbl-7567u:       [PASS][62] -> [DMESG-WARN][63]
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-kbl-7567u/igt@debugfs_test@read_all_entries.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-kbl-7567u/igt@debugfs_test@read_all_entries.html
    - fi-apl-guc:         [PASS][64] -> [DMESG-WARN][65]
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-apl-guc/igt@debugfs_test@read_all_entries.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-apl-guc/igt@debugfs_test@read_all_entries.html
    - fi-kbl-8809g:       [PASS][66] -> [DMESG-WARN][67]
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-kbl-8809g/igt@debugfs_test@read_all_entries.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-kbl-8809g/igt@debugfs_test@read_all_entries.html
    - fi-skl-6600u:       [PASS][68] -> [DMESG-WARN][69]
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-6600u/igt@debugfs_test@read_all_entries.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-6600u/igt@debugfs_test@read_all_entries.html
    - fi-byt-j1900:       [PASS][70] -> [DMESG-WARN][71]
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-byt-j1900/igt@debugfs_test@read_all_entries.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-byt-j1900/igt@debugfs_test@read_all_entries.html
    - fi-bxt-dsi:         [PASS][72] -> [DMESG-WARN][73]
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-bxt-dsi/igt@debugfs_test@read_all_entries.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-bxt-dsi/igt@debugfs_test@read_all_entries.html
    - fi-cfl-8700k:       [PASS][74] -> [DMESG-WARN][75]
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-cfl-8700k/igt@debugfs_test@read_all_entries.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-cfl-8700k/igt@debugfs_test@read_all_entries.html
    - fi-cml-u2:          [PASS][76] -> [DMESG-WARN][77]
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-cml-u2/igt@debugfs_test@read_all_entries.html
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-cml-u2/igt@debugfs_test@read_all_entries.html
    - fi-whl-u:           [PASS][78] -> [DMESG-WARN][79]
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-whl-u/igt@debugfs_test@read_all_entries.html
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-whl-u/igt@debugfs_test@read_all_entries.html
    - fi-bsw-n3050:       [PASS][80] -> [DMESG-WARN][81]
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-bsw-n3050/igt@debugfs_test@read_all_entries.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-bsw-n3050/igt@debugfs_test@read_all_entries.html
    - fi-skl-6700k2:      [PASS][82] -> [DMESG-WARN][83]
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-skl-6700k2/igt@debugfs_test@read_all_entries.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-skl-6700k2/igt@debugfs_test@read_all_entries.html
    - fi-hsw-4770:        [PASS][84] -> [DMESG-WARN][85]
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-hsw-4770/igt@debugfs_test@read_all_entries.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-hsw-4770/igt@debugfs_test@read_all_entries.html
    - fi-cfl-guc:         [PASS][86] -> [DMESG-WARN][87]
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-cfl-guc/igt@debugfs_test@read_all_entries.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-cfl-guc/igt@debugfs_test@read_all_entries.html
    - fi-icl-guc:         [PASS][88] -> [DMESG-WARN][89]
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-icl-guc/igt@debugfs_test@read_all_entries.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-icl-guc/igt@debugfs_test@read_all_entries.html

  
New tests
---------

  New tests have been introduced between CI_DRM_6376 and Patchwork_13460:

### New IGT tests (1) ###

  * igt@i915_selftest@live_memory_region:
    - Statuses :
    - Exec time: [None] s

  

Known issues
------------

  Here are the changes found in Patchwork_13460 that come from known issues:

### IGT changes ###

#### Warnings ####

  * igt@runner@aborted:
    - fi-kbl-r:           [FAIL][90] ([fdo#103841] / [fdo#110992]) -> [FAIL][91] ([fdo#110992])
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6376/fi-kbl-r/igt@runner@aborted.html
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/fi-kbl-r/igt@runner@aborted.html

  
  [fdo#103841]: https://bugs.freedesktop.org/show_bug.cgi?id=103841
  [fdo#110992]: https://bugs.freedesktop.org/show_bug.cgi?id=110992


Participating hosts (53 -> 45)
------------------------------

  Additional (1): fi-snb-2600 
  Missing    (9): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-icl-y fi-icl-dsi fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6376 -> Patchwork_13460

  CI_DRM_6376: 092d19896d2abb3f132713e72608386f670ddbb0 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5071: 3c4edeba35ac699db5b39600eb17f4151c6b42fd @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13460: 2c530c39178dc8c6a2966fe82541debf5785546c @ git://anongit.freedesktop.org/gfx-ci/linux


== Kernel 32bit build ==

Warning: Kernel 32bit buildtest failed:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/build_32bit.log

  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 112 modules
ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1287: recipe for target 'modules' failed
make: *** [modules] Error 2


== Linux commits ==

2c530c39178d HAX drm/i915/lmem: default userspace allocations to LMEM
f5de7bb8e2d3 HAX drm/i915: add the fake lmem region
45761cc62ad8 drm/i915/query: Expose memory regions through the query uAPI
d3e6c6c9f2a5 drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
80c18bff8a6f drm/i915: support basic object migration
8c21fd6a50e8 drm/i915: cpu-map based dumb buffers
3a94fa78f237 drm/i915: Add cpu and lmem fault handlers
5cd5c23c7ae8 drm/i915/lmem: add helper to get CPU accessible offset
4981ff369abe drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
c5129ddc06ad drm/i915: Allow i915 to manage the vma offset nodes instead of drm core
bf7a75b46b28 drm/i915: Don't try to place HWS in non-existing mappable region
6cef423853ea drm/i915: error capture with no ggtt slot
c3ebef177d7d drm/i915/selftests: check for missing aperture
5681ea496f31 drm/i915: set num_fence_regs to 0 if there is no aperture
3fb004e577d6 drm/i915: expose missing map_gtt support to users
50a3eeb8189f drm/i915: do not map aperture if it is not available.
089ac99864c6 drm/i915: define HAS_MAPPABLE_APERTURE
3fff0096edab drm/i915: treat stolen as a region
eb9166edbaaa drm/i915: treat shmem as a region
c0332db20e13 drm/i915: enumerate and init each supported region
282c4317ef1a drm/i915/lmem: support pwrite
1c5dce2c18de drm/i915/lmem: support pread
6bef737f4655 drm/i915/lmem: support CPU relocations
7e21a904a5ac drm/i915/selftest: extend coverage to include LMEM huge-pages
88e9fcca4054 drm/i915/selftests: don't just test CACHE_NONE for huge-pages
cc31698abc26 drm/i915/selftests: add write-dword test for LMEM
4de58ea97434 drm/i915/selftests: move gpu-write-dw into utils
cac31ed0edda drm/i915/blt: support copying objects
8784beb8d0cd drm/i915/lmem: support kernel mapping
894c2da949ad drm/i915: setup io-mapping for LMEM
07335c6c968c drm/i915: support creating LMEM objects
158dcac99d23 drm/i915: Add memory region information to device_info
896ab6f953bb drm/i915/region: support volatile objects
97c56199279a drm/i915/region: support continuous allocations
b199239505e9 drm/i915/region: support basic eviction
9dd43433710a drm/i915: introduce intel_memory_region
6f6eb8a214a1 drm/i915: buddy allocator

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13460/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 16/37] drm/i915/lmem: support pread
  2019-06-27 20:56 ` [PATCH v2 16/37] drm/i915/lmem: support pread Matthew Auld
  2019-06-27 23:50   ` Chris Wilson
@ 2019-07-30  8:58   ` Daniel Vetter
  2019-07-30  9:25     ` Matthew Auld
  2019-07-30 12:05     ` Chris Wilson
  1 sibling, 2 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-07-30  8:58 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Thu, Jun 27, 2019 at 09:56:12PM +0100, Matthew Auld wrote:
> We need to add support for pread'ing an LMEM object.

Why? Usage outside from igts seems pretty dead, at least looking at iris
and anv. This was kinda a neat thing for when we didn't yet realized that
doing clflush in userspace is both possible and more efficient.

Same for pwrite, iris just dropped it, anv doesn't seem to use it. And I
thought mesa plan is to drop the old classic driver for when we'll need
lmem. It's not much, but would allow us to drop a few things.
-Daniel

> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  2 +
>  drivers/gpu/drm/i915/i915_gem.c               |  6 ++
>  drivers/gpu/drm/i915/intel_region_lmem.c      | 76 +++++++++++++++++++
>  3 files changed, 84 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 80ff5ad9bc07..8cdee185251a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -52,6 +52,8 @@ struct drm_i915_gem_object_ops {
>  	void (*truncate)(struct drm_i915_gem_object *obj);
>  	void (*writeback)(struct drm_i915_gem_object *obj);
>  
> +	int (*pread)(struct drm_i915_gem_object *,
> +		     const struct drm_i915_gem_pread *arg);
>  	int (*pwrite)(struct drm_i915_gem_object *obj,
>  		      const struct drm_i915_gem_pwrite *arg);
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 85677ae89849..4ba386ab35e7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -463,6 +463,12 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>  
>  	trace_i915_gem_object_pread(obj, args->offset, args->size);
>  
> +	ret = -ENODEV;
> +	if (obj->ops->pread)
> +		ret = obj->ops->pread(obj, args);
> +	if (ret != -ENODEV)
> +		goto out;
> +
>  	ret = i915_gem_object_wait(obj,
>  				   I915_WAIT_INTERRUPTIBLE,
>  				   MAX_SCHEDULE_TIMEOUT);
> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
> index 701bcac3479e..54b2c7bf177d 100644
> --- a/drivers/gpu/drm/i915/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
> @@ -7,10 +7,86 @@
>  #include "intel_memory_region.h"
>  #include "intel_region_lmem.h"
>  
> +static int lmem_pread(struct drm_i915_gem_object *obj,
> +		      const struct drm_i915_gem_pread *arg)
> +{
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	struct intel_runtime_pm *rpm = &i915->runtime_pm;
> +	intel_wakeref_t wakeref;
> +	struct dma_fence *fence;
> +	char __user *user_data;
> +	unsigned int offset;
> +	unsigned long idx;
> +	u64 remain;
> +	int ret;
> +
> +	ret = i915_gem_object_pin_pages(obj);
> +	if (ret)
> +		return ret;
> +
> +	i915_gem_object_lock(obj);
> +	ret = i915_gem_object_set_to_wc_domain(obj, false);
> +	if (ret) {
> +		i915_gem_object_unlock(obj);
> +		goto out_unpin;
> +	}
> +
> +	fence = i915_gem_object_lock_fence(obj);
> +	i915_gem_object_unlock(obj);
> +	if (!fence) {
> +		ret = -ENOMEM;
> +		goto out_unpin;
> +	}
> +
> +	wakeref = intel_runtime_pm_get(rpm);
> +
> +	remain = arg->size;
> +	user_data = u64_to_user_ptr(arg->data_ptr);
> +	offset = offset_in_page(arg->offset);
> +	for (idx = arg->offset >> PAGE_SHIFT; remain; idx++) {
> +		unsigned long unwritten;
> +		void __iomem *vaddr;
> +		int length;
> +
> +		length = remain;
> +		if (offset + length > PAGE_SIZE)
> +			length = PAGE_SIZE - offset;
> +
> +		vaddr = i915_gem_object_lmem_io_map_page(obj, idx);
> +		if (!vaddr) {
> +			ret = -ENOMEM;
> +			goto out_put;
> +		}
> +
> +		unwritten = copy_to_user(user_data,
> +					 (void __force *)vaddr + offset,
> +					 length);
> +		io_mapping_unmap_atomic(vaddr);
> +		if (unwritten) {
> +			ret = -EFAULT;
> +			goto out_put;
> +		}
> +
> +		remain -= length;
> +		user_data += length;
> +		offset = 0;
> +	}
> +
> +out_put:
> +	intel_runtime_pm_put(rpm, wakeref);
> +	i915_gem_object_unlock_fence(obj, fence);
> +out_unpin:
> +	i915_gem_object_unpin_pages(obj);
> +
> +	return ret;
> +}
> +
>  static const struct drm_i915_gem_object_ops region_lmem_obj_ops = {
>  	.get_pages = i915_memory_region_get_pages_buddy,
>  	.put_pages = i915_memory_region_put_pages_buddy,
>  	.release = i915_gem_object_release_memory_region,
> +
> +	.pread = lmem_pread,
>  };
>  
>  static struct drm_i915_gem_object *
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 16/37] drm/i915/lmem: support pread
  2019-07-30  8:58   ` Daniel Vetter
@ 2019-07-30  9:25     ` Matthew Auld
  2019-07-30  9:50       ` Daniel Vetter
  2019-07-30 12:05     ` Chris Wilson
  1 sibling, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-07-30  9:25 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 30/07/2019 09:58, Daniel Vetter wrote:
> On Thu, Jun 27, 2019 at 09:56:12PM +0100, Matthew Auld wrote:
>> We need to add support for pread'ing an LMEM object.
> 
> Why? Usage outside from igts seems pretty dead, at least looking at iris
> and anv. This was kinda a neat thing for when we didn't yet realized that
> doing clflush in userspace is both possible and more efficient.
> 
> Same for pwrite, iris just dropped it, anv doesn't seem to use it. And I
> thought mesa plan is to drop the old classic driver for when we'll need
> lmem. It's not much, but would allow us to drop a few things.

Hmm, it was at least useful in the super early days for debugging. If we 
were to drop this what do we do with the igts? Just use mmap?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  2019-06-27 20:56 ` [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET Matthew Auld
  2019-06-28  0:12   ` Chris Wilson
@ 2019-07-30  9:49   ` Daniel Vetter
  2019-07-30 14:28     ` Matthew Auld
  1 sibling, 1 reply; 88+ messages in thread
From: Daniel Vetter @ 2019-07-30  9:49 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Thu, Jun 27, 2019 at 09:56:25PM +0100, Matthew Auld wrote:
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> Add a new CPU mmap implementation that allows multiple fault handlers
> that depends on the object's backing pages.
> 
> Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
> and use the zero extending behaviour of drm to differentiate between
> them, when we inspect the flags.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

So I thought that the plan is to reject invalid mmaps, i.e. mmap modes
which are not compatibale with all placement options. Given that, why do
we need this?

- cpu mmap with all the flags still keep working, as long as the only
  placement you select is smem.

- for lmem/stolen the only option we have is a wc mapping, either through
  the pci bar or through the gtt. So for objects only sitting in there
  also no problem, we can just keep using the current gtt mmap stuff (but
  redirect it internally).

- that leaves us with objects which can move around. Only option allows is
  WC, and the gtt mmap ioctl does that already. When the object is in smem
  we'll need to redirect it to a cpu wc mmap, but I think we need to do
  that anyway.

So not really seeing what the uapi problem is you're trying to solve here?

Can you pls explain why we need this?

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |  2 ++
>  drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 30 ++++++++++++++++++
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
>  drivers/gpu/drm/i915/i915_drv.c               |  3 +-
>  include/uapi/drm/i915_drm.h                   | 31 +++++++++++++++++++
>  5 files changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> index ddc7f2a52b3e..5abd5b2172f2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> @@ -30,6 +30,8 @@ int i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>  			struct drm_file *file);
>  int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>  			    struct drm_file *file);
> +int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
> +			       struct drm_file *file_priv);
>  int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>  			 struct drm_file *file);
>  int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index 7b46f44d9c20..cbf89e80a97b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -536,12 +536,42 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>  			struct drm_file *file)
>  {
>  	struct drm_i915_gem_mmap_offset *args = data;
> +	struct drm_i915_private *i915 = to_i915(dev);
> +
> +	if (args->flags & I915_MMAP_OFFSET_FLAGS)
> +		return i915_gem_mmap_offset_ioctl(dev, data, file);
> +
> +	if (!HAS_MAPPABLE_APERTURE(i915)) {
> +		DRM_ERROR("No aperture, cannot mmap via legacy GTT\n");
> +		return -ENODEV;
> +	}
>  
>  	return __assign_gem_object_mmap_data(file, args->handle,
>  					     I915_MMAP_TYPE_GTT,
>  					     &args->offset);
>  }
>  
> +int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
> +			       struct drm_file *file)
> +{
> +	struct drm_i915_gem_mmap_offset *args = data;
> +	enum i915_mmap_type type;
> +
> +	if ((args->flags & (I915_MMAP_OFFSET_WC | I915_MMAP_OFFSET_WB)) &&
> +	    !boot_cpu_has(X86_FEATURE_PAT))
> +		return -ENODEV;
> +
> +	if (args->flags & I915_MMAP_OFFSET_WC)
> +		type = I915_MMAP_TYPE_OFFSET_WC;
> +	else if (args->flags & I915_MMAP_OFFSET_WB)
> +		type = I915_MMAP_TYPE_OFFSET_WB;
> +	else if (args->flags & I915_MMAP_OFFSET_UC)
> +		type = I915_MMAP_TYPE_OFFSET_UC;
> +
> +	return __assign_gem_object_mmap_data(file, args->handle, type,
> +					     &args->offset);
> +}
> +
>  void i915_mmap_offset_object_release(struct kref *ref)
>  {
>  	struct i915_mmap_offset *mmo = container_of(ref,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 86f358da8085..f95e54a25426 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -63,6 +63,9 @@ struct drm_i915_gem_object_ops {
>  
>  enum i915_mmap_type {
>  	I915_MMAP_TYPE_GTT = 0,
> +	I915_MMAP_TYPE_OFFSET_WC,
> +	I915_MMAP_TYPE_OFFSET_WB,
> +	I915_MMAP_TYPE_OFFSET_UC,
>  };
>  
>  struct i915_mmap_offset {
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 0f1f3b7f3029..8dadd6b9a0a9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -459,6 +459,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
>  	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
>  	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
>  	case I915_PARAM_HAS_EXEC_SUBMIT_FENCE:
> +	case I915_PARAM_MMAP_OFFSET_VERSION:
>  		/* For the time being all of these are always true;
>  		 * if some supported hardware does not have one of these
>  		 * features this value needs to be provided from
> @@ -3176,7 +3177,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_GEM_PREAD, i915_gem_pread_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_PWRITE, i915_gem_pwrite_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP, i915_gem_mmap_ioctl, DRM_RENDER_ALLOW),
> -	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP_GTT, i915_gem_mmap_gtt_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP_OFFSET, i915_gem_mmap_gtt_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_SET_DOMAIN, i915_gem_set_domain_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_SW_FINISH, i915_gem_sw_finish_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_SET_TILING, i915_gem_set_tiling_ioctl, DRM_RENDER_ALLOW),
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 328d05e77d9f..729e729e2282 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -359,6 +359,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_I915_QUERY			0x39
>  #define DRM_I915_GEM_VM_CREATE		0x3a
>  #define DRM_I915_GEM_VM_DESTROY		0x3b
> +#define DRM_I915_GEM_MMAP_OFFSET   	DRM_I915_GEM_MMAP_GTT
>  /* Must be kept compact -- no holes */
>  
>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> @@ -421,6 +422,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
>  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
> +#define DRM_IOCTL_I915_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_OFFSET, struct drm_i915_gem_mmap_offset)
>  
>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>   * on the security mechanisms provided by hardware.
> @@ -610,6 +612,10 @@ typedef struct drm_i915_irq_wait {
>   * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
>   */
>  #define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
> +
> +/* Mmap offset ioctl */
> +#define I915_PARAM_MMAP_OFFSET_VERSION	54
> +
>  /* Must be kept compact -- no holes and well documented */
>  
>  typedef struct drm_i915_getparam {
> @@ -785,6 +791,31 @@ struct drm_i915_gem_mmap_gtt {
>  	__u64 offset;
>  };
>  
> +struct drm_i915_gem_mmap_offset {
> +	/** Handle for the object being mapped. */
> +	__u32 handle;
> +	__u32 pad;
> +	/**
> +	 * Fake offset to use for subsequent mmap call
> +	 *
> +	 * This is a fixed-size type for 32/64 compatibility.
> +	 */
> +	__u64 offset;
> +
> +	/**
> +	 * Flags for extended behaviour.
> +	 *
> +	 * It is mandatory that either one of the _WC/_WB flags
> +	 * should be passed here.
> +	 */
> +	__u64 flags;
> +#define I915_MMAP_OFFSET_WC (1 << 0)
> +#define I915_MMAP_OFFSET_WB (1 << 1)
> +#define I915_MMAP_OFFSET_UC (1 << 2)
> +#define I915_MMAP_OFFSET_FLAGS \
> +	(I915_MMAP_OFFSET_WC | I915_MMAP_OFFSET_WB | I915_MMAP_OFFSET_UC)
> +};
> +
>  struct drm_i915_gem_set_domain {
>  	/** Handle for the object */
>  	__u32 handle;
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 16/37] drm/i915/lmem: support pread
  2019-07-30  9:25     ` Matthew Auld
@ 2019-07-30  9:50       ` Daniel Vetter
  0 siblings, 0 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-07-30  9:50 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Jul 30, 2019 at 10:25:10AM +0100, Matthew Auld wrote:
> On 30/07/2019 09:58, Daniel Vetter wrote:
> > On Thu, Jun 27, 2019 at 09:56:12PM +0100, Matthew Auld wrote:
> > > We need to add support for pread'ing an LMEM object.
> > 
> > Why? Usage outside from igts seems pretty dead, at least looking at iris
> > and anv. This was kinda a neat thing for when we didn't yet realized that
> > doing clflush in userspace is both possible and more efficient.
> > 
> > Same for pwrite, iris just dropped it, anv doesn't seem to use it. And I
> > thought mesa plan is to drop the old classic driver for when we'll need
> > lmem. It's not much, but would allow us to drop a few things.
> 
> Hmm, it was at least useful in the super early days for debugging. If we
> were to drop this what do we do with the igts? Just use mmap?

wc mmap is probably simplest. I think we could do a compat function in igt
that does that when pwrite isn't available. Could also have that in
libdrm_intel, in case some of the UMDs have a hard time getting off it.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 16/37] drm/i915/lmem: support pread
  2019-07-30  8:58   ` Daniel Vetter
  2019-07-30  9:25     ` Matthew Auld
@ 2019-07-30 12:05     ` Chris Wilson
  2019-07-30 12:42       ` Daniel Vetter
  1 sibling, 1 reply; 88+ messages in thread
From: Chris Wilson @ 2019-07-30 12:05 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Auld; +Cc: intel-gfx

Quoting Daniel Vetter (2019-07-30 09:58:22)
> On Thu, Jun 27, 2019 at 09:56:12PM +0100, Matthew Auld wrote:
> > We need to add support for pread'ing an LMEM object.
> 
> Why? Usage outside from igts seems pretty dead, at least looking at iris
> and anv. This was kinda a neat thing for when we didn't yet realized that
> doing clflush in userspace is both possible and more efficient.
> 
> Same for pwrite, iris just dropped it, anv doesn't seem to use it. And I
> thought mesa plan is to drop the old classic driver for when we'll need
> lmem. It's not much, but would allow us to drop a few things.

From the opposite perspective, it should only be a wrapper around code
that is being used internally for similar transfers. (One side-effect is
that it can be used to poke more directly at those internals.) It is also
not clear what the preferred strategy will be in future, especially as
people start discussing migration-on-pagefault.

It comes down to whether the maintenance burden of maintaining a
consistent API is worth the maintenance burden of not!
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 16/37] drm/i915/lmem: support pread
  2019-07-30 12:05     ` Chris Wilson
@ 2019-07-30 12:42       ` Daniel Vetter
  0 siblings, 0 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-07-30 12:42 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Matthew Auld

On Tue, Jul 30, 2019 at 2:05 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Daniel Vetter (2019-07-30 09:58:22)
> > On Thu, Jun 27, 2019 at 09:56:12PM +0100, Matthew Auld wrote:
> > > We need to add support for pread'ing an LMEM object.
> >
> > Why? Usage outside from igts seems pretty dead, at least looking at iris
> > and anv. This was kinda a neat thing for when we didn't yet realized that
> > doing clflush in userspace is both possible and more efficient.
> >
> > Same for pwrite, iris just dropped it, anv doesn't seem to use it. And I
> > thought mesa plan is to drop the old classic driver for when we'll need
> > lmem. It's not much, but would allow us to drop a few things.
>
> From the opposite perspective, it should only be a wrapper around code
> that is being used internally for similar transfers. (One side-effect is
> that it can be used to poke more directly at those internals.) It is also
> not clear what the preferred strategy will be in future, especially as
> people start discussing migration-on-pagefault.

Hm, where do we look at migrate-on-pagefault?

I mean aside from the entire resurrection of the mappable deamon
because apparently we can't design apertures for pci bars which are
big enough (unlike amd, which fixed this now). But that's just an
lmem->lmem migration to squeeze it into the right range (and hey we
know how to do that, we even have the old code still).

> It comes down to whether the maintenance burden of maintaining a
> consistent API is worth the maintenance burden of not!

Yeah it's minor, but then pwrite has some irky corner-cases (I
stumbled over the vlc wtf that originally motivated the introduction
of the pwrite hook, and the reintroduction of the page-by-page pwrite
for shmem that's not pinned). So it's not the cleanest uapi we have
since almost a decade of gunk now sitting on top. And when I went
looking at iris/anv it seems like we've sunset it for good going
forward. Note I'm not going for complete removal, just not allowing it
if you set lmem as one of the placements of your bo. So pwrite into an
upload buffer in smem, mapped through the TT to the gpu, would still
be fine. Which I guess should cover all the igt pwrites for
batchbuffers.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  2019-07-30  9:49   ` Daniel Vetter
@ 2019-07-30 14:28     ` Matthew Auld
  2019-07-30 16:22       ` Daniel Vetter
  0 siblings, 1 reply; 88+ messages in thread
From: Matthew Auld @ 2019-07-30 14:28 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On 30/07/2019 10:49, Daniel Vetter wrote:
> On Thu, Jun 27, 2019 at 09:56:25PM +0100, Matthew Auld wrote:
>> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
>>
>> Add a new CPU mmap implementation that allows multiple fault handlers
>> that depends on the object's backing pages.
>>
>> Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
>> and use the zero extending behaviour of drm to differentiate between
>> them, when we inspect the flags.
>>
>> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> 
> So I thought that the plan is to reject invalid mmaps, i.e. mmap modes
> which are not compatibale with all placement options. Given that, why do
> we need this?

We are meant to reject anything !wc for LMEM. There were some patches 
for that but I guess got lost under the radar...

> 
> - cpu mmap with all the flags still keep working, as long as the only
>    placement you select is smem.
> 
> - for lmem/stolen the only option we have is a wc mapping, either through
>    the pci bar or through the gtt. So for objects only sitting in there
>    also no problem, we can just keep using the current gtt mmap stuff (but
>    redirect it internally).
> 
> - that leaves us with objects which can move around. Only option allows is
>    WC, and the gtt mmap ioctl does that already. When the object is in smem
>    we'll need to redirect it to a cpu wc mmap, but I think we need to do
>    that anyway.

So for legacy, gtt_mmap will still go through the aperture, otherwise if 
LMEM is supported then there is no aperture, so we just wc mmap via cpu 
or LMEMBAR depending on the final object placement. And cpu_mmap still 
works if we don't care about LMEM. Hmm, so do we even need most of the 
previous patch then? ALso does that mean we also have to track the 
placement of an object in igt?

gem_mmap__wc:

if (supports_lmem(dev))
	gtt_mmap();
else
	gem_mmap(wc);

gem_mmap__wc:

if (placement_contains(obj, LMEM))
	gtt_mmap();
else
	gem_mmap(wc);

?

> 
> So not really seeing what the uapi problem is you're trying to solve here?
> 
> Can you pls explain why we need this?

The naming of gtt_mmap seemed confusing, since there is no aperture, and 
having one mmap ioctl to cover both smem and lmem seemed like a nice 
idea...also I think umd's stopped using gtt_mmap(or were told to?) but 
maybe those aren't good enough reasons.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
  2019-06-27 20:56 ` [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION Matthew Auld
  2019-06-28  0:22   ` Chris Wilson
  2019-06-28  5:53   ` Tvrtko Ursulin
@ 2019-07-30 16:17   ` Daniel Vetter
  2 siblings, 0 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-07-30 16:17 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Thu, Jun 27, 2019 at 09:56:30PM +0100, Matthew Auld wrote:
> From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> 
> This call will specify which memory region an object should be placed.
> 
> Note that changing the object's backing storage should be immediately
> done after an object is created or if it's not yet in use, otherwise
> this will fail on a busy object.
> 
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c |  12 ++
>  drivers/gpu/drm/i915/gem/i915_gem_context.h |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h  |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.c  | 117 ++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_drv.c             |   2 +-
>  include/uapi/drm/i915_drm.h                 |  27 +++++
>  6 files changed, 161 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8a9787cf0cd0..157ca8247752 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -75,6 +75,7 @@
>  #include "i915_globals.h"
>  #include "i915_trace.h"
>  #include "i915_user_extensions.h"
> +#include "i915_gem_ioctls.h"
>  
>  #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
>  
> @@ -2357,6 +2358,17 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  	return ret;
>  }
>  
> +int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
> +			    struct drm_file *file)
> +{
> +	struct drm_i915_gem_context_param *args = data;
> +
> +	if (args->param <= I915_CONTEXT_PARAM_MAX)
> +		return i915_gem_context_setparam_ioctl(dev, data, file);
> +
> +	return i915_gem_object_setparam_ioctl(dev, data, file);
> +}
> +
>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>  				       void *data, struct drm_file *file)
>  {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> index 9691dd062f72..d5a9a63bb34c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> @@ -157,6 +157,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  				    struct drm_file *file_priv);
>  int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  				    struct drm_file *file_priv);
> +int i915_gem_setparam_ioctl(struct drm_device *dev, void *data,
> +			    struct drm_file *file);
>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>  				       struct drm_file *file);
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> index 5abd5b2172f2..af7465bceebd 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
> @@ -32,6 +32,8 @@ int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>  			    struct drm_file *file);
>  int i915_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file_priv);
> +int i915_gem_object_setparam_ioctl(struct drm_device *dev, void *data,
> +				   struct drm_file *file_priv);
>  int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>  			 struct drm_file *file);
>  int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> index 691af388e4e7..bc95f449de50 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -551,6 +551,123 @@ int __init i915_global_objects_init(void)
>  	return 0;
>  }
>  
> +static enum intel_region_id
> +__region_id(u32 region)
> +{
> +	enum intel_region_id id;
> +
> +	for (id = 0; id < ARRAY_SIZE(intel_region_map); ++id) {
> +		if (intel_region_map[id] == region)
> +			return id;
> +	}
> +
> +	return INTEL_MEMORY_UKNOWN;
> +}
> +
> +static int i915_gem_object_region_select(struct drm_i915_private *dev_priv,
> +					 struct drm_i915_gem_object_param *args,
> +					 struct drm_file *file,
> +					 struct drm_i915_gem_object *obj)
> +{
> +	struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context;
> +	u32 __user *uregions = u64_to_user_ptr(args->data);
> +	u32 uregions_copy[INTEL_MEMORY_UKNOWN];
> +	int i, ret;
> +
> +	if (args->size > ARRAY_SIZE(intel_region_map))
> +		return -EINVAL;
> +
> +	memset(uregions_copy, 0, sizeof(uregions_copy));
> +	for (i = 0; i < args->size; i++) {
> +		u32 region;
> +
> +		ret = get_user(region, uregions);
> +		if (ret)
> +			return ret;
> +
> +		uregions_copy[i] = region;
> +		++uregions;
> +	}
> +
> +	mutex_lock(&dev_priv->drm.struct_mutex);
> +	ret = i915_gem_object_prepare_move(obj);
> +	if (ret) {
> +		DRM_ERROR("Cannot set memory region, object in use\n");
> +	        goto err;

So if all that's changed is the priority of allocations, but not the
overall list, will we allow this?

I think this will be needed for GL, where figuring out the usage pattern
of a given upload buffer is very much an observational thing in many
cases. And we might later change from a lmem, then smem priority order to
preferring smem (but still allowing lmem so that the uapi of the bo
doesn't change).

Also if we go with this would be nice to make that list of possible
allocations static, since I think the plan is to use that to limit what's
possible wrt uapi (stuff like mmap, but also pwrite/pread, and all that).
-Daniel


> +	}
> +
> +	if (args->size > ARRAY_SIZE(intel_region_map))
> +		return -EINVAL;
> +
> +	for (i = 0; i < args->size; i++) {
> +		u32 region = uregions_copy[i];
> +		enum intel_region_id id = __region_id(region);
> +
> +		if (id == INTEL_MEMORY_UKNOWN) {
> +			ret = -EINVAL;
> +			goto err;
> +		}
> +
> +		ret = i915_gem_object_migrate(obj, ce, id);
> +		if (!ret) {
> +			if (MEMORY_TYPE_FROM_REGION(region) ==
> +			    INTEL_LMEM) {
> +				/*
> +				 * TODO: this should be part of get_pages(),
> +				 * when async get_pages arrives
> +				 */
> +				ret = i915_gem_object_fill_blt(obj, ce, 0);
> +				if (ret) {
> +					DRM_ERROR("Failed clearing the object\n");
> +					goto err;
> +				}
> +
> +				i915_gem_object_lock(obj);
> +				ret = i915_gem_object_set_to_cpu_domain(obj, false);
> +				i915_gem_object_unlock(obj);
> +				if (ret)
> +					goto err;
> +			}
> +			break;
> +		}
> +	}
> +err:
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
> +	return ret;
> +}
> +
> +int i915_gem_object_setparam_ioctl(struct drm_device *dev, void *data,
> +				   struct drm_file *file)
> +{
> +
> +	struct drm_i915_gem_object_param *args = data;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct drm_i915_gem_object *obj;
> +	int ret;
> +
> +	obj = i915_gem_object_lookup(file, args->handle);
> +	if (!obj)
> +		return -ENOENT;
> +
> +	switch (args->param) {
> +	case I915_PARAM_MEMORY_REGION:
> +		ret = i915_gem_object_region_select(dev_priv, args, file, obj);
> +		if (ret) {
> +			DRM_ERROR("Cannot set memory region, migration failed\n");
> +			goto err;
> +		}
> +
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +
> +err:
> +	i915_gem_object_put(obj);
> +	return ret;
> +}
> +
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>  #include "selftests/huge_gem_object.c"
>  #include "selftests/huge_pages.c"
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 1c3d5cb2893c..3d6fe993f26e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -3196,7 +3196,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
> -	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_RENDER_ALLOW),
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 729e729e2282..5cf976e7608a 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -360,6 +360,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_I915_GEM_VM_CREATE		0x3a
>  #define DRM_I915_GEM_VM_DESTROY		0x3b
>  #define DRM_I915_GEM_MMAP_OFFSET   	DRM_I915_GEM_MMAP_GTT
> +#define DRM_I915_GEM_OBJECT_SETPARAM	DRM_I915_GEM_CONTEXT_SETPARAM
>  /* Must be kept compact -- no holes */
>  
>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> @@ -423,6 +424,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
>  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
>  #define DRM_IOCTL_I915_GEM_MMAP_OFFSET		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_OFFSET, struct drm_i915_gem_mmap_offset)
> +#define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param)
>  
>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>   * on the security mechanisms provided by hardware.
> @@ -1595,11 +1597,36 @@ struct drm_i915_gem_context_param {
>   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
>   */
>  #define I915_CONTEXT_PARAM_ENGINES	0xa
> +
> +#define I915_CONTEXT_PARAM_MAX	        0xffffffff
>  /* Must be kept compact -- no holes and well documented */
>  
>  	__u64 value;
>  };
>  
> +struct drm_i915_gem_object_param {
> +	/** Handle for the object */
> +	__u32 handle;
> +
> +	__u32 size;
> +
> +	/* Must be 1 */
> +	__u32 object_class;
> +
> +	/** Set the memory region for the object listed in preference order
> +	 *  as an array of region ids within data. To force an object
> +	 *  to a particular memory region, set the region as the sole entry.
> +	 *
> +	 *  Valid region ids are derived from the id field of
> +	 *  struct drm_i915_memory_region_info.
> +	 *  See struct drm_i915_query_memory_region_info.
> +	 */
> +#define I915_PARAM_MEMORY_REGION 0x1
> +	__u32 param;
> +
> +	__u64 data;
> +};
> +
>  /**
>   * Context SSEU programming
>   *
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  2019-07-30 14:28     ` Matthew Auld
@ 2019-07-30 16:22       ` Daniel Vetter
  2019-08-12 16:18         ` Daniel Vetter
  0 siblings, 1 reply; 88+ messages in thread
From: Daniel Vetter @ 2019-07-30 16:22 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Jul 30, 2019 at 03:28:11PM +0100, Matthew Auld wrote:
> On 30/07/2019 10:49, Daniel Vetter wrote:
> > On Thu, Jun 27, 2019 at 09:56:25PM +0100, Matthew Auld wrote:
> > > From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > > 
> > > Add a new CPU mmap implementation that allows multiple fault handlers
> > > that depends on the object's backing pages.
> > > 
> > > Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
> > > and use the zero extending behaviour of drm to differentiate between
> > > them, when we inspect the flags.
> > > 
> > > Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > 
> > So I thought that the plan is to reject invalid mmaps, i.e. mmap modes
> > which are not compatibale with all placement options. Given that, why do
> > we need this?
> 
> We are meant to reject anything !wc for LMEM. There were some patches for
> that but I guess got lost under the radar...
> 
> > 
> > - cpu mmap with all the flags still keep working, as long as the only
> >    placement you select is smem.
> > 
> > - for lmem/stolen the only option we have is a wc mapping, either through
> >    the pci bar or through the gtt. So for objects only sitting in there
> >    also no problem, we can just keep using the current gtt mmap stuff (but
> >    redirect it internally).
> > 
> > - that leaves us with objects which can move around. Only option allows is
> >    WC, and the gtt mmap ioctl does that already. When the object is in smem
> >    we'll need to redirect it to a cpu wc mmap, but I think we need to do
> >    that anyway.
> 
> So for legacy, gtt_mmap will still go through the aperture, otherwise if
> LMEM is supported then there is no aperture, so we just wc mmap via cpu or
> LMEMBAR depending on the final object placement. And cpu_mmap still works if
> we don't care about LMEM. Hmm, so do we even need most of the previous patch
> then? ALso does that mean we also have to track the placement of an object
> in igt?
> 
> gem_mmap__wc:
> 
> if (supports_lmem(dev))
> 	gtt_mmap();
> else
> 	gem_mmap(wc);
> 
> gem_mmap__wc:
> 
> if (placement_contains(obj, LMEM))
> 	gtt_mmap();
> else
> 	gem_mmap(wc);
> 
> ?

Well if you want cpu wc mmaps, then just allocate it as smem ... we might
need a new gem_mmap__lmem I guess to exercise all the possible ways to get
at stuff in lmem (including when it migrates around underneath us while we
access it through the mmap). I wouldn't try too hard to smash all these
use/testcases into one.

> > So not really seeing what the uapi problem is you're trying to solve here?
> > 
> > Can you pls explain why we need this?
> 
> The naming of gtt_mmap seemed confusing, since there is no aperture, and
> having one mmap ioctl to cover both smem and lmem seemed like a nice
> idea...also I think umd's stopped using gtt_mmap(or were told to?) but maybe
> those aren't good enough reasons.

We stopped using gtt mmap because for many cases cpu WC mmap is faster.

Wrt having a clean slate: Not sure why this would benefit us, we just
diverge a bit more from how this works on !lmem, so a bit more complexity
(not much) everywhere for not much gain.

I'm also not sure whether there will be a whole lot of uses of such a
magic LMEMBAR wc mapping. It's probably slow for the exact same reasons
gtt mmap is slow.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-06-27 20:55 ` [PATCH v2 03/37] drm/i915/region: support basic eviction Matthew Auld
  2019-06-27 22:59   ` Chris Wilson
@ 2019-07-30 16:26   ` Daniel Vetter
  2019-08-15 10:48     ` Matthew Auld
  1 sibling, 1 reply; 88+ messages in thread
From: Daniel Vetter @ 2019-07-30 16:26 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> Support basic eviction for regions.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

So from a very high level this looks like it was largely modelled after
i915_gem_shrink.c and not i915_gem_evict.c (our other "make room, we're
running out of stuff" code). Any specific reasons?

I think i915_gem_evict is a lot closer match for what we want for vram (it
started out to manage severely limitted GTT on gen2/3/4) after all. With
the complication that we'll have to manage physical memory with multiple
virtual mappings of it on top, so unfortunately we can't just reuse the
locking patter Chris has come up with in his struct_mutex-removal branch.
But at least conceptually it should be a lot closer.

But I might be entirely off the track with reconstructing how this code
came to be, so please elaborate a bit.

Thanks, Daniel

> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  7 ++
>  drivers/gpu/drm/i915/i915_gem.c               | 16 ++++
>  drivers/gpu/drm/i915/intel_memory_region.c    | 89 ++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_memory_region.h    | 10 +++
>  .../drm/i915/selftests/intel_memory_region.c  | 73 +++++++++++++++
>  drivers/gpu/drm/i915/selftests/mock_region.c  |  1 +
>  6 files changed, 192 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 8d760e852c4b..87000fc24ab3 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -72,6 +72,13 @@ struct drm_i915_gem_object {
>  	 * List of memory region blocks allocated for this object.
>  	 */
>  	struct list_head blocks;
> +	/**
> +	 * Element within memory_region->objects or memory_region->purgeable if
> +	 * the object is marked as DONTNEED. Access is protected by
> +	 * memory_region->obj_lock.
> +	 */
> +	struct list_head region_link;
> +	struct list_head eviction_link;
>  
>  	struct {
>  		/**
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index db3744b0bc80..85677ae89849 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1122,6 +1122,22 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
>  	    !i915_gem_object_has_pages(obj))
>  		i915_gem_object_truncate(obj);
>  
> +	if (obj->memory_region) {
> +		mutex_lock(&obj->memory_region->obj_lock);
> +
> +		switch (obj->mm.madv) {
> +		case I915_MADV_WILLNEED:
> +			list_move(&obj->region_link, &obj->memory_region->objects);
> +			break;
> +		default:
> +			list_move(&obj->region_link,
> +				  &obj->memory_region->purgeable);
> +			break;
> +		}
> +
> +		mutex_unlock(&obj->memory_region->obj_lock);
> +	}
> +
>  	args->retained = obj->mm.madv != __I915_MADV_PURGED;
>  	mutex_unlock(&obj->mm.lock);
>  
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index 4c89853a7769..721b47e46492 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -6,6 +6,56 @@
>  #include "intel_memory_region.h"
>  #include "i915_drv.h"
>  
> +int i915_memory_region_evict(struct intel_memory_region *mem,
> +			     resource_size_t target)
> +{
> +	struct drm_i915_gem_object *obj, *on;
> +	resource_size_t found;
> +	LIST_HEAD(purgeable);
> +	int err;
> +
> +	err = 0;
> +	found = 0;
> +
> +	mutex_lock(&mem->obj_lock);
> +
> +	list_for_each_entry(obj, &mem->purgeable, region_link) {
> +		if (!i915_gem_object_has_pages(obj))
> +			continue;
> +
> +		if (READ_ONCE(obj->pin_global))
> +			continue;
> +
> +		if (atomic_read(&obj->bind_count))
> +			continue;
> +
> +		list_add(&obj->eviction_link, &purgeable);
> +
> +		found += obj->base.size;
> +		if (found >= target)
> +			goto found;
> +	}
> +
> +	err = -ENOSPC;
> +found:
> +	list_for_each_entry_safe(obj, on, &purgeable, eviction_link) {
> +		if (!err) {
> +			__i915_gem_object_put_pages(obj, I915_MM_SHRINKER);
> +
> +			mutex_lock_nested(&obj->mm.lock, I915_MM_SHRINKER);
> +			if (!i915_gem_object_has_pages(obj))
> +				obj->mm.madv = __I915_MADV_PURGED;
> +			mutex_unlock(&obj->mm.lock);
> +		}
> +
> +		list_del(&obj->eviction_link);
> +	}
> +
> +	mutex_unlock(&mem->obj_lock);
> +
> +	return err;
> +}
> +
>  static void
>  memory_region_free_pages(struct drm_i915_gem_object *obj,
>  			 struct sg_table *pages)
> @@ -70,7 +120,8 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
>  		unsigned int order;
>  		u64 block_size;
>  		u64 offset;
> -
> +		bool retry = true;
> +retry:
>  		order = fls(n_pages) - 1;
>  		GEM_BUG_ON(order > mem->mm.max_order);
>  
> @@ -79,9 +130,24 @@ i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj)
>  			if (!IS_ERR(block))
>  				break;
>  
> -			/* XXX: some kind of eviction pass, local to the device */
> -			if (!order--)
> -				goto err_free_blocks;
> +			if (!order--) {
> +				resource_size_t target;
> +				int err;
> +
> +				if (!retry)
> +					goto err_free_blocks;
> +
> +				target = n_pages * mem->mm.min_size;
> +
> +				mutex_unlock(&mem->mm_lock);
> +				err = i915_memory_region_evict(mem, target);
> +				mutex_lock(&mem->mm_lock);
> +				if (err)
> +					goto err_free_blocks;
> +
> +				retry = false;
> +				goto retry;
> +			}
>  		} while (1);
>  
>  		n_pages -= BIT(order);
> @@ -136,6 +202,13 @@ void i915_memory_region_release_buddy(struct intel_memory_region *mem)
>  	i915_buddy_fini(&mem->mm);
>  }
>  
> +void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj)
> +{
> +	mutex_lock(&obj->memory_region->obj_lock);
> +	list_del(&obj->region_link);
> +	mutex_unlock(&obj->memory_region->obj_lock);
> +}
> +
>  struct drm_i915_gem_object *
>  i915_gem_object_create_region(struct intel_memory_region *mem,
>  			      resource_size_t size,
> @@ -164,6 +237,10 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
>  	INIT_LIST_HEAD(&obj->blocks);
>  	obj->memory_region = mem;
>  
> +	mutex_lock(&mem->obj_lock);
> +	list_add(&obj->region_link, &mem->objects);
> +	mutex_unlock(&mem->obj_lock);
> +
>  	return obj;
>  }
>  
> @@ -188,6 +265,10 @@ intel_memory_region_create(struct drm_i915_private *i915,
>  	mem->min_page_size = min_page_size;
>  	mem->ops = ops;
>  
> +	mutex_init(&mem->obj_lock);
> +	INIT_LIST_HEAD(&mem->objects);
> +	INIT_LIST_HEAD(&mem->purgeable);
> +
>  	mutex_init(&mem->mm_lock);
>  
>  	if (ops->init) {
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
> index 8d4736bdde50..bee0c022d295 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.h
> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
> @@ -80,8 +80,16 @@ struct intel_memory_region {
>  	unsigned int type;
>  	unsigned int instance;
>  	unsigned int id;
> +
> +	/* Protects access to objects and purgeable */
> +	struct mutex obj_lock;
> +	struct list_head objects;
> +	struct list_head purgeable;
>  };
>  
> +int i915_memory_region_evict(struct intel_memory_region *mem,
> +			     resource_size_t target);
> +
>  int i915_memory_region_init_buddy(struct intel_memory_region *mem);
>  void i915_memory_region_release_buddy(struct intel_memory_region *mem);
>  
> @@ -89,6 +97,8 @@ int i915_memory_region_get_pages_buddy(struct drm_i915_gem_object *obj);
>  void i915_memory_region_put_pages_buddy(struct drm_i915_gem_object *obj,
>  					struct sg_table *pages);
>  
> +void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj);
> +
>  struct intel_memory_region *
>  intel_memory_region_create(struct drm_i915_private *i915,
>  			   resource_size_t start,
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index c3b160cfd713..ece499869747 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -76,10 +76,83 @@ static int igt_mock_fill(void *arg)
>  	return err;
>  }
>  
> +static void igt_mark_evictable(struct drm_i915_gem_object *obj)
> +{
> +	i915_gem_object_unpin_pages(obj);
> +	obj->mm.madv = I915_MADV_DONTNEED;
> +	list_move(&obj->region_link, &obj->memory_region->purgeable);
> +}
> +
> +static int igt_mock_evict(void *arg)
> +{
> +	struct intel_memory_region *mem = arg;
> +	struct drm_i915_gem_object *obj;
> +	unsigned long n_objects;
> +	LIST_HEAD(objects);
> +	resource_size_t target;
> +	resource_size_t total;
> +	int err = 0;
> +
> +	target = mem->mm.min_size;
> +	total = resource_size(&mem->region);
> +	n_objects = total / target;
> +
> +	while (n_objects--) {
> +		obj = i915_gem_object_create_region(mem, target, 0);
> +		if (IS_ERR(obj)) {
> +			err = PTR_ERR(obj);
> +			goto err_close_objects;
> +		}
> +
> +		list_add(&obj->st_link, &objects);
> +
> +		err = i915_gem_object_pin_pages(obj);
> +		if (err)
> +			goto err_close_objects;
> +
> +		/*
> +		 * Make half of the region evictable, though do so in a
> +		 * horribly fragmented fashion.
> +		 */
> +		if (n_objects % 2)
> +			igt_mark_evictable(obj);
> +	}
> +
> +	while (target <= total / 2) {
> +		obj = i915_gem_object_create_region(mem, target, 0);
> +		if (IS_ERR(obj)) {
> +			err = PTR_ERR(obj);
> +			goto err_close_objects;
> +		}
> +
> +		list_add(&obj->st_link, &objects);
> +
> +		err = i915_gem_object_pin_pages(obj);
> +		if (err) {
> +			pr_err("failed to evict for target=%pa", &target);
> +			goto err_close_objects;
> +		}
> +
> +		/* Again, half of the region should remain evictable */
> +		igt_mark_evictable(obj);
> +
> +		target <<= 1;
> +	}
> +
> +err_close_objects:
> +	close_objects(&objects);
> +
> +	if (err == -ENOMEM)
> +		err = 0;
> +
> +	return err;
> +}
> +
>  int intel_memory_region_mock_selftests(void)
>  {
>  	static const struct i915_subtest tests[] = {
>  		SUBTEST(igt_mock_fill),
> +		SUBTEST(igt_mock_evict),
>  	};
>  	struct intel_memory_region *mem;
>  	struct drm_i915_private *i915;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_region.c b/drivers/gpu/drm/i915/selftests/mock_region.c
> index cb942a461e9d..80eafdc54927 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_region.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_region.c
> @@ -8,6 +8,7 @@
>  static const struct drm_i915_gem_object_ops mock_region_obj_ops = {
>  	.get_pages = i915_memory_region_get_pages_buddy,
>  	.put_pages = i915_memory_region_put_pages_buddy,
> +	.release = i915_gem_object_release_memory_region,
>  };
>  
>  static struct drm_i915_gem_object *
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  2019-07-30 16:22       ` Daniel Vetter
@ 2019-08-12 16:18         ` Daniel Vetter
  0 siblings, 0 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-08-12 16:18 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

On Tue, Jul 30, 2019 at 6:22 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Jul 30, 2019 at 03:28:11PM +0100, Matthew Auld wrote:
> > On 30/07/2019 10:49, Daniel Vetter wrote:
> > > On Thu, Jun 27, 2019 at 09:56:25PM +0100, Matthew Auld wrote:
> > > > From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > > >
> > > > Add a new CPU mmap implementation that allows multiple fault handlers
> > > > that depends on the object's backing pages.
> > > >
> > > > Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
> > > > and use the zero extending behaviour of drm to differentiate between
> > > > them, when we inspect the flags.
> > > >
> > > > Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > >
> > > So I thought that the plan is to reject invalid mmaps, i.e. mmap modes
> > > which are not compatibale with all placement options. Given that, why do
> > > we need this?
> >
> > We are meant to reject anything !wc for LMEM. There were some patches for
> > that but I guess got lost under the radar...
> >
> > >
> > > - cpu mmap with all the flags still keep working, as long as the only
> > >    placement you select is smem.
> > >
> > > - for lmem/stolen the only option we have is a wc mapping, either through
> > >    the pci bar or through the gtt. So for objects only sitting in there
> > >    also no problem, we can just keep using the current gtt mmap stuff (but
> > >    redirect it internally).
> > >
> > > - that leaves us with objects which can move around. Only option allows is
> > >    WC, and the gtt mmap ioctl does that already. When the object is in smem
> > >    we'll need to redirect it to a cpu wc mmap, but I think we need to do
> > >    that anyway.
> >
> > So for legacy, gtt_mmap will still go through the aperture, otherwise if
> > LMEM is supported then there is no aperture, so we just wc mmap via cpu or
> > LMEMBAR depending on the final object placement. And cpu_mmap still works if
> > we don't care about LMEM. Hmm, so do we even need most of the previous patch
> > then? ALso does that mean we also have to track the placement of an object
> > in igt?
> >
> > gem_mmap__wc:
> >
> > if (supports_lmem(dev))
> >       gtt_mmap();
> > else
> >       gem_mmap(wc);
> >
> > gem_mmap__wc:
> >
> > if (placement_contains(obj, LMEM))
> >       gtt_mmap();
> > else
> >       gem_mmap(wc);
> >
> > ?
>
> Well if you want cpu wc mmaps, then just allocate it as smem ... we might
> need a new gem_mmap__lmem I guess to exercise all the possible ways to get
> at stuff in lmem (including when it migrates around underneath us while we
> access it through the mmap). I wouldn't try too hard to smash all these
> use/testcases into one.

Chatted a lot with Joonas today, and realized I missread outright what
this does. Looking at the end result I think it's all nicely aligned
with other (discrete/ttm) drivers, so all good from that point of
view. Still not sure whether it's really a good idea to do this fairly
minor uapi cleanup tied in with lmem. But I guess we committed to that
now, so welp ...
-Daniel

> > > So not really seeing what the uapi problem is you're trying to solve here?
> > >
> > > Can you pls explain why we need this?
> >
> > The naming of gtt_mmap seemed confusing, since there is no aperture, and
> > having one mmap ioctl to cover both smem and lmem seemed like a nice
> > idea...also I think umd's stopped using gtt_mmap(or were told to?) but maybe
> > those aren't good enough reasons.
>
> We stopped using gtt mmap because for many cases cpu WC mmap is faster.
>
> Wrt having a clean slate: Not sure why this would benefit us, we just
> diverge a bit more from how this works on !lmem, so a bit more complexity
> (not much) everywhere for not much gain.
>
> I'm also not sure whether there will be a whole lot of uses of such a
> magic LMEMBAR wc mapping. It's probably slow for the exact same reasons
> gtt mmap is slow.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-07-30 16:26   ` Daniel Vetter
@ 2019-08-15 10:48     ` Matthew Auld
  2019-08-15 14:26       ` Daniel Vetter
  2019-08-15 15:26       ` Chris Wilson
  0 siblings, 2 replies; 88+ messages in thread
From: Matthew Auld @ 2019-08-15 10:48 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel Graphics Development, Matthew Auld

On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > Support basic eviction for regions.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
>
> So from a very high level this looks like it was largely modelled after
> i915_gem_shrink.c and not i915_gem_evict.c (our other "make room, we're
> running out of stuff" code). Any specific reasons?

IIRC I think it was originally based on the patches that exposed
stolen-memory to userspace from a few years ago.

>
> I think i915_gem_evict is a lot closer match for what we want for vram (it
> started out to manage severely limitted GTT on gen2/3/4) after all. With
> the complication that we'll have to manage physical memory with multiple
> virtual mappings of it on top, so unfortunately we can't just reuse the
> locking patter Chris has come up with in his struct_mutex-removal branch.
> But at least conceptually it should be a lot closer.

When you say make it more like i915_gem_evict, what does that mean?
Are you talking about the eviction roster stuff, or the
placement/locking of the eviction logic, with it being deep down in
get_pages?

>
> But I might be entirely off the track with reconstructing how this code
> came to be, so please elaborate a bit.
>
> Thanks, Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-08-15 10:48     ` Matthew Auld
@ 2019-08-15 14:26       ` Daniel Vetter
  2019-08-15 14:34         ` Daniel Vetter
  2019-08-15 14:57         ` Tang, CQ
  2019-08-15 15:26       ` Chris Wilson
  1 sibling, 2 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-08-15 14:26 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld

On Thu, Aug 15, 2019 at 12:48 PM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > Support basic eviction for regions.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> >
> > So from a very high level this looks like it was largely modelled after
> > i915_gem_shrink.c and not i915_gem_evict.c (our other "make room, we're
> > running out of stuff" code). Any specific reasons?
>
> IIRC I think it was originally based on the patches that exposed
> stolen-memory to userspace from a few years ago.
>
> >
> > I think i915_gem_evict is a lot closer match for what we want for vram (it
> > started out to manage severely limitted GTT on gen2/3/4) after all. With
> > the complication that we'll have to manage physical memory with multiple
> > virtual mappings of it on top, so unfortunately we can't just reuse the
> > locking patter Chris has come up with in his struct_mutex-removal branch.
> > But at least conceptually it should be a lot closer.
>
> When you say make it more like i915_gem_evict, what does that mean?
> Are you talking about the eviction roster stuff, or the
> placement/locking of the eviction logic, with it being deep down in
> get_pages?

So there's kinda two aspects here that I meant.

First is the high-level approach of the shrinker, which is a direct
reflection of core mm low memory handling principles: Core mm just
tries to equally shrink everyone when there's low memory, which is
managed by watermarks, and a few other tricks. This is all only
best-effort, and if multiple threads want a lot of memory at the same
time then it's all going to fail with ENOMEM.

On gpus otoh, and what we do in i915_gem_eviction.c for gtt (and very
much needed with the tiny gtt for everything in gen2/3/4/5) is that
when we run out of space, we stall, throw out everyone else, and have
exclusive access to the entire gpu space. Then the next batchbuffer
goes through the same dance. With this you guarantee that if you have
a series of batchbuffers which all need e.g. 60% of lmem, they will
all be able to execute. With the shrinker-style of low-memory handling
eventually you're unlucky, both threads will only get up to 50%, fail
with ENOSPC, and userspace crashes. Which is not good.

The other bit is locking. Since we need to free pages from the
shrinker there's tricky locking rules involved. Worse, we cannot back
off from the shrinker down to e.g. the kmalloc or alloc_pages called
that put us into reclaim. Which means the usual deadlock avoidance
trick of having a slowpath where you first drop all the locks, then
reacquire them in the right order doesn't work - in some cases the
caller of kmalloc or alloc_pages could be holding a lock that we'd
need to unlock first. Hence why the shrinker uses the
best-effort-might-fail solution of trylocks, encoded in shrinker_lock.

But for lmem we don't have such an excuse, because it's all our own
code. The locking design can (and should!) assume that it can get out
of any deadlock and always acquire all the locks it needs. Without
that you can't achive the first part about guaranteeing execution of
batches which collectively need more than 100% of lmem, but
individually all fit. As an example if you look at the amdgpu command
submission ioctl, that passes around ttm_operation_ctx which tracks a
few things about locks and other bits, and if they hit a possible
deadlock situation they can unwind the entire CS and restart by taking
the locks in the right order.

I thought I typed that up somewhere, but I guess it got lost ...

Cheers, Daniel

>
> >
> > But I might be entirely off the track with reconstructing how this code
> > came to be, so please elaborate a bit.
> >
> > Thanks, Daniel



--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-08-15 14:26       ` Daniel Vetter
@ 2019-08-15 14:34         ` Daniel Vetter
  2019-08-15 14:57         ` Tang, CQ
  1 sibling, 0 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-08-15 14:34 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld

On Thu, Aug 15, 2019 at 4:26 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Aug 15, 2019 at 12:48 PM Matthew Auld
> <matthew.william.auld@gmail.com> wrote:
> >
> > On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > > Support basic eviction for regions.
> > > >
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > >
> > > So from a very high level this looks like it was largely modelled after
> > > i915_gem_shrink.c and not i915_gem_evict.c (our other "make room, we're
> > > running out of stuff" code). Any specific reasons?
> >
> > IIRC I think it was originally based on the patches that exposed
> > stolen-memory to userspace from a few years ago.

Forgot to add this:

Yeah, I guess those have been best-effort to at most. Or at least I
never really looked at them seriously since the open source userspace
never went anywhere.
-Daniel

> > > I think i915_gem_evict is a lot closer match for what we want for vram (it
> > > started out to manage severely limitted GTT on gen2/3/4) after all. With
> > > the complication that we'll have to manage physical memory with multiple
> > > virtual mappings of it on top, so unfortunately we can't just reuse the
> > > locking patter Chris has come up with in his struct_mutex-removal branch.
> > > But at least conceptually it should be a lot closer.
> >
> > When you say make it more like i915_gem_evict, what does that mean?
> > Are you talking about the eviction roster stuff, or the
> > placement/locking of the eviction logic, with it being deep down in
> > get_pages?
>
> So there's kinda two aspects here that I meant.
>
> First is the high-level approach of the shrinker, which is a direct
> reflection of core mm low memory handling principles: Core mm just
> tries to equally shrink everyone when there's low memory, which is
> managed by watermarks, and a few other tricks. This is all only
> best-effort, and if multiple threads want a lot of memory at the same
> time then it's all going to fail with ENOMEM.
>
> On gpus otoh, and what we do in i915_gem_eviction.c for gtt (and very
> much needed with the tiny gtt for everything in gen2/3/4/5) is that
> when we run out of space, we stall, throw out everyone else, and have
> exclusive access to the entire gpu space. Then the next batchbuffer
> goes through the same dance. With this you guarantee that if you have
> a series of batchbuffers which all need e.g. 60% of lmem, they will
> all be able to execute. With the shrinker-style of low-memory handling
> eventually you're unlucky, both threads will only get up to 50%, fail
> with ENOSPC, and userspace crashes. Which is not good.
>
> The other bit is locking. Since we need to free pages from the
> shrinker there's tricky locking rules involved. Worse, we cannot back
> off from the shrinker down to e.g. the kmalloc or alloc_pages called
> that put us into reclaim. Which means the usual deadlock avoidance
> trick of having a slowpath where you first drop all the locks, then
> reacquire them in the right order doesn't work - in some cases the
> caller of kmalloc or alloc_pages could be holding a lock that we'd
> need to unlock first. Hence why the shrinker uses the
> best-effort-might-fail solution of trylocks, encoded in shrinker_lock.
>
> But for lmem we don't have such an excuse, because it's all our own
> code. The locking design can (and should!) assume that it can get out
> of any deadlock and always acquire all the locks it needs. Without
> that you can't achive the first part about guaranteeing execution of
> batches which collectively need more than 100% of lmem, but
> individually all fit. As an example if you look at the amdgpu command
> submission ioctl, that passes around ttm_operation_ctx which tracks a
> few things about locks and other bits, and if they hit a possible
> deadlock situation they can unwind the entire CS and restart by taking
> the locks in the right order.
>
> I thought I typed that up somewhere, but I guess it got lost ...
>
> Cheers, Daniel
>
> >
> > >
> > > But I might be entirely off the track with reconstructing how this code
> > > came to be, so please elaborate a bit.
> > >
> > > Thanks, Daniel
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-08-15 14:26       ` Daniel Vetter
  2019-08-15 14:34         ` Daniel Vetter
@ 2019-08-15 14:57         ` Tang, CQ
  2019-08-15 16:20           ` Daniel Vetter
  1 sibling, 1 reply; 88+ messages in thread
From: Tang, CQ @ 2019-08-15 14:57 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Auld; +Cc: Intel Graphics Development, Auld, Matthew



> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Daniel Vetter
> Sent: Thursday, August 15, 2019 7:27 AM
> To: Matthew Auld <matthew.william.auld@gmail.com>
> Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>; Auld,
> Matthew <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v2 03/37] drm/i915/region: support basic
> eviction
> 
> On Thu, Aug 15, 2019 at 12:48 PM Matthew Auld
> <matthew.william.auld@gmail.com> wrote:
> >
> > On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > > Support basic eviction for regions.
> > > >
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > >
> > > So from a very high level this looks like it was largely modelled
> > > after i915_gem_shrink.c and not i915_gem_evict.c (our other "make
> > > room, we're running out of stuff" code). Any specific reasons?
> >
> > IIRC I think it was originally based on the patches that exposed
> > stolen-memory to userspace from a few years ago.
> >
> > >
> > > I think i915_gem_evict is a lot closer match for what we want for
> > > vram (it started out to manage severely limitted GTT on gen2/3/4)
> > > after all. With the complication that we'll have to manage physical
> > > memory with multiple virtual mappings of it on top, so unfortunately
> > > we can't just reuse the locking patter Chris has come up with in his
> struct_mutex-removal branch.
> > > But at least conceptually it should be a lot closer.
> >
> > When you say make it more like i915_gem_evict, what does that mean?
> > Are you talking about the eviction roster stuff, or the
> > placement/locking of the eviction logic, with it being deep down in
> > get_pages?
> 
> So there's kinda two aspects here that I meant.
> 
> First is the high-level approach of the shrinker, which is a direct reflection of
> core mm low memory handling principles: Core mm just tries to equally
> shrink everyone when there's low memory, which is managed by
> watermarks, and a few other tricks. This is all only best-effort, and if multiple
> threads want a lot of memory at the same time then it's all going to fail with
> ENOMEM.
> 
> On gpus otoh, and what we do in i915_gem_eviction.c for gtt (and very much
> needed with the tiny gtt for everything in gen2/3/4/5) is that when we run
> out of space, we stall, throw out everyone else, and have exclusive access to
> the entire gpu space. Then the next batchbuffer goes through the same
> dance. With this you guarantee that if you have a series of batchbuffers
> which all need e.g. 60% of lmem, they will all be able to execute. With the
> shrinker-style of low-memory handling eventually you're unlucky, both
> threads will only get up to 50%, fail with ENOSPC, and userspace crashes.
> Which is not good.
> 
> The other bit is locking. Since we need to free pages from the shrinker
> there's tricky locking rules involved. Worse, we cannot back off from the
> shrinker down to e.g. the kmalloc or alloc_pages called that put us into
> reclaim. Which means the usual deadlock avoidance trick of having a
> slowpath where you first drop all the locks, then reacquire them in the right
> order doesn't work - in some cases the caller of kmalloc or alloc_pages could
> be holding a lock that we'd need to unlock first. Hence why the shrinker uses
> the best-effort-might-fail solution of trylocks, encoded in shrinker_lock.
> 
> But for lmem we don't have such an excuse, because it's all our own code.
> The locking design can (and should!) assume that it can get out of any
> deadlock and always acquire all the locks it needs. Without that you can't
> achive the first part about guaranteeing execution of batches which
> collectively need more than 100% of lmem, but individually all fit. As an
> example if you look at the amdgpu command submission ioctl, that passes
> around ttm_operation_ctx which tracks a few things about locks and other
> bits, and if they hit a possible deadlock situation they can unwind the entire
> CS and restart by taking the locks in the right order.

Thank you for the explanation.

What does our 'struct_mutex' protect for exactly?  As example, I see when blitter engine functions are called, we hold 'struct_mutex" first.

Can we replace 'struct_mutex' with some fine-grain locks so that we can lock obj->mm.lock first, and then lock these fine-grain locks?

I need some background info about 'struct_mutex' design.

--CQ

> 
> I thought I typed that up somewhere, but I guess it got lost ...
> 
> Cheers, Daniel
> 
> >
> > >
> > > But I might be entirely off the track with reconstructing how this
> > > code came to be, so please elaborate a bit.
> > >
> > > Thanks, Daniel
> 
> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-08-15 10:48     ` Matthew Auld
  2019-08-15 14:26       ` Daniel Vetter
@ 2019-08-15 15:26       ` Chris Wilson
  2019-08-15 16:23         ` Daniel Vetter
  1 sibling, 1 reply; 88+ messages in thread
From: Chris Wilson @ 2019-08-15 15:26 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld

Quoting Matthew Auld (2019-08-15 11:48:04)
> On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > Support basic eviction for regions.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> >
> > So from a very high level this looks like it was largely modelled after
> > i915_gem_shrink.c and not i915_gem_evict.c (our other "make room, we're
> > running out of stuff" code). Any specific reasons?
> 
> IIRC I think it was originally based on the patches that exposed
> stolen-memory to userspace from a few years ago.
> 
> >
> > I think i915_gem_evict is a lot closer match for what we want for vram (it
> > started out to manage severely limitted GTT on gen2/3/4) after all. With
> > the complication that we'll have to manage physical memory with multiple
> > virtual mappings of it on top, so unfortunately we can't just reuse the
> > locking patter Chris has come up with in his struct_mutex-removal branch.
> > But at least conceptually it should be a lot closer.
> 
> When you say make it more like i915_gem_evict, what does that mean?
> Are you talking about the eviction roster stuff, or the
> placement/locking of the eviction logic, with it being deep down in
> get_pages?

The biggest difference would be the lack of region coalescing; the
eviction code only tries to free what would result in a successful
allocation. With the order being put into the scanner somewhat relevant,
in practice, fragmentation effects cause the range search to be somewhat
slow and we much prefer the random replacement -- while harmful, it is
not biased as to who it harms, and so is consistent overhead. However,
since you don't need to find a slot inside a small range within a few
million objects, I would expect LRU or even MRU (recently used objects
in games tend to be more ephemeral and so made good eviction targets, at
least according to John Carmack back in the day) to require fewer major
faults.
https://github.com/ESWAT/john-carmack-plan-archive/blob/master/by_day/johnc_plan_20000307.txt

You would need a very similar scanner to keep a journal of the potential
frees from which to track the coalescing (slightly more complicated due
to the disjoint nature of the buddy merges). One suspects that adding
the scanner would shape the buddy_nodes more towards drm_mm_nodes.

This is also a case where real world testing of a thrashing load beats
simulation.  So just make sure the eviction doesn't stall the entire GPU
and submission pipeline and you will be forgiven most transgressions.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-08-15 14:57         ` Tang, CQ
@ 2019-08-15 16:20           ` Daniel Vetter
  2019-08-15 16:35             ` Tang, CQ
  0 siblings, 1 reply; 88+ messages in thread
From: Daniel Vetter @ 2019-08-15 16:20 UTC (permalink / raw)
  To: Tang, CQ; +Cc: Intel Graphics Development, Auld, Matthew

On Thu, Aug 15, 2019 at 4:58 PM Tang, CQ <cq.tang@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> > Of Daniel Vetter
> > Sent: Thursday, August 15, 2019 7:27 AM
> > To: Matthew Auld <matthew.william.auld@gmail.com>
> > Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>; Auld,
> > Matthew <matthew.auld@intel.com>
> > Subject: Re: [Intel-gfx] [PATCH v2 03/37] drm/i915/region: support basic
> > eviction
> >
> > On Thu, Aug 15, 2019 at 12:48 PM Matthew Auld
> > <matthew.william.auld@gmail.com> wrote:
> > >
> > > On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > > > Support basic eviction for regions.
> > > > >
> > > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > > >
> > > > So from a very high level this looks like it was largely modelled
> > > > after i915_gem_shrink.c and not i915_gem_evict.c (our other "make
> > > > room, we're running out of stuff" code). Any specific reasons?
> > >
> > > IIRC I think it was originally based on the patches that exposed
> > > stolen-memory to userspace from a few years ago.
> > >
> > > >
> > > > I think i915_gem_evict is a lot closer match for what we want for
> > > > vram (it started out to manage severely limitted GTT on gen2/3/4)
> > > > after all. With the complication that we'll have to manage physical
> > > > memory with multiple virtual mappings of it on top, so unfortunately
> > > > we can't just reuse the locking patter Chris has come up with in his
> > struct_mutex-removal branch.
> > > > But at least conceptually it should be a lot closer.
> > >
> > > When you say make it more like i915_gem_evict, what does that mean?
> > > Are you talking about the eviction roster stuff, or the
> > > placement/locking of the eviction logic, with it being deep down in
> > > get_pages?
> >
> > So there's kinda two aspects here that I meant.
> >
> > First is the high-level approach of the shrinker, which is a direct reflection of
> > core mm low memory handling principles: Core mm just tries to equally
> > shrink everyone when there's low memory, which is managed by
> > watermarks, and a few other tricks. This is all only best-effort, and if multiple
> > threads want a lot of memory at the same time then it's all going to fail with
> > ENOMEM.
> >
> > On gpus otoh, and what we do in i915_gem_eviction.c for gtt (and very much
> > needed with the tiny gtt for everything in gen2/3/4/5) is that when we run
> > out of space, we stall, throw out everyone else, and have exclusive access to
> > the entire gpu space. Then the next batchbuffer goes through the same
> > dance. With this you guarantee that if you have a series of batchbuffers
> > which all need e.g. 60% of lmem, they will all be able to execute. With the
> > shrinker-style of low-memory handling eventually you're unlucky, both
> > threads will only get up to 50%, fail with ENOSPC, and userspace crashes.
> > Which is not good.
> >
> > The other bit is locking. Since we need to free pages from the shrinker
> > there's tricky locking rules involved. Worse, we cannot back off from the
> > shrinker down to e.g. the kmalloc or alloc_pages called that put us into
> > reclaim. Which means the usual deadlock avoidance trick of having a
> > slowpath where you first drop all the locks, then reacquire them in the right
> > order doesn't work - in some cases the caller of kmalloc or alloc_pages could
> > be holding a lock that we'd need to unlock first. Hence why the shrinker uses
> > the best-effort-might-fail solution of trylocks, encoded in shrinker_lock.
> >
> > But for lmem we don't have such an excuse, because it's all our own code.
> > The locking design can (and should!) assume that it can get out of any
> > deadlock and always acquire all the locks it needs. Without that you can't
> > achive the first part about guaranteeing execution of batches which
> > collectively need more than 100% of lmem, but individually all fit. As an
> > example if you look at the amdgpu command submission ioctl, that passes
> > around ttm_operation_ctx which tracks a few things about locks and other
> > bits, and if they hit a possible deadlock situation they can unwind the entire
> > CS and restart by taking the locks in the right order.
>
> Thank you for the explanation.
>
> What does our 'struct_mutex' protect for exactly?  As example, I see when blitter engine functions are called, we hold 'struct_mutex" first.
>
> Can we replace 'struct_mutex' with some fine-grain locks so that we can lock obj->mm.lock first, and then lock these fine-grain locks?

Sure. With lots of efforts.

> I need some background info about 'struct_mutex' design.

There's not really a design behind it, it's just 10+ years of evolution.
-Daniel

> --CQ
>
> >
> > I thought I typed that up somewhere, but I guess it got lost ...
> >
> > Cheers, Daniel
> >
> > >
> > > >
> > > > But I might be entirely off the track with reconstructing how this
> > > > code came to be, so please elaborate a bit.
> > > >
> > > > Thanks, Daniel
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-08-15 15:26       ` Chris Wilson
@ 2019-08-15 16:23         ` Daniel Vetter
  0 siblings, 0 replies; 88+ messages in thread
From: Daniel Vetter @ 2019-08-15 16:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Thu, Aug 15, 2019 at 5:26 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Matthew Auld (2019-08-15 11:48:04)
> > On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > > Support basic eviction for regions.
> > > >
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > >
> > > So from a very high level this looks like it was largely modelled after
> > > i915_gem_shrink.c and not i915_gem_evict.c (our other "make room, we're
> > > running out of stuff" code). Any specific reasons?
> >
> > IIRC I think it was originally based on the patches that exposed
> > stolen-memory to userspace from a few years ago.
> >
> > >
> > > I think i915_gem_evict is a lot closer match for what we want for vram (it
> > > started out to manage severely limitted GTT on gen2/3/4) after all. With
> > > the complication that we'll have to manage physical memory with multiple
> > > virtual mappings of it on top, so unfortunately we can't just reuse the
> > > locking patter Chris has come up with in his struct_mutex-removal branch.
> > > But at least conceptually it should be a lot closer.
> >
> > When you say make it more like i915_gem_evict, what does that mean?
> > Are you talking about the eviction roster stuff, or the
> > placement/locking of the eviction logic, with it being deep down in
> > get_pages?
>
> The biggest difference would be the lack of region coalescing; the
> eviction code only tries to free what would result in a successful
> allocation. With the order being put into the scanner somewhat relevant,
> in practice, fragmentation effects cause the range search to be somewhat
> slow and we much prefer the random replacement -- while harmful, it is
> not biased as to who it harms, and so is consistent overhead. However,
> since you don't need to find a slot inside a small range within a few
> million objects, I would expect LRU or even MRU (recently used objects
> in games tend to be more ephemeral and so made good eviction targets, at
> least according to John Carmack back in the day) to require fewer major
> faults.
> https://github.com/ESWAT/john-carmack-plan-archive/blob/master/by_day/johnc_plan_20000307.txt
>
> You would need a very similar scanner to keep a journal of the potential
> frees from which to track the coalescing (slightly more complicated due
> to the disjoint nature of the buddy merges). One suspects that adding
> the scanner would shape the buddy_nodes more towards drm_mm_nodes.
>
> This is also a case where real world testing of a thrashing load beats
> simulation.  So just make sure the eviction doesn't stall the entire GPU
> and submission pipeline and you will be forgiven most transgressions.

Yeah the fancy roster is definitely not on the wishlist until we have
this all optimized already. And even then it's probably better to not
be fancy, since we don't really need a contiguous block for pretty
much anything.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 03/37] drm/i915/region: support basic eviction
  2019-08-15 16:20           ` Daniel Vetter
@ 2019-08-15 16:35             ` Tang, CQ
  0 siblings, 0 replies; 88+ messages in thread
From: Tang, CQ @ 2019-08-15 16:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel Graphics Development, Auld, Matthew



> -----Original Message-----
> From: Daniel Vetter [mailto:daniel@ffwll.ch]
> Sent: Thursday, August 15, 2019 9:21 AM
> To: Tang, CQ <cq.tang@intel.com>
> Cc: Matthew Auld <matthew.william.auld@gmail.com>; Intel Graphics
> Development <intel-gfx@lists.freedesktop.org>; Auld, Matthew
> <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v2 03/37] drm/i915/region: support basic
> eviction
> 
> On Thu, Aug 15, 2019 at 4:58 PM Tang, CQ <cq.tang@intel.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On
> > > Behalf Of Daniel Vetter
> > > Sent: Thursday, August 15, 2019 7:27 AM
> > > To: Matthew Auld <matthew.william.auld@gmail.com>
> > > Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>;
> > > Auld, Matthew <matthew.auld@intel.com>
> > > Subject: Re: [Intel-gfx] [PATCH v2 03/37] drm/i915/region: support
> > > basic eviction
> > >
> > > On Thu, Aug 15, 2019 at 12:48 PM Matthew Auld
> > > <matthew.william.auld@gmail.com> wrote:
> > > >
> > > > On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > > > > Support basic eviction for regions.
> > > > > >
> > > > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > > Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> > > > >
> > > > > So from a very high level this looks like it was largely
> > > > > modelled after i915_gem_shrink.c and not i915_gem_evict.c (our
> > > > > other "make room, we're running out of stuff" code). Any specific
> reasons?
> > > >
> > > > IIRC I think it was originally based on the patches that exposed
> > > > stolen-memory to userspace from a few years ago.
> > > >
> > > > >
> > > > > I think i915_gem_evict is a lot closer match for what we want
> > > > > for vram (it started out to manage severely limitted GTT on
> > > > > gen2/3/4) after all. With the complication that we'll have to
> > > > > manage physical memory with multiple virtual mappings of it on
> > > > > top, so unfortunately we can't just reuse the locking patter
> > > > > Chris has come up with in his
> > > struct_mutex-removal branch.
> > > > > But at least conceptually it should be a lot closer.
> > > >
> > > > When you say make it more like i915_gem_evict, what does that mean?
> > > > Are you talking about the eviction roster stuff, or the
> > > > placement/locking of the eviction logic, with it being deep down
> > > > in get_pages?
> > >
> > > So there's kinda two aspects here that I meant.
> > >
> > > First is the high-level approach of the shrinker, which is a direct
> > > reflection of core mm low memory handling principles: Core mm just
> > > tries to equally shrink everyone when there's low memory, which is
> > > managed by watermarks, and a few other tricks. This is all only
> > > best-effort, and if multiple threads want a lot of memory at the
> > > same time then it's all going to fail with ENOMEM.
> > >
> > > On gpus otoh, and what we do in i915_gem_eviction.c for gtt (and
> > > very much needed with the tiny gtt for everything in gen2/3/4/5) is
> > > that when we run out of space, we stall, throw out everyone else,
> > > and have exclusive access to the entire gpu space. Then the next
> > > batchbuffer goes through the same dance. With this you guarantee
> > > that if you have a series of batchbuffers which all need e.g. 60% of
> > > lmem, they will all be able to execute. With the shrinker-style of
> > > low-memory handling eventually you're unlucky, both threads will only
> get up to 50%, fail with ENOSPC, and userspace crashes.
> > > Which is not good.
> > >
> > > The other bit is locking. Since we need to free pages from the
> > > shrinker there's tricky locking rules involved. Worse, we cannot
> > > back off from the shrinker down to e.g. the kmalloc or alloc_pages
> > > called that put us into reclaim. Which means the usual deadlock
> > > avoidance trick of having a slowpath where you first drop all the
> > > locks, then reacquire them in the right order doesn't work - in some
> > > cases the caller of kmalloc or alloc_pages could be holding a lock
> > > that we'd need to unlock first. Hence why the shrinker uses the best-
> effort-might-fail solution of trylocks, encoded in shrinker_lock.
> > >
> > > But for lmem we don't have such an excuse, because it's all our own code.
> > > The locking design can (and should!) assume that it can get out of
> > > any deadlock and always acquire all the locks it needs. Without that
> > > you can't achive the first part about guaranteeing execution of
> > > batches which collectively need more than 100% of lmem, but
> > > individually all fit. As an example if you look at the amdgpu
> > > command submission ioctl, that passes around ttm_operation_ctx which
> > > tracks a few things about locks and other bits, and if they hit a
> > > possible deadlock situation they can unwind the entire CS and restart by
> taking the locks in the right order.
> >
> > Thank you for the explanation.
> >
> > What does our 'struct_mutex' protect for exactly?  As example, I see when
> blitter engine functions are called, we hold 'struct_mutex" first.
> >
> > Can we replace 'struct_mutex' with some fine-grain locks so that we can
> lock obj->mm.lock first, and then lock these fine-grain locks?
> 
> Sure. With lots of efforts.
> 
> > I need some background info about 'struct_mutex' design.
> 
> There's not really a design behind it, it's just 10+ years of evolution.

Yes, in old days, a big coarse-grain lock was OK. Now with so many engines in new hardware for new computation workloads, we might need to do fine-grain locks

--CQ


> -Daniel
> 
> > --CQ
> >
> > >
> > > I thought I typed that up somewhere, but I guess it got lost ...
> > >
> > > Cheers, Daniel
> > >
> > > >
> > > > >
> > > > > But I might be entirely off the track with reconstructing how
> > > > > this code came to be, so please elaborate a bit.
> > > > >
> > > > > Thanks, Daniel
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2019-08-15 16:35 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-27 20:55 [PATCH v2 00/37] Introduce memory region concept (including device local memory) Matthew Auld
2019-06-27 20:55 ` [PATCH v2 01/37] drm/i915: buddy allocator Matthew Auld
2019-06-27 22:28   ` Chris Wilson
2019-06-28  9:35   ` Chris Wilson
2019-06-27 20:55 ` [PATCH v2 02/37] drm/i915: introduce intel_memory_region Matthew Auld
2019-06-27 22:47   ` Chris Wilson
2019-06-28  8:09   ` Chris Wilson
2019-06-27 20:55 ` [PATCH v2 03/37] drm/i915/region: support basic eviction Matthew Auld
2019-06-27 22:59   ` Chris Wilson
2019-07-30 16:26   ` Daniel Vetter
2019-08-15 10:48     ` Matthew Auld
2019-08-15 14:26       ` Daniel Vetter
2019-08-15 14:34         ` Daniel Vetter
2019-08-15 14:57         ` Tang, CQ
2019-08-15 16:20           ` Daniel Vetter
2019-08-15 16:35             ` Tang, CQ
2019-08-15 15:26       ` Chris Wilson
2019-08-15 16:23         ` Daniel Vetter
2019-06-27 20:56 ` [PATCH v2 04/37] drm/i915/region: support continuous allocations Matthew Auld
2019-06-27 23:01   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 05/37] drm/i915/region: support volatile objects Matthew Auld
2019-06-27 23:03   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 06/37] drm/i915: Add memory region information to device_info Matthew Auld
2019-06-27 23:05   ` Chris Wilson
2019-06-27 23:08   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 07/37] drm/i915: support creating LMEM objects Matthew Auld
2019-06-27 23:11   ` Chris Wilson
2019-06-27 23:16   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 08/37] drm/i915: setup io-mapping for LMEM Matthew Auld
2019-06-27 20:56 ` [PATCH v2 09/37] drm/i915/lmem: support kernel mapping Matthew Auld
2019-06-27 23:27   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 10/37] drm/i915/blt: support copying objects Matthew Auld
2019-06-27 23:35   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 11/37] drm/i915/selftests: move gpu-write-dw into utils Matthew Auld
2019-06-27 20:56 ` [PATCH v2 12/37] drm/i915/selftests: add write-dword test for LMEM Matthew Auld
2019-06-27 20:56 ` [PATCH v2 13/37] drm/i915/selftests: don't just test CACHE_NONE for huge-pages Matthew Auld
2019-06-27 23:40   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 14/37] drm/i915/selftest: extend coverage to include LMEM huge-pages Matthew Auld
2019-06-27 23:42   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 15/37] drm/i915/lmem: support CPU relocations Matthew Auld
2019-06-27 23:46   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 16/37] drm/i915/lmem: support pread Matthew Auld
2019-06-27 23:50   ` Chris Wilson
2019-07-30  8:58   ` Daniel Vetter
2019-07-30  9:25     ` Matthew Auld
2019-07-30  9:50       ` Daniel Vetter
2019-07-30 12:05     ` Chris Wilson
2019-07-30 12:42       ` Daniel Vetter
2019-06-27 20:56 ` [PATCH v2 17/37] drm/i915/lmem: support pwrite Matthew Auld
2019-06-27 20:56 ` [PATCH v2 18/37] drm/i915: enumerate and init each supported region Matthew Auld
2019-06-27 20:56 ` [PATCH v2 19/37] drm/i915: treat shmem as a region Matthew Auld
2019-06-27 23:55   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 20/37] drm/i915: treat stolen " Matthew Auld
2019-06-27 20:56 ` [PATCH v2 21/37] drm/i915: define HAS_MAPPABLE_APERTURE Matthew Auld
2019-06-27 20:56 ` [PATCH v2 22/37] drm/i915: do not map aperture if it is not available Matthew Auld
2019-06-27 20:56 ` [PATCH v2 23/37] drm/i915: expose missing map_gtt support to users Matthew Auld
2019-06-27 23:59   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 24/37] drm/i915: set num_fence_regs to 0 if there is no aperture Matthew Auld
2019-06-28  0:00   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 25/37] drm/i915/selftests: check for missing aperture Matthew Auld
2019-06-27 20:56 ` [PATCH v2 26/37] drm/i915: error capture with no ggtt slot Matthew Auld
2019-06-27 20:56 ` [PATCH v2 27/37] drm/i915: Don't try to place HWS in non-existing mappable region Matthew Auld
2019-06-27 20:56 ` [PATCH v2 28/37] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core Matthew Auld
2019-06-28  0:05   ` Chris Wilson
2019-06-28  0:08   ` Chris Wilson
2019-06-28  0:09   ` Chris Wilson
2019-06-28  0:10   ` Chris Wilson
2019-06-27 20:56 ` [PATCH v2 29/37] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET Matthew Auld
2019-06-28  0:12   ` Chris Wilson
2019-07-30  9:49   ` Daniel Vetter
2019-07-30 14:28     ` Matthew Auld
2019-07-30 16:22       ` Daniel Vetter
2019-08-12 16:18         ` Daniel Vetter
2019-06-27 20:56 ` [PATCH v2 30/37] drm/i915/lmem: add helper to get CPU accessible offset Matthew Auld
2019-06-27 20:56 ` [PATCH v2 31/37] drm/i915: Add cpu and lmem fault handlers Matthew Auld
2019-06-27 20:56 ` [PATCH v2 32/37] drm/i915: cpu-map based dumb buffers Matthew Auld
2019-06-27 20:56 ` [PATCH v2 33/37] drm/i915: support basic object migration Matthew Auld
2019-06-27 20:56 ` [PATCH v2 34/37] drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION Matthew Auld
2019-06-28  0:22   ` Chris Wilson
2019-06-28  5:53   ` Tvrtko Ursulin
2019-07-30 16:17   ` Daniel Vetter
2019-06-27 20:56 ` [PATCH v2 35/37] drm/i915/query: Expose memory regions through the query uAPI Matthew Auld
2019-06-28  5:59   ` Tvrtko Ursulin
2019-06-27 20:56 ` [PATCH v2 36/37] HAX drm/i915: add the fake lmem region Matthew Auld
2019-06-27 20:56 ` [PATCH v2 37/37] HAX drm/i915/lmem: default userspace allocations to LMEM Matthew Auld
2019-06-27 21:36 ` ✗ Fi.CI.CHECKPATCH: warning for Introduce memory region concept (including device local memory) (rev2) Patchwork
2019-06-27 21:50 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-06-28  9:59 ` ✗ Fi.CI.BAT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.