All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/4] drm/i915: Prepare error capture for asynchronous migration
@ 2021-11-08 17:45 ` Thomas Hellström
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

This patch series prepares error capture for asynchronous migration,
where the vma pages may not reflect the pages the GPU is currently
executing from but may be several migrations ahead.

The first patch gets rid of some contigous memory allocations that may
blow up due to the change of allocation mode introduced in patch 2/4

The third patch introduces vma state snapshots that record the vma state
at request submission time.
It also takes additional measures to make sure that
the capture list and request is not disappearing from under us while
capturing. The latter may otherwise happen if a heartbeat triggered parallel
capture is running during a manual reset which retires the request.

Finally the last patch is more of a POC patch and not strictly needed yet,
but will be (or at least something very similar) soon for async unbinding.
It will make sure that unbinding doesn't complete or signal completion
before capture is done. Async reuse of memory can't happen until unbinding
signals complete and without waiting for capture done, we might capture
contents of reused memory.
Before the last patch the vma active is instead still keeping the vma alive,
but that will not work with async unbinding anymore, and also it is still
not clear how we guarantee keeping the vma alive long enough to even
grab an active reference during capture.

v2:
- Mostly Fixes for selftests and rebinding. See patch 3. 
v3:
- Honor the unbind fence also when evicting for suspend on gen6.
- Minor cleanups on patch 3.
v4:
- Break out patch 2 from patch 1.
v5:
- Ditch a patch from the since it's already commited.
- Use __GFP_KSWAPD_RECLAIM rather than GFP_NOWAIT in patch 2.
v6:
- Reorder patches and introduce patch 1/4 which gets rid of some
  contiguous allocations that are likely to fail after introduction of
  patch 2/4.
- Use #if IS_ENABLED() rather than #ifdef similar to the rest of the
  driver

Thomas Hellström (4):
  drm/i915: Avoid allocating a page array for the gpu coredump
  drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
  drm/i915: Update error capture code to avoid using the current vma
    state
  drm/i915: Initial introduction of vma resources

 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 137 +++++++++--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   8 +-
 .../drm/i915/gt/intel_execlists_submission.c  |   3 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         | 228 ++++++++++++------
 drivers/gpu/drm/i915/i915_gpu_error.h         |   4 +-
 drivers/gpu/drm/i915/i915_request.c           |  63 +++--
 drivers/gpu/drm/i915/i915_request.h           |  20 +-
 drivers/gpu/drm/i915/i915_vma.c               | 206 +++++++++++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  20 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 131 ++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 +++++++++
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  98 +++++---
 14 files changed, 865 insertions(+), 171 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Intel-gfx] [PATCH v6 0/4] drm/i915: Prepare error capture for asynchronous migration
@ 2021-11-08 17:45 ` Thomas Hellström
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

This patch series prepares error capture for asynchronous migration,
where the vma pages may not reflect the pages the GPU is currently
executing from but may be several migrations ahead.

The first patch gets rid of some contigous memory allocations that may
blow up due to the change of allocation mode introduced in patch 2/4

The third patch introduces vma state snapshots that record the vma state
at request submission time.
It also takes additional measures to make sure that
the capture list and request is not disappearing from under us while
capturing. The latter may otherwise happen if a heartbeat triggered parallel
capture is running during a manual reset which retires the request.

Finally the last patch is more of a POC patch and not strictly needed yet,
but will be (or at least something very similar) soon for async unbinding.
It will make sure that unbinding doesn't complete or signal completion
before capture is done. Async reuse of memory can't happen until unbinding
signals complete and without waiting for capture done, we might capture
contents of reused memory.
Before the last patch the vma active is instead still keeping the vma alive,
but that will not work with async unbinding anymore, and also it is still
not clear how we guarantee keeping the vma alive long enough to even
grab an active reference during capture.

v2:
- Mostly Fixes for selftests and rebinding. See patch 3. 
v3:
- Honor the unbind fence also when evicting for suspend on gen6.
- Minor cleanups on patch 3.
v4:
- Break out patch 2 from patch 1.
v5:
- Ditch a patch from the since it's already commited.
- Use __GFP_KSWAPD_RECLAIM rather than GFP_NOWAIT in patch 2.
v6:
- Reorder patches and introduce patch 1/4 which gets rid of some
  contiguous allocations that are likely to fail after introduction of
  patch 2/4.
- Use #if IS_ENABLED() rather than #ifdef similar to the rest of the
  driver

Thomas Hellström (4):
  drm/i915: Avoid allocating a page array for the gpu coredump
  drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
  drm/i915: Update error capture code to avoid using the current vma
    state
  drm/i915: Initial introduction of vma resources

 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 137 +++++++++--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   8 +-
 .../drm/i915/gt/intel_execlists_submission.c  |   3 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         | 228 ++++++++++++------
 drivers/gpu/drm/i915/i915_gpu_error.h         |   4 +-
 drivers/gpu/drm/i915/i915_request.c           |  63 +++--
 drivers/gpu/drm/i915/i915_request.h           |  20 +-
 drivers/gpu/drm/i915/i915_vma.c               | 206 +++++++++++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  20 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 131 ++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 +++++++++
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  98 +++++---
 14 files changed, 865 insertions(+), 171 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/4] drm/i915: Avoid allocating a page array for the gpu coredump
  2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
@ 2021-11-08 17:45   ` Thomas Hellström
  -1 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

The gpu coredump typically takes place in a dma_fence signalling
critical path, and hence can't use GFP_KERNEL allocations, as that
means we might hit deadlocks under memory pressure. However
changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
patch will instead mean a lower chance of the allocation succeeding.
In particular large contigous allocations like the coredump page
vector.
Remove the page vector in favor of a linked list of single pages.
Use the page lru list head as the list link, as the page owner is
allowed to do that.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 50 +++++++++++++++------------
 drivers/gpu/drm/i915/i915_gpu_error.h |  4 +--
 2 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2a2d7643b551..14de64282697 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -275,16 +275,16 @@ static bool compress_start(struct i915_vma_compress *c)
 static void *compress_next_page(struct i915_vma_compress *c,
 				struct i915_vma_coredump *dst)
 {
-	void *page;
+	void *page_addr;
+	struct page *page;
 
-	if (dst->page_count >= dst->num_pages)
-		return ERR_PTR(-ENOSPC);
-
-	page = pool_alloc(&c->pool, ALLOW_FAIL);
-	if (!page)
+	page_addr = pool_alloc(&c->pool, ALLOW_FAIL);
+	if (!page_addr)
 		return ERR_PTR(-ENOMEM);
 
-	return dst->pages[dst->page_count++] = page;
+	page = virt_to_page(page_addr);
+	list_add_tail(&page->lru, &dst->page_list);
+	return page_addr;
 }
 
 static int compress_page(struct i915_vma_compress *c,
@@ -397,7 +397,7 @@ static int compress_page(struct i915_vma_compress *c,
 
 	if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE)))
 		memcpy(ptr, src, PAGE_SIZE);
-	dst->pages[dst->page_count++] = ptr;
+	list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list);
 	cond_resched();
 
 	return 0;
@@ -614,7 +614,7 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
 			    const struct i915_vma_coredump *vma)
 {
 	char out[ASCII85_BUFSZ];
-	int page;
+	struct page *page;
 
 	if (!vma)
 		return;
@@ -628,16 +628,17 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
 		err_printf(m, "gtt_page_sizes = 0x%08x\n", vma->gtt_page_sizes);
 
 	err_compression_marker(m);
-	for (page = 0; page < vma->page_count; page++) {
+	list_for_each_entry(page, &vma->page_list, lru) {
 		int i, len;
+		const u32 *addr = page_address(page);
 
 		len = PAGE_SIZE;
-		if (page == vma->page_count - 1)
+		if (page == list_last_entry(&vma->page_list, typeof(*page), lru))
 			len -= vma->unused;
 		len = ascii85_encode_len(len);
 
 		for (i = 0; i < len; i++)
-			err_puts(m, ascii85_encode(vma->pages[page][i], out));
+			err_puts(m, ascii85_encode(addr[i], out));
 	}
 	err_puts(m, "\n");
 }
@@ -946,10 +947,12 @@ static void i915_vma_coredump_free(struct i915_vma_coredump *vma)
 {
 	while (vma) {
 		struct i915_vma_coredump *next = vma->next;
-		int page;
+		struct page *page, *n;
 
-		for (page = 0; page < vma->page_count; page++)
-			free_page((unsigned long)vma->pages[page]);
+		list_for_each_entry_safe(page, n, &vma->page_list, lru) {
+			list_del_init(&page->lru);
+			__free_page(page);
+		}
 
 		kfree(vma);
 		vma = next;
@@ -1016,7 +1019,6 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	struct i915_ggtt *ggtt = gt->ggtt;
 	const u64 slot = ggtt->error_capture.start;
 	struct i915_vma_coredump *dst;
-	unsigned long num_pages;
 	struct sgt_iter iter;
 	int ret;
 
@@ -1025,9 +1027,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	if (!vma || !vma->pages || !compress)
 		return NULL;
 
-	num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
-	num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
-	dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
+	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
 	if (!dst)
 		return NULL;
 
@@ -1036,14 +1036,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 		return NULL;
 	}
 
+	INIT_LIST_HEAD(&dst->page_list);
 	strcpy(dst->name, name);
 	dst->next = NULL;
 
 	dst->gtt_offset = vma->node.start;
 	dst->gtt_size = vma->node.size;
 	dst->gtt_page_sizes = vma->page_sizes.gtt;
-	dst->num_pages = num_pages;
-	dst->page_count = 0;
 	dst->unused = 0;
 
 	ret = -EINVAL;
@@ -1106,8 +1105,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	}
 
 	if (ret || compress_flush(compress, dst)) {
-		while (dst->page_count--)
-			pool_free(&compress->pool, dst->pages[dst->page_count]);
+		struct page *page, *n;
+
+		list_for_each_entry_safe_reverse(page, n, &dst->page_list, lru) {
+			list_del_init(&page->lru);
+			pool_free(&compress->pool, page_address(page));
+		}
+
 		kfree(dst);
 		dst = NULL;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index b98d8cdbe4f2..5aedf5129814 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -39,10 +39,8 @@ struct i915_vma_coredump {
 	u64 gtt_size;
 	u32 gtt_page_sizes;
 
-	int num_pages;
-	int page_count;
 	int unused;
-	u32 *pages[];
+	struct list_head page_list;
 };
 
 struct i915_request_coredump {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-gfx] [PATCH v6 1/4] drm/i915: Avoid allocating a page array for the gpu coredump
@ 2021-11-08 17:45   ` Thomas Hellström
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

The gpu coredump typically takes place in a dma_fence signalling
critical path, and hence can't use GFP_KERNEL allocations, as that
means we might hit deadlocks under memory pressure. However
changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
patch will instead mean a lower chance of the allocation succeeding.
In particular large contigous allocations like the coredump page
vector.
Remove the page vector in favor of a linked list of single pages.
Use the page lru list head as the list link, as the page owner is
allowed to do that.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 50 +++++++++++++++------------
 drivers/gpu/drm/i915/i915_gpu_error.h |  4 +--
 2 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2a2d7643b551..14de64282697 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -275,16 +275,16 @@ static bool compress_start(struct i915_vma_compress *c)
 static void *compress_next_page(struct i915_vma_compress *c,
 				struct i915_vma_coredump *dst)
 {
-	void *page;
+	void *page_addr;
+	struct page *page;
 
-	if (dst->page_count >= dst->num_pages)
-		return ERR_PTR(-ENOSPC);
-
-	page = pool_alloc(&c->pool, ALLOW_FAIL);
-	if (!page)
+	page_addr = pool_alloc(&c->pool, ALLOW_FAIL);
+	if (!page_addr)
 		return ERR_PTR(-ENOMEM);
 
-	return dst->pages[dst->page_count++] = page;
+	page = virt_to_page(page_addr);
+	list_add_tail(&page->lru, &dst->page_list);
+	return page_addr;
 }
 
 static int compress_page(struct i915_vma_compress *c,
@@ -397,7 +397,7 @@ static int compress_page(struct i915_vma_compress *c,
 
 	if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE)))
 		memcpy(ptr, src, PAGE_SIZE);
-	dst->pages[dst->page_count++] = ptr;
+	list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list);
 	cond_resched();
 
 	return 0;
@@ -614,7 +614,7 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
 			    const struct i915_vma_coredump *vma)
 {
 	char out[ASCII85_BUFSZ];
-	int page;
+	struct page *page;
 
 	if (!vma)
 		return;
@@ -628,16 +628,17 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
 		err_printf(m, "gtt_page_sizes = 0x%08x\n", vma->gtt_page_sizes);
 
 	err_compression_marker(m);
-	for (page = 0; page < vma->page_count; page++) {
+	list_for_each_entry(page, &vma->page_list, lru) {
 		int i, len;
+		const u32 *addr = page_address(page);
 
 		len = PAGE_SIZE;
-		if (page == vma->page_count - 1)
+		if (page == list_last_entry(&vma->page_list, typeof(*page), lru))
 			len -= vma->unused;
 		len = ascii85_encode_len(len);
 
 		for (i = 0; i < len; i++)
-			err_puts(m, ascii85_encode(vma->pages[page][i], out));
+			err_puts(m, ascii85_encode(addr[i], out));
 	}
 	err_puts(m, "\n");
 }
@@ -946,10 +947,12 @@ static void i915_vma_coredump_free(struct i915_vma_coredump *vma)
 {
 	while (vma) {
 		struct i915_vma_coredump *next = vma->next;
-		int page;
+		struct page *page, *n;
 
-		for (page = 0; page < vma->page_count; page++)
-			free_page((unsigned long)vma->pages[page]);
+		list_for_each_entry_safe(page, n, &vma->page_list, lru) {
+			list_del_init(&page->lru);
+			__free_page(page);
+		}
 
 		kfree(vma);
 		vma = next;
@@ -1016,7 +1019,6 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	struct i915_ggtt *ggtt = gt->ggtt;
 	const u64 slot = ggtt->error_capture.start;
 	struct i915_vma_coredump *dst;
-	unsigned long num_pages;
 	struct sgt_iter iter;
 	int ret;
 
@@ -1025,9 +1027,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	if (!vma || !vma->pages || !compress)
 		return NULL;
 
-	num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
-	num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
-	dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
+	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
 	if (!dst)
 		return NULL;
 
@@ -1036,14 +1036,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 		return NULL;
 	}
 
+	INIT_LIST_HEAD(&dst->page_list);
 	strcpy(dst->name, name);
 	dst->next = NULL;
 
 	dst->gtt_offset = vma->node.start;
 	dst->gtt_size = vma->node.size;
 	dst->gtt_page_sizes = vma->page_sizes.gtt;
-	dst->num_pages = num_pages;
-	dst->page_count = 0;
 	dst->unused = 0;
 
 	ret = -EINVAL;
@@ -1106,8 +1105,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	}
 
 	if (ret || compress_flush(compress, dst)) {
-		while (dst->page_count--)
-			pool_free(&compress->pool, dst->pages[dst->page_count]);
+		struct page *page, *n;
+
+		list_for_each_entry_safe_reverse(page, n, &dst->page_list, lru) {
+			list_del_init(&page->lru);
+			pool_free(&compress->pool, page_address(page));
+		}
+
 		kfree(dst);
 		dst = NULL;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index b98d8cdbe4f2..5aedf5129814 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -39,10 +39,8 @@ struct i915_vma_coredump {
 	u64 gtt_size;
 	u32 gtt_page_sizes;
 
-	int num_pages;
-	int page_count;
 	int unused;
-	u32 *pages[];
+	struct list_head page_list;
 };
 
 struct i915_request_coredump {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/4] drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
  2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
@ 2021-11-08 17:45   ` Thomas Hellström
  -1 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

The capture code is typically run entirely in the fence signalling
critical path. We're about to add lockdep annotation in an upcoming patch
which reveals a lockdep splat similar to the below one.

Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
(which is the same as GFP_WAIT, but open-coded for clarity) rather than
GFP_KERNEL for memory allocation in the capture path. This has the
potential drawback that capture might fail in situations with memory
pressure.

[  234.842048] WARNING: possible circular locking dependency detected
[  234.842050] 5.15.0-rc7+ #20 Tainted: G     U  W
[  234.842052] ------------------------------------------------------
[  234.842054] gem_exec_captur/1180 is trying to acquire lock:
[  234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
[  234.842063]
               but task is already holding lock:
[  234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842138]
               which lock already depends on the new lock.

[  234.842140]
               the existing dependency chain (in reverse order) is:
[  234.842142]
               -> #2 (dma_fence_map){++++}-{0:0}:
[  234.842145]        __dma_fence_might_wait+0x41/0xa0
[  234.842149]        dma_resv_lockdep+0x1dc/0x28f
[  234.842151]        do_one_initcall+0x58/0x2d0
[  234.842154]        kernel_init_freeable+0x273/0x2bf
[  234.842157]        kernel_init+0x16/0x120
[  234.842160]        ret_from_fork+0x1f/0x30
[  234.842163]
               -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[  234.842166]        fs_reclaim_acquire+0x6d/0xd0
[  234.842168]        __kmalloc_node+0x51/0x3a0
[  234.842171]        alloc_cpumask_var_node+0x1b/0x30
[  234.842174]        native_smp_prepare_cpus+0xc7/0x292
[  234.842177]        kernel_init_freeable+0x160/0x2bf
[  234.842179]        kernel_init+0x16/0x120
[  234.842181]        ret_from_fork+0x1f/0x30
[  234.842184]
               -> #0 (fs_reclaim){+.+.}-{0:0}:
[  234.842186]        __lock_acquire+0x1161/0x1dc0
[  234.842189]        lock_acquire+0xb5/0x2b0
[  234.842192]        fs_reclaim_acquire+0xa1/0xd0
[  234.842193]        __kmalloc+0x4d/0x330
[  234.842196]        i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842253]        intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842307]        __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842365]        i915_capture_error_state+0x57/0xa0 [i915]
[  234.842415]        intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.842462]        intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.842504]        simple_attr_write+0xc1/0xe0
[  234.842507]        full_proxy_write+0x53/0x80
[  234.842509]        vfs_write+0xbc/0x350
[  234.842513]        ksys_write+0x58/0xd0
[  234.842514]        do_syscall_64+0x38/0x90
[  234.842516]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.842519]
               other info that might help us debug this:

[  234.842521] Chain exists of:
                 fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map

[  234.842526]  Possible unsafe locking scenario:

[  234.842528]        CPU0                    CPU1
[  234.842529]        ----                    ----
[  234.842531]   lock(dma_fence_map);
[  234.842532]                                lock(mmu_notifier_invalidate_range_start);
[  234.842535]                                lock(dma_fence_map);
[  234.842537]   lock(fs_reclaim);
[  234.842539]
                *** DEADLOCK ***

[  234.842540] 4 locks held by gem_exec_captur/1180:
[  234.842543]  #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
[  234.842547]  #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
[  234.842552]  #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
[  234.842602]  #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842656]
               stack backtrace:
[  234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G     U  W         5.15.0-rc7+ #20
[  234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[  234.842664] Call Trace:
[  234.842666]  dump_stack_lvl+0x57/0x72
[  234.842669]  check_noncircular+0xde/0x100
[  234.842672]  ? __lock_acquire+0x3bf/0x1dc0
[  234.842675]  __lock_acquire+0x1161/0x1dc0
[  234.842678]  lock_acquire+0xb5/0x2b0
[  234.842680]  ? __kmalloc+0x4d/0x330
[  234.842683]  ? finish_task_switch.isra.0+0xf2/0x360
[  234.842686]  ? i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842734]  fs_reclaim_acquire+0xa1/0xd0
[  234.842737]  ? __kmalloc+0x4d/0x330
[  234.842739]  __kmalloc+0x4d/0x330
[  234.842742]  i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842793]  ? capture_vma+0xbe/0x110 [i915]
[  234.842844]  intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842892]  __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842939]  i915_capture_error_state+0x57/0xa0 [i915]
[  234.842985]  intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.843032]  ? __mutex_lock+0x81/0x830
[  234.843035]  ? simple_attr_write+0x3a/0xe0
[  234.843038]  ? __lock_acquire+0x3bf/0x1dc0
[  234.843041]  intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.843083]  ? _copy_from_user+0x45/0x80
[  234.843086]  simple_attr_write+0xc1/0xe0
[  234.843089]  full_proxy_write+0x53/0x80
[  234.843091]  vfs_write+0xbc/0x350
[  234.843094]  ksys_write+0x58/0xd0
[  234.843096]  do_syscall_64+0x38/0x90
[  234.843098]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.843101] RIP: 0033:0x7fa467480877
[  234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
[  234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
[  234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
[  234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
[  234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005

v5:
- Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
  (Daniel Vetter)
v6:
- Include an instance in execlists_capture_work().
- Rework the commit message due to patch reordering.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 3 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c                | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index ca03880fa7e4..a69df5e9e77a 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2186,7 +2186,8 @@ struct execlists_capture {
 static void execlists_capture_work(struct work_struct *work)
 {
 	struct execlists_capture *cap = container_of(work, typeof(*cap), work);
-	const gfp_t gfp = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN;
+	const gfp_t gfp = __GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL |
+		__GFP_NOWARN;
 	struct intel_engine_cs *engine = cap->rq->engine;
 	struct intel_gt_coredump *gt = cap->error->gt;
 	struct intel_engine_capture_vma *vma;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 14de64282697..b1e4ce0f798f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -49,7 +49,7 @@
 #include "i915_memcpy.h"
 #include "i915_scatterlist.h"
 
-#define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
+#define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
 
 static void __sg_set_buf(struct scatterlist *sg,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-gfx] [PATCH v6 2/4] drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
@ 2021-11-08 17:45   ` Thomas Hellström
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

The capture code is typically run entirely in the fence signalling
critical path. We're about to add lockdep annotation in an upcoming patch
which reveals a lockdep splat similar to the below one.

Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
(which is the same as GFP_WAIT, but open-coded for clarity) rather than
GFP_KERNEL for memory allocation in the capture path. This has the
potential drawback that capture might fail in situations with memory
pressure.

[  234.842048] WARNING: possible circular locking dependency detected
[  234.842050] 5.15.0-rc7+ #20 Tainted: G     U  W
[  234.842052] ------------------------------------------------------
[  234.842054] gem_exec_captur/1180 is trying to acquire lock:
[  234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
[  234.842063]
               but task is already holding lock:
[  234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842138]
               which lock already depends on the new lock.

[  234.842140]
               the existing dependency chain (in reverse order) is:
[  234.842142]
               -> #2 (dma_fence_map){++++}-{0:0}:
[  234.842145]        __dma_fence_might_wait+0x41/0xa0
[  234.842149]        dma_resv_lockdep+0x1dc/0x28f
[  234.842151]        do_one_initcall+0x58/0x2d0
[  234.842154]        kernel_init_freeable+0x273/0x2bf
[  234.842157]        kernel_init+0x16/0x120
[  234.842160]        ret_from_fork+0x1f/0x30
[  234.842163]
               -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[  234.842166]        fs_reclaim_acquire+0x6d/0xd0
[  234.842168]        __kmalloc_node+0x51/0x3a0
[  234.842171]        alloc_cpumask_var_node+0x1b/0x30
[  234.842174]        native_smp_prepare_cpus+0xc7/0x292
[  234.842177]        kernel_init_freeable+0x160/0x2bf
[  234.842179]        kernel_init+0x16/0x120
[  234.842181]        ret_from_fork+0x1f/0x30
[  234.842184]
               -> #0 (fs_reclaim){+.+.}-{0:0}:
[  234.842186]        __lock_acquire+0x1161/0x1dc0
[  234.842189]        lock_acquire+0xb5/0x2b0
[  234.842192]        fs_reclaim_acquire+0xa1/0xd0
[  234.842193]        __kmalloc+0x4d/0x330
[  234.842196]        i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842253]        intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842307]        __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842365]        i915_capture_error_state+0x57/0xa0 [i915]
[  234.842415]        intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.842462]        intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.842504]        simple_attr_write+0xc1/0xe0
[  234.842507]        full_proxy_write+0x53/0x80
[  234.842509]        vfs_write+0xbc/0x350
[  234.842513]        ksys_write+0x58/0xd0
[  234.842514]        do_syscall_64+0x38/0x90
[  234.842516]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.842519]
               other info that might help us debug this:

[  234.842521] Chain exists of:
                 fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map

[  234.842526]  Possible unsafe locking scenario:

[  234.842528]        CPU0                    CPU1
[  234.842529]        ----                    ----
[  234.842531]   lock(dma_fence_map);
[  234.842532]                                lock(mmu_notifier_invalidate_range_start);
[  234.842535]                                lock(dma_fence_map);
[  234.842537]   lock(fs_reclaim);
[  234.842539]
                *** DEADLOCK ***

[  234.842540] 4 locks held by gem_exec_captur/1180:
[  234.842543]  #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
[  234.842547]  #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
[  234.842552]  #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
[  234.842602]  #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842656]
               stack backtrace:
[  234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G     U  W         5.15.0-rc7+ #20
[  234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[  234.842664] Call Trace:
[  234.842666]  dump_stack_lvl+0x57/0x72
[  234.842669]  check_noncircular+0xde/0x100
[  234.842672]  ? __lock_acquire+0x3bf/0x1dc0
[  234.842675]  __lock_acquire+0x1161/0x1dc0
[  234.842678]  lock_acquire+0xb5/0x2b0
[  234.842680]  ? __kmalloc+0x4d/0x330
[  234.842683]  ? finish_task_switch.isra.0+0xf2/0x360
[  234.842686]  ? i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842734]  fs_reclaim_acquire+0xa1/0xd0
[  234.842737]  ? __kmalloc+0x4d/0x330
[  234.842739]  __kmalloc+0x4d/0x330
[  234.842742]  i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842793]  ? capture_vma+0xbe/0x110 [i915]
[  234.842844]  intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842892]  __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842939]  i915_capture_error_state+0x57/0xa0 [i915]
[  234.842985]  intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.843032]  ? __mutex_lock+0x81/0x830
[  234.843035]  ? simple_attr_write+0x3a/0xe0
[  234.843038]  ? __lock_acquire+0x3bf/0x1dc0
[  234.843041]  intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.843083]  ? _copy_from_user+0x45/0x80
[  234.843086]  simple_attr_write+0xc1/0xe0
[  234.843089]  full_proxy_write+0x53/0x80
[  234.843091]  vfs_write+0xbc/0x350
[  234.843094]  ksys_write+0x58/0xd0
[  234.843096]  do_syscall_64+0x38/0x90
[  234.843098]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.843101] RIP: 0033:0x7fa467480877
[  234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
[  234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
[  234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
[  234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
[  234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005

v5:
- Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
  (Daniel Vetter)
v6:
- Include an instance in execlists_capture_work().
- Rework the commit message due to patch reordering.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 3 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c                | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index ca03880fa7e4..a69df5e9e77a 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2186,7 +2186,8 @@ struct execlists_capture {
 static void execlists_capture_work(struct work_struct *work)
 {
 	struct execlists_capture *cap = container_of(work, typeof(*cap), work);
-	const gfp_t gfp = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN;
+	const gfp_t gfp = __GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL |
+		__GFP_NOWARN;
 	struct intel_engine_cs *engine = cap->rq->engine;
 	struct intel_gt_coredump *gt = cap->error->gt;
 	struct intel_engine_capture_vma *vma;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 14de64282697..b1e4ce0f798f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -49,7 +49,7 @@
 #include "i915_memcpy.h"
 #include "i915_scatterlist.h"
 
-#define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
+#define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
 
 static void __sg_set_buf(struct scatterlist *sg,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/4] drm/i915: Update error capture code to avoid using the current vma state
  2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
@ 2021-11-08 17:45   ` Thomas Hellström
  -1 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

With asynchronous migrations, the vma state may be several migrations
ahead of the state that matches the request we're capturing.
Address that by introducing an i915_vma_snapshot structure that
can be used to snapshot relevant state at request submission.
In order to make sure we access the correct memory, the snapshots take
references on relevant sg-tables and memory regions.

Also move the capture list allocation out of the fence signaling
critical path and use the CONFIG_DRM_I915_CAPTURE_ERROR define to
avoid compiling in members and functions used for error capture
when they're not used.

Finally, Introduce lockdep annotation.

v4:
- Break out the capture allocation mode change to a separate patch.
v5:
- Fix compilation error in the !CONFIG_DRM_I915_CAPTURE_ERROR case
  (kernel test robot)
v6:
- Use #if IS_ENABLED() instead of #ifdef to match driver style.
- Move yet another change of allocation mode to the separate patch.
- Commit message rework due to patch reordering.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 135 +++++++++++---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   8 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         | 176 +++++++++++++-----
 drivers/gpu/drm/i915/i915_request.c           |  63 +++++--
 drivers/gpu/drm/i915/i915_request.h           |  20 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 137 ++++++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 +++++++++++
 8 files changed, 557 insertions(+), 95 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7d0d0b814670..b4372e0a802f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -174,6 +174,7 @@ i915-y += \
 	  i915_trace_points.o \
 	  i915_ttm_buddy_manager.o \
 	  i915_vma.o \
+	  i915_vma_snapshot.o \
 	  intel_wopcm.o
 
 # general-purpose microcontroller (GuC) support
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index ea5b7b2a4d70..ddca899f2396 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -29,6 +29,7 @@
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
+#include "i915_vma_snapshot.h"
 
 struct eb_vma {
 	struct i915_vma *vma;
@@ -307,11 +308,15 @@ struct i915_execbuffer {
 
 	struct eb_fence *fences;
 	unsigned long num_fences;
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+	struct i915_capture_list *capture_lists[MAX_ENGINE_INSTANCE + 1];
+#endif
 };
 
 static int eb_parse(struct i915_execbuffer *eb);
 static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle);
 static void eb_unpin_engine(struct i915_execbuffer *eb);
+static void eb_capture_release(struct i915_execbuffer *eb);
 
 static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
 {
@@ -1054,6 +1059,7 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
 			i915_vma_put(vma);
 	}
 
+	eb_capture_release(eb);
 	eb_unpin_engine(eb);
 }
 
@@ -1891,36 +1897,113 @@ eb_find_first_request_added(struct i915_execbuffer *eb)
 	return NULL;
 }
 
-static int eb_move_to_gpu(struct i915_execbuffer *eb)
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+
+/* Stage with GFP_KERNEL allocations before we enter the signaling critical path */
+static void eb_capture_stage(struct i915_execbuffer *eb)
 {
 	const unsigned int count = eb->buffer_count;
-	unsigned int i = count;
-	int err = 0, j;
+	unsigned int i = count, j;
+	struct i915_vma_snapshot *vsnap;
 
 	while (i--) {
 		struct eb_vma *ev = &eb->vma[i];
 		struct i915_vma *vma = ev->vma;
 		unsigned int flags = ev->flags;
-		struct drm_i915_gem_object *obj = vma->obj;
 
-		assert_vma_held(vma);
+		if (!(flags & EXEC_OBJECT_CAPTURE))
+			continue;
 
-		if (flags & EXEC_OBJECT_CAPTURE) {
+		vsnap = i915_vma_snapshot_alloc(GFP_KERNEL);
+		if (!vsnap)
+			continue;
+
+		i915_vma_snapshot_init(vsnap, vma, "user");
+		for_each_batch_create_order(eb, j) {
 			struct i915_capture_list *capture;
 
-			for_each_batch_create_order(eb, j) {
-				if (!eb->requests[j])
-					break;
+			capture = kmalloc(sizeof(*capture), GFP_KERNEL);
+			if (!capture)
+				continue;
 
-				capture = kmalloc(sizeof(*capture), GFP_KERNEL);
-				if (capture) {
-					capture->next =
-						eb->requests[j]->capture_list;
-					capture->vma = vma;
-					eb->requests[j]->capture_list = capture;
-				}
-			}
+			capture->next = eb->capture_lists[j];
+			capture->vma_snapshot = i915_vma_snapshot_get(vsnap);
+			eb->capture_lists[j] = capture;
+		}
+		i915_vma_snapshot_put(vsnap);
+	}
+}
+
+/* Commit once we're in the critical path */
+static void eb_capture_commit(struct i915_execbuffer *eb)
+{
+	unsigned int j;
+
+	for_each_batch_create_order(eb, j) {
+		struct i915_request *rq = eb->requests[j];
+
+		if (!rq)
+			break;
+
+		rq->capture_list = eb->capture_lists[j];
+		eb->capture_lists[j] = NULL;
+	}
+}
+
+/*
+ * Release anything that didn't get committed due to errors.
+ * The capture_list will otherwise be freed at request retire.
+ */
+static void eb_capture_release(struct i915_execbuffer *eb)
+{
+	unsigned int j;
+
+	for_each_batch_create_order(eb, j) {
+		if (eb->capture_lists[j]) {
+			i915_request_free_capture_list(eb->capture_lists[j]);
+			eb->capture_lists[j] = NULL;
 		}
+	}
+}
+
+static void eb_capture_list_clear(struct i915_execbuffer *eb)
+{
+	memset(eb->capture_lists, 0, sizeof(eb->capture_lists));
+}
+
+#else
+
+static void eb_capture_stage(struct i915_execbuffer *eb)
+{
+}
+
+static void eb_capture_commit(struct i915_execbuffer *eb)
+{
+}
+
+static void eb_capture_release(struct i915_execbuffer *eb)
+{
+}
+
+static void eb_capture_list_clear(struct i915_execbuffer *eb)
+{
+}
+
+#endif
+
+static int eb_move_to_gpu(struct i915_execbuffer *eb)
+{
+	const unsigned int count = eb->buffer_count;
+	unsigned int i = count;
+	int err = 0, j;
+
+	while (i--) {
+		struct eb_vma *ev = &eb->vma[i];
+		struct i915_vma *vma = ev->vma;
+		unsigned int flags = ev->flags;
+		struct drm_i915_gem_object *obj = vma->obj;
+
+		assert_vma_held(vma);
 
 		/*
 		 * If the GPU is not _reading_ through the CPU cache, we need
@@ -2001,6 +2084,8 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 
 	/* Unconditionally flush any chipset caches (for streaming writes). */
 	intel_gt_chipset_flush(eb->gt);
+	eb_capture_commit(eb);
+
 	return 0;
 
 err_skip:
@@ -3143,13 +3228,14 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 		}
 
 		/*
-		 * Whilst this request exists, batch_obj will be on the
-		 * active_list, and so will hold the active reference. Only when
-		 * this request is retired will the batch_obj be moved onto
-		 * the inactive_list and lose its active reference. Hence we do
-		 * not need to explicitly hold another reference here.
+		 * Not really on stack, but we don't want to call
+		 * kfree on the batch_snapshot when we put it, so use the
+		 * _onstack interface.
 		 */
-		eb->requests[i]->batch = eb->batches[i]->vma;
+		if (eb->batches[i]->vma)
+			i915_vma_snapshot_init_onstack(&eb->requests[i]->batch_snapshot,
+						       eb->batches[i]->vma,
+						       "batch");
 		if (eb->batch_pool) {
 			GEM_BUG_ON(intel_context_is_parallel(eb->context));
 			intel_gt_buffer_pool_mark_active(eb->batch_pool,
@@ -3198,6 +3284,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.fences = NULL;
 	eb.num_fences = 0;
 
+	eb_capture_list_clear(&eb);
+
 	memset(eb.requests, 0, sizeof(struct i915_request *) *
 	       ARRAY_SIZE(eb.requests));
 	eb.composite_fence = NULL;
@@ -3284,6 +3372,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	}
 
 	ww_acquire_done(&eb.ww.ctx);
+	eb_capture_stage(&eb);
 
 	out_fence = eb_requests_create(&eb, in_fence, out_fence_fd);
 	if (IS_ERR(out_fence)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 332756036007..f2ccd5b53d42 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1676,14 +1676,18 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 
 static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
 {
+	struct i915_vma_snapshot *vsnap = &rq->batch_snapshot;
 	void *ring;
 	int size;
 
+	if (!i915_vma_snapshot_present(vsnap))
+		vsnap = NULL;
+
 	drm_printf(m,
 		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
 		   rq->head, rq->postfix, rq->tail,
-		   rq->batch ? upper_32_bits(rq->batch->node.start) : ~0u,
-		   rq->batch ? lower_32_bits(rq->batch->node.start) : ~0u);
+		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
+		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
 
 	size = rq->tail - rq->head;
 	if (rq->tail < rq->head)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index b1e4ce0f798f..c61f9aaa40f9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -48,6 +48,7 @@
 #include "i915_gpu_error.h"
 #include "i915_memcpy.h"
 #include "i915_scatterlist.h"
+#include "i915_vma_snapshot.h"
 
 #define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
@@ -1012,8 +1013,7 @@ void __i915_gpu_coredump_free(struct kref *error_ref)
 
 static struct i915_vma_coredump *
 i915_vma_coredump_create(const struct intel_gt *gt,
-			 const struct i915_vma *vma,
-			 const char *name,
+			 const struct i915_vma_snapshot *vsnap,
 			 struct i915_vma_compress *compress)
 {
 	struct i915_ggtt *ggtt = gt->ggtt;
@@ -1024,7 +1024,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 
 	might_sleep();
 
-	if (!vma || !vma->pages || !compress)
+	if (!vsnap || !vsnap->pages || !compress)
 		return NULL;
 
 	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
@@ -1037,12 +1037,12 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	}
 
 	INIT_LIST_HEAD(&dst->page_list);
-	strcpy(dst->name, name);
+	strcpy(dst->name, vsnap->name);
 	dst->next = NULL;
 
-	dst->gtt_offset = vma->node.start;
-	dst->gtt_size = vma->node.size;
-	dst->gtt_page_sizes = vma->page_sizes.gtt;
+	dst->gtt_offset = vsnap->gtt_offset;
+	dst->gtt_size = vsnap->gtt_size;
+	dst->gtt_page_sizes = vsnap->page_sizes;
 	dst->unused = 0;
 
 	ret = -EINVAL;
@@ -1050,7 +1050,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 		void __iomem *s;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vma->pages) {
+		for_each_sgt_daddr(dma, iter, vsnap->pages) {
 			mutex_lock(&ggtt->error_mutex);
 			ggtt->vm.insert_page(&ggtt->vm, dma, slot,
 					     I915_CACHE_NONE, 0);
@@ -1068,11 +1068,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 			if (ret)
 				break;
 		}
-	} else if (__i915_gem_object_is_lmem(vma->obj)) {
-		struct intel_memory_region *mem = vma->obj->mm.region;
+	} else if (vsnap->mr && vsnap->mr->type != INTEL_MEMORY_SYSTEM) {
+		struct intel_memory_region *mem = vsnap->mr;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vma->pages) {
+		for_each_sgt_daddr(dma, iter, vsnap->pages) {
 			void __iomem *s;
 
 			s = io_mapping_map_wc(&mem->iomap,
@@ -1088,7 +1088,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	} else {
 		struct page *page;
 
-		for_each_sgt_page(page, iter, vma->pages) {
+		for_each_sgt_page(page, iter, vsnap->pages) {
 			void *s;
 
 			drm_clflush_pages(&page, 1);
@@ -1324,37 +1324,71 @@ static bool record_context(struct i915_gem_context_coredump *e,
 
 struct intel_engine_capture_vma {
 	struct intel_engine_capture_vma *next;
-	struct i915_vma *vma;
+	struct i915_vma_snapshot *vsnap;
 	char name[16];
+	bool lockdep_cookie;
 };
 
 static struct intel_engine_capture_vma *
-capture_vma(struct intel_engine_capture_vma *next,
-	    struct i915_vma *vma,
-	    const char *name,
-	    gfp_t gfp)
+capture_vma_snapshot(struct intel_engine_capture_vma *next,
+		     struct i915_vma_snapshot *vsnap,
+		     gfp_t gfp)
 {
 	struct intel_engine_capture_vma *c;
 
-	if (!vma)
+	if (!i915_vma_snapshot_present(vsnap))
 		return next;
 
 	c = kmalloc(sizeof(*c), gfp);
 	if (!c)
 		return next;
 
-	if (!i915_active_acquire_if_busy(&vma->active)) {
+	if (!i915_vma_snapshot_resource_pin(vsnap, &c->lockdep_cookie)) {
 		kfree(c);
 		return next;
 	}
 
-	strcpy(c->name, name);
-	c->vma = vma; /* reference held while active */
+	strcpy(c->name, vsnap->name);
+	c->vsnap = vsnap;
+	i915_vma_snapshot_get(vsnap);
 
 	c->next = next;
 	return c;
 }
 
+static struct intel_engine_capture_vma *
+capture_vma(struct intel_engine_capture_vma *next,
+	    struct i915_vma *vma,
+	    const char *name,
+	    gfp_t gfp)
+{
+	struct i915_vma_snapshot *vsnap;
+
+	if (!vma)
+		return next;
+
+	/*
+	 * If the vma isn't pinned, then the vma should be snapshotted
+	 * to a struct i915_vma_snapshot at command submission time.
+	 * Not here.
+	 */
+	GEM_WARN_ON(!i915_vma_is_pinned(vma));
+	if (!i915_vma_is_pinned(vma))
+		return next;
+
+	vsnap = i915_vma_snapshot_alloc(gfp);
+	if (!vsnap)
+		return next;
+
+	i915_vma_snapshot_init(vsnap, vma, name);
+	next = capture_vma_snapshot(next, vsnap, gfp);
+
+	/* FIXME: Replace on async unbind. */
+	i915_vma_snapshot_put(vsnap);
+
+	return next;
+}
+
 static struct intel_engine_capture_vma *
 capture_user(struct intel_engine_capture_vma *capture,
 	     const struct i915_request *rq,
@@ -1363,7 +1397,7 @@ capture_user(struct intel_engine_capture_vma *capture,
 	struct i915_capture_list *c;
 
 	for (c = rq->capture_list; c; c = c->next)
-		capture = capture_vma(capture, c->vma, "user", gfp);
+		capture = capture_vma_snapshot(capture, c->vma_snapshot, gfp);
 
 	return capture;
 }
@@ -1377,6 +1411,36 @@ static void add_vma(struct intel_engine_coredump *ee,
 	}
 }
 
+static struct i915_vma_coredump *
+create_vma_coredump(const struct intel_gt *gt, struct i915_vma *vma,
+		    const char *name, struct i915_vma_compress *compress)
+{
+	struct i915_vma_coredump *ret = NULL;
+	struct i915_vma_snapshot tmp;
+	bool lockdep_cookie;
+
+	if (!vma)
+		return NULL;
+
+	i915_vma_snapshot_init_onstack(&tmp, vma, name);
+	if (i915_vma_snapshot_resource_pin(&tmp, &lockdep_cookie)) {
+		ret = i915_vma_coredump_create(gt, &tmp, compress);
+		i915_vma_snapshot_resource_unpin(&tmp, lockdep_cookie);
+	}
+	i915_vma_snapshot_put_onstack(&tmp);
+
+	return ret;
+}
+
+static void add_vma_coredump(struct intel_engine_coredump *ee,
+			     const struct intel_gt *gt,
+			     struct i915_vma *vma,
+			     const char *name,
+			     struct i915_vma_compress *compress)
+{
+	add_vma(ee, create_vma_coredump(gt, vma, name, compress));
+}
+
 struct intel_engine_coredump *
 intel_engine_coredump_alloc(struct intel_engine_cs *engine, gfp_t gfp)
 {
@@ -1410,7 +1474,7 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee,
 	 * as the simplest method to avoid being overwritten
 	 * by userspace.
 	 */
-	vma = capture_vma(vma, rq->batch, "batch", gfp);
+	vma = capture_vma_snapshot(vma, &rq->batch_snapshot, gfp);
 	vma = capture_user(vma, rq, gfp);
 	vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
 	vma = capture_vma(vma, rq->context->state, "HW context", gfp);
@@ -1431,30 +1495,24 @@ intel_engine_coredump_add_vma(struct intel_engine_coredump *ee,
 
 	while (capture) {
 		struct intel_engine_capture_vma *this = capture;
-		struct i915_vma *vma = this->vma;
+		struct i915_vma_snapshot *vsnap = this->vsnap;
 
 		add_vma(ee,
 			i915_vma_coredump_create(engine->gt,
-						 vma, this->name,
-						 compress));
+						 vsnap, compress));
 
-		i915_active_release(&vma->active);
+		i915_vma_snapshot_resource_unpin(vsnap, this->lockdep_cookie);
+		i915_vma_snapshot_put(vsnap);
 
 		capture = this->next;
 		kfree(this);
 	}
 
-	add_vma(ee,
-		i915_vma_coredump_create(engine->gt,
-					 engine->status_page.vma,
-					 "HW Status",
-					 compress));
+	add_vma_coredump(ee, engine->gt, engine->status_page.vma,
+			 "HW Status", compress);
 
-	add_vma(ee,
-		i915_vma_coredump_create(engine->gt,
-					 engine->wa_ctx.vma,
-					 "WA context",
-					 compress));
+	add_vma_coredump(ee, engine->gt, engine->wa_ctx.vma,
+			 "WA context", compress);
 }
 
 static struct intel_engine_coredump *
@@ -1490,17 +1548,25 @@ capture_engine(struct intel_engine_cs *engine,
 		}
 	}
 	if (rq)
-		capture = intel_engine_coredump_add_request(ee, rq,
-							    ATOMIC_MAYFAIL);
+		rq = i915_request_get_rcu(rq);
+
+	if (!rq)
+		goto no_request_capture;
+
+	capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL);
 	if (!capture) {
-no_request_capture:
-		kfree(ee);
-		return NULL;
+		i915_request_put(rq);
+		goto no_request_capture;
 	}
 
 	intel_engine_coredump_add_vma(ee, capture, compress);
+	i915_request_put(rq);
 
 	return ee;
+
+no_request_capture:
+	kfree(ee);
+	return NULL;
 }
 
 static void
@@ -1554,10 +1620,8 @@ gt_record_uc(struct intel_gt_coredump *gt,
 	 */
 	error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL);
 	error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL);
-	error_uc->guc_log =
-		i915_vma_coredump_create(gt->_gt,
-					 uc->guc.log.vma, "GuC log buffer",
-					 compress);
+	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
+						"GuC log buffer", compress);
 
 	return error_uc;
 }
@@ -1843,8 +1907,8 @@ void i915_vma_capture_finish(struct intel_gt_coredump *gt,
 	kfree(compress);
 }
 
-struct i915_gpu_coredump *
-i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
+static struct i915_gpu_coredump *
+__i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
 {
 	struct drm_i915_private *i915 = gt->i915;
 	struct i915_gpu_coredump *error;
@@ -1885,6 +1949,22 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
 	return error;
 }
 
+struct i915_gpu_coredump *
+i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
+{
+	static DEFINE_MUTEX(capture_mutex);
+	int ret = mutex_lock_interruptible(&capture_mutex);
+	struct i915_gpu_coredump *dump;
+
+	if (ret)
+		return ERR_PTR(ret);
+
+	dump = __i915_gpu_coredump(gt, engine_mask);
+	mutex_unlock(&capture_mutex);
+
+	return dump;
+}
+
 void i915_error_state_store(struct i915_gpu_coredump *error)
 {
 	struct drm_i915_private *i915;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 623273aca09e..ee855dc86972 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -113,6 +113,10 @@ static void i915_fence_release(struct dma_fence *fence)
 	GEM_BUG_ON(rq->guc_prio != GUC_PRIO_INIT &&
 		   rq->guc_prio != GUC_PRIO_FINI);
 
+	i915_request_free_capture_list(fetch_and_zero(&rq->capture_list));
+	if (i915_vma_snapshot_present(&rq->batch_snapshot))
+		i915_vma_snapshot_put_onstack(&rq->batch_snapshot);
+
 	/*
 	 * The request is put onto a RCU freelist (i.e. the address
 	 * is immediately reused), mark the fences as being freed now.
@@ -186,19 +190,6 @@ void i915_request_notify_execute_cb_imm(struct i915_request *rq)
 	__notify_execute_cb(rq, irq_work_imm);
 }
 
-static void free_capture_list(struct i915_request *request)
-{
-	struct i915_capture_list *capture;
-
-	capture = fetch_and_zero(&request->capture_list);
-	while (capture) {
-		struct i915_capture_list *next = capture->next;
-
-		kfree(capture);
-		capture = next;
-	}
-}
-
 static void __i915_request_fill(struct i915_request *rq, u8 val)
 {
 	void *vaddr = rq->ring->vaddr;
@@ -303,6 +294,37 @@ static void __rq_cancel_watchdog(struct i915_request *rq)
 		i915_request_put(rq);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+
+/**
+ * i915_request_free_capture_list - Free a capture list
+ * @capture: Pointer to the first list item or NULL
+ *
+ */
+void i915_request_free_capture_list(struct i915_capture_list *capture)
+{
+	while (capture) {
+		struct i915_capture_list *next = capture->next;
+
+		i915_vma_snapshot_put(capture->vma_snapshot);
+		capture = next;
+	}
+}
+
+#define assert_capture_list_is_null(_rq) GEM_BUG_ON((_rq)->capture_list)
+
+#define clear_capture_list(_rq) ((_rq)->capture_list = NULL)
+
+#else
+
+#define i915_request_free_capture_list(_a) do {} while (0)
+
+#define assert_capture_list_is_null(_a) do {} while (0)
+
+#define clear_capture_list(_rq) do {} while (0)
+
+#endif
+
 bool i915_request_retire(struct i915_request *rq)
 {
 	if (!__i915_request_is_complete(rq))
@@ -359,7 +381,6 @@ bool i915_request_retire(struct i915_request *rq)
 	intel_context_exit(rq->context);
 	intel_context_unpin(rq->context);
 
-	free_capture_list(rq);
 	i915_sched_node_fini(&rq->sched);
 	i915_request_put(rq);
 
@@ -829,11 +850,18 @@ static void __i915_request_ctor(void *arg)
 	i915_sw_fence_init(&rq->submit, submit_notify);
 	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
 
-	rq->capture_list = NULL;
+	clear_capture_list(rq);
+	rq->batch_snapshot.present = false;
 
 	init_llist_head(&rq->execute_cb);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#define clear_batch_ptr(_rq) ((_rq)->batch = NULL)
+#else
+#define clear_batch_ptr(_a) do {} while (0)
+#endif
+
 struct i915_request *
 __i915_request_create(struct intel_context *ce, gfp_t gfp)
 {
@@ -925,10 +953,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	i915_sched_node_reinit(&rq->sched);
 
 	/* No zalloc, everything must be cleared after use */
-	rq->batch = NULL;
+	clear_batch_ptr(rq);
 	__rq_init_watchdog(rq);
-	GEM_BUG_ON(rq->capture_list);
+	assert_capture_list_is_null(rq);
 	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
+	GEM_BUG_ON(i915_vma_snapshot_present(&rq->batch_snapshot));
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index dc359242d1ae..16aba0cb32d6 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -40,6 +40,7 @@
 #include "i915_scheduler.h"
 #include "i915_selftest.h"
 #include "i915_sw_fence.h"
+#include "i915_vma_snapshot.h"
 
 #include <uapi/drm/i915_drm.h>
 
@@ -48,11 +49,17 @@ struct drm_i915_gem_object;
 struct drm_printer;
 struct i915_request;
 
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 struct i915_capture_list {
+	struct i915_vma_snapshot *vma_snapshot;
 	struct i915_capture_list *next;
-	struct i915_vma *vma;
 };
 
+void i915_request_free_capture_list(struct i915_capture_list *capture);
+#else
+#define i915_request_free_capture_list(_a) do {} while (0)
+#endif
+
 #define RQ_TRACE(rq, fmt, ...) do {					\
 	const struct i915_request *rq__ = (rq);				\
 	ENGINE_TRACE(rq__->engine, "fence %llx:%lld, current %d " fmt,	\
@@ -289,10 +296,12 @@ struct i915_request {
 	/** Preallocate space in the ring for the emitting the request */
 	u32 reserved_space;
 
-	/** Batch buffer related to this request if any (used for
-	 * error state dump only).
-	 */
-	struct i915_vma *batch;
+	/** Batch buffer pointer for selftest internal use. */
+	I915_SELFTEST_DECLARE(struct i915_vma *batch);
+
+	struct i915_vma_snapshot batch_snapshot;
+
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 	/**
 	 * Additional buffers requested by userspace to be captured upon
 	 * a GPU hang. The vma/obj on this list are protected by their
@@ -300,6 +309,7 @@ struct i915_request {
 	 * on the active_list (of their final request).
 	 */
 	struct i915_capture_list *capture_list;
+#endif
 
 	/** Time at which this request was emitted, in jiffies. */
 	unsigned long emitted_jiffies;
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
new file mode 100644
index 000000000000..44985d600f96
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -0,0 +1,137 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include "i915_vma_snapshot.h"
+#include "i915_vma_types.h"
+#include "i915_vma.h"
+
+/**
+ * i915_vma_snapshot_init - Initialize a struct i915_vma_snapshot from
+ * a struct i915_vma.
+ * @vsnap: The i915_vma_snapshot to init.
+ * @vma: A struct i915_vma used to initialize @vsnap.
+ * @name: Name associated with the snapshot. The character pointer needs to
+ * stay alive over the lifitime of the shapsot
+ */
+void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
+			    struct i915_vma *vma,
+			    const char *name)
+{
+	if (!i915_vma_is_pinned(vma))
+		assert_object_held(vma->obj);
+
+	vsnap->name = name;
+	vsnap->size = vma->size;
+	vsnap->obj_size = vma->obj->base.size;
+	vsnap->gtt_offset = vma->node.start;
+	vsnap->gtt_size = vma->node.size;
+	vsnap->page_sizes = vma->page_sizes.gtt;
+	vsnap->pages = vma->pages;
+	vsnap->pages_rsgt = NULL;
+	vsnap->mr = NULL;
+	if (vma->obj->mm.rsgt)
+		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
+	if (vma->obj->mm.region)
+		vsnap->mr = intel_memory_region_get(vma->obj->mm.region);
+	kref_init(&vsnap->kref);
+	vsnap->vma_resource = &vma->active;
+	vsnap->onstack = false;
+	vsnap->present = true;
+}
+
+/**
+ * i915_vma_snapshot_init_onstack - Initialize a struct i915_vma_snapshot from
+ * a struct i915_vma, but avoid kfreeing it on last put.
+ * @vsnap: The i915_vma_snapshot to init.
+ * @vma: A struct i915_vma used to initialize @vsnap.
+ * @name: Name associated with the snapshot. The character pointer needs to
+ * stay alive over the lifitime of the shapsot
+ */
+void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
+				    struct i915_vma *vma,
+				    const char *name)
+{
+	i915_vma_snapshot_init(vsnap, vma, name);
+	vsnap->onstack = true;
+}
+
+static void vma_snapshot_release(struct kref *ref)
+{
+	struct i915_vma_snapshot *vsnap =
+		container_of(ref, typeof(*vsnap), kref);
+
+	vsnap->present = false;
+	if (vsnap->mr)
+		intel_memory_region_put(vsnap->mr);
+	if (vsnap->pages_rsgt)
+		i915_refct_sgt_put(vsnap->pages_rsgt);
+	if (!vsnap->onstack)
+		kfree(vsnap);
+}
+
+/**
+ * i915_vma_snapshot_put - Put an i915_vma_snapshot pointer reference
+ * @vsnap: The pointer reference
+ */
+void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap)
+{
+	kref_put(&vsnap->kref, vma_snapshot_release);
+}
+
+/**
+ * i915_vma_snapshot_put_onstack - Put an onstcak i915_vma_snapshot pointer
+ * reference and varify that the structure is released
+ * @vsnap: The pointer reference
+ *
+ * This function is intended to be paired with a i915_vma_init_onstack()
+ * and should be called before exiting the scope that declared or
+ * freeing the structure that embedded @vsnap to verify that all references
+ * have been released.
+ */
+void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
+{
+	if (!kref_put(&vsnap->kref, vma_snapshot_release))
+		GEM_BUG_ON(1);
+}
+
+/**
+ * i915_vma_snapshot_resource_pin - Temporarily block the memory the
+ * vma snapshot is pointing to from being released.
+ * @vsnap: The vma snapshot.
+ * @lockdep_cookie: Pointer to bool needed for lockdep support. This needs
+ * to be passed to the paired i915_vma_snapshot_resource_unpin.
+ *
+ * This function will temporarily try to hold up a fence or similar structure
+ * and will therefore enter a fence signaling critical section.
+ *
+ * Return: true if we succeeded in blocking the memory from being released,
+ * false otherwise.
+ */
+bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
+				    bool *lockdep_cookie)
+{
+	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
+
+	if (pinned)
+		*lockdep_cookie = dma_fence_begin_signalling();
+
+	return pinned;
+}
+
+/**
+ * i915_vma_snapshot_resource_unpin - Unblock vma snapshot memory from
+ * being released.
+ * @vsnap: The vma snapshot.
+ * @lockdep_cookie: Cookie returned from matching i915_vma_resource_pin().
+ *
+ * Might leave a fence signalling critical section and signal a fence.
+ */
+void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
+				      bool lockdep_cookie)
+{
+	dma_fence_end_signalling(lockdep_cookie);
+
+	return i915_active_release(vsnap->vma_resource);
+}
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
new file mode 100644
index 000000000000..940581df4622
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+#ifndef _I915_VMA_SNAPSHOT_H_
+#define _I915_VMA_SNAPSHOT_H_
+
+#include <linux/kref.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+struct i915_active;
+struct i915_refct_sgt;
+struct i915_vma;
+struct intel_memory_region;
+struct sg_table;
+
+/**
+ * DOC: Simple utilities for snapshotting GPU vma metadata, later used for
+ * error capture. Vi use a separate header for this to avoid issues due to
+ * recursive header includes.
+ */
+
+/**
+ * struct i915_vma_snapshot - Snapshot of vma metadata.
+ * @size: The vma size in bytes.
+ * @obj_size: The size of the underlying object in bytes.
+ * @gtt_offset: The gtt offset the vma is bound to.
+ * @gtt_size: The size in bytes allocated for the vma in the GTT.
+ * @pages: The struct sg_table pointing to the pages bound.
+ * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
+ * @mr: The memory region pointed for the pages bound.
+ * @kref: Reference for this structure.
+ * @vma_resource: FIXME: A means to keep the unbind fence from signaling.
+ * Temporarily while we have only sync unbinds, and still use the vma
+ * active, we use that. With async unbinding we need a signaling refcount
+ * for the unbind fence.
+ * @page_sizes: The vma GTT page sizes information.
+ * @onstack: Whether the structure shouldn't be freed on final put.
+ * @present: Whether the structure is present and initialized.
+ */
+struct i915_vma_snapshot {
+	const char *name;
+	size_t size;
+	size_t obj_size;
+	size_t gtt_offset;
+	size_t gtt_size;
+	struct sg_table *pages;
+	struct i915_refct_sgt *pages_rsgt;
+	struct intel_memory_region *mr;
+	struct kref kref;
+	struct i915_active *vma_resource;
+	u32 page_sizes;
+	bool onstack:1;
+	bool present:1;
+};
+
+void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
+			    struct i915_vma *vma,
+			    const char *name);
+
+void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
+				    struct i915_vma *vma,
+				    const char *name);
+
+void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap);
+
+void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap);
+
+bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
+				    bool *lockdep_cookie);
+
+void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
+				      bool lockdep_cookie);
+
+/**
+ * i915_vma_snapshot_alloc - Allocate a struct i915_vma_snapshot
+ * @gfp: Allocation mode.
+ *
+ * Return: A pointer to a struct i915_vma_snapshot if successful.
+ * NULL otherwise.
+ */
+static inline struct i915_vma_snapshot *i915_vma_snapshot_alloc(gfp_t gfp)
+{
+	return kmalloc(sizeof(struct i915_vma_snapshot), gfp);
+}
+
+/**
+ * i915_vma_snapshot_get - Take a reference on a struct i915_vma_snapshot
+ *
+ * Return: A pointer to a struct i915_vma_snapshot.
+ */
+static inline struct i915_vma_snapshot *
+i915_vma_snapshot_get(struct i915_vma_snapshot *vsnap)
+{
+	kref_get(&vsnap->kref);
+	return vsnap;
+}
+
+/**
+ * i915_vma_snapshot_present - Whether a struct i915_vma_snapshot is
+ * present and initialized.
+ *
+ * Return: true if present and initialized; false otherwise.
+ */
+static inline bool
+i915_vma_snapshot_present(const struct i915_vma_snapshot *vsnap)
+{
+	return vsnap && vsnap->present;
+}
+
+#endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-gfx] [PATCH v6 3/4] drm/i915: Update error capture code to avoid using the current vma state
@ 2021-11-08 17:45   ` Thomas Hellström
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

With asynchronous migrations, the vma state may be several migrations
ahead of the state that matches the request we're capturing.
Address that by introducing an i915_vma_snapshot structure that
can be used to snapshot relevant state at request submission.
In order to make sure we access the correct memory, the snapshots take
references on relevant sg-tables and memory regions.

Also move the capture list allocation out of the fence signaling
critical path and use the CONFIG_DRM_I915_CAPTURE_ERROR define to
avoid compiling in members and functions used for error capture
when they're not used.

Finally, Introduce lockdep annotation.

v4:
- Break out the capture allocation mode change to a separate patch.
v5:
- Fix compilation error in the !CONFIG_DRM_I915_CAPTURE_ERROR case
  (kernel test robot)
v6:
- Use #if IS_ENABLED() instead of #ifdef to match driver style.
- Move yet another change of allocation mode to the separate patch.
- Commit message rework due to patch reordering.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 135 +++++++++++---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   8 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         | 176 +++++++++++++-----
 drivers/gpu/drm/i915/i915_request.c           |  63 +++++--
 drivers/gpu/drm/i915/i915_request.h           |  20 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 137 ++++++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 +++++++++++
 8 files changed, 557 insertions(+), 95 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7d0d0b814670..b4372e0a802f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -174,6 +174,7 @@ i915-y += \
 	  i915_trace_points.o \
 	  i915_ttm_buddy_manager.o \
 	  i915_vma.o \
+	  i915_vma_snapshot.o \
 	  intel_wopcm.o
 
 # general-purpose microcontroller (GuC) support
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index ea5b7b2a4d70..ddca899f2396 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -29,6 +29,7 @@
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
+#include "i915_vma_snapshot.h"
 
 struct eb_vma {
 	struct i915_vma *vma;
@@ -307,11 +308,15 @@ struct i915_execbuffer {
 
 	struct eb_fence *fences;
 	unsigned long num_fences;
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+	struct i915_capture_list *capture_lists[MAX_ENGINE_INSTANCE + 1];
+#endif
 };
 
 static int eb_parse(struct i915_execbuffer *eb);
 static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle);
 static void eb_unpin_engine(struct i915_execbuffer *eb);
+static void eb_capture_release(struct i915_execbuffer *eb);
 
 static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
 {
@@ -1054,6 +1059,7 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
 			i915_vma_put(vma);
 	}
 
+	eb_capture_release(eb);
 	eb_unpin_engine(eb);
 }
 
@@ -1891,36 +1897,113 @@ eb_find_first_request_added(struct i915_execbuffer *eb)
 	return NULL;
 }
 
-static int eb_move_to_gpu(struct i915_execbuffer *eb)
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+
+/* Stage with GFP_KERNEL allocations before we enter the signaling critical path */
+static void eb_capture_stage(struct i915_execbuffer *eb)
 {
 	const unsigned int count = eb->buffer_count;
-	unsigned int i = count;
-	int err = 0, j;
+	unsigned int i = count, j;
+	struct i915_vma_snapshot *vsnap;
 
 	while (i--) {
 		struct eb_vma *ev = &eb->vma[i];
 		struct i915_vma *vma = ev->vma;
 		unsigned int flags = ev->flags;
-		struct drm_i915_gem_object *obj = vma->obj;
 
-		assert_vma_held(vma);
+		if (!(flags & EXEC_OBJECT_CAPTURE))
+			continue;
 
-		if (flags & EXEC_OBJECT_CAPTURE) {
+		vsnap = i915_vma_snapshot_alloc(GFP_KERNEL);
+		if (!vsnap)
+			continue;
+
+		i915_vma_snapshot_init(vsnap, vma, "user");
+		for_each_batch_create_order(eb, j) {
 			struct i915_capture_list *capture;
 
-			for_each_batch_create_order(eb, j) {
-				if (!eb->requests[j])
-					break;
+			capture = kmalloc(sizeof(*capture), GFP_KERNEL);
+			if (!capture)
+				continue;
 
-				capture = kmalloc(sizeof(*capture), GFP_KERNEL);
-				if (capture) {
-					capture->next =
-						eb->requests[j]->capture_list;
-					capture->vma = vma;
-					eb->requests[j]->capture_list = capture;
-				}
-			}
+			capture->next = eb->capture_lists[j];
+			capture->vma_snapshot = i915_vma_snapshot_get(vsnap);
+			eb->capture_lists[j] = capture;
+		}
+		i915_vma_snapshot_put(vsnap);
+	}
+}
+
+/* Commit once we're in the critical path */
+static void eb_capture_commit(struct i915_execbuffer *eb)
+{
+	unsigned int j;
+
+	for_each_batch_create_order(eb, j) {
+		struct i915_request *rq = eb->requests[j];
+
+		if (!rq)
+			break;
+
+		rq->capture_list = eb->capture_lists[j];
+		eb->capture_lists[j] = NULL;
+	}
+}
+
+/*
+ * Release anything that didn't get committed due to errors.
+ * The capture_list will otherwise be freed at request retire.
+ */
+static void eb_capture_release(struct i915_execbuffer *eb)
+{
+	unsigned int j;
+
+	for_each_batch_create_order(eb, j) {
+		if (eb->capture_lists[j]) {
+			i915_request_free_capture_list(eb->capture_lists[j]);
+			eb->capture_lists[j] = NULL;
 		}
+	}
+}
+
+static void eb_capture_list_clear(struct i915_execbuffer *eb)
+{
+	memset(eb->capture_lists, 0, sizeof(eb->capture_lists));
+}
+
+#else
+
+static void eb_capture_stage(struct i915_execbuffer *eb)
+{
+}
+
+static void eb_capture_commit(struct i915_execbuffer *eb)
+{
+}
+
+static void eb_capture_release(struct i915_execbuffer *eb)
+{
+}
+
+static void eb_capture_list_clear(struct i915_execbuffer *eb)
+{
+}
+
+#endif
+
+static int eb_move_to_gpu(struct i915_execbuffer *eb)
+{
+	const unsigned int count = eb->buffer_count;
+	unsigned int i = count;
+	int err = 0, j;
+
+	while (i--) {
+		struct eb_vma *ev = &eb->vma[i];
+		struct i915_vma *vma = ev->vma;
+		unsigned int flags = ev->flags;
+		struct drm_i915_gem_object *obj = vma->obj;
+
+		assert_vma_held(vma);
 
 		/*
 		 * If the GPU is not _reading_ through the CPU cache, we need
@@ -2001,6 +2084,8 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 
 	/* Unconditionally flush any chipset caches (for streaming writes). */
 	intel_gt_chipset_flush(eb->gt);
+	eb_capture_commit(eb);
+
 	return 0;
 
 err_skip:
@@ -3143,13 +3228,14 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 		}
 
 		/*
-		 * Whilst this request exists, batch_obj will be on the
-		 * active_list, and so will hold the active reference. Only when
-		 * this request is retired will the batch_obj be moved onto
-		 * the inactive_list and lose its active reference. Hence we do
-		 * not need to explicitly hold another reference here.
+		 * Not really on stack, but we don't want to call
+		 * kfree on the batch_snapshot when we put it, so use the
+		 * _onstack interface.
 		 */
-		eb->requests[i]->batch = eb->batches[i]->vma;
+		if (eb->batches[i]->vma)
+			i915_vma_snapshot_init_onstack(&eb->requests[i]->batch_snapshot,
+						       eb->batches[i]->vma,
+						       "batch");
 		if (eb->batch_pool) {
 			GEM_BUG_ON(intel_context_is_parallel(eb->context));
 			intel_gt_buffer_pool_mark_active(eb->batch_pool,
@@ -3198,6 +3284,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.fences = NULL;
 	eb.num_fences = 0;
 
+	eb_capture_list_clear(&eb);
+
 	memset(eb.requests, 0, sizeof(struct i915_request *) *
 	       ARRAY_SIZE(eb.requests));
 	eb.composite_fence = NULL;
@@ -3284,6 +3372,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	}
 
 	ww_acquire_done(&eb.ww.ctx);
+	eb_capture_stage(&eb);
 
 	out_fence = eb_requests_create(&eb, in_fence, out_fence_fd);
 	if (IS_ERR(out_fence)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 332756036007..f2ccd5b53d42 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1676,14 +1676,18 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 
 static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
 {
+	struct i915_vma_snapshot *vsnap = &rq->batch_snapshot;
 	void *ring;
 	int size;
 
+	if (!i915_vma_snapshot_present(vsnap))
+		vsnap = NULL;
+
 	drm_printf(m,
 		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
 		   rq->head, rq->postfix, rq->tail,
-		   rq->batch ? upper_32_bits(rq->batch->node.start) : ~0u,
-		   rq->batch ? lower_32_bits(rq->batch->node.start) : ~0u);
+		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
+		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
 
 	size = rq->tail - rq->head;
 	if (rq->tail < rq->head)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index b1e4ce0f798f..c61f9aaa40f9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -48,6 +48,7 @@
 #include "i915_gpu_error.h"
 #include "i915_memcpy.h"
 #include "i915_scatterlist.h"
+#include "i915_vma_snapshot.h"
 
 #define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
@@ -1012,8 +1013,7 @@ void __i915_gpu_coredump_free(struct kref *error_ref)
 
 static struct i915_vma_coredump *
 i915_vma_coredump_create(const struct intel_gt *gt,
-			 const struct i915_vma *vma,
-			 const char *name,
+			 const struct i915_vma_snapshot *vsnap,
 			 struct i915_vma_compress *compress)
 {
 	struct i915_ggtt *ggtt = gt->ggtt;
@@ -1024,7 +1024,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 
 	might_sleep();
 
-	if (!vma || !vma->pages || !compress)
+	if (!vsnap || !vsnap->pages || !compress)
 		return NULL;
 
 	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
@@ -1037,12 +1037,12 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	}
 
 	INIT_LIST_HEAD(&dst->page_list);
-	strcpy(dst->name, name);
+	strcpy(dst->name, vsnap->name);
 	dst->next = NULL;
 
-	dst->gtt_offset = vma->node.start;
-	dst->gtt_size = vma->node.size;
-	dst->gtt_page_sizes = vma->page_sizes.gtt;
+	dst->gtt_offset = vsnap->gtt_offset;
+	dst->gtt_size = vsnap->gtt_size;
+	dst->gtt_page_sizes = vsnap->page_sizes;
 	dst->unused = 0;
 
 	ret = -EINVAL;
@@ -1050,7 +1050,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 		void __iomem *s;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vma->pages) {
+		for_each_sgt_daddr(dma, iter, vsnap->pages) {
 			mutex_lock(&ggtt->error_mutex);
 			ggtt->vm.insert_page(&ggtt->vm, dma, slot,
 					     I915_CACHE_NONE, 0);
@@ -1068,11 +1068,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 			if (ret)
 				break;
 		}
-	} else if (__i915_gem_object_is_lmem(vma->obj)) {
-		struct intel_memory_region *mem = vma->obj->mm.region;
+	} else if (vsnap->mr && vsnap->mr->type != INTEL_MEMORY_SYSTEM) {
+		struct intel_memory_region *mem = vsnap->mr;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vma->pages) {
+		for_each_sgt_daddr(dma, iter, vsnap->pages) {
 			void __iomem *s;
 
 			s = io_mapping_map_wc(&mem->iomap,
@@ -1088,7 +1088,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	} else {
 		struct page *page;
 
-		for_each_sgt_page(page, iter, vma->pages) {
+		for_each_sgt_page(page, iter, vsnap->pages) {
 			void *s;
 
 			drm_clflush_pages(&page, 1);
@@ -1324,37 +1324,71 @@ static bool record_context(struct i915_gem_context_coredump *e,
 
 struct intel_engine_capture_vma {
 	struct intel_engine_capture_vma *next;
-	struct i915_vma *vma;
+	struct i915_vma_snapshot *vsnap;
 	char name[16];
+	bool lockdep_cookie;
 };
 
 static struct intel_engine_capture_vma *
-capture_vma(struct intel_engine_capture_vma *next,
-	    struct i915_vma *vma,
-	    const char *name,
-	    gfp_t gfp)
+capture_vma_snapshot(struct intel_engine_capture_vma *next,
+		     struct i915_vma_snapshot *vsnap,
+		     gfp_t gfp)
 {
 	struct intel_engine_capture_vma *c;
 
-	if (!vma)
+	if (!i915_vma_snapshot_present(vsnap))
 		return next;
 
 	c = kmalloc(sizeof(*c), gfp);
 	if (!c)
 		return next;
 
-	if (!i915_active_acquire_if_busy(&vma->active)) {
+	if (!i915_vma_snapshot_resource_pin(vsnap, &c->lockdep_cookie)) {
 		kfree(c);
 		return next;
 	}
 
-	strcpy(c->name, name);
-	c->vma = vma; /* reference held while active */
+	strcpy(c->name, vsnap->name);
+	c->vsnap = vsnap;
+	i915_vma_snapshot_get(vsnap);
 
 	c->next = next;
 	return c;
 }
 
+static struct intel_engine_capture_vma *
+capture_vma(struct intel_engine_capture_vma *next,
+	    struct i915_vma *vma,
+	    const char *name,
+	    gfp_t gfp)
+{
+	struct i915_vma_snapshot *vsnap;
+
+	if (!vma)
+		return next;
+
+	/*
+	 * If the vma isn't pinned, then the vma should be snapshotted
+	 * to a struct i915_vma_snapshot at command submission time.
+	 * Not here.
+	 */
+	GEM_WARN_ON(!i915_vma_is_pinned(vma));
+	if (!i915_vma_is_pinned(vma))
+		return next;
+
+	vsnap = i915_vma_snapshot_alloc(gfp);
+	if (!vsnap)
+		return next;
+
+	i915_vma_snapshot_init(vsnap, vma, name);
+	next = capture_vma_snapshot(next, vsnap, gfp);
+
+	/* FIXME: Replace on async unbind. */
+	i915_vma_snapshot_put(vsnap);
+
+	return next;
+}
+
 static struct intel_engine_capture_vma *
 capture_user(struct intel_engine_capture_vma *capture,
 	     const struct i915_request *rq,
@@ -1363,7 +1397,7 @@ capture_user(struct intel_engine_capture_vma *capture,
 	struct i915_capture_list *c;
 
 	for (c = rq->capture_list; c; c = c->next)
-		capture = capture_vma(capture, c->vma, "user", gfp);
+		capture = capture_vma_snapshot(capture, c->vma_snapshot, gfp);
 
 	return capture;
 }
@@ -1377,6 +1411,36 @@ static void add_vma(struct intel_engine_coredump *ee,
 	}
 }
 
+static struct i915_vma_coredump *
+create_vma_coredump(const struct intel_gt *gt, struct i915_vma *vma,
+		    const char *name, struct i915_vma_compress *compress)
+{
+	struct i915_vma_coredump *ret = NULL;
+	struct i915_vma_snapshot tmp;
+	bool lockdep_cookie;
+
+	if (!vma)
+		return NULL;
+
+	i915_vma_snapshot_init_onstack(&tmp, vma, name);
+	if (i915_vma_snapshot_resource_pin(&tmp, &lockdep_cookie)) {
+		ret = i915_vma_coredump_create(gt, &tmp, compress);
+		i915_vma_snapshot_resource_unpin(&tmp, lockdep_cookie);
+	}
+	i915_vma_snapshot_put_onstack(&tmp);
+
+	return ret;
+}
+
+static void add_vma_coredump(struct intel_engine_coredump *ee,
+			     const struct intel_gt *gt,
+			     struct i915_vma *vma,
+			     const char *name,
+			     struct i915_vma_compress *compress)
+{
+	add_vma(ee, create_vma_coredump(gt, vma, name, compress));
+}
+
 struct intel_engine_coredump *
 intel_engine_coredump_alloc(struct intel_engine_cs *engine, gfp_t gfp)
 {
@@ -1410,7 +1474,7 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee,
 	 * as the simplest method to avoid being overwritten
 	 * by userspace.
 	 */
-	vma = capture_vma(vma, rq->batch, "batch", gfp);
+	vma = capture_vma_snapshot(vma, &rq->batch_snapshot, gfp);
 	vma = capture_user(vma, rq, gfp);
 	vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
 	vma = capture_vma(vma, rq->context->state, "HW context", gfp);
@@ -1431,30 +1495,24 @@ intel_engine_coredump_add_vma(struct intel_engine_coredump *ee,
 
 	while (capture) {
 		struct intel_engine_capture_vma *this = capture;
-		struct i915_vma *vma = this->vma;
+		struct i915_vma_snapshot *vsnap = this->vsnap;
 
 		add_vma(ee,
 			i915_vma_coredump_create(engine->gt,
-						 vma, this->name,
-						 compress));
+						 vsnap, compress));
 
-		i915_active_release(&vma->active);
+		i915_vma_snapshot_resource_unpin(vsnap, this->lockdep_cookie);
+		i915_vma_snapshot_put(vsnap);
 
 		capture = this->next;
 		kfree(this);
 	}
 
-	add_vma(ee,
-		i915_vma_coredump_create(engine->gt,
-					 engine->status_page.vma,
-					 "HW Status",
-					 compress));
+	add_vma_coredump(ee, engine->gt, engine->status_page.vma,
+			 "HW Status", compress);
 
-	add_vma(ee,
-		i915_vma_coredump_create(engine->gt,
-					 engine->wa_ctx.vma,
-					 "WA context",
-					 compress));
+	add_vma_coredump(ee, engine->gt, engine->wa_ctx.vma,
+			 "WA context", compress);
 }
 
 static struct intel_engine_coredump *
@@ -1490,17 +1548,25 @@ capture_engine(struct intel_engine_cs *engine,
 		}
 	}
 	if (rq)
-		capture = intel_engine_coredump_add_request(ee, rq,
-							    ATOMIC_MAYFAIL);
+		rq = i915_request_get_rcu(rq);
+
+	if (!rq)
+		goto no_request_capture;
+
+	capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL);
 	if (!capture) {
-no_request_capture:
-		kfree(ee);
-		return NULL;
+		i915_request_put(rq);
+		goto no_request_capture;
 	}
 
 	intel_engine_coredump_add_vma(ee, capture, compress);
+	i915_request_put(rq);
 
 	return ee;
+
+no_request_capture:
+	kfree(ee);
+	return NULL;
 }
 
 static void
@@ -1554,10 +1620,8 @@ gt_record_uc(struct intel_gt_coredump *gt,
 	 */
 	error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL);
 	error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL);
-	error_uc->guc_log =
-		i915_vma_coredump_create(gt->_gt,
-					 uc->guc.log.vma, "GuC log buffer",
-					 compress);
+	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
+						"GuC log buffer", compress);
 
 	return error_uc;
 }
@@ -1843,8 +1907,8 @@ void i915_vma_capture_finish(struct intel_gt_coredump *gt,
 	kfree(compress);
 }
 
-struct i915_gpu_coredump *
-i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
+static struct i915_gpu_coredump *
+__i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
 {
 	struct drm_i915_private *i915 = gt->i915;
 	struct i915_gpu_coredump *error;
@@ -1885,6 +1949,22 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
 	return error;
 }
 
+struct i915_gpu_coredump *
+i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
+{
+	static DEFINE_MUTEX(capture_mutex);
+	int ret = mutex_lock_interruptible(&capture_mutex);
+	struct i915_gpu_coredump *dump;
+
+	if (ret)
+		return ERR_PTR(ret);
+
+	dump = __i915_gpu_coredump(gt, engine_mask);
+	mutex_unlock(&capture_mutex);
+
+	return dump;
+}
+
 void i915_error_state_store(struct i915_gpu_coredump *error)
 {
 	struct drm_i915_private *i915;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 623273aca09e..ee855dc86972 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -113,6 +113,10 @@ static void i915_fence_release(struct dma_fence *fence)
 	GEM_BUG_ON(rq->guc_prio != GUC_PRIO_INIT &&
 		   rq->guc_prio != GUC_PRIO_FINI);
 
+	i915_request_free_capture_list(fetch_and_zero(&rq->capture_list));
+	if (i915_vma_snapshot_present(&rq->batch_snapshot))
+		i915_vma_snapshot_put_onstack(&rq->batch_snapshot);
+
 	/*
 	 * The request is put onto a RCU freelist (i.e. the address
 	 * is immediately reused), mark the fences as being freed now.
@@ -186,19 +190,6 @@ void i915_request_notify_execute_cb_imm(struct i915_request *rq)
 	__notify_execute_cb(rq, irq_work_imm);
 }
 
-static void free_capture_list(struct i915_request *request)
-{
-	struct i915_capture_list *capture;
-
-	capture = fetch_and_zero(&request->capture_list);
-	while (capture) {
-		struct i915_capture_list *next = capture->next;
-
-		kfree(capture);
-		capture = next;
-	}
-}
-
 static void __i915_request_fill(struct i915_request *rq, u8 val)
 {
 	void *vaddr = rq->ring->vaddr;
@@ -303,6 +294,37 @@ static void __rq_cancel_watchdog(struct i915_request *rq)
 		i915_request_put(rq);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+
+/**
+ * i915_request_free_capture_list - Free a capture list
+ * @capture: Pointer to the first list item or NULL
+ *
+ */
+void i915_request_free_capture_list(struct i915_capture_list *capture)
+{
+	while (capture) {
+		struct i915_capture_list *next = capture->next;
+
+		i915_vma_snapshot_put(capture->vma_snapshot);
+		capture = next;
+	}
+}
+
+#define assert_capture_list_is_null(_rq) GEM_BUG_ON((_rq)->capture_list)
+
+#define clear_capture_list(_rq) ((_rq)->capture_list = NULL)
+
+#else
+
+#define i915_request_free_capture_list(_a) do {} while (0)
+
+#define assert_capture_list_is_null(_a) do {} while (0)
+
+#define clear_capture_list(_rq) do {} while (0)
+
+#endif
+
 bool i915_request_retire(struct i915_request *rq)
 {
 	if (!__i915_request_is_complete(rq))
@@ -359,7 +381,6 @@ bool i915_request_retire(struct i915_request *rq)
 	intel_context_exit(rq->context);
 	intel_context_unpin(rq->context);
 
-	free_capture_list(rq);
 	i915_sched_node_fini(&rq->sched);
 	i915_request_put(rq);
 
@@ -829,11 +850,18 @@ static void __i915_request_ctor(void *arg)
 	i915_sw_fence_init(&rq->submit, submit_notify);
 	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
 
-	rq->capture_list = NULL;
+	clear_capture_list(rq);
+	rq->batch_snapshot.present = false;
 
 	init_llist_head(&rq->execute_cb);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#define clear_batch_ptr(_rq) ((_rq)->batch = NULL)
+#else
+#define clear_batch_ptr(_a) do {} while (0)
+#endif
+
 struct i915_request *
 __i915_request_create(struct intel_context *ce, gfp_t gfp)
 {
@@ -925,10 +953,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	i915_sched_node_reinit(&rq->sched);
 
 	/* No zalloc, everything must be cleared after use */
-	rq->batch = NULL;
+	clear_batch_ptr(rq);
 	__rq_init_watchdog(rq);
-	GEM_BUG_ON(rq->capture_list);
+	assert_capture_list_is_null(rq);
 	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
+	GEM_BUG_ON(i915_vma_snapshot_present(&rq->batch_snapshot));
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index dc359242d1ae..16aba0cb32d6 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -40,6 +40,7 @@
 #include "i915_scheduler.h"
 #include "i915_selftest.h"
 #include "i915_sw_fence.h"
+#include "i915_vma_snapshot.h"
 
 #include <uapi/drm/i915_drm.h>
 
@@ -48,11 +49,17 @@ struct drm_i915_gem_object;
 struct drm_printer;
 struct i915_request;
 
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 struct i915_capture_list {
+	struct i915_vma_snapshot *vma_snapshot;
 	struct i915_capture_list *next;
-	struct i915_vma *vma;
 };
 
+void i915_request_free_capture_list(struct i915_capture_list *capture);
+#else
+#define i915_request_free_capture_list(_a) do {} while (0)
+#endif
+
 #define RQ_TRACE(rq, fmt, ...) do {					\
 	const struct i915_request *rq__ = (rq);				\
 	ENGINE_TRACE(rq__->engine, "fence %llx:%lld, current %d " fmt,	\
@@ -289,10 +296,12 @@ struct i915_request {
 	/** Preallocate space in the ring for the emitting the request */
 	u32 reserved_space;
 
-	/** Batch buffer related to this request if any (used for
-	 * error state dump only).
-	 */
-	struct i915_vma *batch;
+	/** Batch buffer pointer for selftest internal use. */
+	I915_SELFTEST_DECLARE(struct i915_vma *batch);
+
+	struct i915_vma_snapshot batch_snapshot;
+
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 	/**
 	 * Additional buffers requested by userspace to be captured upon
 	 * a GPU hang. The vma/obj on this list are protected by their
@@ -300,6 +309,7 @@ struct i915_request {
 	 * on the active_list (of their final request).
 	 */
 	struct i915_capture_list *capture_list;
+#endif
 
 	/** Time at which this request was emitted, in jiffies. */
 	unsigned long emitted_jiffies;
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
new file mode 100644
index 000000000000..44985d600f96
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -0,0 +1,137 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include "i915_vma_snapshot.h"
+#include "i915_vma_types.h"
+#include "i915_vma.h"
+
+/**
+ * i915_vma_snapshot_init - Initialize a struct i915_vma_snapshot from
+ * a struct i915_vma.
+ * @vsnap: The i915_vma_snapshot to init.
+ * @vma: A struct i915_vma used to initialize @vsnap.
+ * @name: Name associated with the snapshot. The character pointer needs to
+ * stay alive over the lifitime of the shapsot
+ */
+void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
+			    struct i915_vma *vma,
+			    const char *name)
+{
+	if (!i915_vma_is_pinned(vma))
+		assert_object_held(vma->obj);
+
+	vsnap->name = name;
+	vsnap->size = vma->size;
+	vsnap->obj_size = vma->obj->base.size;
+	vsnap->gtt_offset = vma->node.start;
+	vsnap->gtt_size = vma->node.size;
+	vsnap->page_sizes = vma->page_sizes.gtt;
+	vsnap->pages = vma->pages;
+	vsnap->pages_rsgt = NULL;
+	vsnap->mr = NULL;
+	if (vma->obj->mm.rsgt)
+		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
+	if (vma->obj->mm.region)
+		vsnap->mr = intel_memory_region_get(vma->obj->mm.region);
+	kref_init(&vsnap->kref);
+	vsnap->vma_resource = &vma->active;
+	vsnap->onstack = false;
+	vsnap->present = true;
+}
+
+/**
+ * i915_vma_snapshot_init_onstack - Initialize a struct i915_vma_snapshot from
+ * a struct i915_vma, but avoid kfreeing it on last put.
+ * @vsnap: The i915_vma_snapshot to init.
+ * @vma: A struct i915_vma used to initialize @vsnap.
+ * @name: Name associated with the snapshot. The character pointer needs to
+ * stay alive over the lifitime of the shapsot
+ */
+void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
+				    struct i915_vma *vma,
+				    const char *name)
+{
+	i915_vma_snapshot_init(vsnap, vma, name);
+	vsnap->onstack = true;
+}
+
+static void vma_snapshot_release(struct kref *ref)
+{
+	struct i915_vma_snapshot *vsnap =
+		container_of(ref, typeof(*vsnap), kref);
+
+	vsnap->present = false;
+	if (vsnap->mr)
+		intel_memory_region_put(vsnap->mr);
+	if (vsnap->pages_rsgt)
+		i915_refct_sgt_put(vsnap->pages_rsgt);
+	if (!vsnap->onstack)
+		kfree(vsnap);
+}
+
+/**
+ * i915_vma_snapshot_put - Put an i915_vma_snapshot pointer reference
+ * @vsnap: The pointer reference
+ */
+void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap)
+{
+	kref_put(&vsnap->kref, vma_snapshot_release);
+}
+
+/**
+ * i915_vma_snapshot_put_onstack - Put an onstcak i915_vma_snapshot pointer
+ * reference and varify that the structure is released
+ * @vsnap: The pointer reference
+ *
+ * This function is intended to be paired with a i915_vma_init_onstack()
+ * and should be called before exiting the scope that declared or
+ * freeing the structure that embedded @vsnap to verify that all references
+ * have been released.
+ */
+void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
+{
+	if (!kref_put(&vsnap->kref, vma_snapshot_release))
+		GEM_BUG_ON(1);
+}
+
+/**
+ * i915_vma_snapshot_resource_pin - Temporarily block the memory the
+ * vma snapshot is pointing to from being released.
+ * @vsnap: The vma snapshot.
+ * @lockdep_cookie: Pointer to bool needed for lockdep support. This needs
+ * to be passed to the paired i915_vma_snapshot_resource_unpin.
+ *
+ * This function will temporarily try to hold up a fence or similar structure
+ * and will therefore enter a fence signaling critical section.
+ *
+ * Return: true if we succeeded in blocking the memory from being released,
+ * false otherwise.
+ */
+bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
+				    bool *lockdep_cookie)
+{
+	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
+
+	if (pinned)
+		*lockdep_cookie = dma_fence_begin_signalling();
+
+	return pinned;
+}
+
+/**
+ * i915_vma_snapshot_resource_unpin - Unblock vma snapshot memory from
+ * being released.
+ * @vsnap: The vma snapshot.
+ * @lockdep_cookie: Cookie returned from matching i915_vma_resource_pin().
+ *
+ * Might leave a fence signalling critical section and signal a fence.
+ */
+void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
+				      bool lockdep_cookie)
+{
+	dma_fence_end_signalling(lockdep_cookie);
+
+	return i915_active_release(vsnap->vma_resource);
+}
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
new file mode 100644
index 000000000000..940581df4622
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+#ifndef _I915_VMA_SNAPSHOT_H_
+#define _I915_VMA_SNAPSHOT_H_
+
+#include <linux/kref.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+struct i915_active;
+struct i915_refct_sgt;
+struct i915_vma;
+struct intel_memory_region;
+struct sg_table;
+
+/**
+ * DOC: Simple utilities for snapshotting GPU vma metadata, later used for
+ * error capture. Vi use a separate header for this to avoid issues due to
+ * recursive header includes.
+ */
+
+/**
+ * struct i915_vma_snapshot - Snapshot of vma metadata.
+ * @size: The vma size in bytes.
+ * @obj_size: The size of the underlying object in bytes.
+ * @gtt_offset: The gtt offset the vma is bound to.
+ * @gtt_size: The size in bytes allocated for the vma in the GTT.
+ * @pages: The struct sg_table pointing to the pages bound.
+ * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
+ * @mr: The memory region pointed for the pages bound.
+ * @kref: Reference for this structure.
+ * @vma_resource: FIXME: A means to keep the unbind fence from signaling.
+ * Temporarily while we have only sync unbinds, and still use the vma
+ * active, we use that. With async unbinding we need a signaling refcount
+ * for the unbind fence.
+ * @page_sizes: The vma GTT page sizes information.
+ * @onstack: Whether the structure shouldn't be freed on final put.
+ * @present: Whether the structure is present and initialized.
+ */
+struct i915_vma_snapshot {
+	const char *name;
+	size_t size;
+	size_t obj_size;
+	size_t gtt_offset;
+	size_t gtt_size;
+	struct sg_table *pages;
+	struct i915_refct_sgt *pages_rsgt;
+	struct intel_memory_region *mr;
+	struct kref kref;
+	struct i915_active *vma_resource;
+	u32 page_sizes;
+	bool onstack:1;
+	bool present:1;
+};
+
+void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
+			    struct i915_vma *vma,
+			    const char *name);
+
+void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
+				    struct i915_vma *vma,
+				    const char *name);
+
+void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap);
+
+void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap);
+
+bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
+				    bool *lockdep_cookie);
+
+void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
+				      bool lockdep_cookie);
+
+/**
+ * i915_vma_snapshot_alloc - Allocate a struct i915_vma_snapshot
+ * @gfp: Allocation mode.
+ *
+ * Return: A pointer to a struct i915_vma_snapshot if successful.
+ * NULL otherwise.
+ */
+static inline struct i915_vma_snapshot *i915_vma_snapshot_alloc(gfp_t gfp)
+{
+	return kmalloc(sizeof(struct i915_vma_snapshot), gfp);
+}
+
+/**
+ * i915_vma_snapshot_get - Take a reference on a struct i915_vma_snapshot
+ *
+ * Return: A pointer to a struct i915_vma_snapshot.
+ */
+static inline struct i915_vma_snapshot *
+i915_vma_snapshot_get(struct i915_vma_snapshot *vsnap)
+{
+	kref_get(&vsnap->kref);
+	return vsnap;
+}
+
+/**
+ * i915_vma_snapshot_present - Whether a struct i915_vma_snapshot is
+ * present and initialized.
+ *
+ * Return: true if present and initialized; false otherwise.
+ */
+static inline bool
+i915_vma_snapshot_present(const struct i915_vma_snapshot *vsnap)
+{
+	return vsnap && vsnap->present;
+}
+
+#endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/4] drm/i915: Initial introduction of vma resources
  2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
@ 2021-11-08 17:45   ` Thomas Hellström
  -1 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

The vma resource are needed for asynchronous bind management and are
similar to TTM resources. They contain the data needed for
asynchronous unbinding (typically the vm range, any backend
private information and a means to do refcounting and to hold
the unbinding for error capture).

When a vma is bound, a vma resource is created and attached to the
vma, and on async unbinding it is detached from the vma, and instead
the vm records the fence marking unbind complete. This fence needs to
be waited on before we can bind the same region again, so either
the fence can be recorded for this particular range only, using an
interval tree, or as a simpler approach, for the whole vm. The latter
means no binding can take place on a vm until all detached vma
resources scheduled for unbind are signaled. With an interval tree
fence recording, the interval tree needs to be searched for fences
to be signaled before binding can take place.

But most of that is for later, this patch only introduces stub vma
resources without unbind capability and the fences of which are waited
for sync during unbinding. At this point we're interested in the hold
capability as a POC for error capture. Note that the current sync waiting
at unbind time is done uninterruptible, but that's OK since we're
only ever waiting during error capture, and in that case there's very
little gpu activity (if any) that can stall.

v2:
- Fix the mock gtt selftest to bind with vma resources.
- Update a code comment.
- Account for rebinding the same vma with different I915_VMA_*_BIND flags
v3:
- Some style fixups.
- Move the sync fence wait to __i915_vma_evict instead of __i915_vma_unbind
  to catch also the evict case on suspend.
v4:
- Remove a minor fix that incorrectly landed in this patch.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               | 206 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  20 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      |  14 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.h      |   2 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  98 +++++----
 7 files changed, 288 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index ddca899f2396..a4f12d29c856 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1376,7 +1376,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 		    GRAPHICS_VER(eb->i915) == 6) {
 			err = i915_vma_bind(target->vma,
 					    target->vma->obj->cache_level,
-					    PIN_GLOBAL, NULL);
+					    PIN_GLOBAL, NULL, NULL);
 			if (err)
 				return err;
 		}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 8781c4f61952..d0ca5c618763 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -38,7 +38,35 @@
 #include "i915_trace.h"
 #include "i915_vma.h"
 
+/**
+ * struct i915_vma_resource - Snapshotted unbind information.
+ * @unbind_fence: Fence to mark unbinding complete. Note that this fence
+ * is not considered published until unbind is scheduled, and as such it
+ * is illegal to access this fence before scheduled unbind other than
+ * for refcounting.
+ * @lock: The @unbind_fence lock. We're also using it to protect the weak
+ * pointer to the struct i915_vma, @vma during lookup and takedown.
+ * @vma: Weak back-pointer to the parent vma struct. This pointer is
+ * protected by @lock, and a reference on @vma needs to be taken
+ * using kref_get_unless_zero.
+ * @hold_count: Number of holders blocking the fence from finishing.
+ * The vma itself is keeping a hold, which is released when unbind
+ * is scheduled.
+ */
+struct i915_vma_resource {
+	struct dma_fence unbind_fence;
+	/* See above for description of the lock. */
+	spinlock_t lock;
+	struct i915_vma *vma;
+	refcount_t hold_count;
+};
+
 static struct kmem_cache *slab_vmas;
+static struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
+#ifndef CONFIG_DRM_I915_SELFTEST
+static void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+				   struct i915_vma *vma);
+#endif
 
 struct i915_vma *i915_vma_alloc(void)
 {
@@ -363,6 +391,8 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
  * @cache_level: mapping cache level
  * @flags: flags like global or local mapping
  * @work: preallocated worker for allocating and binding the PTE
+ * @vma_res: pointer to a preallocated vma resource. The resource is either
+ * consumed or freed.
  *
  * DMA addresses are taken from the scatter-gather table of this object (or of
  * this VMA in case of non-default GGTT views) and PTE entries set up.
@@ -371,7 +401,8 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work)
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res)
 {
 	u32 bind_flags;
 	u32 vma_flags;
@@ -381,11 +412,15 @@ int i915_vma_bind(struct i915_vma *vma,
 
 	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
 					      vma->node.size,
-					      vma->vm->total)))
+					      vma->vm->total))) {
+		kfree(vma_res);
 		return -ENODEV;
+	}
 
-	if (GEM_DEBUG_WARN_ON(!flags))
+	if (GEM_DEBUG_WARN_ON(!flags)) {
+		kfree(vma_res);
 		return -EINVAL;
+	}
 
 	bind_flags = flags;
 	bind_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
@@ -394,11 +429,25 @@ int i915_vma_bind(struct i915_vma *vma,
 	vma_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
 
 	bind_flags &= ~vma_flags;
-	if (bind_flags == 0)
+	if (bind_flags == 0) {
+		kfree(vma_res);
 		return 0;
+	}
 
 	GEM_BUG_ON(!vma->pages);
 
+	if (!i915_vma_is_pinned(vma))
+		lockdep_assert_held(&vma->vm->mutex);
+
+	if (vma->resource || !vma_res) {
+		/* Rebinding with an additional I915_VMA_*_BIND */
+		GEM_WARN_ON(!vma_flags);
+		kfree(vma_res);
+	} else {
+		lockdep_assert_held(&vma->vm->mutex);
+		i915_vma_resource_init(vma_res, vma);
+		vma->resource = vma_res;
+	}
 	trace_i915_vma_bind(vma, bind_flags);
 	if (work && bind_flags & vma->vm->bind_async_flags) {
 		struct dma_fence *prev;
@@ -870,6 +919,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		    u64 size, u64 alignment, u64 flags)
 {
 	struct i915_vma_work *work = NULL;
+	struct i915_vma_resource *vma_res;
 	intel_wakeref_t wakeref = 0;
 	unsigned int bound;
 	int err;
@@ -923,6 +973,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		}
 	}
 
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res)) {
+		err = PTR_ERR(vma_res);
+		goto err_fence;
+	}
+
 	/*
 	 * Differentiate between user/kernel vma inside the aliasing-ppgtt.
 	 *
@@ -984,7 +1040,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	GEM_BUG_ON(!vma->pages);
 	err = i915_vma_bind(vma,
 			    vma->obj ? vma->obj->cache_level : 0,
-			    flags, work);
+			    flags, work, vma_res);
 	if (err)
 		goto err_remove;
 
@@ -1014,6 +1070,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	if (wakeref)
 		intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
 	vma_put_pages(vma);
+
 	return err;
 }
 
@@ -1288,6 +1345,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 
 void __i915_vma_evict(struct i915_vma *vma)
 {
+	struct dma_fence *unbind_fence;
+
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
 
 	if (i915_vma_is_map_and_fenceable(vma)) {
@@ -1327,6 +1386,16 @@ void __i915_vma_evict(struct i915_vma *vma)
 
 	i915_vma_detach(vma);
 	vma_unbind_pages(vma);
+
+	unbind_fence = i915_vma_resource_unbind(vma->resource);
+
+	/*
+	 * This uninterruptible wait under the vm mutex is currently
+	 * only ever blocking while the vma is being captured from.
+	 * With async unbinding, this wait here will be removed.
+	 */
+	dma_fence_wait(unbind_fence, false);
+	dma_fence_put(unbind_fence);
 }
 
 int __i915_vma_unbind(struct i915_vma *vma)
@@ -1356,6 +1425,7 @@ int __i915_vma_unbind(struct i915_vma *vma)
 	__i915_vma_evict(vma);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
 	return 0;
 }
 
@@ -1388,7 +1458,6 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	err = __i915_vma_unbind(vma);
 	mutex_unlock(&vm->mutex);
-
 out_rpm:
 	if (wakeref)
 		intel_runtime_pm_put(&vm->i915->runtime_pm, wakeref);
@@ -1411,6 +1480,131 @@ void i915_vma_make_purgeable(struct i915_vma *vma)
 	i915_gem_object_make_purgeable(vma->obj);
 }
 
+static const char *get_driver_name(struct dma_fence *fence)
+{
+	return "vma unbind fence";
+}
+
+static const char *get_timeline_name(struct dma_fence *fence)
+{
+	return "unbound";
+}
+
+static struct dma_fence_ops unbind_fence_ops = {
+	.get_driver_name = get_driver_name,
+	.get_timeline_name = get_timeline_name,
+};
+
+struct i915_vma_resource *i915_vma_resource_alloc(void)
+{
+	struct i915_vma_resource *vma_res =
+		kzalloc(sizeof(*vma_res), GFP_KERNEL);
+
+	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
+}
+
+I915_SELFTEST_EXPORT void
+i915_vma_resource_init(struct i915_vma_resource *vma_res,
+		       struct i915_vma *vma)
+{
+	vma_res->vma = vma;
+	spin_lock_init(&vma_res->lock);
+	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
+		       &vma_res->lock, 0, 0);
+	refcount_set(&vma_res->hold_count, 1);
+}
+
+static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
+{
+	if (refcount_dec_and_test(&vma_res->hold_count))
+		dma_fence_signal(&vma_res->unbind_fence);
+}
+
+/**
+ * i915_vma_resource_unhold - Unhold the signaling of the vma resource unbind
+ * fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: The lockdep cookie returned from i915_vma_resource_hold.
+ *
+ * The function may leave a dma_fence critical section.
+ */
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie)
+{
+	dma_fence_end_signalling(lockdep_cookie);
+
+	if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
+		unsigned long irq_flags;
+
+		/* Inefficient open-coded might_lock_irqsave() */
+		spin_lock_irqsave(&vma_res->lock, irq_flags);
+		spin_unlock_irqrestore(&vma_res->lock, irq_flags);
+	}
+
+	__i915_vma_resource_unhold(vma_res);
+}
+
+/**
+ * i915_vma_resource_hold - Hold the signaling of the vma resource unbind fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: Pointer to a bool serving as a lockdep cooke that should
+ * be given as an argument to the pairing i915_vma_resource_unhold.
+ *
+ * If returning true, the function enters a dma_fence signalling critical
+ * section is not in one already.
+ *
+ * Return: true if holding successful, false if not.
+ */
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie)
+{
+	bool held = refcount_inc_not_zero(&vma_res->hold_count);
+
+	if (held)
+		*lockdep_cookie = dma_fence_begin_signalling();
+
+	return held;
+}
+
+/**
+ * i915_vma_get_current_resource - Return the vma's current vma resource
+ * @vma: The vma referencing the resource.
+ *
+ * Return: A refcounted pointer to the vma's current vma resource.
+ */
+struct i915_vma_resource *i915_vma_get_current_resource(struct i915_vma *vma)
+{
+	GEM_BUG_ON(!vma->resource);
+
+	dma_fence_get(&vma->resource->unbind_fence);
+	return vma->resource;
+}
+
+/**
+ * i915_vma_resource_put - Release a reference to a struct i915_vma_resource
+ * @vma_res: The resource
+ */
+void i915_vma_resource_put(struct i915_vma_resource *vma_res)
+{
+	dma_fence_put(&vma_res->unbind_fence);
+}
+
+static struct dma_fence *
+i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
+{
+	/* Reference is transferred to the returned dma_fence pointer */
+	vma_res->vma->resource = NULL;
+
+	spin_lock(&vma_res->lock);
+	/* Kill the weak reference under the spinlock. */
+	vma_res->vma = NULL;
+	spin_unlock(&vma_res->lock);
+
+	/* With async unbind, schedule it here. */
+	__i915_vma_resource_unhold(vma_res);
+	return &vma_res->unbind_fence;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/i915_vma.c"
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 648dbe744c96..aa13d0d5bb91 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -206,7 +206,8 @@ struct i915_vma_work *i915_vma_work(void);
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work);
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res);
 
 bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color);
 bool i915_vma_misplaced(const struct i915_vma *vma,
@@ -433,7 +434,24 @@ static inline int i915_vma_sync(struct i915_vma *vma)
 	return i915_active_wait(&vma->active);
 }
 
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie);
+
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie);
+
+void i915_vma_resource_put(struct i915_vma_resource *vma_res);
+
+struct i915_vma_resource *i915_vma_get_current_resource(struct i915_vma *vma);
+
+struct i915_vma_resource *i915_vma_resource_alloc(void);
+
 void i915_vma_module_exit(void);
 int i915_vma_module_init(void);
 
+#ifdef CONFIG_DRM_I915_SELFTEST
+void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+			    struct i915_vma *vma);
+#endif
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
index 44985d600f96..b4ee8220df85 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -36,7 +36,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
 	if (vma->obj->mm.region)
 		vsnap->mr = intel_memory_region_get(vma->obj->mm.region);
 	kref_init(&vsnap->kref);
-	vsnap->vma_resource = &vma->active;
+	vsnap->vma_resource = i915_vma_get_current_resource(vma);
 	vsnap->onstack = false;
 	vsnap->present = true;
 }
@@ -63,6 +63,7 @@ static void vma_snapshot_release(struct kref *ref)
 		container_of(ref, typeof(*vsnap), kref);
 
 	vsnap->present = false;
+	i915_vma_resource_put(vsnap->vma_resource);
 	if (vsnap->mr)
 		intel_memory_region_put(vsnap->mr);
 	if (vsnap->pages_rsgt)
@@ -112,12 +113,7 @@ void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
 bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 				    bool *lockdep_cookie)
 {
-	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
-
-	if (pinned)
-		*lockdep_cookie = dma_fence_begin_signalling();
-
-	return pinned;
+	return i915_vma_resource_hold(vsnap->vma_resource, lockdep_cookie);
 }
 
 /**
@@ -131,7 +127,5 @@ bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
 				      bool lockdep_cookie)
 {
-	dma_fence_end_signalling(lockdep_cookie);
-
-	return i915_active_release(vsnap->vma_resource);
+	i915_vma_resource_unhold(vsnap->vma_resource, lockdep_cookie);
 }
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
index 940581df4622..d083b6bf1b11 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -49,7 +49,7 @@ struct i915_vma_snapshot {
 	struct i915_refct_sgt *pages_rsgt;
 	struct intel_memory_region *mr;
 	struct kref kref;
-	struct i915_active *vma_resource;
+	struct i915_vma_resource *vma_resource;
 	u32 page_sizes;
 	bool onstack:1;
 	bool present:1;
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 4ee6e54799f4..a5cd7ac0f517 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -95,6 +95,8 @@ enum i915_cache_level;
  *
  */
 
+struct i915_vma_resource;
+
 struct intel_remapped_plane_info {
 	/* in gtt pages */
 	u32 offset:31;
@@ -293,6 +295,9 @@ struct i915_vma {
 	struct list_head evict_link;
 
 	struct list_head closed_link;
+
+	/** The async vma resource. Protected by the vm_mutex */
+	struct i915_vma_resource *resource;
 };
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 46f4236039a9..30201a6906a9 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -1336,6 +1336,33 @@ static int igt_mock_drunk(void *arg)
 	return exercise_mock(ggtt->vm.i915, drunk_hole);
 }
 
+static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_reserve(vm, &vma->node, obj->base.size,
+				   offset,
+				   obj->cache_level,
+				   0);
+	if (!err) {
+		i915_vma_resource_init(vma_res, vma);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_reserve(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1370,20 +1397,13 @@ static int igt_gtt_reserve(void *arg)
 		}
 
 		list_add(&obj->st_link, &objects);
-
 		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 1) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1429,13 +1449,7 @@ static int igt_gtt_reserve(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1476,13 +1490,7 @@ static int igt_gtt_reserve(void *arg)
 					   2 * I915_GTT_PAGE_SIZE,
 					   I915_GTT_MIN_ALIGNMENT);
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   offset,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, offset);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1509,6 +1517,31 @@ static int igt_gtt_reserve(void *arg)
 	return err;
 }
 
+static int insert_gtt_with_resource(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
+				  obj->cache_level, 0, vm->total, 0);
+	if (!err) {
+		i915_vma_resource_init(vma_res, vma);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_insert(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1593,12 +1626,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err == -ENOSPC) {
 			/* maxed out the GGTT space */
 			i915_gem_object_put(obj);
@@ -1653,12 +1681,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1702,12 +1725,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-gfx] [PATCH v6 4/4] drm/i915: Initial introduction of vma resources
@ 2021-11-08 17:45   ` Thomas Hellström
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Hellström @ 2021-11-08 17:45 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

The vma resource are needed for asynchronous bind management and are
similar to TTM resources. They contain the data needed for
asynchronous unbinding (typically the vm range, any backend
private information and a means to do refcounting and to hold
the unbinding for error capture).

When a vma is bound, a vma resource is created and attached to the
vma, and on async unbinding it is detached from the vma, and instead
the vm records the fence marking unbind complete. This fence needs to
be waited on before we can bind the same region again, so either
the fence can be recorded for this particular range only, using an
interval tree, or as a simpler approach, for the whole vm. The latter
means no binding can take place on a vm until all detached vma
resources scheduled for unbind are signaled. With an interval tree
fence recording, the interval tree needs to be searched for fences
to be signaled before binding can take place.

But most of that is for later, this patch only introduces stub vma
resources without unbind capability and the fences of which are waited
for sync during unbinding. At this point we're interested in the hold
capability as a POC for error capture. Note that the current sync waiting
at unbind time is done uninterruptible, but that's OK since we're
only ever waiting during error capture, and in that case there's very
little gpu activity (if any) that can stall.

v2:
- Fix the mock gtt selftest to bind with vma resources.
- Update a code comment.
- Account for rebinding the same vma with different I915_VMA_*_BIND flags
v3:
- Some style fixups.
- Move the sync fence wait to __i915_vma_evict instead of __i915_vma_unbind
  to catch also the evict case on suspend.
v4:
- Remove a minor fix that incorrectly landed in this patch.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               | 206 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  20 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      |  14 +-
 drivers/gpu/drm/i915/i915_vma_snapshot.h      |   2 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  98 +++++----
 7 files changed, 288 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index ddca899f2396..a4f12d29c856 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1376,7 +1376,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 		    GRAPHICS_VER(eb->i915) == 6) {
 			err = i915_vma_bind(target->vma,
 					    target->vma->obj->cache_level,
-					    PIN_GLOBAL, NULL);
+					    PIN_GLOBAL, NULL, NULL);
 			if (err)
 				return err;
 		}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 8781c4f61952..d0ca5c618763 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -38,7 +38,35 @@
 #include "i915_trace.h"
 #include "i915_vma.h"
 
+/**
+ * struct i915_vma_resource - Snapshotted unbind information.
+ * @unbind_fence: Fence to mark unbinding complete. Note that this fence
+ * is not considered published until unbind is scheduled, and as such it
+ * is illegal to access this fence before scheduled unbind other than
+ * for refcounting.
+ * @lock: The @unbind_fence lock. We're also using it to protect the weak
+ * pointer to the struct i915_vma, @vma during lookup and takedown.
+ * @vma: Weak back-pointer to the parent vma struct. This pointer is
+ * protected by @lock, and a reference on @vma needs to be taken
+ * using kref_get_unless_zero.
+ * @hold_count: Number of holders blocking the fence from finishing.
+ * The vma itself is keeping a hold, which is released when unbind
+ * is scheduled.
+ */
+struct i915_vma_resource {
+	struct dma_fence unbind_fence;
+	/* See above for description of the lock. */
+	spinlock_t lock;
+	struct i915_vma *vma;
+	refcount_t hold_count;
+};
+
 static struct kmem_cache *slab_vmas;
+static struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
+#ifndef CONFIG_DRM_I915_SELFTEST
+static void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+				   struct i915_vma *vma);
+#endif
 
 struct i915_vma *i915_vma_alloc(void)
 {
@@ -363,6 +391,8 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
  * @cache_level: mapping cache level
  * @flags: flags like global or local mapping
  * @work: preallocated worker for allocating and binding the PTE
+ * @vma_res: pointer to a preallocated vma resource. The resource is either
+ * consumed or freed.
  *
  * DMA addresses are taken from the scatter-gather table of this object (or of
  * this VMA in case of non-default GGTT views) and PTE entries set up.
@@ -371,7 +401,8 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work)
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res)
 {
 	u32 bind_flags;
 	u32 vma_flags;
@@ -381,11 +412,15 @@ int i915_vma_bind(struct i915_vma *vma,
 
 	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
 					      vma->node.size,
-					      vma->vm->total)))
+					      vma->vm->total))) {
+		kfree(vma_res);
 		return -ENODEV;
+	}
 
-	if (GEM_DEBUG_WARN_ON(!flags))
+	if (GEM_DEBUG_WARN_ON(!flags)) {
+		kfree(vma_res);
 		return -EINVAL;
+	}
 
 	bind_flags = flags;
 	bind_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
@@ -394,11 +429,25 @@ int i915_vma_bind(struct i915_vma *vma,
 	vma_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
 
 	bind_flags &= ~vma_flags;
-	if (bind_flags == 0)
+	if (bind_flags == 0) {
+		kfree(vma_res);
 		return 0;
+	}
 
 	GEM_BUG_ON(!vma->pages);
 
+	if (!i915_vma_is_pinned(vma))
+		lockdep_assert_held(&vma->vm->mutex);
+
+	if (vma->resource || !vma_res) {
+		/* Rebinding with an additional I915_VMA_*_BIND */
+		GEM_WARN_ON(!vma_flags);
+		kfree(vma_res);
+	} else {
+		lockdep_assert_held(&vma->vm->mutex);
+		i915_vma_resource_init(vma_res, vma);
+		vma->resource = vma_res;
+	}
 	trace_i915_vma_bind(vma, bind_flags);
 	if (work && bind_flags & vma->vm->bind_async_flags) {
 		struct dma_fence *prev;
@@ -870,6 +919,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		    u64 size, u64 alignment, u64 flags)
 {
 	struct i915_vma_work *work = NULL;
+	struct i915_vma_resource *vma_res;
 	intel_wakeref_t wakeref = 0;
 	unsigned int bound;
 	int err;
@@ -923,6 +973,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		}
 	}
 
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res)) {
+		err = PTR_ERR(vma_res);
+		goto err_fence;
+	}
+
 	/*
 	 * Differentiate between user/kernel vma inside the aliasing-ppgtt.
 	 *
@@ -984,7 +1040,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	GEM_BUG_ON(!vma->pages);
 	err = i915_vma_bind(vma,
 			    vma->obj ? vma->obj->cache_level : 0,
-			    flags, work);
+			    flags, work, vma_res);
 	if (err)
 		goto err_remove;
 
@@ -1014,6 +1070,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	if (wakeref)
 		intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
 	vma_put_pages(vma);
+
 	return err;
 }
 
@@ -1288,6 +1345,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 
 void __i915_vma_evict(struct i915_vma *vma)
 {
+	struct dma_fence *unbind_fence;
+
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
 
 	if (i915_vma_is_map_and_fenceable(vma)) {
@@ -1327,6 +1386,16 @@ void __i915_vma_evict(struct i915_vma *vma)
 
 	i915_vma_detach(vma);
 	vma_unbind_pages(vma);
+
+	unbind_fence = i915_vma_resource_unbind(vma->resource);
+
+	/*
+	 * This uninterruptible wait under the vm mutex is currently
+	 * only ever blocking while the vma is being captured from.
+	 * With async unbinding, this wait here will be removed.
+	 */
+	dma_fence_wait(unbind_fence, false);
+	dma_fence_put(unbind_fence);
 }
 
 int __i915_vma_unbind(struct i915_vma *vma)
@@ -1356,6 +1425,7 @@ int __i915_vma_unbind(struct i915_vma *vma)
 	__i915_vma_evict(vma);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
 	return 0;
 }
 
@@ -1388,7 +1458,6 @@ int i915_vma_unbind(struct i915_vma *vma)
 
 	err = __i915_vma_unbind(vma);
 	mutex_unlock(&vm->mutex);
-
 out_rpm:
 	if (wakeref)
 		intel_runtime_pm_put(&vm->i915->runtime_pm, wakeref);
@@ -1411,6 +1480,131 @@ void i915_vma_make_purgeable(struct i915_vma *vma)
 	i915_gem_object_make_purgeable(vma->obj);
 }
 
+static const char *get_driver_name(struct dma_fence *fence)
+{
+	return "vma unbind fence";
+}
+
+static const char *get_timeline_name(struct dma_fence *fence)
+{
+	return "unbound";
+}
+
+static struct dma_fence_ops unbind_fence_ops = {
+	.get_driver_name = get_driver_name,
+	.get_timeline_name = get_timeline_name,
+};
+
+struct i915_vma_resource *i915_vma_resource_alloc(void)
+{
+	struct i915_vma_resource *vma_res =
+		kzalloc(sizeof(*vma_res), GFP_KERNEL);
+
+	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
+}
+
+I915_SELFTEST_EXPORT void
+i915_vma_resource_init(struct i915_vma_resource *vma_res,
+		       struct i915_vma *vma)
+{
+	vma_res->vma = vma;
+	spin_lock_init(&vma_res->lock);
+	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
+		       &vma_res->lock, 0, 0);
+	refcount_set(&vma_res->hold_count, 1);
+}
+
+static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
+{
+	if (refcount_dec_and_test(&vma_res->hold_count))
+		dma_fence_signal(&vma_res->unbind_fence);
+}
+
+/**
+ * i915_vma_resource_unhold - Unhold the signaling of the vma resource unbind
+ * fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: The lockdep cookie returned from i915_vma_resource_hold.
+ *
+ * The function may leave a dma_fence critical section.
+ */
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie)
+{
+	dma_fence_end_signalling(lockdep_cookie);
+
+	if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
+		unsigned long irq_flags;
+
+		/* Inefficient open-coded might_lock_irqsave() */
+		spin_lock_irqsave(&vma_res->lock, irq_flags);
+		spin_unlock_irqrestore(&vma_res->lock, irq_flags);
+	}
+
+	__i915_vma_resource_unhold(vma_res);
+}
+
+/**
+ * i915_vma_resource_hold - Hold the signaling of the vma resource unbind fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: Pointer to a bool serving as a lockdep cooke that should
+ * be given as an argument to the pairing i915_vma_resource_unhold.
+ *
+ * If returning true, the function enters a dma_fence signalling critical
+ * section is not in one already.
+ *
+ * Return: true if holding successful, false if not.
+ */
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie)
+{
+	bool held = refcount_inc_not_zero(&vma_res->hold_count);
+
+	if (held)
+		*lockdep_cookie = dma_fence_begin_signalling();
+
+	return held;
+}
+
+/**
+ * i915_vma_get_current_resource - Return the vma's current vma resource
+ * @vma: The vma referencing the resource.
+ *
+ * Return: A refcounted pointer to the vma's current vma resource.
+ */
+struct i915_vma_resource *i915_vma_get_current_resource(struct i915_vma *vma)
+{
+	GEM_BUG_ON(!vma->resource);
+
+	dma_fence_get(&vma->resource->unbind_fence);
+	return vma->resource;
+}
+
+/**
+ * i915_vma_resource_put - Release a reference to a struct i915_vma_resource
+ * @vma_res: The resource
+ */
+void i915_vma_resource_put(struct i915_vma_resource *vma_res)
+{
+	dma_fence_put(&vma_res->unbind_fence);
+}
+
+static struct dma_fence *
+i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
+{
+	/* Reference is transferred to the returned dma_fence pointer */
+	vma_res->vma->resource = NULL;
+
+	spin_lock(&vma_res->lock);
+	/* Kill the weak reference under the spinlock. */
+	vma_res->vma = NULL;
+	spin_unlock(&vma_res->lock);
+
+	/* With async unbind, schedule it here. */
+	__i915_vma_resource_unhold(vma_res);
+	return &vma_res->unbind_fence;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/i915_vma.c"
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 648dbe744c96..aa13d0d5bb91 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -206,7 +206,8 @@ struct i915_vma_work *i915_vma_work(void);
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work);
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res);
 
 bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color);
 bool i915_vma_misplaced(const struct i915_vma *vma,
@@ -433,7 +434,24 @@ static inline int i915_vma_sync(struct i915_vma *vma)
 	return i915_active_wait(&vma->active);
 }
 
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie);
+
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie);
+
+void i915_vma_resource_put(struct i915_vma_resource *vma_res);
+
+struct i915_vma_resource *i915_vma_get_current_resource(struct i915_vma *vma);
+
+struct i915_vma_resource *i915_vma_resource_alloc(void);
+
 void i915_vma_module_exit(void);
 int i915_vma_module_init(void);
 
+#ifdef CONFIG_DRM_I915_SELFTEST
+void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+			    struct i915_vma *vma);
+#endif
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
index 44985d600f96..b4ee8220df85 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -36,7 +36,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
 	if (vma->obj->mm.region)
 		vsnap->mr = intel_memory_region_get(vma->obj->mm.region);
 	kref_init(&vsnap->kref);
-	vsnap->vma_resource = &vma->active;
+	vsnap->vma_resource = i915_vma_get_current_resource(vma);
 	vsnap->onstack = false;
 	vsnap->present = true;
 }
@@ -63,6 +63,7 @@ static void vma_snapshot_release(struct kref *ref)
 		container_of(ref, typeof(*vsnap), kref);
 
 	vsnap->present = false;
+	i915_vma_resource_put(vsnap->vma_resource);
 	if (vsnap->mr)
 		intel_memory_region_put(vsnap->mr);
 	if (vsnap->pages_rsgt)
@@ -112,12 +113,7 @@ void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
 bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 				    bool *lockdep_cookie)
 {
-	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
-
-	if (pinned)
-		*lockdep_cookie = dma_fence_begin_signalling();
-
-	return pinned;
+	return i915_vma_resource_hold(vsnap->vma_resource, lockdep_cookie);
 }
 
 /**
@@ -131,7 +127,5 @@ bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
 				      bool lockdep_cookie)
 {
-	dma_fence_end_signalling(lockdep_cookie);
-
-	return i915_active_release(vsnap->vma_resource);
+	i915_vma_resource_unhold(vsnap->vma_resource, lockdep_cookie);
 }
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
index 940581df4622..d083b6bf1b11 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -49,7 +49,7 @@ struct i915_vma_snapshot {
 	struct i915_refct_sgt *pages_rsgt;
 	struct intel_memory_region *mr;
 	struct kref kref;
-	struct i915_active *vma_resource;
+	struct i915_vma_resource *vma_resource;
 	u32 page_sizes;
 	bool onstack:1;
 	bool present:1;
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 4ee6e54799f4..a5cd7ac0f517 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -95,6 +95,8 @@ enum i915_cache_level;
  *
  */
 
+struct i915_vma_resource;
+
 struct intel_remapped_plane_info {
 	/* in gtt pages */
 	u32 offset:31;
@@ -293,6 +295,9 @@ struct i915_vma {
 	struct list_head evict_link;
 
 	struct list_head closed_link;
+
+	/** The async vma resource. Protected by the vm_mutex */
+	struct i915_vma_resource *resource;
 };
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 46f4236039a9..30201a6906a9 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -1336,6 +1336,33 @@ static int igt_mock_drunk(void *arg)
 	return exercise_mock(ggtt->vm.i915, drunk_hole);
 }
 
+static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_reserve(vm, &vma->node, obj->base.size,
+				   offset,
+				   obj->cache_level,
+				   0);
+	if (!err) {
+		i915_vma_resource_init(vma_res, vma);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_reserve(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1370,20 +1397,13 @@ static int igt_gtt_reserve(void *arg)
 		}
 
 		list_add(&obj->st_link, &objects);
-
 		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 1) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1429,13 +1449,7 @@ static int igt_gtt_reserve(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1476,13 +1490,7 @@ static int igt_gtt_reserve(void *arg)
 					   2 * I915_GTT_PAGE_SIZE,
 					   I915_GTT_MIN_ALIGNMENT);
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   offset,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, offset);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1509,6 +1517,31 @@ static int igt_gtt_reserve(void *arg)
 	return err;
 }
 
+static int insert_gtt_with_resource(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
+				  obj->cache_level, 0, vm->total, 0);
+	if (!err) {
+		i915_vma_resource_init(vma_res, vma);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_insert(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1593,12 +1626,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err == -ENOSPC) {
 			/* maxed out the GGTT space */
 			i915_gem_object_put(obj);
@@ -1653,12 +1681,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1702,12 +1725,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Prepare error capture for asynchronous migration (rev2)
  2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
                   ` (4 preceding siblings ...)
  (?)
@ 2021-11-08 18:29 ` Patchwork
  -1 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2021-11-08 18:29 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Prepare error capture for asynchronous migration (rev2)
URL   : https://patchwork.freedesktop.org/series/96493/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
1948fce512ee drm/i915: Avoid allocating a page array for the gpu coredump
04cf071c7201 drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
c10709188aa4 drm/i915: Update error capture code to avoid using the current vma state
-:793: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#793: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 945 lines checked
6cd0fb32331c drm/i915: Initial introduction of vma resources



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Prepare error capture for asynchronous migration (rev2)
  2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
                   ` (5 preceding siblings ...)
  (?)
@ 2021-11-08 19:02 ` Patchwork
  -1 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2021-11-08 19:02 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 6187 bytes --]

== Series Details ==

Series: drm/i915: Prepare error capture for asynchronous migration (rev2)
URL   : https://patchwork.freedesktop.org/series/96493/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10854 -> Patchwork_21535
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/index.html

Participating hosts (40 -> 35)
------------------------------

  Additional (1): fi-icl-u2 
  Missing    (6): bat-dg1-6 bat-dg1-5 fi-bsw-cyan bat-adlp-4 fi-ctg-p8600 fi-bdw-samus 

New tests
---------

  New tests have been introduced between CI_DRM_10854 and Patchwork_21535:

### New IGT tests (1) ###

  * igt@gem_exec_suspend@basic-s0:
    - Statuses : 1 incomplete(s) 32 pass(s) 1 skip(s)
    - Exec time: [0.0, 20.41] s

  

Known issues
------------

  Here are the changes found in Patchwork_21535 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_cs_nop@fork-gfx0:
    - fi-icl-u2:          NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-icl-u2/igt@amdgpu/amd_cs_nop@fork-gfx0.html

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
    - fi-snb-2600:        NOTRUN -> [SKIP][2] ([fdo#109271]) +17 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-snb-2600/igt@amdgpu/amd_cs_nop@sync-fork-compute0.html

  * igt@debugfs_test@read_all_entries:
    - fi-apl-guc:         [PASS][3] -> [DMESG-WARN][4] ([i915#1610])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/fi-apl-guc/igt@debugfs_test@read_all_entries.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-apl-guc/igt@debugfs_test@read_all_entries.html

  * igt@gem_huc_copy@huc-copy:
    - fi-icl-u2:          NOTRUN -> [SKIP][5] ([i915#2190])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-icl-u2/igt@gem_huc_copy@huc-copy.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-bdw-5557u:       [PASS][6] -> [DMESG-FAIL][7] ([i915#541])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/fi-bdw-5557u/igt@i915_selftest@live@gt_heartbeat.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-bdw-5557u/igt@i915_selftest@live@gt_heartbeat.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-icl-u2:          NOTRUN -> [SKIP][8] ([fdo#111827]) +8 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
    - fi-icl-u2:          NOTRUN -> [SKIP][9] ([fdo#109278]) +2 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-icl-u2/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_force_connector_basic@force-load-detect:
    - fi-icl-u2:          NOTRUN -> [SKIP][10] ([fdo#109285])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-icl-u2/igt@kms_force_connector_basic@force-load-detect.html

  * igt@prime_vgem@basic-userptr:
    - fi-icl-u2:          NOTRUN -> [SKIP][11] ([i915#3301])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-icl-u2/igt@prime_vgem@basic-userptr.html

  
#### Possible fixes ####

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-skl-6600u:       [FAIL][12] ([i915#3239]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/fi-skl-6600u/igt@i915_pm_rpm@basic-pci-d3-state.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-skl-6600u/igt@i915_pm_rpm@basic-pci-d3-state.html

  * igt@i915_selftest@live@hangcheck:
    - fi-snb-2600:        [INCOMPLETE][14] ([i915#3921]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/fi-snb-2600/igt@i915_selftest@live@hangcheck.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-snb-2600/igt@i915_selftest@live@hangcheck.html

  * igt@kms_frontbuffer_tracking@basic:
    - {fi-hsw-gt1}:       [DMESG-WARN][16] ([i915#4290]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/fi-hsw-gt1/igt@kms_frontbuffer_tracking@basic.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/fi-hsw-gt1/igt@kms_frontbuffer_tracking@basic.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1610]: https://gitlab.freedesktop.org/drm/intel/issues/1610
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#3239]: https://gitlab.freedesktop.org/drm/intel/issues/3239
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3921]: https://gitlab.freedesktop.org/drm/intel/issues/3921
  [i915#4290]: https://gitlab.freedesktop.org/drm/intel/issues/4290
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541


Build changes
-------------

  * Linux: CI_DRM_10854 -> Patchwork_21535

  CI-20190529: 20190529
  CI_DRM_10854: 895fb34d3265137c84fe3e9dd48fb9ad2e00fd36 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6274: 569de51145fba197a8d93b2417348d47507bf485 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_21535: 6cd0fb32331c9c588c51fdfaa4f5ca69af86c5f2 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

6cd0fb32331c drm/i915: Initial introduction of vma resources
c10709188aa4 drm/i915: Update error capture code to avoid using the current vma state
04cf071c7201 drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
1948fce512ee drm/i915: Avoid allocating a page array for the gpu coredump

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/index.html

[-- Attachment #2: Type: text/html, Size: 7151 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Prepare error capture for asynchronous migration (rev2)
  2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
                   ` (6 preceding siblings ...)
  (?)
@ 2021-11-08 20:21 ` Patchwork
  -1 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2021-11-08 20:21 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 30288 bytes --]

== Series Details ==

Series: drm/i915: Prepare error capture for asynchronous migration (rev2)
URL   : https://patchwork.freedesktop.org/series/96493/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10854_full -> Patchwork_21535_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (11 -> 10)
------------------------------

  Missing    (1): pig-kbl-iris 

New tests
---------

  New tests have been introduced between CI_DRM_10854_full and Patchwork_21535_full:

### New IGT tests (190) ###

  * igt@debugfs_test@sysfs:
    - Statuses : 7 pass(s)
    - Exec time: [0.01, 0.08] s

  * igt@gem_ctx_exec@basic-nohangcheck:
    - Statuses : 7 pass(s)
    - Exec time: [0.05, 0.98] s

  * igt@gem_ctx_persistence@file:
    - Statuses :
    - Exec time: [None] s

  * igt@gem_ctx_persistence@idempotent:
    - Statuses : 4 pass(s) 1 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ctx_persistence@process:
    - Statuses : 6 pass(s) 1 skip(s)
    - Exec time: [0.0, 0.27] s

  * igt@gem_ctx_persistence@processes:
    - Statuses : 6 pass(s) 1 skip(s)
    - Exec time: [0.0, 0.45] s

  * igt@gem_ctx_persistence@smoketest:
    - Statuses : 7 pass(s) 1 skip(s)
    - Exec time: [0.0, 35.50] s

  * igt@gem_eio@kms:
    - Statuses : 7 pass(s)
    - Exec time: [12.03, 17.34] s

  * igt@gem_exec_suspend@basic-s0:
    - Statuses : 6 pass(s)
    - Exec time: [1.58, 20.35] s

  * igt@gem_mmap_gtt@close-race:
    - Statuses : 8 pass(s)
    - Exec time: [20.06, 20.77] s

  * igt@gem_mmap_gtt@cpuset-basic-small-copy:
    - Statuses : 7 pass(s)
    - Exec time: [1.01, 6.44] s

  * igt@gem_mmap_gtt@cpuset-basic-small-copy-odd:
    - Statuses : 7 pass(s)
    - Exec time: [1.51, 9.92] s

  * igt@gem_mmap_gtt@cpuset-basic-small-copy-xy:
    - Statuses : 6 pass(s)
    - Exec time: [1.03, 7.61] s

  * igt@gem_mmap_gtt@cpuset-big-copy:
    - Statuses : 7 pass(s)
    - Exec time: [3.26, 29.09] s

  * igt@gem_mmap_gtt@cpuset-big-copy-odd:
    - Statuses : 7 pass(s)
    - Exec time: [3.98, 26.96] s

  * igt@gem_mmap_gtt@cpuset-big-copy-xy:
    - Statuses : 6 pass(s)
    - Exec time: [4.30, 30.75] s

  * igt@gem_mmap_gtt@cpuset-medium-copy:
    - Statuses : 5 pass(s)
    - Exec time: [1.85, 11.14] s

  * igt@gem_mmap_gtt@cpuset-medium-copy-odd:
    - Statuses : 7 pass(s)
    - Exec time: [2.40, 15.09] s

  * igt@gem_mmap_gtt@cpuset-medium-copy-xy:
    - Statuses : 7 pass(s)
    - Exec time: [1.95, 15.39] s

  * igt@gem_mmap_gtt@flink-race:
    - Statuses : 7 pass(s)
    - Exec time: [20.07, 20.68] s

  * igt@gem_mmap_gtt@isolation:
    - Statuses : 5 pass(s)
    - Exec time: [0.00] s

  * igt@gem_pread@exhaustion:
    - Statuses : 6 warn(s)
    - Exec time: [2.65, 48.45] s

  * igt@gem_pwrite@basic-exhaustion:
    - Statuses : 1 skip(s) 7 warn(s)
    - Exec time: [0.0, 49.60] s

  * igt@gem_pwrite@basic-self:
    - Statuses : 7 pass(s)
    - Exec time: [0.05, 0.30] s

  * igt@gem_userptr_blits@nohangcheck:
    - Statuses : 7 pass(s)
    - Exec time: [0.05, 0.96] s

  * igt@gem_userptr_blits@stress-purge:
    - Statuses : 7 pass(s)
    - Exec time: [5.38, 12.29] s

  * igt@i915_pm_dc@dc5-dpms:
    - Statuses : 6 pass(s) 1 skip(s)
    - Exec time: [0.0, 2.23] s

  * igt@i915_pm_dc@dc5-psr:
    - Statuses : 3 pass(s) 4 skip(s)
    - Exec time: [0.0, 3.44] s

  * igt@i915_pm_dc@dc6-dpms:
    - Statuses : 3 fail(s) 1 pass(s) 3 skip(s)
    - Exec time: [0.0, 3.93] s

  * igt@i915_pm_dc@dc6-psr:
    - Statuses : 2 fail(s) 1 pass(s) 4 skip(s)
    - Exec time: [0.0, 3.78] s

  * igt@kms_atomic_interruptible@atomic-setmode@dp-1-pipe-a:
    - Statuses : 2 pass(s)
    - Exec time: [6.25, 6.33] s

  * igt@kms_atomic_interruptible@atomic-setmode@edp-1-pipe-a:
    - Statuses : 2 pass(s)
    - Exec time: [6.90, 6.91] s

  * igt@kms_atomic_interruptible@atomic-setmode@hdmi-a-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.59] s

  * igt@kms_atomic_interruptible@atomic-setmode@vga-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.15] s

  * igt@kms_atomic_interruptible@legacy-cursor@dp-1-pipe-a:
    - Statuses : 2 pass(s)
    - Exec time: [6.22, 6.35] s

  * igt@kms_atomic_interruptible@legacy-cursor@edp-1-pipe-a:
    - Statuses : 3 pass(s)
    - Exec time: [7.53, 7.85] s

  * igt@kms_atomic_interruptible@legacy-cursor@hdmi-a-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.60] s

  * igt@kms_atomic_interruptible@legacy-cursor@vga-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.16] s

  * igt@kms_atomic_interruptible@legacy-dpms@dp-1-pipe-a:
    - Statuses : 2 pass(s)
    - Exec time: [6.25, 6.36] s

  * igt@kms_atomic_interruptible@legacy-dpms@edp-1-pipe-a:
    - Statuses : 3 pass(s)
    - Exec time: [7.53, 7.97] s

  * igt@kms_atomic_interruptible@legacy-dpms@hdmi-a-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.63] s

  * igt@kms_atomic_interruptible@legacy-dpms@vga-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.18] s

  * igt@kms_atomic_interruptible@legacy-pageflip@dp-1-pipe-a:
    - Statuses : 2 pass(s)
    - Exec time: [6.26, 6.34] s

  * igt@kms_atomic_interruptible@legacy-pageflip@edp-1-pipe-a:
    - Statuses : 3 pass(s)
    - Exec time: [7.51, 7.88] s

  * igt@kms_atomic_interruptible@legacy-pageflip@hdmi-a-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.57] s

  * igt@kms_atomic_interruptible@legacy-pageflip@vga-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.19] s

  * igt@kms_atomic_interruptible@legacy-setmode@dp-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.25] s

  * igt@kms_atomic_interruptible@legacy-setmode@edp-1-pipe-a:
    - Statuses : 2 pass(s)
    - Exec time: [6.87, 6.89] s

  * igt@kms_atomic_interruptible@legacy-setmode@hdmi-a-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.52] s

  * igt@kms_atomic_interruptible@legacy-setmode@vga-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.15] s

  * igt@kms_atomic_interruptible@universal-setplane-primary@dp-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.37] s

  * igt@kms_atomic_interruptible@universal-setplane-primary@edp-1-pipe-a:
    - Statuses : 3 pass(s)
    - Exec time: [7.49, 7.82] s

  * igt@kms_atomic_interruptible@universal-setplane-primary@hdmi-a-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.62] s

  * igt@kms_atomic_interruptible@universal-setplane-primary@vga-1-pipe-a:
    - Statuses : 1 pass(s)
    - Exec time: [6.19] s

  * igt@kms_chamelium@dp-mode-timings:
    - Statuses : 8 skip(s)
    - Exec time: [0.0] s

  * igt@kms_chamelium@hdmi-aspect-ratio:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_chamelium@hdmi-mode-timings:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@kms_color@pipe-d-ctm-0-25:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.02] s

  * igt@kms_color@pipe-d-ctm-0-5:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.04] s

  * igt@kms_color@pipe-d-ctm-0-75:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.54] s

  * igt@kms_color@pipe-d-ctm-blue-to-red:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.03] s

  * igt@kms_color@pipe-d-ctm-green-to-red:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 1.06] s

  * igt@kms_color@pipe-d-ctm-max:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.05] s

  * igt@kms_color@pipe-d-ctm-negative:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.03] s

  * igt@kms_color@pipe-d-ctm-red-to-blue:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.07] s

  * igt@kms_color@pipe-d-degamma:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 1.34] s

  * igt@kms_color@pipe-d-gamma:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.31] s

  * igt@kms_color@pipe-d-legacy-gamma:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 1.05] s

  * igt@kms_color@pipe-d-legacy-gamma-reset:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 0.02] s

  * igt@kms_content_protection@content_type_change:
    - Statuses : 7 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@kms_content_protection@lic:
    - Statuses : 5 skip(s) 1 timeout(s)
    - Exec time: [0.0, 120.81] s

  * igt@kms_content_protection@mei_interface:
    - Statuses : 7 skip(s)
    - Exec time: [0.00, 0.03] s

  * igt@kms_content_protection@srm:
    - Statuses : 5 skip(s) 2 timeout(s)
    - Exec time: [0.00, 121.38] s

  * igt@kms_content_protection@type1:
    - Statuses : 7 skip(s)
    - Exec time: [0.0, 0.01] s

  * igt@kms_content_protection@uevent:
    - Statuses : 2 fail(s) 5 skip(s)
    - Exec time: [0.0, 34.60] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x128-offscreen:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 2.60] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x128-onscreen:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.79] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x128-random:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x128-rapid-movement:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.23] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x128-sliding:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.96] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x42-offscreen:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.58] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x42-onscreen:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.83] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x42-random:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 4.02] s

  * igt@kms_cursor_crc@pipe-d-cursor-128x42-sliding:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 4.03] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x256-offscreen:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.58] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x256-onscreen:
    - Statuses :
    - Exec time: [None] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x256-random:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 4.04] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x256-rapid-movement:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.33] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x256-sliding:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.89] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x85-offscreen:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 2.60] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x85-onscreen:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.83] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x85-random:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 4.00] s

  * igt@kms_cursor_crc@pipe-d-cursor-256x85-sliding:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x170-offscreen:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x170-onscreen:
    - Statuses : 8 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x170-random:
    - Statuses : 8 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x170-sliding:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-offscreen:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-onscreen:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-random:
    - Statuses :
    - Exec time: [None] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-rapid-movement:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-sliding:
    - Statuses : 8 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x21-offscreen:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.60] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x21-onscreen:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.88] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x21-random:
    - Statuses : 1 pass(s) 4 skip(s)
    - Exec time: [0.0, 4.07] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x21-sliding:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.91] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x64-offscreen:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.58] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x64-onscreen:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.77] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x64-random:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 4.01] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x64-rapid-movement:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 1.24] s

  * igt@kms_cursor_crc@pipe-d-cursor-64x64-sliding:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.92] s

  * igt@kms_cursor_crc@pipe-d-cursor-alpha-opaque:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.28] s

  * igt@kms_cursor_crc@pipe-d-cursor-alpha-transparent:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.29] s

  * igt@kms_cursor_crc@pipe-d-cursor-dpms:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.72] s

  * igt@kms_cursor_crc@pipe-d-cursor-size-change:
    - Statuses :
    - Exec time: [None] s

  * igt@kms_cursor_crc@pipe-d-cursor-suspend:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 5.87] s

  * igt@kms_cursor_edge_walk@pipe-d-128x128-bottom-edge:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.50] s

  * igt@kms_cursor_edge_walk@pipe-d-128x128-left-edge:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.50] s

  * igt@kms_cursor_edge_walk@pipe-d-128x128-right-edge:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 3.50] s

  * igt@kms_cursor_edge_walk@pipe-d-128x128-top-edge:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.49] s

  * igt@kms_cursor_edge_walk@pipe-d-256x256-bottom-edge:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.51] s

  * igt@kms_cursor_edge_walk@pipe-d-256x256-left-edge:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.54] s

  * igt@kms_cursor_edge_walk@pipe-d-256x256-right-edge:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.51] s

  * igt@kms_cursor_edge_walk@pipe-d-256x256-top-edge:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.47] s

  * igt@kms_cursor_edge_walk@pipe-d-64x64-bottom-edge:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.50] s

  * igt@kms_cursor_edge_walk@pipe-d-64x64-left-edge:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.50] s

  * igt@kms_cursor_edge_walk@pipe-d-64x64-right-edge:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.50] s

  * igt@kms_cursor_edge_walk@pipe-d-64x64-top-edge:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.51] s

  * igt@kms_cursor_legacy@pipe-d-forked-bo:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 21.54] s

  * igt@kms_cursor_legacy@pipe-d-forked-move:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 21.55] s

  * igt@kms_cursor_legacy@pipe-d-single-bo:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@kms_cursor_legacy@pipe-d-single-move:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 21.49] s

  * igt@kms_cursor_legacy@pipe-d-torture-bo:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 21.63] s

  * igt@kms_cursor_legacy@pipe-d-torture-move:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 21.64] s

  * igt@kms_dp_tiled_display@basic-test-pattern:
    - Statuses : 6 skip(s)
    - Exec time: [0.0, 0.00] s

  * igt@kms_pipe_crc_basic@nonblocking-crc-pipe-d-frame-sequence:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.49] s

  * igt@kms_pipe_crc_basic@read-crc-pipe-d-frame-sequence:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 1.46] s

  * igt@kms_plane_alpha_blend@pipe-d-alpha-7efc:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 12.04] s

  * igt@kms_plane_alpha_blend@pipe-d-alpha-basic:
    - Statuses : 1 pass(s) 4 skip(s)
    - Exec time: [0.0, 7.19] s

  * igt@kms_plane_alpha_blend@pipe-d-alpha-opaque-fb:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.50] s

  * igt@kms_plane_alpha_blend@pipe-d-constant-alpha-max:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.41] s

  * igt@kms_plane_alpha_blend@pipe-d-constant-alpha-mid:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.94] s

  * igt@kms_plane_alpha_blend@pipe-d-constant-alpha-min:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.96] s

  * igt@kms_plane_alpha_blend@pipe-d-coverage-7efc:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 11.98] s

  * igt@kms_plane_alpha_blend@pipe-d-coverage-vs-premult-vs-constant:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.39] s

  * igt@kms_plane_cursor@pipe-d-overlay-size-128:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 3.02] s

  * igt@kms_plane_cursor@pipe-d-overlay-size-256:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.04] s

  * igt@kms_plane_cursor@pipe-d-overlay-size-64:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.07] s

  * igt@kms_plane_cursor@pipe-d-primary-size-128:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.06] s

  * igt@kms_plane_cursor@pipe-d-primary-size-256:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.07] s

  * igt@kms_plane_cursor@pipe-d-primary-size-64:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 3.07] s

  * igt@kms_plane_cursor@pipe-d-viewport-size-128:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.04] s

  * igt@kms_plane_cursor@pipe-d-viewport-size-256:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 3.07] s

  * igt@kms_plane_cursor@pipe-d-viewport-size-64:
    - Statuses :
    - Exec time: [None] s

  * igt@kms_plane_lowres@pipe-d-tiling-none:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_plane_lowres@pipe-d-tiling-x:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_plane_lowres@pipe-d-tiling-y:
    - Statuses : 7 skip(s)
    - Exec time: [0.0] s

  * igt@kms_plane_lowres@pipe-d-tiling-yf:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@kms_plane_multiple@atomic-pipe-d-tiling-none:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 1.44] s

  * igt@kms_plane_multiple@atomic-pipe-d-tiling-x:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 1.42] s

  * igt@kms_plane_multiple@atomic-pipe-d-tiling-y:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.50] s

  * igt@kms_plane_multiple@atomic-pipe-d-tiling-yf:
    - Statuses : 7 skip(s)
    - Exec time: [0.0, 1.21] s

  * igt@kms_prime@basic-crc:
    - Statuses :
    - Exec time: [None] s

  * igt@kms_universal_plane@universal-plane-pipe-d-functional:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.71] s

  * igt@kms_universal_plane@universal-plane-pipe-d-sanity:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.64] s

  * igt@kms_vblank@pipe-d-accuracy-idle:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.87] s

  * igt@kms_vblank@pipe-d-query-busy:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.00] s

  * igt@kms_vblank@pipe-d-query-busy-hang:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@kms_vblank@pipe-d-query-forked:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.92] s

  * igt@kms_vblank@pipe-d-query-forked-busy:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 3.00] s

  * igt@kms_vblank@pipe-d-query-forked-busy-hang:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.02] s

  * igt@kms_vblank@pipe-d-query-forked-hang:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 16.89] s

  * igt@kms_vblank@pipe-d-query-idle:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.91] s

  * igt@kms_vblank@pipe-d-query-idle-hang:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 16.33] s

  * igt@kms_vblank@pipe-d-ts-continuation-dpms-rpm:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 3.64] s

  * igt@kms_vblank@pipe-d-ts-continuation-dpms-suspend:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.73] s

  * igt@kms_vblank@pipe-d-ts-continuation-idle:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 1.54] s

  * igt@kms_vblank@pipe-d-ts-continuation-idle-hang:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 16.44] s

  * igt@kms_vblank@pipe-d-ts-continuation-modeset:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.72] s

  * igt@kms_vblank@pipe-d-ts-continuation-modeset-hang:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 16.82] s

  * igt@kms_vblank@pipe-d-ts-continuation-modeset-rpm:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 3.68] s

  * igt@kms_vblank@pipe-d-ts-continuation-suspend:
    - Statuses : 1 incomplete(s) 6 skip(s)
    - Exec time: [0.0] s

  * igt@kms_vblank@pipe-d-wait-busy:
    - Statuses : 1 pass(s) 7 skip(s)
    - Exec time: [0.0, 3.03] s

  * igt@kms_vblank@pipe-d-wait-busy-hang:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 16.89] s

  * igt@kms_vblank@pipe-d-wait-forked:
    - Statuses : 1 pass(s) 5 skip(s)
    - Exec time: [0.0, 2.91] s

  * igt@kms_vblank@pipe-d-wait-forked-busy:
    - Statuses :
    - Exec time: [None] s

  * igt@kms_vblank@pipe-d-wait-forked-busy-hang:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 16.79] s

  * igt@kms_vblank@pipe-d-wait-forked-hang:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 16.71] s

  * igt@kms_vblank@pipe-d-wait-idle:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 2.87] s

  * igt@kms_vblank@pipe-d-wait-idle-hang:
    - Statuses : 1 pass(s) 6 skip(s)
    - Exec time: [0.0, 16.87] s

  

Known issues
------------

  Here are the changes found in Patchwork_21535_full that come from known issues:

### CI changes ###

#### Possible fixes ####

  * boot:
    - shard-snb:          ([PASS][1], [PASS][2], [PASS][3], [PASS][4], [PASS][5], [PASS][6], [PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [FAIL][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25]) ([i915#4338]) -> ([PASS][26], [PASS][27], [PASS][28], [PASS][29], [PASS][30], [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49], [PASS][50])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb7/boot.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb7/boot.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb7/boot.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb7/boot.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb7/boot.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb6/boot.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb6/boot.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb6/boot.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb6/boot.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb5/boot.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb5/boot.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb5/boot.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb5/boot.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb5/boot.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb5/boot.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb4/boot.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb4/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb4/boot.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb4/boot.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb4/boot.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb2/boot.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb2/boot.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb2/boot.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb2/boot.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-snb2/boot.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb7/boot.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb7/boot.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb7/boot.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb7/boot.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb7/boot.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb6/boot.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb6/boot.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb6/boot.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb6/boot.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb6/boot.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb6/boot.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb5/boot.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb5/boot.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb5/boot.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb5/boot.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb5/boot.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb4/boot.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb4/boot.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb4/boot.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb4/boot.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb4/boot.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb2/boot.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb2/boot.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb2/boot.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb2/boot.html

  

### IGT changes ###

#### Issues hit ####

  * igt@feature_discovery@display-3x:
    - shard-tglb:         NOTRUN -> [SKIP][51] ([i915#1839])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-tglb8/igt@feature_discovery@display-3x.html

  * igt@gem_ctx_persistence@hostile:
    - shard-snb:          NOTRUN -> [SKIP][52] ([fdo#109271] / [i915#1099])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-snb6/igt@gem_ctx_persistence@hostile.html

  * igt@gem_ctx_sseu@engines:
    - shard-tglb:         NOTRUN -> [SKIP][53] ([i915#280])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-tglb3/igt@gem_ctx_sseu@engines.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-glk:          [PASS][54] -> [FAIL][55] ([i915#2846])
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-glk8/igt@gem_exec_fair@basic-deadline.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-glk9/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-flow@rcs0:
    - shard-tglb:         [PASS][56] -> [FAIL][57] ([i915#2842])
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-tglb5/igt@gem_exec_fair@basic-flow@rcs0.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-tglb7/igt@gem_exec_fair@basic-flow@rcs0.html

  * igt@gem_exec_fair@basic-none-vip@rcs0:
    - shard-tglb:         NOTRUN -> [FAIL][58] ([i915#2842]) +10 similar issues
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-tglb3/igt@gem_exec_fair@basic-none-vip@rcs0.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
    - shard-kbl:          [PASS][59] -> [FAIL][60] ([i915#2842])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-kbl3/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-kbl7/igt@gem_exec_fair@basic-pace-solo@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-iclb:         [PASS][61] -> [FAIL][62] ([i915#2842])
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10854/shard-iclb8/igt@gem_exec_fair@basic-pace@vecs0.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-iclb7/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-iclb:         NOTRUN -> [FAIL][63] ([i915#2849])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-iclb6/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_pread@exhaustion (NEW):
    - shard-apl:          NOTRUN -> [WARN][64] ([i915#2658])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-apl4/igt@gem_pread@exhaustion.html

  * igt@gem_pwrite@basic-exhaustion (NEW):
    - {shard-rkl}:        NOTRUN -> [SKIP][65] ([i915#3282])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/shard-rkl-1/igt@gem_pwrite@basic-exhaustion.html

  * igt@gem_pxp@create-protected-buffer:
    - shard-iclb:         NOTRUN -> [SKIP]

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21535/index.html

[-- Attachment #2: Type: text/html, Size: 38315 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/4] drm/i915: Avoid allocating a page array for the gpu coredump
  2021-11-08 17:45   ` [Intel-gfx] " Thomas Hellström
@ 2021-11-22 15:08     ` Ramalingam C
  -1 siblings, 0 replies; 18+ messages in thread
From: Ramalingam C @ 2021-11-22 15:08 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx, matthew.auld, dri-devel

On 2021-11-08 at 18:45:44 +0100, Thomas Hellström wrote:
> The gpu coredump typically takes place in a dma_fence signalling
> critical path, and hence can't use GFP_KERNEL allocations, as that
> means we might hit deadlocks under memory pressure. However
> changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
> patch will instead mean a lower chance of the allocation succeeding.
> In particular large contigous allocations like the coredump page
> vector.
> Remove the page vector in favor of a linked list of single pages.
> Use the page lru list head as the list link, as the page owner is
> allowed to do that.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Looks good to me

Reviewed-by: Ramalingam C <ramalingam.c@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gpu_error.c | 50 +++++++++++++++------------
>  drivers/gpu/drm/i915/i915_gpu_error.h |  4 +--
>  2 files changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 2a2d7643b551..14de64282697 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -275,16 +275,16 @@ static bool compress_start(struct i915_vma_compress *c)
>  static void *compress_next_page(struct i915_vma_compress *c,
>  				struct i915_vma_coredump *dst)
>  {
> -	void *page;
> +	void *page_addr;
> +	struct page *page;
>  
> -	if (dst->page_count >= dst->num_pages)
> -		return ERR_PTR(-ENOSPC);
> -
> -	page = pool_alloc(&c->pool, ALLOW_FAIL);
> -	if (!page)
> +	page_addr = pool_alloc(&c->pool, ALLOW_FAIL);
> +	if (!page_addr)
>  		return ERR_PTR(-ENOMEM);
>  
> -	return dst->pages[dst->page_count++] = page;
> +	page = virt_to_page(page_addr);
> +	list_add_tail(&page->lru, &dst->page_list);
> +	return page_addr;
>  }
>  
>  static int compress_page(struct i915_vma_compress *c,
> @@ -397,7 +397,7 @@ static int compress_page(struct i915_vma_compress *c,
>  
>  	if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE)))
>  		memcpy(ptr, src, PAGE_SIZE);
> -	dst->pages[dst->page_count++] = ptr;
> +	list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list);
>  	cond_resched();
>  
>  	return 0;
> @@ -614,7 +614,7 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
>  			    const struct i915_vma_coredump *vma)
>  {
>  	char out[ASCII85_BUFSZ];
> -	int page;
> +	struct page *page;
>  
>  	if (!vma)
>  		return;
> @@ -628,16 +628,17 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
>  		err_printf(m, "gtt_page_sizes = 0x%08x\n", vma->gtt_page_sizes);
>  
>  	err_compression_marker(m);
> -	for (page = 0; page < vma->page_count; page++) {
> +	list_for_each_entry(page, &vma->page_list, lru) {
>  		int i, len;
> +		const u32 *addr = page_address(page);
>  
>  		len = PAGE_SIZE;
> -		if (page == vma->page_count - 1)
> +		if (page == list_last_entry(&vma->page_list, typeof(*page), lru))
>  			len -= vma->unused;
>  		len = ascii85_encode_len(len);
>  
>  		for (i = 0; i < len; i++)
> -			err_puts(m, ascii85_encode(vma->pages[page][i], out));
> +			err_puts(m, ascii85_encode(addr[i], out));
>  	}
>  	err_puts(m, "\n");
>  }
> @@ -946,10 +947,12 @@ static void i915_vma_coredump_free(struct i915_vma_coredump *vma)
>  {
>  	while (vma) {
>  		struct i915_vma_coredump *next = vma->next;
> -		int page;
> +		struct page *page, *n;
>  
> -		for (page = 0; page < vma->page_count; page++)
> -			free_page((unsigned long)vma->pages[page]);
> +		list_for_each_entry_safe(page, n, &vma->page_list, lru) {
> +			list_del_init(&page->lru);
> +			__free_page(page);
> +		}
>  
>  		kfree(vma);
>  		vma = next;
> @@ -1016,7 +1019,6 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	struct i915_ggtt *ggtt = gt->ggtt;
>  	const u64 slot = ggtt->error_capture.start;
>  	struct i915_vma_coredump *dst;
> -	unsigned long num_pages;
>  	struct sgt_iter iter;
>  	int ret;
>  
> @@ -1025,9 +1027,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	if (!vma || !vma->pages || !compress)
>  		return NULL;
>  
> -	num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
> -	num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
> -	dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
> +	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
>  	if (!dst)
>  		return NULL;
>  
> @@ -1036,14 +1036,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  		return NULL;
>  	}
>  
> +	INIT_LIST_HEAD(&dst->page_list);
>  	strcpy(dst->name, name);
>  	dst->next = NULL;
>  
>  	dst->gtt_offset = vma->node.start;
>  	dst->gtt_size = vma->node.size;
>  	dst->gtt_page_sizes = vma->page_sizes.gtt;
> -	dst->num_pages = num_pages;
> -	dst->page_count = 0;
>  	dst->unused = 0;
>  
>  	ret = -EINVAL;
> @@ -1106,8 +1105,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	}
>  
>  	if (ret || compress_flush(compress, dst)) {
> -		while (dst->page_count--)
> -			pool_free(&compress->pool, dst->pages[dst->page_count]);
> +		struct page *page, *n;
> +
> +		list_for_each_entry_safe_reverse(page, n, &dst->page_list, lru) {
> +			list_del_init(&page->lru);
> +			pool_free(&compress->pool, page_address(page));
> +		}
> +
>  		kfree(dst);
>  		dst = NULL;
>  	}
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
> index b98d8cdbe4f2..5aedf5129814 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.h
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.h
> @@ -39,10 +39,8 @@ struct i915_vma_coredump {
>  	u64 gtt_size;
>  	u32 gtt_page_sizes;
>  
> -	int num_pages;
> -	int page_count;
>  	int unused;
> -	u32 *pages[];
> +	struct list_head page_list;
>  };
>  
>  struct i915_request_coredump {
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v6 1/4] drm/i915: Avoid allocating a page array for the gpu coredump
@ 2021-11-22 15:08     ` Ramalingam C
  0 siblings, 0 replies; 18+ messages in thread
From: Ramalingam C @ 2021-11-22 15:08 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx, matthew.auld, dri-devel

On 2021-11-08 at 18:45:44 +0100, Thomas Hellström wrote:
> The gpu coredump typically takes place in a dma_fence signalling
> critical path, and hence can't use GFP_KERNEL allocations, as that
> means we might hit deadlocks under memory pressure. However
> changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
> patch will instead mean a lower chance of the allocation succeeding.
> In particular large contigous allocations like the coredump page
> vector.
> Remove the page vector in favor of a linked list of single pages.
> Use the page lru list head as the list link, as the page owner is
> allowed to do that.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Looks good to me

Reviewed-by: Ramalingam C <ramalingam.c@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gpu_error.c | 50 +++++++++++++++------------
>  drivers/gpu/drm/i915/i915_gpu_error.h |  4 +--
>  2 files changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 2a2d7643b551..14de64282697 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -275,16 +275,16 @@ static bool compress_start(struct i915_vma_compress *c)
>  static void *compress_next_page(struct i915_vma_compress *c,
>  				struct i915_vma_coredump *dst)
>  {
> -	void *page;
> +	void *page_addr;
> +	struct page *page;
>  
> -	if (dst->page_count >= dst->num_pages)
> -		return ERR_PTR(-ENOSPC);
> -
> -	page = pool_alloc(&c->pool, ALLOW_FAIL);
> -	if (!page)
> +	page_addr = pool_alloc(&c->pool, ALLOW_FAIL);
> +	if (!page_addr)
>  		return ERR_PTR(-ENOMEM);
>  
> -	return dst->pages[dst->page_count++] = page;
> +	page = virt_to_page(page_addr);
> +	list_add_tail(&page->lru, &dst->page_list);
> +	return page_addr;
>  }
>  
>  static int compress_page(struct i915_vma_compress *c,
> @@ -397,7 +397,7 @@ static int compress_page(struct i915_vma_compress *c,
>  
>  	if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE)))
>  		memcpy(ptr, src, PAGE_SIZE);
> -	dst->pages[dst->page_count++] = ptr;
> +	list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list);
>  	cond_resched();
>  
>  	return 0;
> @@ -614,7 +614,7 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
>  			    const struct i915_vma_coredump *vma)
>  {
>  	char out[ASCII85_BUFSZ];
> -	int page;
> +	struct page *page;
>  
>  	if (!vma)
>  		return;
> @@ -628,16 +628,17 @@ static void print_error_vma(struct drm_i915_error_state_buf *m,
>  		err_printf(m, "gtt_page_sizes = 0x%08x\n", vma->gtt_page_sizes);
>  
>  	err_compression_marker(m);
> -	for (page = 0; page < vma->page_count; page++) {
> +	list_for_each_entry(page, &vma->page_list, lru) {
>  		int i, len;
> +		const u32 *addr = page_address(page);
>  
>  		len = PAGE_SIZE;
> -		if (page == vma->page_count - 1)
> +		if (page == list_last_entry(&vma->page_list, typeof(*page), lru))
>  			len -= vma->unused;
>  		len = ascii85_encode_len(len);
>  
>  		for (i = 0; i < len; i++)
> -			err_puts(m, ascii85_encode(vma->pages[page][i], out));
> +			err_puts(m, ascii85_encode(addr[i], out));
>  	}
>  	err_puts(m, "\n");
>  }
> @@ -946,10 +947,12 @@ static void i915_vma_coredump_free(struct i915_vma_coredump *vma)
>  {
>  	while (vma) {
>  		struct i915_vma_coredump *next = vma->next;
> -		int page;
> +		struct page *page, *n;
>  
> -		for (page = 0; page < vma->page_count; page++)
> -			free_page((unsigned long)vma->pages[page]);
> +		list_for_each_entry_safe(page, n, &vma->page_list, lru) {
> +			list_del_init(&page->lru);
> +			__free_page(page);
> +		}
>  
>  		kfree(vma);
>  		vma = next;
> @@ -1016,7 +1019,6 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	struct i915_ggtt *ggtt = gt->ggtt;
>  	const u64 slot = ggtt->error_capture.start;
>  	struct i915_vma_coredump *dst;
> -	unsigned long num_pages;
>  	struct sgt_iter iter;
>  	int ret;
>  
> @@ -1025,9 +1027,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	if (!vma || !vma->pages || !compress)
>  		return NULL;
>  
> -	num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
> -	num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
> -	dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
> +	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
>  	if (!dst)
>  		return NULL;
>  
> @@ -1036,14 +1036,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  		return NULL;
>  	}
>  
> +	INIT_LIST_HEAD(&dst->page_list);
>  	strcpy(dst->name, name);
>  	dst->next = NULL;
>  
>  	dst->gtt_offset = vma->node.start;
>  	dst->gtt_size = vma->node.size;
>  	dst->gtt_page_sizes = vma->page_sizes.gtt;
> -	dst->num_pages = num_pages;
> -	dst->page_count = 0;
>  	dst->unused = 0;
>  
>  	ret = -EINVAL;
> @@ -1106,8 +1105,13 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	}
>  
>  	if (ret || compress_flush(compress, dst)) {
> -		while (dst->page_count--)
> -			pool_free(&compress->pool, dst->pages[dst->page_count]);
> +		struct page *page, *n;
> +
> +		list_for_each_entry_safe_reverse(page, n, &dst->page_list, lru) {
> +			list_del_init(&page->lru);
> +			pool_free(&compress->pool, page_address(page));
> +		}
> +
>  		kfree(dst);
>  		dst = NULL;
>  	}
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
> index b98d8cdbe4f2..5aedf5129814 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.h
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.h
> @@ -39,10 +39,8 @@ struct i915_vma_coredump {
>  	u64 gtt_size;
>  	u32 gtt_page_sizes;
>  
> -	int num_pages;
> -	int page_count;
>  	int unused;
> -	u32 *pages[];
> +	struct list_head page_list;
>  };
>  
>  struct i915_request_coredump {
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v6 2/4] drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
  2021-11-08 17:45   ` [Intel-gfx] " Thomas Hellström
  (?)
@ 2021-11-22 16:50   ` Ramalingam C
  -1 siblings, 0 replies; 18+ messages in thread
From: Ramalingam C @ 2021-11-22 16:50 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx, matthew.auld, dri-devel

On 2021-11-08 at 18:45:45 +0100, Thomas Hellström wrote:
> The capture code is typically run entirely in the fence signalling
> critical path. We're about to add lockdep annotation in an upcoming patch
> which reveals a lockdep splat similar to the below one.
> 
> Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
> (which is the same as GFP_WAIT, but open-coded for clarity) rather than
> GFP_KERNEL for memory allocation in the capture path. This has the
> potential drawback that capture might fail in situations with memory
> pressure.
> 
> [  234.842048] WARNING: possible circular locking dependency detected
> [  234.842050] 5.15.0-rc7+ #20 Tainted: G     U  W
> [  234.842052] ------------------------------------------------------
> [  234.842054] gem_exec_captur/1180 is trying to acquire lock:
> [  234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
> [  234.842063]
>                but task is already holding lock:
> [  234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
> [  234.842138]
>                which lock already depends on the new lock.
> 
> [  234.842140]
>                the existing dependency chain (in reverse order) is:
> [  234.842142]
>                -> #2 (dma_fence_map){++++}-{0:0}:
> [  234.842145]        __dma_fence_might_wait+0x41/0xa0
> [  234.842149]        dma_resv_lockdep+0x1dc/0x28f
> [  234.842151]        do_one_initcall+0x58/0x2d0
> [  234.842154]        kernel_init_freeable+0x273/0x2bf
> [  234.842157]        kernel_init+0x16/0x120
> [  234.842160]        ret_from_fork+0x1f/0x30
> [  234.842163]
>                -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
> [  234.842166]        fs_reclaim_acquire+0x6d/0xd0
> [  234.842168]        __kmalloc_node+0x51/0x3a0
> [  234.842171]        alloc_cpumask_var_node+0x1b/0x30
> [  234.842174]        native_smp_prepare_cpus+0xc7/0x292
> [  234.842177]        kernel_init_freeable+0x160/0x2bf
> [  234.842179]        kernel_init+0x16/0x120
> [  234.842181]        ret_from_fork+0x1f/0x30
> [  234.842184]
>                -> #0 (fs_reclaim){+.+.}-{0:0}:
> [  234.842186]        __lock_acquire+0x1161/0x1dc0
> [  234.842189]        lock_acquire+0xb5/0x2b0
> [  234.842192]        fs_reclaim_acquire+0xa1/0xd0
> [  234.842193]        __kmalloc+0x4d/0x330
> [  234.842196]        i915_vma_coredump_create+0x78/0x5b0 [i915]
> [  234.842253]        intel_engine_coredump_add_vma+0x36/0xe0 [i915]
> [  234.842307]        __i915_gpu_coredump+0x290/0x5e0 [i915]
> [  234.842365]        i915_capture_error_state+0x57/0xa0 [i915]
> [  234.842415]        intel_gt_handle_error+0x348/0x3e0 [i915]
> [  234.842462]        intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
> [  234.842504]        simple_attr_write+0xc1/0xe0
> [  234.842507]        full_proxy_write+0x53/0x80
> [  234.842509]        vfs_write+0xbc/0x350
> [  234.842513]        ksys_write+0x58/0xd0
> [  234.842514]        do_syscall_64+0x38/0x90
> [  234.842516]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  234.842519]
>                other info that might help us debug this:
> 
> [  234.842521] Chain exists of:
>                  fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
> 
> [  234.842526]  Possible unsafe locking scenario:
> 
> [  234.842528]        CPU0                    CPU1
> [  234.842529]        ----                    ----
> [  234.842531]   lock(dma_fence_map);
> [  234.842532]                                lock(mmu_notifier_invalidate_range_start);
> [  234.842535]                                lock(dma_fence_map);
> [  234.842537]   lock(fs_reclaim);
> [  234.842539]
>                 *** DEADLOCK ***
> 
> [  234.842540] 4 locks held by gem_exec_captur/1180:
> [  234.842543]  #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
> [  234.842547]  #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
> [  234.842552]  #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
> [  234.842602]  #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
> [  234.842656]
>                stack backtrace:
> [  234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G     U  W         5.15.0-rc7+ #20
> [  234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
> [  234.842664] Call Trace:
> [  234.842666]  dump_stack_lvl+0x57/0x72
> [  234.842669]  check_noncircular+0xde/0x100
> [  234.842672]  ? __lock_acquire+0x3bf/0x1dc0
> [  234.842675]  __lock_acquire+0x1161/0x1dc0
> [  234.842678]  lock_acquire+0xb5/0x2b0
> [  234.842680]  ? __kmalloc+0x4d/0x330
> [  234.842683]  ? finish_task_switch.isra.0+0xf2/0x360
> [  234.842686]  ? i915_vma_coredump_create+0x78/0x5b0 [i915]
> [  234.842734]  fs_reclaim_acquire+0xa1/0xd0
> [  234.842737]  ? __kmalloc+0x4d/0x330
> [  234.842739]  __kmalloc+0x4d/0x330
> [  234.842742]  i915_vma_coredump_create+0x78/0x5b0 [i915]
> [  234.842793]  ? capture_vma+0xbe/0x110 [i915]
> [  234.842844]  intel_engine_coredump_add_vma+0x36/0xe0 [i915]
> [  234.842892]  __i915_gpu_coredump+0x290/0x5e0 [i915]
> [  234.842939]  i915_capture_error_state+0x57/0xa0 [i915]
> [  234.842985]  intel_gt_handle_error+0x348/0x3e0 [i915]
> [  234.843032]  ? __mutex_lock+0x81/0x830
> [  234.843035]  ? simple_attr_write+0x3a/0xe0
> [  234.843038]  ? __lock_acquire+0x3bf/0x1dc0
> [  234.843041]  intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
> [  234.843083]  ? _copy_from_user+0x45/0x80
> [  234.843086]  simple_attr_write+0xc1/0xe0
> [  234.843089]  full_proxy_write+0x53/0x80
> [  234.843091]  vfs_write+0xbc/0x350
> [  234.843094]  ksys_write+0x58/0xd0
> [  234.843096]  do_syscall_64+0x38/0x90
> [  234.843098]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  234.843101] RIP: 0033:0x7fa467480877
> [  234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> [  234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [  234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
> [  234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
> [  234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
> [  234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
> [  234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005
> 
> v5:
> - Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
>   (Daniel Vetter)
> v6:
> - Include an instance in execlists_capture_work().
> - Rework the commit message due to patch reordering.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Looks good to me 

Reviewed-by: Ramalingam C <ramalingam.c@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 3 ++-
>  drivers/gpu/drm/i915/i915_gpu_error.c                | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index ca03880fa7e4..a69df5e9e77a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -2186,7 +2186,8 @@ struct execlists_capture {
>  static void execlists_capture_work(struct work_struct *work)
>  {
>  	struct execlists_capture *cap = container_of(work, typeof(*cap), work);
> -	const gfp_t gfp = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN;
> +	const gfp_t gfp = __GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL |
> +		__GFP_NOWARN;
>  	struct intel_engine_cs *engine = cap->rq->engine;
>  	struct intel_gt_coredump *gt = cap->error->gt;
>  	struct intel_engine_capture_vma *vma;
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 14de64282697..b1e4ce0f798f 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -49,7 +49,7 @@
>  #include "i915_memcpy.h"
>  #include "i915_scatterlist.h"
>  
> -#define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
> +#define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
>  #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
>  
>  static void __sg_set_buf(struct scatterlist *sg,
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/4] drm/i915: Update error capture code to avoid using the current vma state
  2021-11-08 17:45   ` [Intel-gfx] " Thomas Hellström
@ 2021-11-29  9:13     ` Ramalingam C
  -1 siblings, 0 replies; 18+ messages in thread
From: Ramalingam C @ 2021-11-29  9:13 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx, matthew.auld, dri-devel

On 2021-11-08 at 18:45:46 +0100, Thomas Hellström wrote:
> With asynchronous migrations, the vma state may be several migrations
> ahead of the state that matches the request we're capturing.
> Address that by introducing an i915_vma_snapshot structure that
> can be used to snapshot relevant state at request submission.
> In order to make sure we access the correct memory, the snapshots take
> references on relevant sg-tables and memory regions.
> 
> Also move the capture list allocation out of the fence signaling
> critical path and use the CONFIG_DRM_I915_CAPTURE_ERROR define to
> avoid compiling in members and functions used for error capture
> when they're not used.
> 
> Finally, Introduce lockdep annotation.
> 
> v4:
> - Break out the capture allocation mode change to a separate patch.
> v5:
> - Fix compilation error in the !CONFIG_DRM_I915_CAPTURE_ERROR case
>   (kernel test robot)
> v6:
> - Use #if IS_ENABLED() instead of #ifdef to match driver style.
> - Move yet another change of allocation mode to the separate patch.
> - Commit message rework due to patch reordering.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Looks good to me

Reviewed-by : Ramalingam C <ramalingam.c@intel.com>

> ---
>  drivers/gpu/drm/i915/Makefile                 |   1 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 135 +++++++++++---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   8 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c         | 176 +++++++++++++-----
>  drivers/gpu/drm/i915/i915_request.c           |  63 +++++--
>  drivers/gpu/drm/i915/i915_request.h           |  20 +-
>  drivers/gpu/drm/i915/i915_vma_snapshot.c      | 137 ++++++++++++++
>  drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 +++++++++++
>  8 files changed, 557 insertions(+), 95 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
>  create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 7d0d0b814670..b4372e0a802f 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -174,6 +174,7 @@ i915-y += \
>  	  i915_trace_points.o \
>  	  i915_ttm_buddy_manager.o \
>  	  i915_vma.o \
> +	  i915_vma_snapshot.o \
>  	  intel_wopcm.o
>  
>  # general-purpose microcontroller (GuC) support
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index ea5b7b2a4d70..ddca899f2396 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -29,6 +29,7 @@
>  #include "i915_gem_ioctls.h"
>  #include "i915_trace.h"
>  #include "i915_user_extensions.h"
> +#include "i915_vma_snapshot.h"
>  
>  struct eb_vma {
>  	struct i915_vma *vma;
> @@ -307,11 +308,15 @@ struct i915_execbuffer {
>  
>  	struct eb_fence *fences;
>  	unsigned long num_fences;
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> +	struct i915_capture_list *capture_lists[MAX_ENGINE_INSTANCE + 1];
> +#endif
>  };
>  
>  static int eb_parse(struct i915_execbuffer *eb);
>  static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle);
>  static void eb_unpin_engine(struct i915_execbuffer *eb);
> +static void eb_capture_release(struct i915_execbuffer *eb);
>  
>  static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
>  {
> @@ -1054,6 +1059,7 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
>  			i915_vma_put(vma);
>  	}
>  
> +	eb_capture_release(eb);
>  	eb_unpin_engine(eb);
>  }
>  
> @@ -1891,36 +1897,113 @@ eb_find_first_request_added(struct i915_execbuffer *eb)
>  	return NULL;
>  }
>  
> -static int eb_move_to_gpu(struct i915_execbuffer *eb)
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> +
> +/* Stage with GFP_KERNEL allocations before we enter the signaling critical path */
> +static void eb_capture_stage(struct i915_execbuffer *eb)
>  {
>  	const unsigned int count = eb->buffer_count;
> -	unsigned int i = count;
> -	int err = 0, j;
> +	unsigned int i = count, j;
> +	struct i915_vma_snapshot *vsnap;
>  
>  	while (i--) {
>  		struct eb_vma *ev = &eb->vma[i];
>  		struct i915_vma *vma = ev->vma;
>  		unsigned int flags = ev->flags;
> -		struct drm_i915_gem_object *obj = vma->obj;
>  
> -		assert_vma_held(vma);
> +		if (!(flags & EXEC_OBJECT_CAPTURE))
> +			continue;
>  
> -		if (flags & EXEC_OBJECT_CAPTURE) {
> +		vsnap = i915_vma_snapshot_alloc(GFP_KERNEL);
> +		if (!vsnap)
> +			continue;
> +
> +		i915_vma_snapshot_init(vsnap, vma, "user");
> +		for_each_batch_create_order(eb, j) {
>  			struct i915_capture_list *capture;
>  
> -			for_each_batch_create_order(eb, j) {
> -				if (!eb->requests[j])
> -					break;
> +			capture = kmalloc(sizeof(*capture), GFP_KERNEL);
> +			if (!capture)
> +				continue;
>  
> -				capture = kmalloc(sizeof(*capture), GFP_KERNEL);
> -				if (capture) {
> -					capture->next =
> -						eb->requests[j]->capture_list;
> -					capture->vma = vma;
> -					eb->requests[j]->capture_list = capture;
> -				}
> -			}
> +			capture->next = eb->capture_lists[j];
> +			capture->vma_snapshot = i915_vma_snapshot_get(vsnap);
> +			eb->capture_lists[j] = capture;
> +		}
> +		i915_vma_snapshot_put(vsnap);
> +	}
> +}
> +
> +/* Commit once we're in the critical path */
> +static void eb_capture_commit(struct i915_execbuffer *eb)
> +{
> +	unsigned int j;
> +
> +	for_each_batch_create_order(eb, j) {
> +		struct i915_request *rq = eb->requests[j];
> +
> +		if (!rq)
> +			break;
> +
> +		rq->capture_list = eb->capture_lists[j];
> +		eb->capture_lists[j] = NULL;
> +	}
> +}
> +
> +/*
> + * Release anything that didn't get committed due to errors.
> + * The capture_list will otherwise be freed at request retire.
> + */
> +static void eb_capture_release(struct i915_execbuffer *eb)
> +{
> +	unsigned int j;
> +
> +	for_each_batch_create_order(eb, j) {
> +		if (eb->capture_lists[j]) {
> +			i915_request_free_capture_list(eb->capture_lists[j]);
> +			eb->capture_lists[j] = NULL;
>  		}
> +	}
> +}
> +
> +static void eb_capture_list_clear(struct i915_execbuffer *eb)
> +{
> +	memset(eb->capture_lists, 0, sizeof(eb->capture_lists));
> +}
> +
> +#else
> +
> +static void eb_capture_stage(struct i915_execbuffer *eb)
> +{
> +}
> +
> +static void eb_capture_commit(struct i915_execbuffer *eb)
> +{
> +}
> +
> +static void eb_capture_release(struct i915_execbuffer *eb)
> +{
> +}
> +
> +static void eb_capture_list_clear(struct i915_execbuffer *eb)
> +{
> +}
> +
> +#endif
> +
> +static int eb_move_to_gpu(struct i915_execbuffer *eb)
> +{
> +	const unsigned int count = eb->buffer_count;
> +	unsigned int i = count;
> +	int err = 0, j;
> +
> +	while (i--) {
> +		struct eb_vma *ev = &eb->vma[i];
> +		struct i915_vma *vma = ev->vma;
> +		unsigned int flags = ev->flags;
> +		struct drm_i915_gem_object *obj = vma->obj;
> +
> +		assert_vma_held(vma);
>  
>  		/*
>  		 * If the GPU is not _reading_ through the CPU cache, we need
> @@ -2001,6 +2084,8 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
>  
>  	/* Unconditionally flush any chipset caches (for streaming writes). */
>  	intel_gt_chipset_flush(eb->gt);
> +	eb_capture_commit(eb);
> +
>  	return 0;
>  
>  err_skip:
> @@ -3143,13 +3228,14 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
>  		}
>  
>  		/*
> -		 * Whilst this request exists, batch_obj will be on the
> -		 * active_list, and so will hold the active reference. Only when
> -		 * this request is retired will the batch_obj be moved onto
> -		 * the inactive_list and lose its active reference. Hence we do
> -		 * not need to explicitly hold another reference here.
> +		 * Not really on stack, but we don't want to call
> +		 * kfree on the batch_snapshot when we put it, so use the
> +		 * _onstack interface.
>  		 */
> -		eb->requests[i]->batch = eb->batches[i]->vma;
> +		if (eb->batches[i]->vma)
> +			i915_vma_snapshot_init_onstack(&eb->requests[i]->batch_snapshot,
> +						       eb->batches[i]->vma,
> +						       "batch");
>  		if (eb->batch_pool) {
>  			GEM_BUG_ON(intel_context_is_parallel(eb->context));
>  			intel_gt_buffer_pool_mark_active(eb->batch_pool,
> @@ -3198,6 +3284,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  	eb.fences = NULL;
>  	eb.num_fences = 0;
>  
> +	eb_capture_list_clear(&eb);
> +
>  	memset(eb.requests, 0, sizeof(struct i915_request *) *
>  	       ARRAY_SIZE(eb.requests));
>  	eb.composite_fence = NULL;
> @@ -3284,6 +3372,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  	}
>  
>  	ww_acquire_done(&eb.ww.ctx);
> +	eb_capture_stage(&eb);
>  
>  	out_fence = eb_requests_create(&eb, in_fence, out_fence_fd);
>  	if (IS_ERR(out_fence)) {
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 332756036007..f2ccd5b53d42 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1676,14 +1676,18 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>  
>  static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
>  {
> +	struct i915_vma_snapshot *vsnap = &rq->batch_snapshot;
>  	void *ring;
>  	int size;
>  
> +	if (!i915_vma_snapshot_present(vsnap))
> +		vsnap = NULL;
> +
>  	drm_printf(m,
>  		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
>  		   rq->head, rq->postfix, rq->tail,
> -		   rq->batch ? upper_32_bits(rq->batch->node.start) : ~0u,
> -		   rq->batch ? lower_32_bits(rq->batch->node.start) : ~0u);
> +		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
> +		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
>  
>  	size = rq->tail - rq->head;
>  	if (rq->tail < rq->head)
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index b1e4ce0f798f..c61f9aaa40f9 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -48,6 +48,7 @@
>  #include "i915_gpu_error.h"
>  #include "i915_memcpy.h"
>  #include "i915_scatterlist.h"
> +#include "i915_vma_snapshot.h"
>  
>  #define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
>  #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
> @@ -1012,8 +1013,7 @@ void __i915_gpu_coredump_free(struct kref *error_ref)
>  
>  static struct i915_vma_coredump *
>  i915_vma_coredump_create(const struct intel_gt *gt,
> -			 const struct i915_vma *vma,
> -			 const char *name,
> +			 const struct i915_vma_snapshot *vsnap,
>  			 struct i915_vma_compress *compress)
>  {
>  	struct i915_ggtt *ggtt = gt->ggtt;
> @@ -1024,7 +1024,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  
>  	might_sleep();
>  
> -	if (!vma || !vma->pages || !compress)
> +	if (!vsnap || !vsnap->pages || !compress)
>  		return NULL;
>  
>  	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
> @@ -1037,12 +1037,12 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	}
>  
>  	INIT_LIST_HEAD(&dst->page_list);
> -	strcpy(dst->name, name);
> +	strcpy(dst->name, vsnap->name);
>  	dst->next = NULL;
>  
> -	dst->gtt_offset = vma->node.start;
> -	dst->gtt_size = vma->node.size;
> -	dst->gtt_page_sizes = vma->page_sizes.gtt;
> +	dst->gtt_offset = vsnap->gtt_offset;
> +	dst->gtt_size = vsnap->gtt_size;
> +	dst->gtt_page_sizes = vsnap->page_sizes;
>  	dst->unused = 0;
>  
>  	ret = -EINVAL;
> @@ -1050,7 +1050,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  		void __iomem *s;
>  		dma_addr_t dma;
>  
> -		for_each_sgt_daddr(dma, iter, vma->pages) {
> +		for_each_sgt_daddr(dma, iter, vsnap->pages) {
>  			mutex_lock(&ggtt->error_mutex);
>  			ggtt->vm.insert_page(&ggtt->vm, dma, slot,
>  					     I915_CACHE_NONE, 0);
> @@ -1068,11 +1068,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  			if (ret)
>  				break;
>  		}
> -	} else if (__i915_gem_object_is_lmem(vma->obj)) {
> -		struct intel_memory_region *mem = vma->obj->mm.region;
> +	} else if (vsnap->mr && vsnap->mr->type != INTEL_MEMORY_SYSTEM) {
> +		struct intel_memory_region *mem = vsnap->mr;
>  		dma_addr_t dma;
>  
> -		for_each_sgt_daddr(dma, iter, vma->pages) {
> +		for_each_sgt_daddr(dma, iter, vsnap->pages) {
>  			void __iomem *s;
>  
>  			s = io_mapping_map_wc(&mem->iomap,
> @@ -1088,7 +1088,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	} else {
>  		struct page *page;
>  
> -		for_each_sgt_page(page, iter, vma->pages) {
> +		for_each_sgt_page(page, iter, vsnap->pages) {
>  			void *s;
>  
>  			drm_clflush_pages(&page, 1);
> @@ -1324,37 +1324,71 @@ static bool record_context(struct i915_gem_context_coredump *e,
>  
>  struct intel_engine_capture_vma {
>  	struct intel_engine_capture_vma *next;
> -	struct i915_vma *vma;
> +	struct i915_vma_snapshot *vsnap;
>  	char name[16];
> +	bool lockdep_cookie;
>  };
>  
>  static struct intel_engine_capture_vma *
> -capture_vma(struct intel_engine_capture_vma *next,
> -	    struct i915_vma *vma,
> -	    const char *name,
> -	    gfp_t gfp)
> +capture_vma_snapshot(struct intel_engine_capture_vma *next,
> +		     struct i915_vma_snapshot *vsnap,
> +		     gfp_t gfp)
>  {
>  	struct intel_engine_capture_vma *c;
>  
> -	if (!vma)
> +	if (!i915_vma_snapshot_present(vsnap))
>  		return next;
>  
>  	c = kmalloc(sizeof(*c), gfp);
>  	if (!c)
>  		return next;
>  
> -	if (!i915_active_acquire_if_busy(&vma->active)) {
> +	if (!i915_vma_snapshot_resource_pin(vsnap, &c->lockdep_cookie)) {
>  		kfree(c);
>  		return next;
>  	}
>  
> -	strcpy(c->name, name);
> -	c->vma = vma; /* reference held while active */
> +	strcpy(c->name, vsnap->name);
> +	c->vsnap = vsnap;
> +	i915_vma_snapshot_get(vsnap);
>  
>  	c->next = next;
>  	return c;
>  }
>  
> +static struct intel_engine_capture_vma *
> +capture_vma(struct intel_engine_capture_vma *next,
> +	    struct i915_vma *vma,
> +	    const char *name,
> +	    gfp_t gfp)
> +{
> +	struct i915_vma_snapshot *vsnap;
> +
> +	if (!vma)
> +		return next;
> +
> +	/*
> +	 * If the vma isn't pinned, then the vma should be snapshotted
> +	 * to a struct i915_vma_snapshot at command submission time.
> +	 * Not here.
> +	 */
> +	GEM_WARN_ON(!i915_vma_is_pinned(vma));
> +	if (!i915_vma_is_pinned(vma))
> +		return next;
> +
> +	vsnap = i915_vma_snapshot_alloc(gfp);
> +	if (!vsnap)
> +		return next;
> +
> +	i915_vma_snapshot_init(vsnap, vma, name);
> +	next = capture_vma_snapshot(next, vsnap, gfp);
> +
> +	/* FIXME: Replace on async unbind. */
> +	i915_vma_snapshot_put(vsnap);
> +
> +	return next;
> +}
> +
>  static struct intel_engine_capture_vma *
>  capture_user(struct intel_engine_capture_vma *capture,
>  	     const struct i915_request *rq,
> @@ -1363,7 +1397,7 @@ capture_user(struct intel_engine_capture_vma *capture,
>  	struct i915_capture_list *c;
>  
>  	for (c = rq->capture_list; c; c = c->next)
> -		capture = capture_vma(capture, c->vma, "user", gfp);
> +		capture = capture_vma_snapshot(capture, c->vma_snapshot, gfp);
>  
>  	return capture;
>  }
> @@ -1377,6 +1411,36 @@ static void add_vma(struct intel_engine_coredump *ee,
>  	}
>  }
>  
> +static struct i915_vma_coredump *
> +create_vma_coredump(const struct intel_gt *gt, struct i915_vma *vma,
> +		    const char *name, struct i915_vma_compress *compress)
> +{
> +	struct i915_vma_coredump *ret = NULL;
> +	struct i915_vma_snapshot tmp;
> +	bool lockdep_cookie;
> +
> +	if (!vma)
> +		return NULL;
> +
> +	i915_vma_snapshot_init_onstack(&tmp, vma, name);
> +	if (i915_vma_snapshot_resource_pin(&tmp, &lockdep_cookie)) {
> +		ret = i915_vma_coredump_create(gt, &tmp, compress);
> +		i915_vma_snapshot_resource_unpin(&tmp, lockdep_cookie);
> +	}
> +	i915_vma_snapshot_put_onstack(&tmp);
> +
> +	return ret;
> +}
> +
> +static void add_vma_coredump(struct intel_engine_coredump *ee,
> +			     const struct intel_gt *gt,
> +			     struct i915_vma *vma,
> +			     const char *name,
> +			     struct i915_vma_compress *compress)
> +{
> +	add_vma(ee, create_vma_coredump(gt, vma, name, compress));
> +}
> +
>  struct intel_engine_coredump *
>  intel_engine_coredump_alloc(struct intel_engine_cs *engine, gfp_t gfp)
>  {
> @@ -1410,7 +1474,7 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee,
>  	 * as the simplest method to avoid being overwritten
>  	 * by userspace.
>  	 */
> -	vma = capture_vma(vma, rq->batch, "batch", gfp);
> +	vma = capture_vma_snapshot(vma, &rq->batch_snapshot, gfp);
>  	vma = capture_user(vma, rq, gfp);
>  	vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
>  	vma = capture_vma(vma, rq->context->state, "HW context", gfp);
> @@ -1431,30 +1495,24 @@ intel_engine_coredump_add_vma(struct intel_engine_coredump *ee,
>  
>  	while (capture) {
>  		struct intel_engine_capture_vma *this = capture;
> -		struct i915_vma *vma = this->vma;
> +		struct i915_vma_snapshot *vsnap = this->vsnap;
>  
>  		add_vma(ee,
>  			i915_vma_coredump_create(engine->gt,
> -						 vma, this->name,
> -						 compress));
> +						 vsnap, compress));
>  
> -		i915_active_release(&vma->active);
> +		i915_vma_snapshot_resource_unpin(vsnap, this->lockdep_cookie);
> +		i915_vma_snapshot_put(vsnap);
>  
>  		capture = this->next;
>  		kfree(this);
>  	}
>  
> -	add_vma(ee,
> -		i915_vma_coredump_create(engine->gt,
> -					 engine->status_page.vma,
> -					 "HW Status",
> -					 compress));
> +	add_vma_coredump(ee, engine->gt, engine->status_page.vma,
> +			 "HW Status", compress);
>  
> -	add_vma(ee,
> -		i915_vma_coredump_create(engine->gt,
> -					 engine->wa_ctx.vma,
> -					 "WA context",
> -					 compress));
> +	add_vma_coredump(ee, engine->gt, engine->wa_ctx.vma,
> +			 "WA context", compress);
>  }
>  
>  static struct intel_engine_coredump *
> @@ -1490,17 +1548,25 @@ capture_engine(struct intel_engine_cs *engine,
>  		}
>  	}
>  	if (rq)
> -		capture = intel_engine_coredump_add_request(ee, rq,
> -							    ATOMIC_MAYFAIL);
> +		rq = i915_request_get_rcu(rq);
> +
> +	if (!rq)
> +		goto no_request_capture;
> +
> +	capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL);
>  	if (!capture) {
> -no_request_capture:
> -		kfree(ee);
> -		return NULL;
> +		i915_request_put(rq);
> +		goto no_request_capture;
>  	}
>  
>  	intel_engine_coredump_add_vma(ee, capture, compress);
> +	i915_request_put(rq);
>  
>  	return ee;
> +
> +no_request_capture:
> +	kfree(ee);
> +	return NULL;
>  }
>  
>  static void
> @@ -1554,10 +1620,8 @@ gt_record_uc(struct intel_gt_coredump *gt,
>  	 */
>  	error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL);
>  	error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL);
> -	error_uc->guc_log =
> -		i915_vma_coredump_create(gt->_gt,
> -					 uc->guc.log.vma, "GuC log buffer",
> -					 compress);
> +	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
> +						"GuC log buffer", compress);
>  
>  	return error_uc;
>  }
> @@ -1843,8 +1907,8 @@ void i915_vma_capture_finish(struct intel_gt_coredump *gt,
>  	kfree(compress);
>  }
>  
> -struct i915_gpu_coredump *
> -i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
> +static struct i915_gpu_coredump *
> +__i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
>  {
>  	struct drm_i915_private *i915 = gt->i915;
>  	struct i915_gpu_coredump *error;
> @@ -1885,6 +1949,22 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
>  	return error;
>  }
>  
> +struct i915_gpu_coredump *
> +i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
> +{
> +	static DEFINE_MUTEX(capture_mutex);
> +	int ret = mutex_lock_interruptible(&capture_mutex);
> +	struct i915_gpu_coredump *dump;
> +
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	dump = __i915_gpu_coredump(gt, engine_mask);
> +	mutex_unlock(&capture_mutex);
> +
> +	return dump;
> +}
> +
>  void i915_error_state_store(struct i915_gpu_coredump *error)
>  {
>  	struct drm_i915_private *i915;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 623273aca09e..ee855dc86972 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -113,6 +113,10 @@ static void i915_fence_release(struct dma_fence *fence)
>  	GEM_BUG_ON(rq->guc_prio != GUC_PRIO_INIT &&
>  		   rq->guc_prio != GUC_PRIO_FINI);
>  
> +	i915_request_free_capture_list(fetch_and_zero(&rq->capture_list));
> +	if (i915_vma_snapshot_present(&rq->batch_snapshot))
> +		i915_vma_snapshot_put_onstack(&rq->batch_snapshot);
> +
>  	/*
>  	 * The request is put onto a RCU freelist (i.e. the address
>  	 * is immediately reused), mark the fences as being freed now.
> @@ -186,19 +190,6 @@ void i915_request_notify_execute_cb_imm(struct i915_request *rq)
>  	__notify_execute_cb(rq, irq_work_imm);
>  }
>  
> -static void free_capture_list(struct i915_request *request)
> -{
> -	struct i915_capture_list *capture;
> -
> -	capture = fetch_and_zero(&request->capture_list);
> -	while (capture) {
> -		struct i915_capture_list *next = capture->next;
> -
> -		kfree(capture);
> -		capture = next;
> -	}
> -}
> -
>  static void __i915_request_fill(struct i915_request *rq, u8 val)
>  {
>  	void *vaddr = rq->ring->vaddr;
> @@ -303,6 +294,37 @@ static void __rq_cancel_watchdog(struct i915_request *rq)
>  		i915_request_put(rq);
>  }
>  
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> +
> +/**
> + * i915_request_free_capture_list - Free a capture list
> + * @capture: Pointer to the first list item or NULL
> + *
> + */
> +void i915_request_free_capture_list(struct i915_capture_list *capture)
> +{
> +	while (capture) {
> +		struct i915_capture_list *next = capture->next;
> +
> +		i915_vma_snapshot_put(capture->vma_snapshot);
> +		capture = next;
> +	}
> +}
> +
> +#define assert_capture_list_is_null(_rq) GEM_BUG_ON((_rq)->capture_list)
> +
> +#define clear_capture_list(_rq) ((_rq)->capture_list = NULL)
> +
> +#else
> +
> +#define i915_request_free_capture_list(_a) do {} while (0)
> +
> +#define assert_capture_list_is_null(_a) do {} while (0)
> +
> +#define clear_capture_list(_rq) do {} while (0)
> +
> +#endif
> +
>  bool i915_request_retire(struct i915_request *rq)
>  {
>  	if (!__i915_request_is_complete(rq))
> @@ -359,7 +381,6 @@ bool i915_request_retire(struct i915_request *rq)
>  	intel_context_exit(rq->context);
>  	intel_context_unpin(rq->context);
>  
> -	free_capture_list(rq);
>  	i915_sched_node_fini(&rq->sched);
>  	i915_request_put(rq);
>  
> @@ -829,11 +850,18 @@ static void __i915_request_ctor(void *arg)
>  	i915_sw_fence_init(&rq->submit, submit_notify);
>  	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>  
> -	rq->capture_list = NULL;
> +	clear_capture_list(rq);
> +	rq->batch_snapshot.present = false;
>  
>  	init_llist_head(&rq->execute_cb);
>  }
>  
> +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> +#define clear_batch_ptr(_rq) ((_rq)->batch = NULL)
> +#else
> +#define clear_batch_ptr(_a) do {} while (0)
> +#endif
> +
>  struct i915_request *
>  __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  {
> @@ -925,10 +953,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  	i915_sched_node_reinit(&rq->sched);
>  
>  	/* No zalloc, everything must be cleared after use */
> -	rq->batch = NULL;
> +	clear_batch_ptr(rq);
>  	__rq_init_watchdog(rq);
> -	GEM_BUG_ON(rq->capture_list);
> +	assert_capture_list_is_null(rq);
>  	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
> +	GEM_BUG_ON(i915_vma_snapshot_present(&rq->batch_snapshot));
>  
>  	/*
>  	 * Reserve space in the ring buffer for all the commands required to
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index dc359242d1ae..16aba0cb32d6 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -40,6 +40,7 @@
>  #include "i915_scheduler.h"
>  #include "i915_selftest.h"
>  #include "i915_sw_fence.h"
> +#include "i915_vma_snapshot.h"
>  
>  #include <uapi/drm/i915_drm.h>
>  
> @@ -48,11 +49,17 @@ struct drm_i915_gem_object;
>  struct drm_printer;
>  struct i915_request;
>  
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
>  struct i915_capture_list {
> +	struct i915_vma_snapshot *vma_snapshot;
>  	struct i915_capture_list *next;
> -	struct i915_vma *vma;
>  };
>  
> +void i915_request_free_capture_list(struct i915_capture_list *capture);
> +#else
> +#define i915_request_free_capture_list(_a) do {} while (0)
> +#endif
> +
>  #define RQ_TRACE(rq, fmt, ...) do {					\
>  	const struct i915_request *rq__ = (rq);				\
>  	ENGINE_TRACE(rq__->engine, "fence %llx:%lld, current %d " fmt,	\
> @@ -289,10 +296,12 @@ struct i915_request {
>  	/** Preallocate space in the ring for the emitting the request */
>  	u32 reserved_space;
>  
> -	/** Batch buffer related to this request if any (used for
> -	 * error state dump only).
> -	 */
> -	struct i915_vma *batch;
> +	/** Batch buffer pointer for selftest internal use. */
> +	I915_SELFTEST_DECLARE(struct i915_vma *batch);
> +
> +	struct i915_vma_snapshot batch_snapshot;
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
>  	/**
>  	 * Additional buffers requested by userspace to be captured upon
>  	 * a GPU hang. The vma/obj on this list are protected by their
> @@ -300,6 +309,7 @@ struct i915_request {
>  	 * on the active_list (of their final request).
>  	 */
>  	struct i915_capture_list *capture_list;
> +#endif
>  
>  	/** Time at which this request was emitted, in jiffies. */
>  	unsigned long emitted_jiffies;
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> new file mode 100644
> index 000000000000..44985d600f96
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> @@ -0,0 +1,137 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#include "i915_vma_snapshot.h"
> +#include "i915_vma_types.h"
> +#include "i915_vma.h"
> +
> +/**
> + * i915_vma_snapshot_init - Initialize a struct i915_vma_snapshot from
> + * a struct i915_vma.
> + * @vsnap: The i915_vma_snapshot to init.
> + * @vma: A struct i915_vma used to initialize @vsnap.
> + * @name: Name associated with the snapshot. The character pointer needs to
> + * stay alive over the lifitime of the shapsot
> + */
> +void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
> +			    struct i915_vma *vma,
> +			    const char *name)
> +{
> +	if (!i915_vma_is_pinned(vma))
> +		assert_object_held(vma->obj);
> +
> +	vsnap->name = name;
> +	vsnap->size = vma->size;
> +	vsnap->obj_size = vma->obj->base.size;
> +	vsnap->gtt_offset = vma->node.start;
> +	vsnap->gtt_size = vma->node.size;
> +	vsnap->page_sizes = vma->page_sizes.gtt;
> +	vsnap->pages = vma->pages;
> +	vsnap->pages_rsgt = NULL;
> +	vsnap->mr = NULL;
> +	if (vma->obj->mm.rsgt)
> +		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
> +	if (vma->obj->mm.region)
> +		vsnap->mr = intel_memory_region_get(vma->obj->mm.region);
> +	kref_init(&vsnap->kref);
> +	vsnap->vma_resource = &vma->active;
> +	vsnap->onstack = false;
> +	vsnap->present = true;
> +}
> +
> +/**
> + * i915_vma_snapshot_init_onstack - Initialize a struct i915_vma_snapshot from
> + * a struct i915_vma, but avoid kfreeing it on last put.
> + * @vsnap: The i915_vma_snapshot to init.
> + * @vma: A struct i915_vma used to initialize @vsnap.
> + * @name: Name associated with the snapshot. The character pointer needs to
> + * stay alive over the lifitime of the shapsot
> + */
> +void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
> +				    struct i915_vma *vma,
> +				    const char *name)
> +{
> +	i915_vma_snapshot_init(vsnap, vma, name);
> +	vsnap->onstack = true;
> +}
> +
> +static void vma_snapshot_release(struct kref *ref)
> +{
> +	struct i915_vma_snapshot *vsnap =
> +		container_of(ref, typeof(*vsnap), kref);
> +
> +	vsnap->present = false;
> +	if (vsnap->mr)
> +		intel_memory_region_put(vsnap->mr);
> +	if (vsnap->pages_rsgt)
> +		i915_refct_sgt_put(vsnap->pages_rsgt);
> +	if (!vsnap->onstack)
> +		kfree(vsnap);
> +}
> +
> +/**
> + * i915_vma_snapshot_put - Put an i915_vma_snapshot pointer reference
> + * @vsnap: The pointer reference
> + */
> +void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap)
> +{
> +	kref_put(&vsnap->kref, vma_snapshot_release);
> +}
> +
> +/**
> + * i915_vma_snapshot_put_onstack - Put an onstcak i915_vma_snapshot pointer
> + * reference and varify that the structure is released
> + * @vsnap: The pointer reference
> + *
> + * This function is intended to be paired with a i915_vma_init_onstack()
> + * and should be called before exiting the scope that declared or
> + * freeing the structure that embedded @vsnap to verify that all references
> + * have been released.
> + */
> +void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
> +{
> +	if (!kref_put(&vsnap->kref, vma_snapshot_release))
> +		GEM_BUG_ON(1);
> +}
> +
> +/**
> + * i915_vma_snapshot_resource_pin - Temporarily block the memory the
> + * vma snapshot is pointing to from being released.
> + * @vsnap: The vma snapshot.
> + * @lockdep_cookie: Pointer to bool needed for lockdep support. This needs
> + * to be passed to the paired i915_vma_snapshot_resource_unpin.
> + *
> + * This function will temporarily try to hold up a fence or similar structure
> + * and will therefore enter a fence signaling critical section.
> + *
> + * Return: true if we succeeded in blocking the memory from being released,
> + * false otherwise.
> + */
> +bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
> +				    bool *lockdep_cookie)
> +{
> +	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
> +
> +	if (pinned)
> +		*lockdep_cookie = dma_fence_begin_signalling();
> +
> +	return pinned;
> +}
> +
> +/**
> + * i915_vma_snapshot_resource_unpin - Unblock vma snapshot memory from
> + * being released.
> + * @vsnap: The vma snapshot.
> + * @lockdep_cookie: Cookie returned from matching i915_vma_resource_pin().
> + *
> + * Might leave a fence signalling critical section and signal a fence.
> + */
> +void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
> +				      bool lockdep_cookie)
> +{
> +	dma_fence_end_signalling(lockdep_cookie);
> +
> +	return i915_active_release(vsnap->vma_resource);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> new file mode 100644
> index 000000000000..940581df4622
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> @@ -0,0 +1,112 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +#ifndef _I915_VMA_SNAPSHOT_H_
> +#define _I915_VMA_SNAPSHOT_H_
> +
> +#include <linux/kref.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +
> +struct i915_active;
> +struct i915_refct_sgt;
> +struct i915_vma;
> +struct intel_memory_region;
> +struct sg_table;
> +
> +/**
> + * DOC: Simple utilities for snapshotting GPU vma metadata, later used for
> + * error capture. Vi use a separate header for this to avoid issues due to
> + * recursive header includes.
> + */
> +
> +/**
> + * struct i915_vma_snapshot - Snapshot of vma metadata.
> + * @size: The vma size in bytes.
> + * @obj_size: The size of the underlying object in bytes.
> + * @gtt_offset: The gtt offset the vma is bound to.
> + * @gtt_size: The size in bytes allocated for the vma in the GTT.
> + * @pages: The struct sg_table pointing to the pages bound.
> + * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
> + * @mr: The memory region pointed for the pages bound.
> + * @kref: Reference for this structure.
> + * @vma_resource: FIXME: A means to keep the unbind fence from signaling.
> + * Temporarily while we have only sync unbinds, and still use the vma
> + * active, we use that. With async unbinding we need a signaling refcount
> + * for the unbind fence.
> + * @page_sizes: The vma GTT page sizes information.
> + * @onstack: Whether the structure shouldn't be freed on final put.
> + * @present: Whether the structure is present and initialized.
> + */
> +struct i915_vma_snapshot {
> +	const char *name;
> +	size_t size;
> +	size_t obj_size;
> +	size_t gtt_offset;
> +	size_t gtt_size;
> +	struct sg_table *pages;
> +	struct i915_refct_sgt *pages_rsgt;
> +	struct intel_memory_region *mr;
> +	struct kref kref;
> +	struct i915_active *vma_resource;
> +	u32 page_sizes;
> +	bool onstack:1;
> +	bool present:1;
> +};
> +
> +void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
> +			    struct i915_vma *vma,
> +			    const char *name);
> +
> +void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
> +				    struct i915_vma *vma,
> +				    const char *name);
> +
> +void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap);
> +
> +void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap);
> +
> +bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
> +				    bool *lockdep_cookie);
> +
> +void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
> +				      bool lockdep_cookie);
> +
> +/**
> + * i915_vma_snapshot_alloc - Allocate a struct i915_vma_snapshot
> + * @gfp: Allocation mode.
> + *
> + * Return: A pointer to a struct i915_vma_snapshot if successful.
> + * NULL otherwise.
> + */
> +static inline struct i915_vma_snapshot *i915_vma_snapshot_alloc(gfp_t gfp)
> +{
> +	return kmalloc(sizeof(struct i915_vma_snapshot), gfp);
> +}
> +
> +/**
> + * i915_vma_snapshot_get - Take a reference on a struct i915_vma_snapshot
> + *
> + * Return: A pointer to a struct i915_vma_snapshot.
> + */
> +static inline struct i915_vma_snapshot *
> +i915_vma_snapshot_get(struct i915_vma_snapshot *vsnap)
> +{
> +	kref_get(&vsnap->kref);
> +	return vsnap;
> +}
> +
> +/**
> + * i915_vma_snapshot_present - Whether a struct i915_vma_snapshot is
> + * present and initialized.
> + *
> + * Return: true if present and initialized; false otherwise.
> + */
> +static inline bool
> +i915_vma_snapshot_present(const struct i915_vma_snapshot *vsnap)
> +{
> +	return vsnap && vsnap->present;
> +}
> +
> +#endif
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Intel-gfx] [PATCH v6 3/4] drm/i915: Update error capture code to avoid using the current vma state
@ 2021-11-29  9:13     ` Ramalingam C
  0 siblings, 0 replies; 18+ messages in thread
From: Ramalingam C @ 2021-11-29  9:13 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx, matthew.auld, dri-devel

On 2021-11-08 at 18:45:46 +0100, Thomas Hellström wrote:
> With asynchronous migrations, the vma state may be several migrations
> ahead of the state that matches the request we're capturing.
> Address that by introducing an i915_vma_snapshot structure that
> can be used to snapshot relevant state at request submission.
> In order to make sure we access the correct memory, the snapshots take
> references on relevant sg-tables and memory regions.
> 
> Also move the capture list allocation out of the fence signaling
> critical path and use the CONFIG_DRM_I915_CAPTURE_ERROR define to
> avoid compiling in members and functions used for error capture
> when they're not used.
> 
> Finally, Introduce lockdep annotation.
> 
> v4:
> - Break out the capture allocation mode change to a separate patch.
> v5:
> - Fix compilation error in the !CONFIG_DRM_I915_CAPTURE_ERROR case
>   (kernel test robot)
> v6:
> - Use #if IS_ENABLED() instead of #ifdef to match driver style.
> - Move yet another change of allocation mode to the separate patch.
> - Commit message rework due to patch reordering.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Looks good to me

Reviewed-by : Ramalingam C <ramalingam.c@intel.com>

> ---
>  drivers/gpu/drm/i915/Makefile                 |   1 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 135 +++++++++++---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   8 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c         | 176 +++++++++++++-----
>  drivers/gpu/drm/i915/i915_request.c           |  63 +++++--
>  drivers/gpu/drm/i915/i915_request.h           |  20 +-
>  drivers/gpu/drm/i915/i915_vma_snapshot.c      | 137 ++++++++++++++
>  drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 +++++++++++
>  8 files changed, 557 insertions(+), 95 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
>  create mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 7d0d0b814670..b4372e0a802f 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -174,6 +174,7 @@ i915-y += \
>  	  i915_trace_points.o \
>  	  i915_ttm_buddy_manager.o \
>  	  i915_vma.o \
> +	  i915_vma_snapshot.o \
>  	  intel_wopcm.o
>  
>  # general-purpose microcontroller (GuC) support
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index ea5b7b2a4d70..ddca899f2396 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -29,6 +29,7 @@
>  #include "i915_gem_ioctls.h"
>  #include "i915_trace.h"
>  #include "i915_user_extensions.h"
> +#include "i915_vma_snapshot.h"
>  
>  struct eb_vma {
>  	struct i915_vma *vma;
> @@ -307,11 +308,15 @@ struct i915_execbuffer {
>  
>  	struct eb_fence *fences;
>  	unsigned long num_fences;
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> +	struct i915_capture_list *capture_lists[MAX_ENGINE_INSTANCE + 1];
> +#endif
>  };
>  
>  static int eb_parse(struct i915_execbuffer *eb);
>  static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle);
>  static void eb_unpin_engine(struct i915_execbuffer *eb);
> +static void eb_capture_release(struct i915_execbuffer *eb);
>  
>  static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
>  {
> @@ -1054,6 +1059,7 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
>  			i915_vma_put(vma);
>  	}
>  
> +	eb_capture_release(eb);
>  	eb_unpin_engine(eb);
>  }
>  
> @@ -1891,36 +1897,113 @@ eb_find_first_request_added(struct i915_execbuffer *eb)
>  	return NULL;
>  }
>  
> -static int eb_move_to_gpu(struct i915_execbuffer *eb)
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> +
> +/* Stage with GFP_KERNEL allocations before we enter the signaling critical path */
> +static void eb_capture_stage(struct i915_execbuffer *eb)
>  {
>  	const unsigned int count = eb->buffer_count;
> -	unsigned int i = count;
> -	int err = 0, j;
> +	unsigned int i = count, j;
> +	struct i915_vma_snapshot *vsnap;
>  
>  	while (i--) {
>  		struct eb_vma *ev = &eb->vma[i];
>  		struct i915_vma *vma = ev->vma;
>  		unsigned int flags = ev->flags;
> -		struct drm_i915_gem_object *obj = vma->obj;
>  
> -		assert_vma_held(vma);
> +		if (!(flags & EXEC_OBJECT_CAPTURE))
> +			continue;
>  
> -		if (flags & EXEC_OBJECT_CAPTURE) {
> +		vsnap = i915_vma_snapshot_alloc(GFP_KERNEL);
> +		if (!vsnap)
> +			continue;
> +
> +		i915_vma_snapshot_init(vsnap, vma, "user");
> +		for_each_batch_create_order(eb, j) {
>  			struct i915_capture_list *capture;
>  
> -			for_each_batch_create_order(eb, j) {
> -				if (!eb->requests[j])
> -					break;
> +			capture = kmalloc(sizeof(*capture), GFP_KERNEL);
> +			if (!capture)
> +				continue;
>  
> -				capture = kmalloc(sizeof(*capture), GFP_KERNEL);
> -				if (capture) {
> -					capture->next =
> -						eb->requests[j]->capture_list;
> -					capture->vma = vma;
> -					eb->requests[j]->capture_list = capture;
> -				}
> -			}
> +			capture->next = eb->capture_lists[j];
> +			capture->vma_snapshot = i915_vma_snapshot_get(vsnap);
> +			eb->capture_lists[j] = capture;
> +		}
> +		i915_vma_snapshot_put(vsnap);
> +	}
> +}
> +
> +/* Commit once we're in the critical path */
> +static void eb_capture_commit(struct i915_execbuffer *eb)
> +{
> +	unsigned int j;
> +
> +	for_each_batch_create_order(eb, j) {
> +		struct i915_request *rq = eb->requests[j];
> +
> +		if (!rq)
> +			break;
> +
> +		rq->capture_list = eb->capture_lists[j];
> +		eb->capture_lists[j] = NULL;
> +	}
> +}
> +
> +/*
> + * Release anything that didn't get committed due to errors.
> + * The capture_list will otherwise be freed at request retire.
> + */
> +static void eb_capture_release(struct i915_execbuffer *eb)
> +{
> +	unsigned int j;
> +
> +	for_each_batch_create_order(eb, j) {
> +		if (eb->capture_lists[j]) {
> +			i915_request_free_capture_list(eb->capture_lists[j]);
> +			eb->capture_lists[j] = NULL;
>  		}
> +	}
> +}
> +
> +static void eb_capture_list_clear(struct i915_execbuffer *eb)
> +{
> +	memset(eb->capture_lists, 0, sizeof(eb->capture_lists));
> +}
> +
> +#else
> +
> +static void eb_capture_stage(struct i915_execbuffer *eb)
> +{
> +}
> +
> +static void eb_capture_commit(struct i915_execbuffer *eb)
> +{
> +}
> +
> +static void eb_capture_release(struct i915_execbuffer *eb)
> +{
> +}
> +
> +static void eb_capture_list_clear(struct i915_execbuffer *eb)
> +{
> +}
> +
> +#endif
> +
> +static int eb_move_to_gpu(struct i915_execbuffer *eb)
> +{
> +	const unsigned int count = eb->buffer_count;
> +	unsigned int i = count;
> +	int err = 0, j;
> +
> +	while (i--) {
> +		struct eb_vma *ev = &eb->vma[i];
> +		struct i915_vma *vma = ev->vma;
> +		unsigned int flags = ev->flags;
> +		struct drm_i915_gem_object *obj = vma->obj;
> +
> +		assert_vma_held(vma);
>  
>  		/*
>  		 * If the GPU is not _reading_ through the CPU cache, we need
> @@ -2001,6 +2084,8 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
>  
>  	/* Unconditionally flush any chipset caches (for streaming writes). */
>  	intel_gt_chipset_flush(eb->gt);
> +	eb_capture_commit(eb);
> +
>  	return 0;
>  
>  err_skip:
> @@ -3143,13 +3228,14 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
>  		}
>  
>  		/*
> -		 * Whilst this request exists, batch_obj will be on the
> -		 * active_list, and so will hold the active reference. Only when
> -		 * this request is retired will the batch_obj be moved onto
> -		 * the inactive_list and lose its active reference. Hence we do
> -		 * not need to explicitly hold another reference here.
> +		 * Not really on stack, but we don't want to call
> +		 * kfree on the batch_snapshot when we put it, so use the
> +		 * _onstack interface.
>  		 */
> -		eb->requests[i]->batch = eb->batches[i]->vma;
> +		if (eb->batches[i]->vma)
> +			i915_vma_snapshot_init_onstack(&eb->requests[i]->batch_snapshot,
> +						       eb->batches[i]->vma,
> +						       "batch");
>  		if (eb->batch_pool) {
>  			GEM_BUG_ON(intel_context_is_parallel(eb->context));
>  			intel_gt_buffer_pool_mark_active(eb->batch_pool,
> @@ -3198,6 +3284,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  	eb.fences = NULL;
>  	eb.num_fences = 0;
>  
> +	eb_capture_list_clear(&eb);
> +
>  	memset(eb.requests, 0, sizeof(struct i915_request *) *
>  	       ARRAY_SIZE(eb.requests));
>  	eb.composite_fence = NULL;
> @@ -3284,6 +3372,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  	}
>  
>  	ww_acquire_done(&eb.ww.ctx);
> +	eb_capture_stage(&eb);
>  
>  	out_fence = eb_requests_create(&eb, in_fence, out_fence_fd);
>  	if (IS_ERR(out_fence)) {
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 332756036007..f2ccd5b53d42 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1676,14 +1676,18 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
>  
>  static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
>  {
> +	struct i915_vma_snapshot *vsnap = &rq->batch_snapshot;
>  	void *ring;
>  	int size;
>  
> +	if (!i915_vma_snapshot_present(vsnap))
> +		vsnap = NULL;
> +
>  	drm_printf(m,
>  		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
>  		   rq->head, rq->postfix, rq->tail,
> -		   rq->batch ? upper_32_bits(rq->batch->node.start) : ~0u,
> -		   rq->batch ? lower_32_bits(rq->batch->node.start) : ~0u);
> +		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
> +		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
>  
>  	size = rq->tail - rq->head;
>  	if (rq->tail < rq->head)
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index b1e4ce0f798f..c61f9aaa40f9 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -48,6 +48,7 @@
>  #include "i915_gpu_error.h"
>  #include "i915_memcpy.h"
>  #include "i915_scatterlist.h"
> +#include "i915_vma_snapshot.h"
>  
>  #define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
>  #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
> @@ -1012,8 +1013,7 @@ void __i915_gpu_coredump_free(struct kref *error_ref)
>  
>  static struct i915_vma_coredump *
>  i915_vma_coredump_create(const struct intel_gt *gt,
> -			 const struct i915_vma *vma,
> -			 const char *name,
> +			 const struct i915_vma_snapshot *vsnap,
>  			 struct i915_vma_compress *compress)
>  {
>  	struct i915_ggtt *ggtt = gt->ggtt;
> @@ -1024,7 +1024,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  
>  	might_sleep();
>  
> -	if (!vma || !vma->pages || !compress)
> +	if (!vsnap || !vsnap->pages || !compress)
>  		return NULL;
>  
>  	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
> @@ -1037,12 +1037,12 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	}
>  
>  	INIT_LIST_HEAD(&dst->page_list);
> -	strcpy(dst->name, name);
> +	strcpy(dst->name, vsnap->name);
>  	dst->next = NULL;
>  
> -	dst->gtt_offset = vma->node.start;
> -	dst->gtt_size = vma->node.size;
> -	dst->gtt_page_sizes = vma->page_sizes.gtt;
> +	dst->gtt_offset = vsnap->gtt_offset;
> +	dst->gtt_size = vsnap->gtt_size;
> +	dst->gtt_page_sizes = vsnap->page_sizes;
>  	dst->unused = 0;
>  
>  	ret = -EINVAL;
> @@ -1050,7 +1050,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  		void __iomem *s;
>  		dma_addr_t dma;
>  
> -		for_each_sgt_daddr(dma, iter, vma->pages) {
> +		for_each_sgt_daddr(dma, iter, vsnap->pages) {
>  			mutex_lock(&ggtt->error_mutex);
>  			ggtt->vm.insert_page(&ggtt->vm, dma, slot,
>  					     I915_CACHE_NONE, 0);
> @@ -1068,11 +1068,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  			if (ret)
>  				break;
>  		}
> -	} else if (__i915_gem_object_is_lmem(vma->obj)) {
> -		struct intel_memory_region *mem = vma->obj->mm.region;
> +	} else if (vsnap->mr && vsnap->mr->type != INTEL_MEMORY_SYSTEM) {
> +		struct intel_memory_region *mem = vsnap->mr;
>  		dma_addr_t dma;
>  
> -		for_each_sgt_daddr(dma, iter, vma->pages) {
> +		for_each_sgt_daddr(dma, iter, vsnap->pages) {
>  			void __iomem *s;
>  
>  			s = io_mapping_map_wc(&mem->iomap,
> @@ -1088,7 +1088,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  	} else {
>  		struct page *page;
>  
> -		for_each_sgt_page(page, iter, vma->pages) {
> +		for_each_sgt_page(page, iter, vsnap->pages) {
>  			void *s;
>  
>  			drm_clflush_pages(&page, 1);
> @@ -1324,37 +1324,71 @@ static bool record_context(struct i915_gem_context_coredump *e,
>  
>  struct intel_engine_capture_vma {
>  	struct intel_engine_capture_vma *next;
> -	struct i915_vma *vma;
> +	struct i915_vma_snapshot *vsnap;
>  	char name[16];
> +	bool lockdep_cookie;
>  };
>  
>  static struct intel_engine_capture_vma *
> -capture_vma(struct intel_engine_capture_vma *next,
> -	    struct i915_vma *vma,
> -	    const char *name,
> -	    gfp_t gfp)
> +capture_vma_snapshot(struct intel_engine_capture_vma *next,
> +		     struct i915_vma_snapshot *vsnap,
> +		     gfp_t gfp)
>  {
>  	struct intel_engine_capture_vma *c;
>  
> -	if (!vma)
> +	if (!i915_vma_snapshot_present(vsnap))
>  		return next;
>  
>  	c = kmalloc(sizeof(*c), gfp);
>  	if (!c)
>  		return next;
>  
> -	if (!i915_active_acquire_if_busy(&vma->active)) {
> +	if (!i915_vma_snapshot_resource_pin(vsnap, &c->lockdep_cookie)) {
>  		kfree(c);
>  		return next;
>  	}
>  
> -	strcpy(c->name, name);
> -	c->vma = vma; /* reference held while active */
> +	strcpy(c->name, vsnap->name);
> +	c->vsnap = vsnap;
> +	i915_vma_snapshot_get(vsnap);
>  
>  	c->next = next;
>  	return c;
>  }
>  
> +static struct intel_engine_capture_vma *
> +capture_vma(struct intel_engine_capture_vma *next,
> +	    struct i915_vma *vma,
> +	    const char *name,
> +	    gfp_t gfp)
> +{
> +	struct i915_vma_snapshot *vsnap;
> +
> +	if (!vma)
> +		return next;
> +
> +	/*
> +	 * If the vma isn't pinned, then the vma should be snapshotted
> +	 * to a struct i915_vma_snapshot at command submission time.
> +	 * Not here.
> +	 */
> +	GEM_WARN_ON(!i915_vma_is_pinned(vma));
> +	if (!i915_vma_is_pinned(vma))
> +		return next;
> +
> +	vsnap = i915_vma_snapshot_alloc(gfp);
> +	if (!vsnap)
> +		return next;
> +
> +	i915_vma_snapshot_init(vsnap, vma, name);
> +	next = capture_vma_snapshot(next, vsnap, gfp);
> +
> +	/* FIXME: Replace on async unbind. */
> +	i915_vma_snapshot_put(vsnap);
> +
> +	return next;
> +}
> +
>  static struct intel_engine_capture_vma *
>  capture_user(struct intel_engine_capture_vma *capture,
>  	     const struct i915_request *rq,
> @@ -1363,7 +1397,7 @@ capture_user(struct intel_engine_capture_vma *capture,
>  	struct i915_capture_list *c;
>  
>  	for (c = rq->capture_list; c; c = c->next)
> -		capture = capture_vma(capture, c->vma, "user", gfp);
> +		capture = capture_vma_snapshot(capture, c->vma_snapshot, gfp);
>  
>  	return capture;
>  }
> @@ -1377,6 +1411,36 @@ static void add_vma(struct intel_engine_coredump *ee,
>  	}
>  }
>  
> +static struct i915_vma_coredump *
> +create_vma_coredump(const struct intel_gt *gt, struct i915_vma *vma,
> +		    const char *name, struct i915_vma_compress *compress)
> +{
> +	struct i915_vma_coredump *ret = NULL;
> +	struct i915_vma_snapshot tmp;
> +	bool lockdep_cookie;
> +
> +	if (!vma)
> +		return NULL;
> +
> +	i915_vma_snapshot_init_onstack(&tmp, vma, name);
> +	if (i915_vma_snapshot_resource_pin(&tmp, &lockdep_cookie)) {
> +		ret = i915_vma_coredump_create(gt, &tmp, compress);
> +		i915_vma_snapshot_resource_unpin(&tmp, lockdep_cookie);
> +	}
> +	i915_vma_snapshot_put_onstack(&tmp);
> +
> +	return ret;
> +}
> +
> +static void add_vma_coredump(struct intel_engine_coredump *ee,
> +			     const struct intel_gt *gt,
> +			     struct i915_vma *vma,
> +			     const char *name,
> +			     struct i915_vma_compress *compress)
> +{
> +	add_vma(ee, create_vma_coredump(gt, vma, name, compress));
> +}
> +
>  struct intel_engine_coredump *
>  intel_engine_coredump_alloc(struct intel_engine_cs *engine, gfp_t gfp)
>  {
> @@ -1410,7 +1474,7 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee,
>  	 * as the simplest method to avoid being overwritten
>  	 * by userspace.
>  	 */
> -	vma = capture_vma(vma, rq->batch, "batch", gfp);
> +	vma = capture_vma_snapshot(vma, &rq->batch_snapshot, gfp);
>  	vma = capture_user(vma, rq, gfp);
>  	vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
>  	vma = capture_vma(vma, rq->context->state, "HW context", gfp);
> @@ -1431,30 +1495,24 @@ intel_engine_coredump_add_vma(struct intel_engine_coredump *ee,
>  
>  	while (capture) {
>  		struct intel_engine_capture_vma *this = capture;
> -		struct i915_vma *vma = this->vma;
> +		struct i915_vma_snapshot *vsnap = this->vsnap;
>  
>  		add_vma(ee,
>  			i915_vma_coredump_create(engine->gt,
> -						 vma, this->name,
> -						 compress));
> +						 vsnap, compress));
>  
> -		i915_active_release(&vma->active);
> +		i915_vma_snapshot_resource_unpin(vsnap, this->lockdep_cookie);
> +		i915_vma_snapshot_put(vsnap);
>  
>  		capture = this->next;
>  		kfree(this);
>  	}
>  
> -	add_vma(ee,
> -		i915_vma_coredump_create(engine->gt,
> -					 engine->status_page.vma,
> -					 "HW Status",
> -					 compress));
> +	add_vma_coredump(ee, engine->gt, engine->status_page.vma,
> +			 "HW Status", compress);
>  
> -	add_vma(ee,
> -		i915_vma_coredump_create(engine->gt,
> -					 engine->wa_ctx.vma,
> -					 "WA context",
> -					 compress));
> +	add_vma_coredump(ee, engine->gt, engine->wa_ctx.vma,
> +			 "WA context", compress);
>  }
>  
>  static struct intel_engine_coredump *
> @@ -1490,17 +1548,25 @@ capture_engine(struct intel_engine_cs *engine,
>  		}
>  	}
>  	if (rq)
> -		capture = intel_engine_coredump_add_request(ee, rq,
> -							    ATOMIC_MAYFAIL);
> +		rq = i915_request_get_rcu(rq);
> +
> +	if (!rq)
> +		goto no_request_capture;
> +
> +	capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL);
>  	if (!capture) {
> -no_request_capture:
> -		kfree(ee);
> -		return NULL;
> +		i915_request_put(rq);
> +		goto no_request_capture;
>  	}
>  
>  	intel_engine_coredump_add_vma(ee, capture, compress);
> +	i915_request_put(rq);
>  
>  	return ee;
> +
> +no_request_capture:
> +	kfree(ee);
> +	return NULL;
>  }
>  
>  static void
> @@ -1554,10 +1620,8 @@ gt_record_uc(struct intel_gt_coredump *gt,
>  	 */
>  	error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL);
>  	error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL);
> -	error_uc->guc_log =
> -		i915_vma_coredump_create(gt->_gt,
> -					 uc->guc.log.vma, "GuC log buffer",
> -					 compress);
> +	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
> +						"GuC log buffer", compress);
>  
>  	return error_uc;
>  }
> @@ -1843,8 +1907,8 @@ void i915_vma_capture_finish(struct intel_gt_coredump *gt,
>  	kfree(compress);
>  }
>  
> -struct i915_gpu_coredump *
> -i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
> +static struct i915_gpu_coredump *
> +__i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
>  {
>  	struct drm_i915_private *i915 = gt->i915;
>  	struct i915_gpu_coredump *error;
> @@ -1885,6 +1949,22 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
>  	return error;
>  }
>  
> +struct i915_gpu_coredump *
> +i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
> +{
> +	static DEFINE_MUTEX(capture_mutex);
> +	int ret = mutex_lock_interruptible(&capture_mutex);
> +	struct i915_gpu_coredump *dump;
> +
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	dump = __i915_gpu_coredump(gt, engine_mask);
> +	mutex_unlock(&capture_mutex);
> +
> +	return dump;
> +}
> +
>  void i915_error_state_store(struct i915_gpu_coredump *error)
>  {
>  	struct drm_i915_private *i915;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 623273aca09e..ee855dc86972 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -113,6 +113,10 @@ static void i915_fence_release(struct dma_fence *fence)
>  	GEM_BUG_ON(rq->guc_prio != GUC_PRIO_INIT &&
>  		   rq->guc_prio != GUC_PRIO_FINI);
>  
> +	i915_request_free_capture_list(fetch_and_zero(&rq->capture_list));
> +	if (i915_vma_snapshot_present(&rq->batch_snapshot))
> +		i915_vma_snapshot_put_onstack(&rq->batch_snapshot);
> +
>  	/*
>  	 * The request is put onto a RCU freelist (i.e. the address
>  	 * is immediately reused), mark the fences as being freed now.
> @@ -186,19 +190,6 @@ void i915_request_notify_execute_cb_imm(struct i915_request *rq)
>  	__notify_execute_cb(rq, irq_work_imm);
>  }
>  
> -static void free_capture_list(struct i915_request *request)
> -{
> -	struct i915_capture_list *capture;
> -
> -	capture = fetch_and_zero(&request->capture_list);
> -	while (capture) {
> -		struct i915_capture_list *next = capture->next;
> -
> -		kfree(capture);
> -		capture = next;
> -	}
> -}
> -
>  static void __i915_request_fill(struct i915_request *rq, u8 val)
>  {
>  	void *vaddr = rq->ring->vaddr;
> @@ -303,6 +294,37 @@ static void __rq_cancel_watchdog(struct i915_request *rq)
>  		i915_request_put(rq);
>  }
>  
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
> +
> +/**
> + * i915_request_free_capture_list - Free a capture list
> + * @capture: Pointer to the first list item or NULL
> + *
> + */
> +void i915_request_free_capture_list(struct i915_capture_list *capture)
> +{
> +	while (capture) {
> +		struct i915_capture_list *next = capture->next;
> +
> +		i915_vma_snapshot_put(capture->vma_snapshot);
> +		capture = next;
> +	}
> +}
> +
> +#define assert_capture_list_is_null(_rq) GEM_BUG_ON((_rq)->capture_list)
> +
> +#define clear_capture_list(_rq) ((_rq)->capture_list = NULL)
> +
> +#else
> +
> +#define i915_request_free_capture_list(_a) do {} while (0)
> +
> +#define assert_capture_list_is_null(_a) do {} while (0)
> +
> +#define clear_capture_list(_rq) do {} while (0)
> +
> +#endif
> +
>  bool i915_request_retire(struct i915_request *rq)
>  {
>  	if (!__i915_request_is_complete(rq))
> @@ -359,7 +381,6 @@ bool i915_request_retire(struct i915_request *rq)
>  	intel_context_exit(rq->context);
>  	intel_context_unpin(rq->context);
>  
> -	free_capture_list(rq);
>  	i915_sched_node_fini(&rq->sched);
>  	i915_request_put(rq);
>  
> @@ -829,11 +850,18 @@ static void __i915_request_ctor(void *arg)
>  	i915_sw_fence_init(&rq->submit, submit_notify);
>  	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>  
> -	rq->capture_list = NULL;
> +	clear_capture_list(rq);
> +	rq->batch_snapshot.present = false;
>  
>  	init_llist_head(&rq->execute_cb);
>  }
>  
> +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> +#define clear_batch_ptr(_rq) ((_rq)->batch = NULL)
> +#else
> +#define clear_batch_ptr(_a) do {} while (0)
> +#endif
> +
>  struct i915_request *
>  __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  {
> @@ -925,10 +953,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  	i915_sched_node_reinit(&rq->sched);
>  
>  	/* No zalloc, everything must be cleared after use */
> -	rq->batch = NULL;
> +	clear_batch_ptr(rq);
>  	__rq_init_watchdog(rq);
> -	GEM_BUG_ON(rq->capture_list);
> +	assert_capture_list_is_null(rq);
>  	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
> +	GEM_BUG_ON(i915_vma_snapshot_present(&rq->batch_snapshot));
>  
>  	/*
>  	 * Reserve space in the ring buffer for all the commands required to
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index dc359242d1ae..16aba0cb32d6 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -40,6 +40,7 @@
>  #include "i915_scheduler.h"
>  #include "i915_selftest.h"
>  #include "i915_sw_fence.h"
> +#include "i915_vma_snapshot.h"
>  
>  #include <uapi/drm/i915_drm.h>
>  
> @@ -48,11 +49,17 @@ struct drm_i915_gem_object;
>  struct drm_printer;
>  struct i915_request;
>  
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
>  struct i915_capture_list {
> +	struct i915_vma_snapshot *vma_snapshot;
>  	struct i915_capture_list *next;
> -	struct i915_vma *vma;
>  };
>  
> +void i915_request_free_capture_list(struct i915_capture_list *capture);
> +#else
> +#define i915_request_free_capture_list(_a) do {} while (0)
> +#endif
> +
>  #define RQ_TRACE(rq, fmt, ...) do {					\
>  	const struct i915_request *rq__ = (rq);				\
>  	ENGINE_TRACE(rq__->engine, "fence %llx:%lld, current %d " fmt,	\
> @@ -289,10 +296,12 @@ struct i915_request {
>  	/** Preallocate space in the ring for the emitting the request */
>  	u32 reserved_space;
>  
> -	/** Batch buffer related to this request if any (used for
> -	 * error state dump only).
> -	 */
> -	struct i915_vma *batch;
> +	/** Batch buffer pointer for selftest internal use. */
> +	I915_SELFTEST_DECLARE(struct i915_vma *batch);
> +
> +	struct i915_vma_snapshot batch_snapshot;
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
>  	/**
>  	 * Additional buffers requested by userspace to be captured upon
>  	 * a GPU hang. The vma/obj on this list are protected by their
> @@ -300,6 +309,7 @@ struct i915_request {
>  	 * on the active_list (of their final request).
>  	 */
>  	struct i915_capture_list *capture_list;
> +#endif
>  
>  	/** Time at which this request was emitted, in jiffies. */
>  	unsigned long emitted_jiffies;
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> new file mode 100644
> index 000000000000..44985d600f96
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> @@ -0,0 +1,137 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#include "i915_vma_snapshot.h"
> +#include "i915_vma_types.h"
> +#include "i915_vma.h"
> +
> +/**
> + * i915_vma_snapshot_init - Initialize a struct i915_vma_snapshot from
> + * a struct i915_vma.
> + * @vsnap: The i915_vma_snapshot to init.
> + * @vma: A struct i915_vma used to initialize @vsnap.
> + * @name: Name associated with the snapshot. The character pointer needs to
> + * stay alive over the lifitime of the shapsot
> + */
> +void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
> +			    struct i915_vma *vma,
> +			    const char *name)
> +{
> +	if (!i915_vma_is_pinned(vma))
> +		assert_object_held(vma->obj);
> +
> +	vsnap->name = name;
> +	vsnap->size = vma->size;
> +	vsnap->obj_size = vma->obj->base.size;
> +	vsnap->gtt_offset = vma->node.start;
> +	vsnap->gtt_size = vma->node.size;
> +	vsnap->page_sizes = vma->page_sizes.gtt;
> +	vsnap->pages = vma->pages;
> +	vsnap->pages_rsgt = NULL;
> +	vsnap->mr = NULL;
> +	if (vma->obj->mm.rsgt)
> +		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
> +	if (vma->obj->mm.region)
> +		vsnap->mr = intel_memory_region_get(vma->obj->mm.region);
> +	kref_init(&vsnap->kref);
> +	vsnap->vma_resource = &vma->active;
> +	vsnap->onstack = false;
> +	vsnap->present = true;
> +}
> +
> +/**
> + * i915_vma_snapshot_init_onstack - Initialize a struct i915_vma_snapshot from
> + * a struct i915_vma, but avoid kfreeing it on last put.
> + * @vsnap: The i915_vma_snapshot to init.
> + * @vma: A struct i915_vma used to initialize @vsnap.
> + * @name: Name associated with the snapshot. The character pointer needs to
> + * stay alive over the lifitime of the shapsot
> + */
> +void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
> +				    struct i915_vma *vma,
> +				    const char *name)
> +{
> +	i915_vma_snapshot_init(vsnap, vma, name);
> +	vsnap->onstack = true;
> +}
> +
> +static void vma_snapshot_release(struct kref *ref)
> +{
> +	struct i915_vma_snapshot *vsnap =
> +		container_of(ref, typeof(*vsnap), kref);
> +
> +	vsnap->present = false;
> +	if (vsnap->mr)
> +		intel_memory_region_put(vsnap->mr);
> +	if (vsnap->pages_rsgt)
> +		i915_refct_sgt_put(vsnap->pages_rsgt);
> +	if (!vsnap->onstack)
> +		kfree(vsnap);
> +}
> +
> +/**
> + * i915_vma_snapshot_put - Put an i915_vma_snapshot pointer reference
> + * @vsnap: The pointer reference
> + */
> +void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap)
> +{
> +	kref_put(&vsnap->kref, vma_snapshot_release);
> +}
> +
> +/**
> + * i915_vma_snapshot_put_onstack - Put an onstcak i915_vma_snapshot pointer
> + * reference and varify that the structure is released
> + * @vsnap: The pointer reference
> + *
> + * This function is intended to be paired with a i915_vma_init_onstack()
> + * and should be called before exiting the scope that declared or
> + * freeing the structure that embedded @vsnap to verify that all references
> + * have been released.
> + */
> +void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
> +{
> +	if (!kref_put(&vsnap->kref, vma_snapshot_release))
> +		GEM_BUG_ON(1);
> +}
> +
> +/**
> + * i915_vma_snapshot_resource_pin - Temporarily block the memory the
> + * vma snapshot is pointing to from being released.
> + * @vsnap: The vma snapshot.
> + * @lockdep_cookie: Pointer to bool needed for lockdep support. This needs
> + * to be passed to the paired i915_vma_snapshot_resource_unpin.
> + *
> + * This function will temporarily try to hold up a fence or similar structure
> + * and will therefore enter a fence signaling critical section.
> + *
> + * Return: true if we succeeded in blocking the memory from being released,
> + * false otherwise.
> + */
> +bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
> +				    bool *lockdep_cookie)
> +{
> +	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
> +
> +	if (pinned)
> +		*lockdep_cookie = dma_fence_begin_signalling();
> +
> +	return pinned;
> +}
> +
> +/**
> + * i915_vma_snapshot_resource_unpin - Unblock vma snapshot memory from
> + * being released.
> + * @vsnap: The vma snapshot.
> + * @lockdep_cookie: Cookie returned from matching i915_vma_resource_pin().
> + *
> + * Might leave a fence signalling critical section and signal a fence.
> + */
> +void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
> +				      bool lockdep_cookie)
> +{
> +	dma_fence_end_signalling(lockdep_cookie);
> +
> +	return i915_active_release(vsnap->vma_resource);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> new file mode 100644
> index 000000000000..940581df4622
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> @@ -0,0 +1,112 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +#ifndef _I915_VMA_SNAPSHOT_H_
> +#define _I915_VMA_SNAPSHOT_H_
> +
> +#include <linux/kref.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +
> +struct i915_active;
> +struct i915_refct_sgt;
> +struct i915_vma;
> +struct intel_memory_region;
> +struct sg_table;
> +
> +/**
> + * DOC: Simple utilities for snapshotting GPU vma metadata, later used for
> + * error capture. Vi use a separate header for this to avoid issues due to
> + * recursive header includes.
> + */
> +
> +/**
> + * struct i915_vma_snapshot - Snapshot of vma metadata.
> + * @size: The vma size in bytes.
> + * @obj_size: The size of the underlying object in bytes.
> + * @gtt_offset: The gtt offset the vma is bound to.
> + * @gtt_size: The size in bytes allocated for the vma in the GTT.
> + * @pages: The struct sg_table pointing to the pages bound.
> + * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
> + * @mr: The memory region pointed for the pages bound.
> + * @kref: Reference for this structure.
> + * @vma_resource: FIXME: A means to keep the unbind fence from signaling.
> + * Temporarily while we have only sync unbinds, and still use the vma
> + * active, we use that. With async unbinding we need a signaling refcount
> + * for the unbind fence.
> + * @page_sizes: The vma GTT page sizes information.
> + * @onstack: Whether the structure shouldn't be freed on final put.
> + * @present: Whether the structure is present and initialized.
> + */
> +struct i915_vma_snapshot {
> +	const char *name;
> +	size_t size;
> +	size_t obj_size;
> +	size_t gtt_offset;
> +	size_t gtt_size;
> +	struct sg_table *pages;
> +	struct i915_refct_sgt *pages_rsgt;
> +	struct intel_memory_region *mr;
> +	struct kref kref;
> +	struct i915_active *vma_resource;
> +	u32 page_sizes;
> +	bool onstack:1;
> +	bool present:1;
> +};
> +
> +void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
> +			    struct i915_vma *vma,
> +			    const char *name);
> +
> +void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
> +				    struct i915_vma *vma,
> +				    const char *name);
> +
> +void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap);
> +
> +void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap);
> +
> +bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
> +				    bool *lockdep_cookie);
> +
> +void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
> +				      bool lockdep_cookie);
> +
> +/**
> + * i915_vma_snapshot_alloc - Allocate a struct i915_vma_snapshot
> + * @gfp: Allocation mode.
> + *
> + * Return: A pointer to a struct i915_vma_snapshot if successful.
> + * NULL otherwise.
> + */
> +static inline struct i915_vma_snapshot *i915_vma_snapshot_alloc(gfp_t gfp)
> +{
> +	return kmalloc(sizeof(struct i915_vma_snapshot), gfp);
> +}
> +
> +/**
> + * i915_vma_snapshot_get - Take a reference on a struct i915_vma_snapshot
> + *
> + * Return: A pointer to a struct i915_vma_snapshot.
> + */
> +static inline struct i915_vma_snapshot *
> +i915_vma_snapshot_get(struct i915_vma_snapshot *vsnap)
> +{
> +	kref_get(&vsnap->kref);
> +	return vsnap;
> +}
> +
> +/**
> + * i915_vma_snapshot_present - Whether a struct i915_vma_snapshot is
> + * present and initialized.
> + *
> + * Return: true if present and initialized; false otherwise.
> + */
> +static inline bool
> +i915_vma_snapshot_present(const struct i915_vma_snapshot *vsnap)
> +{
> +	return vsnap && vsnap->present;
> +}
> +
> +#endif
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-11-29  9:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 17:45 [PATCH v6 0/4] drm/i915: Prepare error capture for asynchronous migration Thomas Hellström
2021-11-08 17:45 ` [Intel-gfx] " Thomas Hellström
2021-11-08 17:45 ` [PATCH v6 1/4] drm/i915: Avoid allocating a page array for the gpu coredump Thomas Hellström
2021-11-08 17:45   ` [Intel-gfx] " Thomas Hellström
2021-11-22 15:08   ` Ramalingam C
2021-11-22 15:08     ` [Intel-gfx] " Ramalingam C
2021-11-08 17:45 ` [PATCH v6 2/4] drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code Thomas Hellström
2021-11-08 17:45   ` [Intel-gfx] " Thomas Hellström
2021-11-22 16:50   ` Ramalingam C
2021-11-08 17:45 ` [PATCH v6 3/4] drm/i915: Update error capture code to avoid using the current vma state Thomas Hellström
2021-11-08 17:45   ` [Intel-gfx] " Thomas Hellström
2021-11-29  9:13   ` Ramalingam C
2021-11-29  9:13     ` [Intel-gfx] " Ramalingam C
2021-11-08 17:45 ` [PATCH v6 4/4] drm/i915: Initial introduction of vma resources Thomas Hellström
2021-11-08 17:45   ` [Intel-gfx] " Thomas Hellström
2021-11-08 18:29 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Prepare error capture for asynchronous migration (rev2) Patchwork
2021-11-08 19:02 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-11-08 20:21 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.