All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/request: fix early tracepoints
@ 2021-09-03 11:24 ` Matthew Auld
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Auld @ 2021-09-03 11:24 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Michael Mason, Daniel Vetter

Currently we blow up in trace_dma_fence_init, when calling into
get_driver_name or get_timeline_name, since both the engine and context
might be NULL(or contain some garbage address) in the case of newly
allocated slab objects via the request ctor. Note that we also use
SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
freed, but delay freeing the underlying page by an RCU grace period.
With this scheme requests can be re-allocated, at the same time as they
are also being read by some lockless RCU lookup mechanism.

One possible fix, since we don't yet have a fully initialised request
when in the ctor, is just setting the context/engine as NULL and adding
some extra handling in get_driver_name etc. And since the ctor is only
called for new slab objects(i.e allocate new page and call the ctor for
each object) it's safe to reset the context/engine prior to calling into
dma_fence_init, since we can be certain that no one is doing an RCU
lookup which might depend on peeking at the engine/context, like in
active_engine(), since the object can't yet be externally visible.

In the recycled case(which might also be externally visible) the request
refcount always transitions from 0->1 after we set the context/engine
etc, which should ensure it's valid to dereference the engine for
example, when doing an RCU list-walk, so long as we can also increment
the refcount first. If the refcount is already zero, then the request is
considered complete/released.  If it's non-zero, then the request might
be in the process of being re-allocated, or potentially still in flight,
however after successfully incrementing the refcount, it's possible to
carefully inspect the request state, to determine if the request is
still what we were looking for. Note that all externally visible
requests returned to the cache must have zero refcount.

An alternative fix then is to instead move the dma_fence_init out from
the request ctor. Originally this was how it was done, but it was moved
in:

commit 855e39e65cfc33a73724f1cc644ffc5754864a20
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Feb 3 09:41:48 2020 +0000

    drm/i915: Initialise basic fence before acquiring seqno

where it looks like intel_timeline_get_seqno() relied on some of the
rq->fence state, but that is no longer the case since:

commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Mar 23 16:49:50 2021 +0100

    drm/i915: Do not share hwsp across contexts any more, v8.

intel_timeline_get_seqno() could also be cleaned up slightly by dropping
the request argument.

Moving dma_fence_init back out of the ctor, should ensure we have enough
of the request initialised in case of trace_dma_fence_init.
Functionally this should be the same, and is effectively what we were
already open coding before, except now we also assign the fence->lock
and fence->ops, but since these are invariant for recycled
requests(which might be externally visible), and will therefore already
hold the same value, it shouldn't matter. We still leave the
spin_lock_init() in the ctor, since we can't re-init the rq->lock in
case it is already held.

Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Michael Mason <michael.w.mason@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ce446716d092..79da5eca60af 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg)
 	i915_sw_fence_init(&rq->submit, submit_notify);
 	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
 
-	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
-
 	rq->capture_list = NULL;
 
 	init_llist_head(&rq->execute_cb);
@@ -905,17 +903,12 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	rq->ring = ce->ring;
 	rq->execution_mask = ce->engine->mask;
 
-	kref_init(&rq->fence.refcount);
-	rq->fence.flags = 0;
-	rq->fence.error = 0;
-	INIT_LIST_HEAD(&rq->fence.cb_list);
-
 	ret = intel_timeline_get_seqno(tl, rq, &seqno);
 	if (ret)
 		goto err_free;
 
-	rq->fence.context = tl->fence_context;
-	rq->fence.seqno = seqno;
+	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
+		       tl->fence_context, seqno);
 
 	RCU_INIT_POINTER(rq->timeline, tl);
 	rq->hwsp_seqno = tl->hwsp_seqno;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-gfx] [PATCH] drm/i915/request: fix early tracepoints
@ 2021-09-03 11:24 ` Matthew Auld
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Auld @ 2021-09-03 11:24 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Michael Mason, Daniel Vetter

Currently we blow up in trace_dma_fence_init, when calling into
get_driver_name or get_timeline_name, since both the engine and context
might be NULL(or contain some garbage address) in the case of newly
allocated slab objects via the request ctor. Note that we also use
SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
freed, but delay freeing the underlying page by an RCU grace period.
With this scheme requests can be re-allocated, at the same time as they
are also being read by some lockless RCU lookup mechanism.

One possible fix, since we don't yet have a fully initialised request
when in the ctor, is just setting the context/engine as NULL and adding
some extra handling in get_driver_name etc. And since the ctor is only
called for new slab objects(i.e allocate new page and call the ctor for
each object) it's safe to reset the context/engine prior to calling into
dma_fence_init, since we can be certain that no one is doing an RCU
lookup which might depend on peeking at the engine/context, like in
active_engine(), since the object can't yet be externally visible.

In the recycled case(which might also be externally visible) the request
refcount always transitions from 0->1 after we set the context/engine
etc, which should ensure it's valid to dereference the engine for
example, when doing an RCU list-walk, so long as we can also increment
the refcount first. If the refcount is already zero, then the request is
considered complete/released.  If it's non-zero, then the request might
be in the process of being re-allocated, or potentially still in flight,
however after successfully incrementing the refcount, it's possible to
carefully inspect the request state, to determine if the request is
still what we were looking for. Note that all externally visible
requests returned to the cache must have zero refcount.

An alternative fix then is to instead move the dma_fence_init out from
the request ctor. Originally this was how it was done, but it was moved
in:

commit 855e39e65cfc33a73724f1cc644ffc5754864a20
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Feb 3 09:41:48 2020 +0000

    drm/i915: Initialise basic fence before acquiring seqno

where it looks like intel_timeline_get_seqno() relied on some of the
rq->fence state, but that is no longer the case since:

commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Mar 23 16:49:50 2021 +0100

    drm/i915: Do not share hwsp across contexts any more, v8.

intel_timeline_get_seqno() could also be cleaned up slightly by dropping
the request argument.

Moving dma_fence_init back out of the ctor, should ensure we have enough
of the request initialised in case of trace_dma_fence_init.
Functionally this should be the same, and is effectively what we were
already open coding before, except now we also assign the fence->lock
and fence->ops, but since these are invariant for recycled
requests(which might be externally visible), and will therefore already
hold the same value, it shouldn't matter. We still leave the
spin_lock_init() in the ctor, since we can't re-init the rq->lock in
case it is already held.

Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Michael Mason <michael.w.mason@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ce446716d092..79da5eca60af 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg)
 	i915_sw_fence_init(&rq->submit, submit_notify);
 	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
 
-	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
-
 	rq->capture_list = NULL;
 
 	init_llist_head(&rq->execute_cb);
@@ -905,17 +903,12 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	rq->ring = ce->ring;
 	rq->execution_mask = ce->engine->mask;
 
-	kref_init(&rq->fence.refcount);
-	rq->fence.flags = 0;
-	rq->fence.error = 0;
-	INIT_LIST_HEAD(&rq->fence.cb_list);
-
 	ret = intel_timeline_get_seqno(tl, rq, &seqno);
 	if (ret)
 		goto err_free;
 
-	rq->fence.context = tl->fence_context;
-	rq->fence.seqno = seqno;
+	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
+		       tl->fence_context, seqno);
 
 	RCU_INIT_POINTER(rq->timeline, tl);
 	rq->hwsp_seqno = tl->hwsp_seqno;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/request: fix early tracepoints
  2021-09-03 11:24 ` [Intel-gfx] " Matthew Auld
  (?)
@ 2021-09-03 11:42 ` Patchwork
  -1 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2021-09-03 11:42 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/request: fix early tracepoints
URL   : https://patchwork.freedesktop.org/series/94317/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
c82cbcd35cf9 drm/i915/request: fix early tracepoints
-:40: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")'
#40: 
commit 855e39e65cfc33a73724f1cc644ffc5754864a20

-:49: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 12ca695d2c1e ("drm/i915: Do not share hwsp across contexts any more, v8.")'
#49: 
commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc

total: 2 errors, 0 warnings, 0 checks, 27 lines checked



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/request: fix early tracepoints
  2021-09-03 11:24 ` [Intel-gfx] " Matthew Auld
  (?)
  (?)
@ 2021-09-03 12:11 ` Patchwork
  -1 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2021-09-03 12:11 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 2776 bytes --]

== Series Details ==

Series: drm/i915/request: fix early tracepoints
URL   : https://patchwork.freedesktop.org/series/94317/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10549 -> Patchwork_20951
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/index.html

Known issues
------------

  Here are the changes found in Patchwork_20951 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@cs-gfx:
    - fi-rkl-guc:         NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/fi-rkl-guc/igt@amdgpu/amd_basic@cs-gfx.html

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
    - fi-snb-2600:        NOTRUN -> [SKIP][2] ([fdo#109271]) +17 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/fi-snb-2600/igt@amdgpu/amd_cs_nop@sync-fork-compute0.html

  
#### Possible fixes ####

  * igt@core_hotunplug@unbind-rebind:
    - fi-rkl-guc:         [DMESG-WARN][3] ([i915#3925]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/fi-rkl-guc/igt@core_hotunplug@unbind-rebind.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/fi-rkl-guc/igt@core_hotunplug@unbind-rebind.html

  * igt@i915_selftest@live@hangcheck:
    - fi-snb-2600:        [INCOMPLETE][5] ([i915#3921]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/fi-snb-2600/igt@i915_selftest@live@hangcheck.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/fi-snb-2600/igt@i915_selftest@live@hangcheck.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [i915#3921]: https://gitlab.freedesktop.org/drm/intel/issues/3921
  [i915#3925]: https://gitlab.freedesktop.org/drm/intel/issues/3925


Participating hosts (46 -> 38)
------------------------------

  Missing    (8): fi-kbl-soraka bat-adls-5 fi-hsw-4200u bat-dg1-5 fi-bsw-cyan bat-adlp-4 fi-bdw-samus bat-jsl-1 


Build changes
-------------

  * Linux: CI_DRM_10549 -> Patchwork_20951

  CI-20190529: 20190529
  CI_DRM_10549: cd8e346a0b56ce66f94a9908c7a068bbee8f1829 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6197: 40888f97a6ad219f4ed48a1830d0ef3c9617d006 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20951: c82cbcd35cf9beef272fa314c267c0ced70adbc5 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c82cbcd35cf9 drm/i915/request: fix early tracepoints

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/index.html

[-- Attachment #2: Type: text/html, Size: 3466 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/request: fix early tracepoints
  2021-09-03 11:24 ` [Intel-gfx] " Matthew Auld
                   ` (2 preceding siblings ...)
  (?)
@ 2021-09-03 13:27 ` Patchwork
  -1 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2021-09-03 13:27 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 30262 bytes --]

== Series Details ==

Series: drm/i915/request: fix early tracepoints
URL   : https://patchwork.freedesktop.org/series/94317/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10549_full -> Patchwork_20951_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_20951_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@feature_discovery@display-2x:
    - shard-iclb:         NOTRUN -> [SKIP][1] ([i915#1839])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@feature_discovery@display-2x.html

  * igt@feature_discovery@display-3x:
    - shard-tglb:         NOTRUN -> [SKIP][2] ([i915#1839])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@feature_discovery@display-3x.html

  * igt@feature_discovery@display-4x:
    - shard-apl:          NOTRUN -> [SKIP][3] ([fdo#109271]) +209 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl2/igt@feature_discovery@display-4x.html

  * igt@gem_ctx_isolation@preservation-s3@vcs1:
    - shard-kbl:          [PASS][4] -> [DMESG-WARN][5] ([i915#180])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-kbl7/igt@gem_ctx_isolation@preservation-s3@vcs1.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-kbl7/igt@gem_ctx_isolation@preservation-s3@vcs1.html

  * igt@gem_ctx_persistence@legacy-engines-mixed-process:
    - shard-snb:          NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#1099]) +5 similar issues
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-snb7/igt@gem_ctx_persistence@legacy-engines-mixed-process.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-kbl:          [PASS][7] -> [FAIL][8] ([i915#2846])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-kbl4/igt@gem_exec_fair@basic-deadline.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-kbl6/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-pace@rcs0:
    - shard-kbl:          [PASS][9] -> [FAIL][10] ([i915#2842]) +1 similar issue
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-kbl7/igt@gem_exec_fair@basic-pace@rcs0.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-kbl3/igt@gem_exec_fair@basic-pace@rcs0.html
    - shard-glk:          [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-glk9/igt@gem_exec_fair@basic-pace@rcs0.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@gem_exec_fair@basic-pace@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-tglb:         [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-tglb3/igt@gem_exec_fair@basic-pace@vecs0.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_params@secure-non-master:
    - shard-iclb:         NOTRUN -> [SKIP][15] ([fdo#112283])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@gem_exec_params@secure-non-master.html

  * igt@gem_exec_params@secure-non-root:
    - shard-tglb:         NOTRUN -> [SKIP][16] ([fdo#112283])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@gem_exec_params@secure-non-root.html

  * igt@gem_exec_schedule@u-semaphore-user:
    - shard-snb:          NOTRUN -> [SKIP][17] ([fdo#109271]) +223 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-snb7/igt@gem_exec_schedule@u-semaphore-user.html

  * igt@gem_exec_whisper@basic-fds-forked:
    - shard-glk:          [PASS][18] -> [DMESG-WARN][19] ([i915#118] / [i915#95])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-glk8/igt@gem_exec_whisper@basic-fds-forked.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk3/igt@gem_exec_whisper@basic-fds-forked.html

  * igt@gem_huc_copy@huc-copy:
    - shard-apl:          NOTRUN -> [SKIP][20] ([fdo#109271] / [i915#2190])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl1/igt@gem_huc_copy@huc-copy.html

  * igt@gem_mmap_gtt@cpuset-big-copy-xy:
    - shard-iclb:         [PASS][21] -> [FAIL][22] ([i915#2428])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-iclb8/igt@gem_mmap_gtt@cpuset-big-copy-xy.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb8/igt@gem_mmap_gtt@cpuset-big-copy-xy.html
    - shard-glk:          [PASS][23] -> [FAIL][24] ([i915#1888] / [i915#307])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-glk4/igt@gem_mmap_gtt@cpuset-big-copy-xy.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk6/igt@gem_mmap_gtt@cpuset-big-copy-xy.html

  * igt@gem_render_copy@yf-tiled-to-vebox-x-tiled:
    - shard-iclb:         NOTRUN -> [SKIP][25] ([i915#768])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@gem_render_copy@yf-tiled-to-vebox-x-tiled.html

  * igt@gem_softpin@evict-snoop:
    - shard-iclb:         NOTRUN -> [SKIP][26] ([fdo#109312])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@gem_softpin@evict-snoop.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-glk:          NOTRUN -> [SKIP][27] ([fdo#109271] / [i915#3323])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-snb:          NOTRUN -> [FAIL][28] ([i915#2724])
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-snb7/igt@gem_userptr_blits@vma-merge.html
    - shard-glk:          NOTRUN -> [FAIL][29] ([i915#3318])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@gem_userptr_blits@vma-merge.html

  * igt@gen7_exec_parse@basic-allowed:
    - shard-iclb:         NOTRUN -> [SKIP][30] ([fdo#109289])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@gen7_exec_parse@basic-allowed.html

  * igt@i915_pm_dc@dc3co-vpb-simulation:
    - shard-tglb:         NOTRUN -> [SKIP][31] ([i915#1904])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@i915_pm_dc@dc3co-vpb-simulation.html

  * igt@i915_pm_rc6_residency@rc6-fence:
    - shard-iclb:         NOTRUN -> [WARN][32] ([i915#1804] / [i915#2684])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@i915_pm_rc6_residency@rc6-fence.html

  * igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait:
    - shard-tglb:         NOTRUN -> [SKIP][33] ([fdo#111644] / [i915#1397] / [i915#2411])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait.html

  * igt@i915_selftest@live@hangcheck:
    - shard-snb:          [PASS][34] -> [INCOMPLETE][35] ([i915#3921])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-snb6/igt@i915_selftest@live@hangcheck.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-snb5/igt@i915_selftest@live@hangcheck.html

  * igt@kms_big_fb@linear-16bpp-rotate-270:
    - shard-tglb:         NOTRUN -> [SKIP][36] ([fdo#111614])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_big_fb@linear-16bpp-rotate-270.html

  * igt@kms_big_fb@x-tiled-64bpp-rotate-270:
    - shard-iclb:         NOTRUN -> [SKIP][37] ([fdo#110725] / [fdo#111614])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_big_fb@x-tiled-64bpp-rotate-270.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-0-hflip:
    - shard-apl:          NOTRUN -> [SKIP][38] ([fdo#109271] / [i915#3777]) +2 similar issues
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl2/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-0-hflip.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-0:
    - shard-tglb:         NOTRUN -> [SKIP][39] ([fdo#111615])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-0.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-hflip:
    - shard-iclb:         NOTRUN -> [SKIP][40] ([fdo#110723])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-hflip.html

  * igt@kms_big_joiner@invalid-modeset:
    - shard-iclb:         NOTRUN -> [SKIP][41] ([i915#2705])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb5/igt@kms_big_joiner@invalid-modeset.html

  * igt@kms_ccs@pipe-a-bad-aux-stride-y_tiled_gen12_rc_ccs_cc:
    - shard-iclb:         NOTRUN -> [SKIP][42] ([fdo#109278] / [i915#3886]) +5 similar issues
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_ccs@pipe-a-bad-aux-stride-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][43] ([i915#3689] / [i915#3886]) +1 similar issue
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-ccs-on-another-bo-y_tiled_gen12_rc_ccs_cc:
    - shard-apl:          NOTRUN -> [SKIP][44] ([fdo#109271] / [i915#3886]) +11 similar issues
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl6/igt@kms_ccs@pipe-c-ccs-on-another-bo-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-c-crc-primary-rotation-180-y_tiled_gen12_mc_ccs:
    - shard-glk:          NOTRUN -> [SKIP][45] ([fdo#109271] / [i915#3886]) +1 similar issue
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@kms_ccs@pipe-c-crc-primary-rotation-180-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-d-crc-sprite-planes-basic-y_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][46] ([i915#3689]) +3 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_ccs@pipe-d-crc-sprite-planes-basic-y_tiled_ccs.html

  * igt@kms_chamelium@dp-hpd-enable-disable-mode:
    - shard-tglb:         NOTRUN -> [SKIP][47] ([fdo#109284] / [fdo#111827]) +2 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_chamelium@dp-hpd-enable-disable-mode.html

  * igt@kms_chamelium@vga-hpd-without-ddc:
    - shard-snb:          NOTRUN -> [SKIP][48] ([fdo#109271] / [fdo#111827]) +11 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-snb2/igt@kms_chamelium@vga-hpd-without-ddc.html

  * igt@kms_color_chamelium@pipe-a-ctm-limited-range:
    - shard-apl:          NOTRUN -> [SKIP][49] ([fdo#109271] / [fdo#111827]) +18 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl2/igt@kms_color_chamelium@pipe-a-ctm-limited-range.html

  * igt@kms_color_chamelium@pipe-b-ctm-limited-range:
    - shard-glk:          NOTRUN -> [SKIP][50] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@kms_color_chamelium@pipe-b-ctm-limited-range.html

  * igt@kms_color_chamelium@pipe-invalid-gamma-lut-sizes:
    - shard-iclb:         NOTRUN -> [SKIP][51] ([fdo#109284] / [fdo#111827]) +3 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_color_chamelium@pipe-invalid-gamma-lut-sizes.html
    - shard-skl:          NOTRUN -> [SKIP][52] ([fdo#109271] / [fdo#111827])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-skl9/igt@kms_color_chamelium@pipe-invalid-gamma-lut-sizes.html

  * igt@kms_content_protection@dp-mst-type-0:
    - shard-iclb:         NOTRUN -> [SKIP][53] ([i915#3116])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_content_protection@dp-mst-type-0.html

  * igt@kms_content_protection@legacy:
    - shard-iclb:         NOTRUN -> [SKIP][54] ([fdo#109300] / [fdo#111066])
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb5/igt@kms_content_protection@legacy.html

  * igt@kms_content_protection@srm:
    - shard-apl:          NOTRUN -> [TIMEOUT][55] ([i915#1319])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl1/igt@kms_content_protection@srm.html

  * igt@kms_content_protection@uevent:
    - shard-apl:          NOTRUN -> [FAIL][56] ([i915#2105])
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl6/igt@kms_content_protection@uevent.html

  * igt@kms_cursor_crc@pipe-a-cursor-512x170-offscreen:
    - shard-iclb:         NOTRUN -> [SKIP][57] ([fdo#109278] / [fdo#109279])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_cursor_crc@pipe-a-cursor-512x170-offscreen.html

  * igt@kms_cursor_crc@pipe-a-cursor-alpha-opaque:
    - shard-apl:          NOTRUN -> [FAIL][58] ([i915#3444])
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl2/igt@kms_cursor_crc@pipe-a-cursor-alpha-opaque.html

  * igt@kms_cursor_crc@pipe-b-cursor-512x512-offscreen:
    - shard-tglb:         NOTRUN -> [SKIP][59] ([fdo#109279] / [i915#3359])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_cursor_crc@pipe-b-cursor-512x512-offscreen.html

  * igt@kms_cursor_crc@pipe-b-cursor-size-change:
    - shard-snb:          [PASS][60] -> [FAIL][61] ([i915#4024])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-snb2/igt@kms_cursor_crc@pipe-b-cursor-size-change.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-snb6/igt@kms_cursor_crc@pipe-b-cursor-size-change.html

  * igt@kms_cursor_crc@pipe-d-cursor-512x170-rapid-movement:
    - shard-tglb:         NOTRUN -> [SKIP][62] ([i915#3359]) +2 similar issues
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_cursor_crc@pipe-d-cursor-512x170-rapid-movement.html

  * igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions-varying-size:
    - shard-iclb:         NOTRUN -> [SKIP][63] ([fdo#109274] / [fdo#109278])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@pipe-d-torture-bo:
    - shard-glk:          NOTRUN -> [SKIP][64] ([fdo#109271] / [i915#533])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@kms_cursor_legacy@pipe-d-torture-bo.html

  * igt@kms_flip@2x-flip-vs-fences:
    - shard-iclb:         NOTRUN -> [SKIP][65] ([fdo#109274])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_flip@2x-flip-vs-fences.html

  * igt@kms_flip@2x-plain-flip-fb-recreate:
    - shard-tglb:         NOTRUN -> [SKIP][66] ([fdo#111825]) +10 similar issues
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_flip@2x-plain-flip-fb-recreate.html

  * igt@kms_flip@flip-vs-suspend-interruptible@c-dp1:
    - shard-apl:          [PASS][67] -> [DMESG-WARN][68] ([i915#180]) +1 similar issue
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-apl6/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl2/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html

  * igt@kms_flip@plain-flip-ts-check@b-edp1:
    - shard-skl:          [PASS][69] -> [FAIL][70] ([i915#2122])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-skl10/igt@kms_flip@plain-flip-ts-check@b-edp1.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-skl7/igt@kms_flip@plain-flip-ts-check@b-edp1.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs:
    - shard-apl:          NOTRUN -> [SKIP][71] ([fdo#109271] / [i915#2672])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl1/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-gtt:
    - shard-skl:          NOTRUN -> [SKIP][72] ([fdo#109271])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-skl9/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-gtt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-indfb-pgflip-blt:
    - shard-glk:          NOTRUN -> [SKIP][73] ([fdo#109271]) +36 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-indfb-pgflip-blt.html

  * igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-mmap-wc:
    - shard-iclb:         NOTRUN -> [SKIP][74] ([fdo#109280]) +10 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_hdr@bpc-switch:
    - shard-skl:          [PASS][75] -> [FAIL][76] ([i915#1188])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-skl1/igt@kms_hdr@bpc-switch.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-skl2/igt@kms_hdr@bpc-switch.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-d-frame-sequence:
    - shard-iclb:         NOTRUN -> [SKIP][77] ([fdo#109278]) +8 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_pipe_crc_basic@read-crc-pipe-d-frame-sequence.html

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-max:
    - shard-glk:          NOTRUN -> [FAIL][78] ([fdo#108145] / [i915#265])
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-max.html

  * igt@kms_plane_alpha_blend@pipe-b-alpha-opaque-fb:
    - shard-apl:          NOTRUN -> [FAIL][79] ([fdo#108145] / [i915#265]) +1 similar issue
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl6/igt@kms_plane_alpha_blend@pipe-b-alpha-opaque-fb.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [PASS][80] -> [FAIL][81] ([fdo#108145] / [i915#265])
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-skl9/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-skl3/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_plane_lowres@pipe-a-tiling-none:
    - shard-tglb:         NOTRUN -> [SKIP][82] ([i915#3536]) +1 similar issue
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_plane_lowres@pipe-a-tiling-none.html

  * igt@kms_prime@basic-crc@first-to-second:
    - shard-iclb:         NOTRUN -> [SKIP][83] ([i915#1836])
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_prime@basic-crc@first-to-second.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4:
    - shard-iclb:         NOTRUN -> [SKIP][84] ([i915#658])
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-5:
    - shard-apl:          NOTRUN -> [SKIP][85] ([fdo#109271] / [i915#658]) +3 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl1/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-5.html

  * igt@kms_psr2_su@page_flip:
    - shard-glk:          NOTRUN -> [SKIP][86] ([fdo#109271] / [i915#658]) +1 similar issue
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@kms_psr2_su@page_flip.html

  * igt@kms_psr@psr2_sprite_mmap_gtt:
    - shard-iclb:         NOTRUN -> [SKIP][87] ([fdo#109441])
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_psr@psr2_sprite_mmap_gtt.html

  * igt@kms_psr@psr2_suspend:
    - shard-iclb:         [PASS][88] -> [SKIP][89] ([fdo#109441]) +1 similar issue
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-iclb2/igt@kms_psr@psr2_suspend.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb6/igt@kms_psr@psr2_suspend.html

  * igt@kms_vblank@pipe-d-wait-idle:
    - shard-apl:          NOTRUN -> [SKIP][90] ([fdo#109271] / [i915#533])
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl6/igt@kms_vblank@pipe-d-wait-idle.html

  * igt@kms_vrr@flip-dpms:
    - shard-iclb:         NOTRUN -> [SKIP][91] ([fdo#109502])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@kms_vrr@flip-dpms.html

  * igt@kms_writeback@writeback-invalid-parameters:
    - shard-apl:          NOTRUN -> [SKIP][92] ([fdo#109271] / [i915#2437])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl2/igt@kms_writeback@writeback-invalid-parameters.html

  * igt@kms_writeback@writeback-pixel-formats:
    - shard-tglb:         NOTRUN -> [SKIP][93] ([i915#2437])
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@kms_writeback@writeback-pixel-formats.html

  * igt@nouveau_crc@pipe-c-source-rg:
    - shard-iclb:         NOTRUN -> [SKIP][94] ([i915#2530])
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@nouveau_crc@pipe-c-source-rg.html

  * igt@perf@polling-small-buf:
    - shard-skl:          [PASS][95] -> [FAIL][96] ([i915#1722])
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-skl10/igt@perf@polling-small-buf.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-skl7/igt@perf@polling-small-buf.html

  * igt@prime_nv_api@i915_self_import_to_different_fd:
    - shard-tglb:         NOTRUN -> [SKIP][97] ([fdo#109291]) +1 similar issue
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb1/igt@prime_nv_api@i915_self_import_to_different_fd.html

  * igt@prime_udl:
    - shard-iclb:         NOTRUN -> [SKIP][98] ([fdo#109291])
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@prime_udl.html

  * igt@sysfs_clients@recycle-many:
    - shard-glk:          NOTRUN -> [SKIP][99] ([fdo#109271] / [i915#2994]) +1 similar issue
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@sysfs_clients@recycle-many.html

  * igt@sysfs_clients@sema-50:
    - shard-apl:          NOTRUN -> [SKIP][100] ([fdo#109271] / [i915#2994]) +4 similar issues
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl1/igt@sysfs_clients@sema-50.html

  * igt@sysfs_clients@split-10:
    - shard-iclb:         NOTRUN -> [SKIP][101] ([i915#2994])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb7/igt@sysfs_clients@split-10.html

  
#### Possible fixes ####

  * igt@fbdev@unaligned-write:
    - {shard-rkl}:        [SKIP][102] ([i915#2582]) -> [PASS][103]
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-1/igt@fbdev@unaligned-write.html
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@fbdev@unaligned-write.html

  * igt@feature_discovery@psr2:
    - {shard-rkl}:        [SKIP][104] ([i915#658]) -> [PASS][105]
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-1/igt@feature_discovery@psr2.html
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@feature_discovery@psr2.html

  * igt@gem_ctx_isolation@preservation-s3@vecs0:
    - shard-kbl:          [DMESG-WARN][106] ([i915#180]) -> [PASS][107]
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-kbl7/igt@gem_ctx_isolation@preservation-s3@vecs0.html
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-kbl7/igt@gem_ctx_isolation@preservation-s3@vecs0.html

  * igt@gem_ctx_persistence@legacy-engines-hang@render:
    - {shard-rkl}:        [FAIL][108] ([i915#2410]) -> [PASS][109]
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-5/igt@gem_ctx_persistence@legacy-engines-hang@render.html
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-2/igt@gem_ctx_persistence@legacy-engines-hang@render.html

  * igt@gem_eio@hibernate:
    - {shard-rkl}:        [FAIL][110] -> [PASS][111]
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-5/igt@gem_eio@hibernate.html
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@gem_eio@hibernate.html

  * igt@gem_eio@unwedge-stress:
    - shard-tglb:         [TIMEOUT][112] ([i915#2369] / [i915#3063] / [i915#3648]) -> [PASS][113]
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-tglb6/igt@gem_eio@unwedge-stress.html
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb7/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-tglb:         [FAIL][114] ([i915#2842]) -> [PASS][115]
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-tglb6/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-tglb3/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_mmap_gtt@cpuset-basic-small-copy-xy:
    - {shard-rkl}:        [FAIL][116] ([i915#307]) -> [PASS][117]
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-5/igt@gem_mmap_gtt@cpuset-basic-small-copy-xy.html
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-2/igt@gem_mmap_gtt@cpuset-basic-small-copy-xy.html

  * igt@gem_mmap_gtt@cpuset-big-copy-odd:
    - shard-iclb:         [FAIL][118] ([i915#307]) -> [PASS][119]
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-iclb3/igt@gem_mmap_gtt@cpuset-big-copy-odd.html
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-iclb1/igt@gem_mmap_gtt@cpuset-big-copy-odd.html

  * igt@gem_softpin@noreloc-s3:
    - shard-apl:          [DMESG-WARN][120] ([i915#180]) -> [PASS][121]
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-apl3/igt@gem_softpin@noreloc-s3.html
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-apl6/igt@gem_softpin@noreloc-s3.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-glk:          [DMESG-WARN][122] ([i915#1436] / [i915#716]) -> [PASS][123]
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-glk2/igt@gen9_exec_parse@allowed-all.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-glk5/igt@gen9_exec_parse@allowed-all.html

  * igt@i915_pm_backlight@basic-brightness:
    - {shard-rkl}:        [SKIP][124] ([i915#3012]) -> [PASS][125]
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-1/igt@i915_pm_backlight@basic-brightness.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@i915_pm_backlight@basic-brightness.html

  * igt@kms_big_fb@x-tiled-64bpp-rotate-180:
    - {shard-rkl}:        [SKIP][126] ([i915#3638]) -> [PASS][127] +5 similar issues
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-5/igt@kms_big_fb@x-tiled-64bpp-rotate-180.html
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@kms_big_fb@x-tiled-64bpp-rotate-180.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-async-flip:
    - {shard-rkl}:        [SKIP][128] ([i915#3721]) -> [PASS][129] +4 similar issues
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-5/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html

  * igt@kms_color@pipe-a-ctm-0-75:
    - {shard-rkl}:        [SKIP][130] ([i915#1149] / [i915#1849] / [i915#4070]) -> [PASS][131] +3 similar issues
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-5/igt@kms_color@pipe-a-ctm-0-75.html
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@kms_color@pipe-a-ctm-0-75.html

  * igt@kms_color@pipe-c-ctm-0-5:
    - shard-skl:          [DMESG-WARN][132] ([i915#1982]) -> [PASS][133]
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-skl8/igt@kms_color@pipe-c-ctm-0-5.html
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-skl9/igt@kms_color@pipe-c-ctm-0-5.html

  * igt@kms_concurrent@pipe-a:
    - {shard-rkl}:        [SKIP][134] ([i915#1845] / [i915#4070]) -> [PASS][135] +1 similar issue
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-5/igt@kms_concurrent@pipe-a.html
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@kms_concurrent@pipe-a.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x128-sliding:
    - {shard-rkl}:        [SKIP][136] ([fdo#112022] / [i915#4070]) -> [PASS][137] +8 similar issues
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-rkl-1/igt@kms_cursor_crc@pipe-b-cursor-128x128-sliding.html
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-rkl-6/igt@kms_cursor_crc@pipe-b-cursor-128x128-sliding.html

  * igt@kms_cursor_edge_walk@pipe-b-256x256-top-edge:
    - shard-snb:          [SKIP][138] ([fdo#109271]) -> [PASS][139]
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10549/shard-snb2/igt@kms_cursor_edge_walk@pipe-b-256x256-top-edge.html
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/shard-snb6/igt@kms_cursor_edge_walk@pipe-b-256x256-top-edge.html

  * igt@kms_cursor_legacy@cursora-vs-flipa-atomic:
    - {shard-rkl}:        [SKIP][140] ([fdo#111825] / [i915#4070]) -> [PASS][141] +4 similar issues
   [140]

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20951/index.html

[-- Attachment #2: Type: text/html, Size: 33740 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915/request: fix early tracepoints
  2021-09-03 11:24 ` [Intel-gfx] " Matthew Auld
@ 2021-09-08 17:38   ` Daniel Vetter
  -1 siblings, 0 replies; 9+ messages in thread
From: Daniel Vetter @ 2021-09-08 17:38 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Michael Mason, Daniel Vetter

On Fri, Sep 03, 2021 at 12:24:05PM +0100, Matthew Auld wrote:
> Currently we blow up in trace_dma_fence_init, when calling into
> get_driver_name or get_timeline_name, since both the engine and context
> might be NULL(or contain some garbage address) in the case of newly
> allocated slab objects via the request ctor. Note that we also use
> SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
> freed, but delay freeing the underlying page by an RCU grace period.
> With this scheme requests can be re-allocated, at the same time as they
> are also being read by some lockless RCU lookup mechanism.
> 
> One possible fix, since we don't yet have a fully initialised request
> when in the ctor, is just setting the context/engine as NULL and adding
> some extra handling in get_driver_name etc. And since the ctor is only
> called for new slab objects(i.e allocate new page and call the ctor for
> each object) it's safe to reset the context/engine prior to calling into
> dma_fence_init, since we can be certain that no one is doing an RCU
> lookup which might depend on peeking at the engine/context, like in
> active_engine(), since the object can't yet be externally visible.
> 
> In the recycled case(which might also be externally visible) the request
> refcount always transitions from 0->1 after we set the context/engine
> etc, which should ensure it's valid to dereference the engine for
> example, when doing an RCU list-walk, so long as we can also increment
> the refcount first. If the refcount is already zero, then the request is
> considered complete/released.  If it's non-zero, then the request might
> be in the process of being re-allocated, or potentially still in flight,
> however after successfully incrementing the refcount, it's possible to
> carefully inspect the request state, to determine if the request is
> still what we were looking for. Note that all externally visible
> requests returned to the cache must have zero refcount.

The commit message here is a bit confusing, since you start out with
describing a solution that you're not actually implementing it. I usually
do this by putting alternate solutions at the bottom, starting with "An
alternate solution would be ..." or so.

And then closing with why we don't do that, here it would be that we do
no longer have a need for these partially set up i915_requests, and
therefore just reverting that complication is the simplest solution.

> An alternative fix then is to instead move the dma_fence_init out from
> the request ctor. Originally this was how it was done, but it was moved
> in:
> 
> commit 855e39e65cfc33a73724f1cc644ffc5754864a20
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Feb 3 09:41:48 2020 +0000
> 
>     drm/i915: Initialise basic fence before acquiring seqno
> 
> where it looks like intel_timeline_get_seqno() relied on some of the
> rq->fence state, but that is no longer the case since:
> 
> commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc
> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Date:   Tue Mar 23 16:49:50 2021 +0100
> 
>     drm/i915: Do not share hwsp across contexts any more, v8.
> 
> intel_timeline_get_seqno() could also be cleaned up slightly by dropping
> the request argument.
> 
> Moving dma_fence_init back out of the ctor, should ensure we have enough
> of the request initialised in case of trace_dma_fence_init.
> Functionally this should be the same, and is effectively what we were
> already open coding before, except now we also assign the fence->lock
> and fence->ops, but since these are invariant for recycled
> requests(which might be externally visible), and will therefore already
> hold the same value, it shouldn't matter. We still leave the
> spin_lock_init() in the ctor, since we can't re-init the rq->lock in
> case it is already held.

Holding rq->lock without having a full reference to it sounds like really
bad taste. I think it would be good to have a (kerneldoc) comment next to
i915_request.lock about this, with a FIXME. But separate patch.

> Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Michael Mason <michael.w.mason@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>

With the commit message restructured a bit, and assuming this one actually
works:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

But I'm really not confident :-(
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_request.c | 11 ++---------
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index ce446716d092..79da5eca60af 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg)
>  	i915_sw_fence_init(&rq->submit, submit_notify);
>  	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>  
> -	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
> -
>  	rq->capture_list = NULL;
>  
>  	init_llist_head(&rq->execute_cb);
> @@ -905,17 +903,12 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  	rq->ring = ce->ring;
>  	rq->execution_mask = ce->engine->mask;
>  
> -	kref_init(&rq->fence.refcount);
> -	rq->fence.flags = 0;
> -	rq->fence.error = 0;
> -	INIT_LIST_HEAD(&rq->fence.cb_list);
> -
>  	ret = intel_timeline_get_seqno(tl, rq, &seqno);
>  	if (ret)
>  		goto err_free;
>  
> -	rq->fence.context = tl->fence_context;
> -	rq->fence.seqno = seqno;
> +	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
> +		       tl->fence_context, seqno);
>  
>  	RCU_INIT_POINTER(rq->timeline, tl);
>  	rq->hwsp_seqno = tl->hwsp_seqno;
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/request: fix early tracepoints
@ 2021-09-08 17:38   ` Daniel Vetter
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Vetter @ 2021-09-08 17:38 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx, dri-devel, Michael Mason, Daniel Vetter

On Fri, Sep 03, 2021 at 12:24:05PM +0100, Matthew Auld wrote:
> Currently we blow up in trace_dma_fence_init, when calling into
> get_driver_name or get_timeline_name, since both the engine and context
> might be NULL(or contain some garbage address) in the case of newly
> allocated slab objects via the request ctor. Note that we also use
> SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
> freed, but delay freeing the underlying page by an RCU grace period.
> With this scheme requests can be re-allocated, at the same time as they
> are also being read by some lockless RCU lookup mechanism.
> 
> One possible fix, since we don't yet have a fully initialised request
> when in the ctor, is just setting the context/engine as NULL and adding
> some extra handling in get_driver_name etc. And since the ctor is only
> called for new slab objects(i.e allocate new page and call the ctor for
> each object) it's safe to reset the context/engine prior to calling into
> dma_fence_init, since we can be certain that no one is doing an RCU
> lookup which might depend on peeking at the engine/context, like in
> active_engine(), since the object can't yet be externally visible.
> 
> In the recycled case(which might also be externally visible) the request
> refcount always transitions from 0->1 after we set the context/engine
> etc, which should ensure it's valid to dereference the engine for
> example, when doing an RCU list-walk, so long as we can also increment
> the refcount first. If the refcount is already zero, then the request is
> considered complete/released.  If it's non-zero, then the request might
> be in the process of being re-allocated, or potentially still in flight,
> however after successfully incrementing the refcount, it's possible to
> carefully inspect the request state, to determine if the request is
> still what we were looking for. Note that all externally visible
> requests returned to the cache must have zero refcount.

The commit message here is a bit confusing, since you start out with
describing a solution that you're not actually implementing it. I usually
do this by putting alternate solutions at the bottom, starting with "An
alternate solution would be ..." or so.

And then closing with why we don't do that, here it would be that we do
no longer have a need for these partially set up i915_requests, and
therefore just reverting that complication is the simplest solution.

> An alternative fix then is to instead move the dma_fence_init out from
> the request ctor. Originally this was how it was done, but it was moved
> in:
> 
> commit 855e39e65cfc33a73724f1cc644ffc5754864a20
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Feb 3 09:41:48 2020 +0000
> 
>     drm/i915: Initialise basic fence before acquiring seqno
> 
> where it looks like intel_timeline_get_seqno() relied on some of the
> rq->fence state, but that is no longer the case since:
> 
> commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc
> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Date:   Tue Mar 23 16:49:50 2021 +0100
> 
>     drm/i915: Do not share hwsp across contexts any more, v8.
> 
> intel_timeline_get_seqno() could also be cleaned up slightly by dropping
> the request argument.
> 
> Moving dma_fence_init back out of the ctor, should ensure we have enough
> of the request initialised in case of trace_dma_fence_init.
> Functionally this should be the same, and is effectively what we were
> already open coding before, except now we also assign the fence->lock
> and fence->ops, but since these are invariant for recycled
> requests(which might be externally visible), and will therefore already
> hold the same value, it shouldn't matter. We still leave the
> spin_lock_init() in the ctor, since we can't re-init the rq->lock in
> case it is already held.

Holding rq->lock without having a full reference to it sounds like really
bad taste. I think it would be good to have a (kerneldoc) comment next to
i915_request.lock about this, with a FIXME. But separate patch.

> Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Michael Mason <michael.w.mason@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>

With the commit message restructured a bit, and assuming this one actually
works:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

But I'm really not confident :-(
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_request.c | 11 ++---------
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index ce446716d092..79da5eca60af 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg)
>  	i915_sw_fence_init(&rq->submit, submit_notify);
>  	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>  
> -	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
> -
>  	rq->capture_list = NULL;
>  
>  	init_llist_head(&rq->execute_cb);
> @@ -905,17 +903,12 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  	rq->ring = ce->ring;
>  	rq->execution_mask = ce->engine->mask;
>  
> -	kref_init(&rq->fence.refcount);
> -	rq->fence.flags = 0;
> -	rq->fence.error = 0;
> -	INIT_LIST_HEAD(&rq->fence.cb_list);
> -
>  	ret = intel_timeline_get_seqno(tl, rq, &seqno);
>  	if (ret)
>  		goto err_free;
>  
> -	rq->fence.context = tl->fence_context;
> -	rq->fence.seqno = seqno;
> +	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
> +		       tl->fence_context, seqno);
>  
>  	RCU_INIT_POINTER(rq->timeline, tl);
>  	rq->hwsp_seqno = tl->hwsp_seqno;
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915/request: fix early tracepoints
  2021-09-08 17:38   ` [Intel-gfx] " Daniel Vetter
@ 2021-09-09 16:16     ` Matthew Auld
  -1 siblings, 0 replies; 9+ messages in thread
From: Matthew Auld @ 2021-09-09 16:16 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, dri-devel, Michael Mason

On 08/09/2021 18:38, Daniel Vetter wrote:
> On Fri, Sep 03, 2021 at 12:24:05PM +0100, Matthew Auld wrote:
>> Currently we blow up in trace_dma_fence_init, when calling into
>> get_driver_name or get_timeline_name, since both the engine and context
>> might be NULL(or contain some garbage address) in the case of newly
>> allocated slab objects via the request ctor. Note that we also use
>> SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
>> freed, but delay freeing the underlying page by an RCU grace period.
>> With this scheme requests can be re-allocated, at the same time as they
>> are also being read by some lockless RCU lookup mechanism.
>>
>> One possible fix, since we don't yet have a fully initialised request
>> when in the ctor, is just setting the context/engine as NULL and adding
>> some extra handling in get_driver_name etc. And since the ctor is only
>> called for new slab objects(i.e allocate new page and call the ctor for
>> each object) it's safe to reset the context/engine prior to calling into
>> dma_fence_init, since we can be certain that no one is doing an RCU
>> lookup which might depend on peeking at the engine/context, like in
>> active_engine(), since the object can't yet be externally visible.
>>
>> In the recycled case(which might also be externally visible) the request
>> refcount always transitions from 0->1 after we set the context/engine
>> etc, which should ensure it's valid to dereference the engine for
>> example, when doing an RCU list-walk, so long as we can also increment
>> the refcount first. If the refcount is already zero, then the request is
>> considered complete/released.  If it's non-zero, then the request might
>> be in the process of being re-allocated, or potentially still in flight,
>> however after successfully incrementing the refcount, it's possible to
>> carefully inspect the request state, to determine if the request is
>> still what we were looking for. Note that all externally visible
>> requests returned to the cache must have zero refcount.
> 
> The commit message here is a bit confusing, since you start out with
> describing a solution that you're not actually implementing it. I usually
> do this by putting alternate solutions at the bottom, starting with "An
> alternate solution would be ..." or so.
> 
> And then closing with why we don't do that, here it would be that we do
> no longer have a need for these partially set up i915_requests, and
> therefore just reverting that complication is the simplest solution.
> 
>> An alternative fix then is to instead move the dma_fence_init out from
>> the request ctor. Originally this was how it was done, but it was moved
>> in:
>>
>> commit 855e39e65cfc33a73724f1cc644ffc5754864a20
>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>> Date:   Mon Feb 3 09:41:48 2020 +0000
>>
>>      drm/i915: Initialise basic fence before acquiring seqno
>>
>> where it looks like intel_timeline_get_seqno() relied on some of the
>> rq->fence state, but that is no longer the case since:
>>
>> commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc
>> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Date:   Tue Mar 23 16:49:50 2021 +0100
>>
>>      drm/i915: Do not share hwsp across contexts any more, v8.
>>
>> intel_timeline_get_seqno() could also be cleaned up slightly by dropping
>> the request argument.
>>
>> Moving dma_fence_init back out of the ctor, should ensure we have enough
>> of the request initialised in case of trace_dma_fence_init.
>> Functionally this should be the same, and is effectively what we were
>> already open coding before, except now we also assign the fence->lock
>> and fence->ops, but since these are invariant for recycled
>> requests(which might be externally visible), and will therefore already
>> hold the same value, it shouldn't matter. We still leave the
>> spin_lock_init() in the ctor, since we can't re-init the rq->lock in
>> case it is already held.
> 
> Holding rq->lock without having a full reference to it sounds like really
> bad taste. I think it would be good to have a (kerneldoc) comment next to
> i915_request.lock about this, with a FIXME. But separate patch.

Sorry, I guess I just meant that we can't blindly move the lock_init() 
i.e if for some unknown reason we moved it out from the ctor then it 
needs to go before the ref transitions from 0->1. Touching rq->lock 
should require the full ref. I'll reword.

> 
>> Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Michael Mason <michael.w.mason@intel.com>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
> 
> With the commit message restructured a bit, and assuming this one actually
> works:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> But I'm really not confident :-(
> -Daniel
> 
>> ---
>>   drivers/gpu/drm/i915/i915_request.c | 11 ++---------
>>   1 file changed, 2 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>> index ce446716d092..79da5eca60af 100644
>> --- a/drivers/gpu/drm/i915/i915_request.c
>> +++ b/drivers/gpu/drm/i915/i915_request.c
>> @@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg)
>>   	i915_sw_fence_init(&rq->submit, submit_notify);
>>   	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>>   
>> -	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
>> -
>>   	rq->capture_list = NULL;
>>   
>>   	init_llist_head(&rq->execute_cb);
>> @@ -905,17 +903,12 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>>   	rq->ring = ce->ring;
>>   	rq->execution_mask = ce->engine->mask;
>>   
>> -	kref_init(&rq->fence.refcount);
>> -	rq->fence.flags = 0;
>> -	rq->fence.error = 0;
>> -	INIT_LIST_HEAD(&rq->fence.cb_list);
>> -
>>   	ret = intel_timeline_get_seqno(tl, rq, &seqno);
>>   	if (ret)
>>   		goto err_free;
>>   
>> -	rq->fence.context = tl->fence_context;
>> -	rq->fence.seqno = seqno;
>> +	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
>> +		       tl->fence_context, seqno);
>>   
>>   	RCU_INIT_POINTER(rq->timeline, tl);
>>   	rq->hwsp_seqno = tl->hwsp_seqno;
>> -- 
>> 2.26.3
>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/request: fix early tracepoints
@ 2021-09-09 16:16     ` Matthew Auld
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Auld @ 2021-09-09 16:16 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, dri-devel, Michael Mason

On 08/09/2021 18:38, Daniel Vetter wrote:
> On Fri, Sep 03, 2021 at 12:24:05PM +0100, Matthew Auld wrote:
>> Currently we blow up in trace_dma_fence_init, when calling into
>> get_driver_name or get_timeline_name, since both the engine and context
>> might be NULL(or contain some garbage address) in the case of newly
>> allocated slab objects via the request ctor. Note that we also use
>> SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
>> freed, but delay freeing the underlying page by an RCU grace period.
>> With this scheme requests can be re-allocated, at the same time as they
>> are also being read by some lockless RCU lookup mechanism.
>>
>> One possible fix, since we don't yet have a fully initialised request
>> when in the ctor, is just setting the context/engine as NULL and adding
>> some extra handling in get_driver_name etc. And since the ctor is only
>> called for new slab objects(i.e allocate new page and call the ctor for
>> each object) it's safe to reset the context/engine prior to calling into
>> dma_fence_init, since we can be certain that no one is doing an RCU
>> lookup which might depend on peeking at the engine/context, like in
>> active_engine(), since the object can't yet be externally visible.
>>
>> In the recycled case(which might also be externally visible) the request
>> refcount always transitions from 0->1 after we set the context/engine
>> etc, which should ensure it's valid to dereference the engine for
>> example, when doing an RCU list-walk, so long as we can also increment
>> the refcount first. If the refcount is already zero, then the request is
>> considered complete/released.  If it's non-zero, then the request might
>> be in the process of being re-allocated, or potentially still in flight,
>> however after successfully incrementing the refcount, it's possible to
>> carefully inspect the request state, to determine if the request is
>> still what we were looking for. Note that all externally visible
>> requests returned to the cache must have zero refcount.
> 
> The commit message here is a bit confusing, since you start out with
> describing a solution that you're not actually implementing it. I usually
> do this by putting alternate solutions at the bottom, starting with "An
> alternate solution would be ..." or so.
> 
> And then closing with why we don't do that, here it would be that we do
> no longer have a need for these partially set up i915_requests, and
> therefore just reverting that complication is the simplest solution.
> 
>> An alternative fix then is to instead move the dma_fence_init out from
>> the request ctor. Originally this was how it was done, but it was moved
>> in:
>>
>> commit 855e39e65cfc33a73724f1cc644ffc5754864a20
>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>> Date:   Mon Feb 3 09:41:48 2020 +0000
>>
>>      drm/i915: Initialise basic fence before acquiring seqno
>>
>> where it looks like intel_timeline_get_seqno() relied on some of the
>> rq->fence state, but that is no longer the case since:
>>
>> commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc
>> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Date:   Tue Mar 23 16:49:50 2021 +0100
>>
>>      drm/i915: Do not share hwsp across contexts any more, v8.
>>
>> intel_timeline_get_seqno() could also be cleaned up slightly by dropping
>> the request argument.
>>
>> Moving dma_fence_init back out of the ctor, should ensure we have enough
>> of the request initialised in case of trace_dma_fence_init.
>> Functionally this should be the same, and is effectively what we were
>> already open coding before, except now we also assign the fence->lock
>> and fence->ops, but since these are invariant for recycled
>> requests(which might be externally visible), and will therefore already
>> hold the same value, it shouldn't matter. We still leave the
>> spin_lock_init() in the ctor, since we can't re-init the rq->lock in
>> case it is already held.
> 
> Holding rq->lock without having a full reference to it sounds like really
> bad taste. I think it would be good to have a (kerneldoc) comment next to
> i915_request.lock about this, with a FIXME. But separate patch.

Sorry, I guess I just meant that we can't blindly move the lock_init() 
i.e if for some unknown reason we moved it out from the ctor then it 
needs to go before the ref transitions from 0->1. Touching rq->lock 
should require the full ref. I'll reword.

> 
>> Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno")
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Michael Mason <michael.w.mason@intel.com>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
> 
> With the commit message restructured a bit, and assuming this one actually
> works:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> But I'm really not confident :-(
> -Daniel
> 
>> ---
>>   drivers/gpu/drm/i915/i915_request.c | 11 ++---------
>>   1 file changed, 2 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>> index ce446716d092..79da5eca60af 100644
>> --- a/drivers/gpu/drm/i915/i915_request.c
>> +++ b/drivers/gpu/drm/i915/i915_request.c
>> @@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg)
>>   	i915_sw_fence_init(&rq->submit, submit_notify);
>>   	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>>   
>> -	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
>> -
>>   	rq->capture_list = NULL;
>>   
>>   	init_llist_head(&rq->execute_cb);
>> @@ -905,17 +903,12 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>>   	rq->ring = ce->ring;
>>   	rq->execution_mask = ce->engine->mask;
>>   
>> -	kref_init(&rq->fence.refcount);
>> -	rq->fence.flags = 0;
>> -	rq->fence.error = 0;
>> -	INIT_LIST_HEAD(&rq->fence.cb_list);
>> -
>>   	ret = intel_timeline_get_seqno(tl, rq, &seqno);
>>   	if (ret)
>>   		goto err_free;
>>   
>> -	rq->fence.context = tl->fence_context;
>> -	rq->fence.seqno = seqno;
>> +	dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
>> +		       tl->fence_context, seqno);
>>   
>>   	RCU_INIT_POINTER(rq->timeline, tl);
>>   	rq->hwsp_seqno = tl->hwsp_seqno;
>> -- 
>> 2.26.3
>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-09 16:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-03 11:24 [PATCH] drm/i915/request: fix early tracepoints Matthew Auld
2021-09-03 11:24 ` [Intel-gfx] " Matthew Auld
2021-09-03 11:42 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2021-09-03 12:11 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-09-03 13:27 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2021-09-08 17:38 ` [PATCH] " Daniel Vetter
2021-09-08 17:38   ` [Intel-gfx] " Daniel Vetter
2021-09-09 16:16   ` Matthew Auld
2021-09-09 16:16     ` [Intel-gfx] " Matthew Auld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.