All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
@ 2020-01-20 17:57 Chris Wilson
  2020-01-20 19:47 ` Tvrtko Ursulin
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Chris Wilson @ 2020-01-20 17:57 UTC (permalink / raw)
  To: intel-gfx

Keep the rq->fence.flags consistent with the status of the
rq->sched.link, and clear the associated bits when decoupling the link
on retirement (as we may wish to inspect those flags independent of
other state).

Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
References: https://gitlab.freedesktop.org/drm/intel/issues/997
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 9ed0d3bc7249..78a5f5d3c070 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
 		locked = engine;
 	}
 	list_del_init(&rq->sched.link);
+	clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+	clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
 	spin_unlock_irq(&locked->active.lock);
 }
 
-- 
2.25.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-20 17:57 [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link Chris Wilson
@ 2020-01-20 19:47 ` Tvrtko Ursulin
  2020-01-20 20:27   ` Chris Wilson
  2020-01-20 20:32   ` Chris Wilson
  2020-01-21  2:12 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
  2020-01-21 14:53 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  2 siblings, 2 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2020-01-20 19:47 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 20/01/2020 17:57, Chris Wilson wrote:
> Keep the rq->fence.flags consistent with the status of the
> rq->sched.link, and clear the associated bits when decoupling the link
> on retirement (as we may wish to inspect those flags independent of
> other state).
> 
> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
> References: https://gitlab.freedesktop.org/drm/intel/issues/997
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 9ed0d3bc7249..78a5f5d3c070 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
>   		locked = engine;
>   	}
>   	list_del_init(&rq->sched.link);
> +	clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);

This one I think can not be set in retirement. Or there is a path?

[comes back after writing the comment below]

Race between completion to hold puts the request on hold, then request 
completes just as it is un-held? It needs retire to happen at the right 
time, driven by ...? Is this it?

> +	clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);

This one I think indeed can race with completion.

Regards,

Tvrtko

>   	spin_unlock_irq(&locked->active.lock);
>   }
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-20 19:47 ` Tvrtko Ursulin
@ 2020-01-20 20:27   ` Chris Wilson
  2020-01-21 17:33     ` Tvrtko Ursulin
  2020-01-20 20:32   ` Chris Wilson
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2020-01-20 20:27 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2020-01-20 19:47:08)
> 
> On 20/01/2020 17:57, Chris Wilson wrote:
> > Keep the rq->fence.flags consistent with the status of the
> > rq->sched.link, and clear the associated bits when decoupling the link
> > on retirement (as we may wish to inspect those flags independent of
> > other state).
> > 
> > Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
> > References: https://gitlab.freedesktop.org/drm/intel/issues/997
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 9ed0d3bc7249..78a5f5d3c070 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
> >               locked = engine;
> >       }
> >       list_del_init(&rq->sched.link);
> > +     clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> 
> This one I think can not be set in retirement. Or there is a path?

No, I don't think there's one for pqueue, it was just being consistent.
> 
> [comes back after writing the comment below]
> 
> Race between completion to hold puts the request on hold, then request 
> completes just as it is un-held? It needs retire to happen at the right 
> time, driven by ...? Is this it?

Yeah, but the clear one I was thinking about is 

static bool hold_request(const struct i915_request *rq)
{
        struct i915_dependency *p;

        /*
         * If one of our ancestors is on hold, we must also be on hold,
         * otherwise we will bypass it and execute before it.
         */
        list_for_each_entry(p, &rq->sched.signalers_list, signal_link) {
                const struct i915_request *s =
                        container_of(p->signaler, typeof(*s), sched);

                if (s->engine != rq->engine)
                        continue;

                if (i915_request_on_hold(s))
                        return true;
        }

        return false;
}

where we check the rq->fence.flags which holds stale information.

> 
> > +     clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
> 
> This one I think indeed can race with completion.

Clear both for consistency, caught out once, may be caught out again on
the other.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-20 19:47 ` Tvrtko Ursulin
  2020-01-20 20:27   ` Chris Wilson
@ 2020-01-20 20:32   ` Chris Wilson
  2020-01-21 11:11     ` Tvrtko Ursulin
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2020-01-20 20:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2020-01-20 19:47:08)
> 
> On 20/01/2020 17:57, Chris Wilson wrote:
> > Keep the rq->fence.flags consistent with the status of the
> > rq->sched.link, and clear the associated bits when decoupling the link
> > on retirement (as we may wish to inspect those flags independent of
> > other state).
> > 
> > Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
> > References: https://gitlab.freedesktop.org/drm/intel/issues/997
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 9ed0d3bc7249..78a5f5d3c070 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
> >               locked = engine;
> >       }
> >       list_del_init(&rq->sched.link);
> > +     clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> 
> This one I think can not be set in retirement. Or there is a path?
> 
> [comes back after writing the comment below]
> 
> Race between completion to hold puts the request on hold, then request 
> completes just as it is un-held? It needs retire to happen at the right 
> time, driven by ...? Is this it?
> 
> > +     clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
> 
> This one I think indeed can race with completion.

Fwiw, this appears to be the active part of the trace

<0>[   56.132334]   <idle>-0       1.Ns1 52042407us : execlists_reset_finish: 0000:00:02.0 vcs1: depth->1
<0>[   56.132415] rs:main -428     0..s. 52042429us : process_csb: 0000:00:02.0 vcs1: cs-irq head=5, tail=1
<0>[   56.132501] rs:main -428     0..s. 52042432us : process_csb: 0000:00:02.0 vcs1: csb[0]: status=0x00000001:0x00000000
<0>[   56.132591] rs:main -428     0..s. 52042437us : trace_ports: 0000:00:02.0 vcs1: promote { b:6!, 0:0 }
<0>[   56.132676] rs:main -428     0..s. 52042442us : process_csb: 0000:00:02.0 vcs1: csb[1]: status=0x00000818:0x00000020
<0>[   56.132766] rs:main -428     0..s. 52042445us : trace_ports: 0000:00:02.0 vcs1: completed { b:6!, 0:0 }
<0>[   56.132860] kworker/-12      0.... 52042771us : i915_request_retire: 0000:00:02.0 vcs1: fence b:6, current 6
<0>[   56.132949] kworker/-65      1d..1 52044886us : execlists_unhold: 0000:00:02.0 vcs0: fence 27:2, current 2 hold release
<0>[   56.133041] ksoftirq-16      1..s. 52044912us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=2
<0>[   56.133118] ksoftirq-16      1d.s1 52044916us : __i915_request_submit: 0000:00:02.0 vcs0: fence 27:2, current 2
<0>[   56.133216] kworker/-65      1.... 52044946us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:14, current 14
<0>[   56.133314] kworker/-65      1d... 52044986us : __i915_request_commit: 0000:00:02.0 vcs0: fence 9:16, current 14
<0>[   56.133402] kworker/-65      1d... 52044993us : __engine_park: 0000:00:02.0 vcs0:
<0>[   56.133490] kworker/-65      1d..2 52045032us : __i915_request_submit: 0000:00:02.0 vcs0: fence 9:16, current 14
<0>[   56.133581] kworker/-65      1d..2 52045892us : trace_ports: 0000:00:02.0 vcs0: submit { 9:16, 0:0 }
<0>[   56.133664]   <idle>-0       0dNH2 52045932us : __intel_context_retire: 0000:00:02.0 vcs0: context:27 retire
<0>[   56.133751]   <idle>-0       0.Ns1 52045949us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=4
<0>[   56.133838]   <idle>-0       0.Ns1 52045950us : process_csb: 0000:00:02.0 vcs0: csb[3]: status=0x00000001:0x00000000
<0>[   56.133927]   <idle>-0       0.Ns1 52045951us : trace_ports: 0000:00:02.0 vcs0: promote { 9:16!, 0:0 }
<0>[   56.134013]   <idle>-0       0.Ns1 52045951us : process_csb: 0000:00:02.0 vcs0: csb[4]: status=0x00000818:0x00000060
<0>[   56.134102]   <idle>-0       0.Ns1 52045952us : trace_ports: 0000:00:02.0 vcs0: completed { 9:16!, 0:0 }
<0>[   56.134198] kworker/-12      0.... 52045960us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:16, current 16
<0>[   56.134285] kworker/-12      0.... 52045962us : __engine_park: 0000:00:02.0 vcs0:
<0>[   56.134361] kworker/-65      1d..1 52046503us : execlists_unhold: 0000:00:02.0 vcs1: fence 2a:2, current 1 hold release
<0>[   56.134427] ksoftirq-16      1..s. 52046510us : process_csb: 0000:00:02.0 vcs1: cs-irq head=1, tail=1

It doesn't look like there's overlap between the hold and retire. :|
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-20 17:57 [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link Chris Wilson
  2020-01-20 19:47 ` Tvrtko Ursulin
@ 2020-01-21  2:12 ` Patchwork
  2020-01-21 14:53 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  2 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2020-01-21  2:12 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Mark the removal of the i915_request from the sched.link
URL   : https://patchwork.freedesktop.org/series/72302/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7781 -> Patchwork_16176
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with Patchwork_16176 need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_16176, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_16176:

### IGT changes ###

#### Warnings ####

  * igt@gem_exec_parallel@contexts:
    - fi-byt-n2820:       [TIMEOUT][1] ([fdo#112271]) -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-byt-n2820/igt@gem_exec_parallel@contexts.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/fi-byt-n2820/igt@gem_exec_parallel@contexts.html

  
Known issues
------------

  Here are the changes found in Patchwork_16176 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-cfl-8700k:       [PASS][3] -> [DMESG-WARN][4] ([i915#889])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-cfl-8700k/igt@i915_module_load@reload-with-fault-injection.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/fi-cfl-8700k/igt@i915_module_load@reload-with-fault-injection.html

  
#### Possible fixes ####

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][5] ([fdo#111407]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Warnings ####

  * igt@i915_selftest@live_blt:
    - fi-hsw-4770r:       [DMESG-FAIL][7] ([i915#553] / [i915#725]) -> [DMESG-FAIL][8] ([i915#725])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-hsw-4770r/igt@i915_selftest@live_blt.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/fi-hsw-4770r/igt@i915_selftest@live_blt.html
    - fi-hsw-4770:        [DMESG-FAIL][9] ([i915#725]) -> [DMESG-FAIL][10] ([i915#553] / [i915#725])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-hsw-4770/igt@i915_selftest@live_blt.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/fi-hsw-4770/igt@i915_selftest@live_blt.html

  
  [fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#553]: https://gitlab.freedesktop.org/drm/intel/issues/553
  [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725
  [i915#889]: https://gitlab.freedesktop.org/drm/intel/issues/889


Participating hosts (50 -> 38)
------------------------------

  Missing    (12): fi-ilk-m540 fi-ehl-1 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-gdg-551 fi-bsw-kefka fi-skl-lmem fi-snb-2600 fi-kbl-r 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7781 -> Patchwork_16176

  CI-20190529: 20190529
  CI_DRM_7781: 3f2b341ae1fde67f823aeb715c6f489affdef8b1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5374: 83c32e859202e43ff6a8cca162c76fcd90ad6e3b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16176: 985c9db0ca25cdafd8b69e29a6514066253be73d @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

985c9db0ca25 drm/i915: Mark the removal of the i915_request from the sched.link

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-20 20:32   ` Chris Wilson
@ 2020-01-21 11:11     ` Tvrtko Ursulin
  2020-01-21 11:15       ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2020-01-21 11:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 20/01/2020 20:32, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-01-20 19:47:08)
>>
>> On 20/01/2020 17:57, Chris Wilson wrote:
>>> Keep the rq->fence.flags consistent with the status of the
>>> rq->sched.link, and clear the associated bits when decoupling the link
>>> on retirement (as we may wish to inspect those flags independent of
>>> other state).
>>>
>>> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
>>> References: https://gitlab.freedesktop.org/drm/intel/issues/997
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 9ed0d3bc7249..78a5f5d3c070 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
>>>                locked = engine;
>>>        }
>>>        list_del_init(&rq->sched.link);
>>> +     clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
>>
>> This one I think can not be set in retirement. Or there is a path?
>>
>> [comes back after writing the comment below]
>>
>> Race between completion to hold puts the request on hold, then request
>> completes just as it is un-held? It needs retire to happen at the right
>> time, driven by ...? Is this it?
>>
>>> +     clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
>>
>> This one I think indeed can race with completion.
> 
> Fwiw, this appears to be the active part of the trace
> 
> <0>[   56.132334]   <idle>-0       1.Ns1 52042407us : execlists_reset_finish: 0000:00:02.0 vcs1: depth->1
> <0>[   56.132415] rs:main -428     0..s. 52042429us : process_csb: 0000:00:02.0 vcs1: cs-irq head=5, tail=1
> <0>[   56.132501] rs:main -428     0..s. 52042432us : process_csb: 0000:00:02.0 vcs1: csb[0]: status=0x00000001:0x00000000
> <0>[   56.132591] rs:main -428     0..s. 52042437us : trace_ports: 0000:00:02.0 vcs1: promote { b:6!, 0:0 }
> <0>[   56.132676] rs:main -428     0..s. 52042442us : process_csb: 0000:00:02.0 vcs1: csb[1]: status=0x00000818:0x00000020
> <0>[   56.132766] rs:main -428     0..s. 52042445us : trace_ports: 0000:00:02.0 vcs1: completed { b:6!, 0:0 }
> <0>[   56.132860] kworker/-12      0.... 52042771us : i915_request_retire: 0000:00:02.0 vcs1: fence b:6, current 6
> <0>[   56.132949] kworker/-65      1d..1 52044886us : execlists_unhold: 0000:00:02.0 vcs0: fence 27:2, current 2 hold release
> <0>[   56.133041] ksoftirq-16      1..s. 52044912us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=2
> <0>[   56.133118] ksoftirq-16      1d.s1 52044916us : __i915_request_submit: 0000:00:02.0 vcs0: fence 27:2, current 2
> <0>[   56.133216] kworker/-65      1.... 52044946us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:14, current 14
> <0>[   56.133314] kworker/-65      1d... 52044986us : __i915_request_commit: 0000:00:02.0 vcs0: fence 9:16, current 14
> <0>[   56.133402] kworker/-65      1d... 52044993us : __engine_park: 0000:00:02.0 vcs0:
> <0>[   56.133490] kworker/-65      1d..2 52045032us : __i915_request_submit: 0000:00:02.0 vcs0: fence 9:16, current 14
> <0>[   56.133581] kworker/-65      1d..2 52045892us : trace_ports: 0000:00:02.0 vcs0: submit { 9:16, 0:0 }
> <0>[   56.133664]   <idle>-0       0dNH2 52045932us : __intel_context_retire: 0000:00:02.0 vcs0: context:27 retire
> <0>[   56.133751]   <idle>-0       0.Ns1 52045949us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=4
> <0>[   56.133838]   <idle>-0       0.Ns1 52045950us : process_csb: 0000:00:02.0 vcs0: csb[3]: status=0x00000001:0x00000000
> <0>[   56.133927]   <idle>-0       0.Ns1 52045951us : trace_ports: 0000:00:02.0 vcs0: promote { 9:16!, 0:0 }
> <0>[   56.134013]   <idle>-0       0.Ns1 52045951us : process_csb: 0000:00:02.0 vcs0: csb[4]: status=0x00000818:0x00000060
> <0>[   56.134102]   <idle>-0       0.Ns1 52045952us : trace_ports: 0000:00:02.0 vcs0: completed { 9:16!, 0:0 }
> <0>[   56.134198] kworker/-12      0.... 52045960us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:16, current 16
> <0>[   56.134285] kworker/-12      0.... 52045962us : __engine_park: 0000:00:02.0 vcs0:
> <0>[   56.134361] kworker/-65      1d..1 52046503us : execlists_unhold: 0000:00:02.0 vcs1: fence 2a:2, current 1 hold release
> <0>[   56.134427] ksoftirq-16      1..s. 52046510us : process_csb: 0000:00:02.0 vcs1: cs-irq head=1, tail=1
> 
> It doesn't look like there's overlap between the hold and retire. :|

Which bit is the race? I coudln't spot the same request being mentioned 
on two separate paths.

I mean I have no issues with the patch. Just trying to understand the 
possible races and whether or not there should be an assert for PQUEUE 
instead of clearing it on retire.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-21 11:11     ` Tvrtko Ursulin
@ 2020-01-21 11:15       ` Chris Wilson
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2020-01-21 11:15 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2020-01-21 11:11:27)
> 
> On 20/01/2020 20:32, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2020-01-20 19:47:08)
> >>
> >> On 20/01/2020 17:57, Chris Wilson wrote:
> >>> Keep the rq->fence.flags consistent with the status of the
> >>> rq->sched.link, and clear the associated bits when decoupling the link
> >>> on retirement (as we may wish to inspect those flags independent of
> >>> other state).
> >>>
> >>> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
> >>> References: https://gitlab.freedesktop.org/drm/intel/issues/997
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_request.c | 2 ++
> >>>    1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>> index 9ed0d3bc7249..78a5f5d3c070 100644
> >>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>> @@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
> >>>                locked = engine;
> >>>        }
> >>>        list_del_init(&rq->sched.link);
> >>> +     clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
> >>
> >> This one I think can not be set in retirement. Or there is a path?
> >>
> >> [comes back after writing the comment below]
> >>
> >> Race between completion to hold puts the request on hold, then request
> >> completes just as it is un-held? It needs retire to happen at the right
> >> time, driven by ...? Is this it?
> >>
> >>> +     clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
> >>
> >> This one I think indeed can race with completion.
> > 
> > Fwiw, this appears to be the active part of the trace
> > 
> > <0>[   56.132334]   <idle>-0       1.Ns1 52042407us : execlists_reset_finish: 0000:00:02.0 vcs1: depth->1
> > <0>[   56.132415] rs:main -428     0..s. 52042429us : process_csb: 0000:00:02.0 vcs1: cs-irq head=5, tail=1
> > <0>[   56.132501] rs:main -428     0..s. 52042432us : process_csb: 0000:00:02.0 vcs1: csb[0]: status=0x00000001:0x00000000
> > <0>[   56.132591] rs:main -428     0..s. 52042437us : trace_ports: 0000:00:02.0 vcs1: promote { b:6!, 0:0 }
> > <0>[   56.132676] rs:main -428     0..s. 52042442us : process_csb: 0000:00:02.0 vcs1: csb[1]: status=0x00000818:0x00000020
> > <0>[   56.132766] rs:main -428     0..s. 52042445us : trace_ports: 0000:00:02.0 vcs1: completed { b:6!, 0:0 }
> > <0>[   56.132860] kworker/-12      0.... 52042771us : i915_request_retire: 0000:00:02.0 vcs1: fence b:6, current 6
> > <0>[   56.132949] kworker/-65      1d..1 52044886us : execlists_unhold: 0000:00:02.0 vcs0: fence 27:2, current 2 hold release
> > <0>[   56.133041] ksoftirq-16      1..s. 52044912us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=2
> > <0>[   56.133118] ksoftirq-16      1d.s1 52044916us : __i915_request_submit: 0000:00:02.0 vcs0: fence 27:2, current 2
> > <0>[   56.133216] kworker/-65      1.... 52044946us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:14, current 14
> > <0>[   56.133314] kworker/-65      1d... 52044986us : __i915_request_commit: 0000:00:02.0 vcs0: fence 9:16, current 14
> > <0>[   56.133402] kworker/-65      1d... 52044993us : __engine_park: 0000:00:02.0 vcs0:
> > <0>[   56.133490] kworker/-65      1d..2 52045032us : __i915_request_submit: 0000:00:02.0 vcs0: fence 9:16, current 14
> > <0>[   56.133581] kworker/-65      1d..2 52045892us : trace_ports: 0000:00:02.0 vcs0: submit { 9:16, 0:0 }
> > <0>[   56.133664]   <idle>-0       0dNH2 52045932us : __intel_context_retire: 0000:00:02.0 vcs0: context:27 retire
> > <0>[   56.133751]   <idle>-0       0.Ns1 52045949us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=4
> > <0>[   56.133838]   <idle>-0       0.Ns1 52045950us : process_csb: 0000:00:02.0 vcs0: csb[3]: status=0x00000001:0x00000000
> > <0>[   56.133927]   <idle>-0       0.Ns1 52045951us : trace_ports: 0000:00:02.0 vcs0: promote { 9:16!, 0:0 }
> > <0>[   56.134013]   <idle>-0       0.Ns1 52045951us : process_csb: 0000:00:02.0 vcs0: csb[4]: status=0x00000818:0x00000060
> > <0>[   56.134102]   <idle>-0       0.Ns1 52045952us : trace_ports: 0000:00:02.0 vcs0: completed { 9:16!, 0:0 }
> > <0>[   56.134198] kworker/-12      0.... 52045960us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:16, current 16
> > <0>[   56.134285] kworker/-12      0.... 52045962us : __engine_park: 0000:00:02.0 vcs0:
> > <0>[   56.134361] kworker/-65      1d..1 52046503us : execlists_unhold: 0000:00:02.0 vcs1: fence 2a:2, current 1 hold release
> > <0>[   56.134427] ksoftirq-16      1..s. 52046510us : process_csb: 0000:00:02.0 vcs1: cs-irq head=1, tail=1
> > 
> > It doesn't look like there's overlap between the hold and retire. :|
> 
> Which bit is the race? I coudln't spot the same request being mentioned 
> on two separate paths.

Neither could I. I don't think this patch is actually associated with
the bug, but I think the timing of the GEM_BUG_ON implicates the
execlists_hold() patch.

> I mean I have no issues with the patch. Just trying to understand the 
> possible races and whether or not there should be an assert for PQUEUE 
> instead of clearing it on retire.

It can be on the pqueue during retirement, we're just a bit more careful
during dequeuing.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-20 17:57 [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link Chris Wilson
  2020-01-20 19:47 ` Tvrtko Ursulin
  2020-01-21  2:12 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
@ 2020-01-21 14:53 ` Patchwork
  2 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2020-01-21 14:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Mark the removal of the i915_request from the sched.link
URL   : https://patchwork.freedesktop.org/series/72302/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7781_full -> Patchwork_16176_full
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with Patchwork_16176_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_16176_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_16176_full:

### IGT changes ###

#### Warnings ####

  * igt@gem_exec_fence@basic-busy-all:
    - shard-skl:          [SKIP][1] ([fdo#109271]) -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl8/igt@gem_exec_fence@basic-busy-all.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl8/igt@gem_exec_fence@basic-busy-all.html

  
Known issues
------------

  Here are the changes found in Patchwork_16176_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@vcs1-clean:
    - shard-iclb:         [PASS][3] -> [SKIP][4] ([fdo#109276] / [fdo#112080])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb1/igt@gem_ctx_isolation@vcs1-clean.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb8/igt@gem_ctx_isolation@vcs1-clean.html

  * igt@gem_ctx_shared@q-smoketest-bsd1:
    - shard-tglb:         [PASS][5] -> [INCOMPLETE][6] ([fdo#111735] / [i915#472])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb2/igt@gem_ctx_shared@q-smoketest-bsd1.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb4/igt@gem_ctx_shared@q-smoketest-bsd1.html

  * igt@gem_eio@kms:
    - shard-tglb:         [PASS][7] -> [INCOMPLETE][8] ([i915#472] / [i915#476])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb1/igt@gem_eio@kms.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb7/igt@gem_eio@kms.html

  * igt@gem_exec_create@forked:
    - shard-apl:          [PASS][9] -> [TIMEOUT][10] ([fdo#112271] / [i915#940])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-apl6/igt@gem_exec_create@forked.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-apl3/igt@gem_exec_create@forked.html
    - shard-kbl:          [PASS][11] -> [TIMEOUT][12] ([fdo#112271] / [i915#940])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl2/igt@gem_exec_create@forked.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-kbl1/igt@gem_exec_create@forked.html

  * igt@gem_exec_schedule@preempt-other-bsd:
    - shard-iclb:         [PASS][13] -> [SKIP][14] ([fdo#112146]) +3 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb3/igt@gem_exec_schedule@preempt-other-bsd.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb2/igt@gem_exec_schedule@preempt-other-bsd.html

  * igt@gem_exec_suspend@basic-s3:
    - shard-kbl:          [PASS][15] -> [DMESG-WARN][16] ([i915#180]) +7 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl2/igt@gem_exec_suspend@basic-s3.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-kbl1/igt@gem_exec_suspend@basic-s3.html

  * igt@gem_exec_whisper@normal:
    - shard-tglb:         [PASS][17] -> [INCOMPLETE][18] ([i915#472]) +1 similar issue
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb7/igt@gem_exec_whisper@normal.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb7/igt@gem_exec_whisper@normal.html

  * igt@gem_persistent_relocs@forked-faulting-reloc-thrashing:
    - shard-kbl:          [PASS][19] -> [INCOMPLETE][20] ([fdo#103665] / [i915#530])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl4/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-kbl3/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html

  * igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive:
    - shard-glk:          [PASS][21] -> [TIMEOUT][22] ([fdo#112271] / [i915#530])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-glk6/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-glk3/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html

  * igt@gem_persistent_relocs@forked-interruptible-thrash-inactive:
    - shard-hsw:          [PASS][23] -> [TIMEOUT][24] ([fdo#112271] / [i915#530])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-hsw2/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-hsw2/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html

  * igt@gem_ppgtt@flink-and-close-vma-leak:
    - shard-apl:          [PASS][25] -> [FAIL][26] ([i915#644])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-apl6/igt@gem_ppgtt@flink-and-close-vma-leak.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-apl4/igt@gem_ppgtt@flink-and-close-vma-leak.html

  * igt@gem_sync@basic-all:
    - shard-tglb:         [PASS][27] -> [INCOMPLETE][28] ([i915#470] / [i915#472])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb1/igt@gem_sync@basic-all.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb3/igt@gem_sync@basic-all.html

  * igt@gem_tiled_blits@normal:
    - shard-hsw:          [PASS][29] -> [FAIL][30] ([i915#818])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-hsw4/igt@gem_tiled_blits@normal.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-hsw5/igt@gem_tiled_blits@normal.html

  * igt@i915_selftest@mock_requests:
    - shard-snb:          [PASS][31] -> [INCOMPLETE][32] ([i915#82])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-snb6/igt@i915_selftest@mock_requests.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-snb6/igt@i915_selftest@mock_requests.html
    - shard-skl:          [PASS][33] -> [INCOMPLETE][34] ([i915#198])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl4/igt@i915_selftest@mock_requests.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl4/igt@i915_selftest@mock_requests.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
    - shard-skl:          [PASS][35] -> [FAIL][36] ([i915#79])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl2/igt@kms_flip@flip-vs-expired-vblank-interruptible.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl7/igt@kms_flip@flip-vs-expired-vblank-interruptible.html

  * igt@kms_frontbuffer_tracking@fbcpsr-stridechange:
    - shard-tglb:         [PASS][37] -> [FAIL][38] ([i915#49]) +1 similar issue
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb5/igt@kms_frontbuffer_tracking@fbcpsr-stridechange.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb6/igt@kms_frontbuffer_tracking@fbcpsr-stridechange.html

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          [PASS][39] -> [FAIL][40] ([fdo#108145])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl2/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl1/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [PASS][41] -> [FAIL][42] ([fdo#108145] / [i915#265])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl10/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl4/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_render:
    - shard-iclb:         [PASS][43] -> [SKIP][44] ([fdo#109441]) +2 similar issues
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb2/igt@kms_psr@psr2_cursor_render.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb1/igt@kms_psr@psr2_cursor_render.html

  * igt@kms_vblank@pipe-b-ts-continuation-suspend:
    - shard-iclb:         [PASS][45] -> [DMESG-WARN][46] ([fdo#111764]) +1 similar issue
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb6/igt@kms_vblank@pipe-b-ts-continuation-suspend.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb8/igt@kms_vblank@pipe-b-ts-continuation-suspend.html

  * igt@perf_pmu@init-sema-vcs1:
    - shard-iclb:         [PASS][47] -> [SKIP][48] ([fdo#112080]) +2 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb1/igt@perf_pmu@init-sema-vcs1.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb8/igt@perf_pmu@init-sema-vcs1.html

  * igt@prime_busy@after-bsd2:
    - shard-iclb:         [PASS][49] -> [SKIP][50] ([fdo#109276]) +6 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb1/igt@prime_busy@after-bsd2.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb8/igt@prime_busy@after-bsd2.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@rcs0-s3:
    - shard-kbl:          [DMESG-WARN][51] ([i915#180]) -> [PASS][52] +4 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl1/igt@gem_ctx_isolation@rcs0-s3.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-kbl6/igt@gem_ctx_isolation@rcs0-s3.html
    - shard-apl:          [DMESG-WARN][53] ([i915#180]) -> [PASS][54] +3 similar issues
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-apl6/igt@gem_ctx_isolation@rcs0-s3.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-apl1/igt@gem_ctx_isolation@rcs0-s3.html

  * igt@gem_ctx_persistence@rcs0-mixed-process:
    - shard-skl:          [FAIL][55] ([i915#679]) -> [PASS][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl10/igt@gem_ctx_persistence@rcs0-mixed-process.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl3/igt@gem_ctx_persistence@rcs0-mixed-process.html

  * igt@gem_ctx_persistence@vcs1-queued:
    - shard-iclb:         [SKIP][57] ([fdo#109276] / [fdo#112080]) -> [PASS][58] +5 similar issues
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb3/igt@gem_ctx_persistence@vcs1-queued.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb4/igt@gem_ctx_persistence@vcs1-queued.html

  * igt@gem_exec_balancer@hang:
    - shard-iclb:         [INCOMPLETE][59] ([i915#140]) -> [PASS][60] +1 similar issue
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb1/igt@gem_exec_balancer@hang.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb8/igt@gem_exec_balancer@hang.html

  * igt@gem_exec_balancer@smoke:
    - shard-iclb:         [SKIP][61] ([fdo#110854]) -> [PASS][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb3/igt@gem_exec_balancer@smoke.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb4/igt@gem_exec_balancer@smoke.html

  * igt@gem_exec_gttfill@basic:
    - shard-tglb:         [INCOMPLETE][63] ([fdo#111593] / [i915#472]) -> [PASS][64] +1 similar issue
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb3/igt@gem_exec_gttfill@basic.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb4/igt@gem_exec_gttfill@basic.html

  * igt@gem_exec_nop@basic-series:
    - shard-tglb:         [INCOMPLETE][65] ([i915#472]) -> [PASS][66]
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb6/igt@gem_exec_nop@basic-series.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb8/igt@gem_exec_nop@basic-series.html

  * igt@gem_exec_schedule@wide-bsd:
    - shard-iclb:         [SKIP][67] ([fdo#112146]) -> [PASS][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb1/igt@gem_exec_schedule@wide-bsd.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb7/igt@gem_exec_schedule@wide-bsd.html

  * igt@gem_persistent_relocs@forked-faulting-reloc-thrashing:
    - shard-hsw:          [INCOMPLETE][69] ([i915#530] / [i915#61]) -> [PASS][70]
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-hsw5/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-hsw6/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html

  * igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive:
    - shard-kbl:          [TIMEOUT][71] ([fdo#112271] / [i915#530]) -> [PASS][72] +1 similar issue
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl3/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-kbl4/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html

  * igt@gem_pipe_control_store_loop@reused-buffer:
    - shard-hsw:          [FAIL][73] ([i915#874]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-hsw1/igt@gem_pipe_control_store_loop@reused-buffer.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-hsw4/igt@gem_pipe_control_store_loop@reused-buffer.html

  * igt@gem_ppgtt@flink-and-close-vma-leak:
    - shard-glk:          [FAIL][75] ([i915#644]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-glk7/igt@gem_ppgtt@flink-and-close-vma-leak.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-glk6/igt@gem_ppgtt@flink-and-close-vma-leak.html

  * igt@gem_sync@basic-each:
    - shard-tglb:         [INCOMPLETE][77] ([i915#472] / [i915#707]) -> [PASS][78] +1 similar issue
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb6/igt@gem_sync@basic-each.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb1/igt@gem_sync@basic-each.html

  * igt@i915_selftest@live_gem_contexts:
    - shard-tglb:         [INCOMPLETE][79] ([i915#455] / [i915#472]) -> [PASS][80]
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb3/igt@i915_selftest@live_gem_contexts.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb8/igt@i915_selftest@live_gem_contexts.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x42-sliding:
    - shard-skl:          [FAIL][81] ([i915#54]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl6/igt@kms_cursor_crc@pipe-b-cursor-128x42-sliding.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl6/igt@kms_cursor_crc@pipe-b-cursor-128x42-sliding.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-hsw:          [INCOMPLETE][83] ([i915#61]) -> [PASS][84] +1 similar issue
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-hsw2/igt@kms_flip@flip-vs-suspend.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-hsw5/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-pwrite:
    - shard-tglb:         [FAIL][85] ([i915#49]) -> [PASS][86] +2 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb1/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-pwrite.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb7/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-pwrite.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - shard-skl:          [INCOMPLETE][87] ([i915#69]) -> [PASS][88]
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl1/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl5/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  * igt@kms_psr2_su@frontbuffer:
    - shard-iclb:         [SKIP][89] ([fdo#109642] / [fdo#111068]) -> [PASS][90]
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb1/igt@kms_psr2_su@frontbuffer.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb2/igt@kms_psr2_su@frontbuffer.html

  * igt@kms_psr@psr2_sprite_plane_move:
    - shard-iclb:         [SKIP][91] ([fdo#109441]) -> [PASS][92] +1 similar issue
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb4/igt@kms_psr@psr2_sprite_plane_move.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb2/igt@kms_psr@psr2_sprite_plane_move.html

  * igt@kms_setmode@basic:
    - shard-apl:          [FAIL][93] ([i915#31]) -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-apl2/igt@kms_setmode@basic.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-apl8/igt@kms_setmode@basic.html

  * igt@perf_pmu@busy-no-semaphores-vcs1:
    - shard-iclb:         [SKIP][95] ([fdo#112080]) -> [PASS][96] +7 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb6/igt@perf_pmu@busy-no-semaphores-vcs1.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb1/igt@perf_pmu@busy-no-semaphores-vcs1.html

  * igt@prime_busy@hang-bsd2:
    - shard-iclb:         [SKIP][97] ([fdo#109276]) -> [PASS][98] +17 similar issues
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb6/igt@prime_busy@hang-bsd2.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb1/igt@prime_busy@hang-bsd2.html

  
#### Warnings ####

  * igt@gem_ctx_isolation@vcs1-nonpriv:
    - shard-iclb:         [SKIP][99] ([fdo#109276] / [fdo#112080]) -> [FAIL][100] ([IGT#28])
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-iclb7/igt@gem_ctx_isolation@vcs1-nonpriv.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-iclb4/igt@gem_ctx_isolation@vcs1-nonpriv.html

  * igt@gem_exec_fence@basic-busy-all:
    - shard-glk:          [SKIP][101] ([fdo#109271]) -> [INCOMPLETE][102] ([i915#58] / [k.org#198133])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-glk1/igt@gem_exec_fence@basic-busy-all.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-glk7/igt@gem_exec_fence@basic-busy-all.html
    - shard-apl:          [SKIP][103] ([fdo#109271]) -> [INCOMPLETE][104] ([fdo#103927])
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-apl7/igt@gem_exec_fence@basic-busy-all.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-apl1/igt@gem_exec_fence@basic-busy-all.html

  * igt@gem_tiled_blits@interruptible:
    - shard-hsw:          [FAIL][105] ([i915#818]) -> [FAIL][106] ([i915#694])
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-hsw7/igt@gem_tiled_blits@interruptible.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-hsw1/igt@gem_tiled_blits@interruptible.html

  * igt@gem_workarounds@suspend-resume-fd:
    - shard-kbl:          [DMESG-WARN][107] ([i915#180]) -> [DMESG-WARN][108] ([i915#56])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl1/igt@gem_workarounds@suspend-resume-fd.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-kbl6/igt@gem_workarounds@suspend-resume-fd.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-tglb:         [SKIP][109] ([i915#468]) -> [FAIL][110] ([i915#454])
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-tglb1/igt@i915_pm_dc@dc6-psr.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-tglb7/igt@i915_pm_dc@dc6-psr.html

  * igt@i915_pm_rpm@gem-idle:
    - shard-snb:          [SKIP][111] ([fdo#109271]) -> [INCOMPLETE][112] ([i915#82])
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-snb4/igt@i915_pm_rpm@gem-idle.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-snb1/igt@i915_pm_rpm@gem-idle.html

  * igt@runner@aborted:
    - shard-kbl:          ([FAIL][113], [FAIL][114]) ([i915#997]) -> [FAIL][115] ([i915#940])
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl6/igt@runner@aborted.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-kbl4/igt@runner@aborted.html
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-kbl1/igt@runner@aborted.html
    - shard-skl:          ([FAIL][116], [FAIL][117]) ([i915#69] / [i915#997]) -> [FAIL][118] ([i915#69] / [i915#873])
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl3/igt@runner@aborted.html
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/shard-skl8/igt@runner@aborted.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/shard-skl4/igt@runner@aborted.html

  
  [IGT#28]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/28
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109276]: https://bugs.freedesktop.org/show_bug.cgi?id=109276
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110854]: https://bugs.freedesktop.org/show_bug.cgi?id=110854
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593
  [fdo#111735]: https://bugs.freedesktop.org/show_bug.cgi?id=111735
  [fdo#111764]: https://bugs.freedesktop.org/show_bug.cgi?id=111764
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [fdo#112146]: https://bugs.freedesktop.org/show_bug.cgi?id=112146
  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#140]: https://gitlab.freedesktop.org/drm/intel/issues/140
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#198]: https://gitlab.freedesktop.org/drm/intel/issues/198
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#31]: https://gitlab.freedesktop.org/drm/intel/issues/31
  [i915#454]: https://gitlab.freedesktop.org/drm/intel/issues/454
  [i915#455]: https://gitlab.freedesktop.org/drm/intel/issues/455
  [i915#468]: https://gitlab.freedesktop.org/drm/intel/issues/468
  [i915#470]: https://gitlab.freedesktop.org/drm/intel/issues/470
  [i915#472]: https://gitlab.freedesktop.org/drm/intel/issues/472
  [i915#476]: https://gitlab.freedesktop.org/drm/intel/issues/476
  [i915#49]: https://gitlab.freedesktop.org/drm/intel/issues/49
  [i915#530]: https://gitlab.freedesktop.org/drm/intel/issues/530
  [i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
  [i915#56]: https://gitlab.freedesktop.org/drm/intel/issues/56
  [i915#58]: https://gitlab.freedesktop.org/drm/intel/issues/58
  [i915#61]: https://gitlab.freedesktop.org/drm/intel/issues/61
  [i915#644]: https://gitlab.freedesktop.org/drm/intel/issues/644
  [i915#679]: https://gitlab.freedesktop.org/drm/intel/issues/679
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#694]: https://gitlab.freedesktop.org/drm/intel/issues/694
  [i915#707]: https://gitlab.freedesktop.org/drm/intel/issues/707
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#818]: https://gitlab.freedesktop.org/drm/intel/issues/818
  [i915#82]: https://gitlab.freedesktop.org/drm/intel/issues/82
  [i915#873]: https://gitlab.freedesktop.org/drm/intel/issues/873
  [i915#874]: https://gitlab.freedesktop.org/drm/intel/issues/874
  [i915#940]: https://gitlab.freedesktop.org/drm/intel/issues/940
  [i915#997]: https://gitlab.freedesktop.org/drm/intel/issues/997
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7781 -> Patchwork_16176

  CI-20190529: 20190529
  CI_DRM_7781: 3f2b341ae1fde67f823aeb715c6f489affdef8b1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5374: 83c32e859202e43ff6a8cca162c76fcd90ad6e3b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16176: 985c9db0ca25cdafd8b69e29a6514066253be73d @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16176/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
  2020-01-20 20:27   ` Chris Wilson
@ 2020-01-21 17:33     ` Tvrtko Ursulin
  0 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2020-01-21 17:33 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 20/01/2020 20:27, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-01-20 19:47:08)
>>
>> On 20/01/2020 17:57, Chris Wilson wrote:
>>> Keep the rq->fence.flags consistent with the status of the
>>> rq->sched.link, and clear the associated bits when decoupling the link
>>> on retirement (as we may wish to inspect those flags independent of
>>> other state).
>>>
>>> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
>>> References: https://gitlab.freedesktop.org/drm/intel/issues/997
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 9ed0d3bc7249..78a5f5d3c070 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
>>>                locked = engine;
>>>        }
>>>        list_del_init(&rq->sched.link);
>>> +     clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
>>
>> This one I think can not be set in retirement. Or there is a path?
> 
> No, I don't think there's one for pqueue, it was just being consistent.
>>
>> [comes back after writing the comment below]
>>
>> Race between completion to hold puts the request on hold, then request
>> completes just as it is un-held? It needs retire to happen at the right
>> time, driven by ...? Is this it?
> 
> Yeah, but the clear one I was thinking about is
> 
> static bool hold_request(const struct i915_request *rq)
> {
>          struct i915_dependency *p;
> 
>          /*
>           * If one of our ancestors is on hold, we must also be on hold,
>           * otherwise we will bypass it and execute before it.
>           */
>          list_for_each_entry(p, &rq->sched.signalers_list, signal_link) {
>                  const struct i915_request *s =
>                          container_of(p->signaler, typeof(*s), sched);
> 
>                  if (s->engine != rq->engine)
>                          continue;
> 
>                  if (i915_request_on_hold(s))
>                          return true;
>          }
> 
>          return false;
> }
> 
> where we check the rq->fence.flags which holds stale information.

i915_request_on_hold? Why would that hold stale information?
 
>>
>>> +     clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
>>
>> This one I think indeed can race with completion.
> 
> Clear both for consistency, caught out once, may be caught out again on
> the other.

From the bug we have this:

<0>[   61.788268] gem_exec-2120    3.... 56070451us : i915_sched_node_reinit: i915_sched_node_reinit:400 GEM_BUG_ON(!list_empty(&node->link))
...
2>[   61.793703] kernel BUG at drivers/gpu/drm/i915/i915_scheduler.c:400!
<4>[   61.793736] invalid opcode: 0000 [#1] PREEMPT SMP PTI
<4>[   61.793744] CPU: 0 PID: 2120 Comm: gem_exec_fence Tainted: G     U  W         5.5.0-rc6-CI-CI_DRM_7762+ #1
<4>[   61.793755] Hardware name:  /NUC7i5BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017
<4>[   61.793822] RIP: 0010:i915_sched_node_reinit+0x14a/0x150 [i915]
<4>[   61.793831] Code: 00 48 c7 c2 10 27 79 a0 48 c7 c7 1f c2 69 a0 e8 0c 37 b0 e0 bf 01 00 00 00 e8 22 09 b0 e0 31 f6 bf 09 00 00 00 e8 86 48 a1 e0 <0f> 0b 0f 1f 40 00 48 8d 47 10 48 89 3f 48 89 7f 08 48 89 47 10 48
<4>[   61.793850] RSP: 0018:ffffc900020d3a58 EFLAGS: 00010296
<4>[   61.793857] RAX: ffff888261bed140 RBX: ffff88825c429140 RCX: 0000000000000006
<4>[   61.793865] RDX: 00000000000018dd RSI: 0000000000000000 RDI: 0000000000000009
<4>[   61.793873] RBP: ffff888272f58440 R08: 0000000000000000 R09: 0000000000000000
<4>[   61.793881] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88825c4292e8
<4>[   61.793889] R13: 0000000000000000 R14: ffffc900020d3dc0 R15: ffff888256a50068
<4>[   61.793898] FS:  00007f321ab41e40(0000) GS:ffff888276c00000(0000) knlGS:0000000000000000
<4>[   61.793907] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   61.793914] CR2: 00007f275c1470f8 CR3: 0000000260a28002 CR4: 00000000003606f0
<4>[   61.793922] Call Trace:
<4>[   61.793979]  __i915_request_create+0x1c6/0x5a0 [i915]
<4>[   61.794034]  i915_request_create+0x86/0x1c0 [i915]
<4>[   61.794085]  i915_gem_do_execbuffer+0xbf2/0x2500 [i915]

This would mean in use request returned to the RCU slab cache, right? Do we expect to keep any rq.fence->flags across rq recycling? Perhaps we need dma_fence_reinit or something with the asserts similar to what you added in:

commit 67a3acaab7167157fb827595019eaf55df244824
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Nov 22 09:49:24 2019 +0000

    drm/i915: Use a ctor for TYPESAFE_BY_RCU i915_request

And I reviewed, although of course the substance mostly evaporated from my head by now.

Question seems to be how an on list request could have been returned to the slab. Does hold needs to have a reference if it races with parallel retire?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-01-21 17:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-20 17:57 [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link Chris Wilson
2020-01-20 19:47 ` Tvrtko Ursulin
2020-01-20 20:27   ` Chris Wilson
2020-01-21 17:33     ` Tvrtko Ursulin
2020-01-20 20:32   ` Chris Wilson
2020-01-21 11:11     ` Tvrtko Ursulin
2020-01-21 11:15       ` Chris Wilson
2020-01-21  2:12 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2020-01-21 14:53 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.