intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
@ 2020-05-27  8:53 Tvrtko Ursulin
  2020-05-27  9:20 ` Chris Wilson
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2020-05-27  8:53 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Xiaogang Li, Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Our generic mechanism for specifying aligned request start,
I915_EXEC_FENCE_SUBMIT, has a bit too relaxed rules when it comes to the
actual use case of media scalability on Tigerlake.

While the submit fence can provide the aligned start, the  actual media
pipeline expects that execution remains in lockstep from then onwards.
Otherwise the hardware cannot handle well one of the pair getting
preempted leading to GPU hangs and corrupted media playback.

This puts us in a difficult position where the only visible solution,
which does not involve adding new uapi, is to give more meaning to the
generic submit fence. At least when used between the video engines.

The special semantics this patch applies in that case are two fold. First
is that both of the aligned pairs will be marked as non-preemptable and
second is ensuring both are not sharing ELSP ports with any other context.

Non-preemptable property will ensure that once the start has been aligned
the requests remain executing until completion.

Single port ELSP property will ensure there are no possible inversions
between different independent pairs of aligned requests.

Going forward we can think of introducing new uapi to request this
behaviour as a separate, more explicit flag, and at that point retire this
semi-generic special handling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Xiaogang Li <xiaogang.li@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.h |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c     | 46 +++++++++++++++++++++----
 2 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 07be021882cc..576d11c0cdd9 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -212,7 +212,7 @@ intel_context_force_single_submission(const struct intel_context *ce)
 static inline void
 intel_context_set_single_submission(struct intel_context *ce)
 {
-	__set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
+	set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
 }
 
 static inline bool
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3214a4ecc31a..88cf20fd92c8 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1734,8 +1734,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 
 static bool ctx_single_port_submission(const struct intel_context *ce)
 {
-	return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
-		intel_context_force_single_submission(ce));
+	return intel_context_force_single_submission(ce);
 }
 
 static bool can_merge_ctx(const struct intel_context *prev,
@@ -4929,9 +4928,41 @@ static void execlists_park(struct intel_engine_cs *engine)
 	cancel_timer(&engine->execlists.preempt);
 }
 
+static void
+mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
+{
+	/*
+	 * Give (temporary) special meaning to a pair requests with requested
+	 * aligned start along the video engines.
+	 *
+	 * They should be non-preemptable and have all ELSP ports to themselves
+	 * to avoid any deadlocks caused by inversions.
+	 *
+	 * Gen11+
+	 */
+	if (INTEL_GEN(rq->i915) < 11 ||
+	    rq->engine->class != VIDEO_DECODE_CLASS ||
+	    rq->engine->class != signal->engine->class)
+		return;
+
+	set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
+	set_bit(I915_FENCE_FLAG_NOPREEMPT, &signal->fence.flags);
+
+	intel_context_set_single_submission(rq->context);
+	intel_context_set_single_submission(signal->context);
+}
+
+static void
+execlists_bond_execute(struct i915_request *rq, struct dma_fence *signal)
+{
+	mark_bonded_pair(rq, to_request(signal));
+}
+
 void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = execlists_submit_request;
+	if (engine->class == VIDEO_DECODE_CLASS)
+		engine->bond_execute = execlists_bond_execute;
 	engine->schedule = i915_schedule;
 	engine->execlists.tasklet.func = execlists_submission_tasklet;
 
@@ -5602,15 +5633,18 @@ virtual_find_bond(struct virtual_engine *ve,
 }
 
 static void
-virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
+virtual_bond_execute(struct i915_request *rq, struct dma_fence *sigfence)
 {
 	struct virtual_engine *ve = to_virtual_engine(rq->engine);
+	struct i915_request *signal = to_request(sigfence);
 	intel_engine_mask_t allowed, exec;
 	struct ve_bond *bond;
 
-	allowed = ~to_request(signal)->engine->mask;
+	mark_bonded_pair(rq, signal);
+
+	allowed = ~signal->engine->mask;
 
-	bond = virtual_find_bond(ve, to_request(signal)->engine);
+	bond = virtual_find_bond(ve, signal->engine);
 	if (bond)
 		allowed &= bond->sibling_mask;
 
@@ -5620,7 +5654,7 @@ virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 		;
 
 	/* Prevent the master from being re-run on the bonded engines */
-	to_request(signal)->execution_mask &= ~allowed;
+	signal->execution_mask &= ~allowed;
 }
 
 struct intel_context *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
  2020-05-27  8:53 [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests Tvrtko Ursulin
@ 2020-05-27  9:20 ` Chris Wilson
  2020-05-27  9:36   ` Tvrtko Ursulin
  2020-05-27  9:29 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2020-05-27  9:20 UTC (permalink / raw)
  To: Intel-gfx, Tvrtko Ursulin; +Cc: Xiaogang Li

Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Our generic mechanism for specifying aligned request start,
> I915_EXEC_FENCE_SUBMIT, has a bit too relaxed rules when it comes to the
> actual use case of media scalability on Tigerlake.
> 
> While the submit fence can provide the aligned start, the  actual media
> pipeline expects that execution remains in lockstep from then onwards.
> Otherwise the hardware cannot handle well one of the pair getting
> preempted leading to GPU hangs and corrupted media playback.
> 
> This puts us in a difficult position where the only visible solution,
> which does not involve adding new uapi, is to give more meaning to the
> generic submit fence. At least when used between the video engines.
> 
> The special semantics this patch applies in that case are two fold. First
> is that both of the aligned pairs will be marked as non-preemptable and
> second is ensuring both are not sharing ELSP ports with any other context.
> 
> Non-preemptable property will ensure that once the start has been aligned
> the requests remain executing until completion.
> 
> Single port ELSP property will ensure there are no possible inversions
> between different independent pairs of aligned requests.

Nak.

> Going forward we can think of introducing new uapi to request this
> behaviour as a separate, more explicit flag, and at that point retire this
> semi-generic special handling.

As CI will say, such behaviour will already need a new flag. Forcing
no-preemption should be a privileged flag, so I would expect some
acknowledge that this is a HW problem that we have to workaround.

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Suggested-by: Xiaogang Li <xiaogang.li@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/intel_context.h |  2 +-
>  drivers/gpu/drm/i915/gt/intel_lrc.c     | 46 +++++++++++++++++++++----
>  2 files changed, 41 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index 07be021882cc..576d11c0cdd9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -212,7 +212,7 @@ intel_context_force_single_submission(const struct intel_context *ce)
>  static inline void
>  intel_context_set_single_submission(struct intel_context *ce)
>  {
> -       __set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
> +       set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
>  }
>  
>  static inline bool
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 3214a4ecc31a..88cf20fd92c8 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1734,8 +1734,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  
>  static bool ctx_single_port_submission(const struct intel_context *ce)
>  {
> -       return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
> -               intel_context_force_single_submission(ce));
> +       return intel_context_force_single_submission(ce);
>  }
>  
>  static bool can_merge_ctx(const struct intel_context *prev,
> @@ -4929,9 +4928,41 @@ static void execlists_park(struct intel_engine_cs *engine)
>         cancel_timer(&engine->execlists.preempt);
>  }
>  
> +static void
> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
> +{
> +       /*
> +        * Give (temporary) special meaning to a pair requests with requested
> +        * aligned start along the video engines.
> +        *
> +        * They should be non-preemptable and have all ELSP ports to themselves
> +        * to avoid any deadlocks caused by inversions.

This sentence needs expanding, because you are implying that this is an
issue we need to address in the code. If there is a deadlock resulting
from incorrect submission ordering, that is a bug in the code.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Special handling for bonded requests
  2020-05-27  8:53 [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests Tvrtko Ursulin
  2020-05-27  9:20 ` Chris Wilson
@ 2020-05-27  9:29 ` Patchwork
  2020-05-27  9:50 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2020-05-27  9:29 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Special handling for bonded requests
URL   : https://patchwork.freedesktop.org/series/77688/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
10e39c0f7a03 drm/i915: Special handling for bonded requests
-:20: WARNING:TYPO_SPELLING: 'preemptable' may be misspelled - perhaps 'preemptible'?
#20: 
is that both of the aligned pairs will be marked as non-preemptable and

-:23: WARNING:TYPO_SPELLING: 'preemptable' may be misspelled - perhaps 'preemptible'?
#23: 
Non-preemptable property will ensure that once the start has been aligned

-:75: WARNING:TYPO_SPELLING: 'preemptable' may be misspelled - perhaps 'preemptible'?
#75: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:4938:
+	 * They should be non-preemptable and have all ELSP ports to themselves

total: 0 errors, 3 warnings, 0 checks, 87 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
  2020-05-27  9:20 ` Chris Wilson
@ 2020-05-27  9:36   ` Tvrtko Ursulin
  2020-05-27 10:05     ` Chris Wilson
  0 siblings, 1 reply; 12+ messages in thread
From: Tvrtko Ursulin @ 2020-05-27  9:36 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx; +Cc: Xiaogang Li


On 27/05/2020 10:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Our generic mechanism for specifying aligned request start,
>> I915_EXEC_FENCE_SUBMIT, has a bit too relaxed rules when it comes to the
>> actual use case of media scalability on Tigerlake.
>>
>> While the submit fence can provide the aligned start, the  actual media
>> pipeline expects that execution remains in lockstep from then onwards.
>> Otherwise the hardware cannot handle well one of the pair getting
>> preempted leading to GPU hangs and corrupted media playback.
>>
>> This puts us in a difficult position where the only visible solution,
>> which does not involve adding new uapi, is to give more meaning to the
>> generic submit fence. At least when used between the video engines.
>>
>> The special semantics this patch applies in that case are two fold. First
>> is that both of the aligned pairs will be marked as non-preemptable and
>> second is ensuring both are not sharing ELSP ports with any other context.
>>
>> Non-preemptable property will ensure that once the start has been aligned
>> the requests remain executing until completion.
>>
>> Single port ELSP property will ensure there are no possible inversions
>> between different independent pairs of aligned requests.
> 
> Nak.
> 
>> Going forward we can think of introducing new uapi to request this
>> behaviour as a separate, more explicit flag, and at that point retire this
>> semi-generic special handling.
> 
> As CI will say, such behaviour will already need a new flag. Forcing
> no-preemption should be a privileged flag, so I would expect some
> acknowledge that this is a HW problem that we have to workaround.

I don't know how hw exactly works so from my side it's only empirical.

I agree new flag is needed but we also need a fix ASAP for TGL. And I 
don't think we can add new uapi in a reasonable time frame here. We 
would need the ctx set_parallel extension with a dont-preempt flag and 
multi-batch execbuf. And a lot of work in media-driver which will not be 
ready for TGL. I don't think a flag at the I915_EXEC_FENCE_SUBMIT level 
is a solution.

>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Suggested-by: Xiaogang Li <xiaogang.li@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_context.h |  2 +-
>>   drivers/gpu/drm/i915/gt/intel_lrc.c     | 46 +++++++++++++++++++++----
>>   2 files changed, 41 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
>> index 07be021882cc..576d11c0cdd9 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_context.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
>> @@ -212,7 +212,7 @@ intel_context_force_single_submission(const struct intel_context *ce)
>>   static inline void
>>   intel_context_set_single_submission(struct intel_context *ce)
>>   {
>> -       __set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
>> +       set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
>>   }
>>   
>>   static inline bool
>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> index 3214a4ecc31a..88cf20fd92c8 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> @@ -1734,8 +1734,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>>   
>>   static bool ctx_single_port_submission(const struct intel_context *ce)
>>   {
>> -       return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
>> -               intel_context_force_single_submission(ce));
>> +       return intel_context_force_single_submission(ce);
>>   }
>>   
>>   static bool can_merge_ctx(const struct intel_context *prev,
>> @@ -4929,9 +4928,41 @@ static void execlists_park(struct intel_engine_cs *engine)
>>          cancel_timer(&engine->execlists.preempt);
>>   }
>>   
>> +static void
>> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
>> +{
>> +       /*
>> +        * Give (temporary) special meaning to a pair requests with requested
>> +        * aligned start along the video engines.
>> +        *
>> +        * They should be non-preemptable and have all ELSP ports to themselves
>> +        * to avoid any deadlocks caused by inversions.
> 
> This sentence needs expanding, because you are implying that this is an
> issue we need to address in the code. If there is a deadlock resulting
> from incorrect submission ordering, that is a bug in the code.

If we add no-preempt I think we have to have single elsp because there 
is no ordering between unrelated pairs and then elsp inversion certainly 
sounds like a deadlock.

ctx-A: vcs0 + vcs1
ctx-B: vcs0 + vcs1

There is no ordering between A and B but we'd have to pick same port for 
both pairs of A and B respectively. I don't see that we can do that 
since all four submissions are async.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Special handling for bonded requests
  2020-05-27  8:53 [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests Tvrtko Ursulin
  2020-05-27  9:20 ` Chris Wilson
  2020-05-27  9:29 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2020-05-27  9:50 ` Patchwork
  2020-05-27 10:52 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2020-05-27  9:50 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Special handling for bonded requests
URL   : https://patchwork.freedesktop.org/series/77688/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8541 -> Patchwork_17782
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/index.html

Known issues
------------

  Here are the changes found in Patchwork_17782 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live@coherency:
    - fi-gdg-551:         [PASS][1] -> [DMESG-FAIL][2] ([i915#1748])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/fi-gdg-551/igt@i915_selftest@live@coherency.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/fi-gdg-551/igt@i915_selftest@live@coherency.html

  * igt@kms_chamelium@hdmi-crc-fast:
    - fi-kbl-7500u:       [PASS][3] -> [DMESG-FAIL][4] ([i915#165])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/fi-kbl-7500u/igt@kms_chamelium@hdmi-crc-fast.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/fi-kbl-7500u/igt@kms_chamelium@hdmi-crc-fast.html

  
  [i915#165]: https://gitlab.freedesktop.org/drm/intel/issues/165
  [i915#1748]: https://gitlab.freedesktop.org/drm/intel/issues/1748


Participating hosts (50 -> 44)
------------------------------

  Additional (1): fi-kbl-7560u 
  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_8541 -> Patchwork_17782

  CI-20190529: 20190529
  CI_DRM_8541: 6f35fec9320a0ee5178075dd3d9df6507f55af68 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5680: f7e3772175c53f0c910f4513831791cb5bdcab04 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17782: 10e39c0f7a03edbe826ad25606a4d465b1817d6b @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

10e39c0f7a03 drm/i915: Special handling for bonded requests

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
  2020-05-27  9:36   ` Tvrtko Ursulin
@ 2020-05-27 10:05     ` Chris Wilson
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2020-05-27 10:05 UTC (permalink / raw)
  To: Intel-gfx, Tvrtko Ursulin; +Cc: Xiaogang Li

Quoting Tvrtko Ursulin (2020-05-27 10:36:31)
> 
> On 27/05/2020 10:20, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Our generic mechanism for specifying aligned request start,
> >> I915_EXEC_FENCE_SUBMIT, has a bit too relaxed rules when it comes to the
> >> actual use case of media scalability on Tigerlake.
> >>
> >> While the submit fence can provide the aligned start, the  actual media
> >> pipeline expects that execution remains in lockstep from then onwards.
> >> Otherwise the hardware cannot handle well one of the pair getting
> >> preempted leading to GPU hangs and corrupted media playback.
> >>
> >> This puts us in a difficult position where the only visible solution,
> >> which does not involve adding new uapi, is to give more meaning to the
> >> generic submit fence. At least when used between the video engines.
> >>
> >> The special semantics this patch applies in that case are two fold. First
> >> is that both of the aligned pairs will be marked as non-preemptable and
> >> second is ensuring both are not sharing ELSP ports with any other context.
> >>
> >> Non-preemptable property will ensure that once the start has been aligned
> >> the requests remain executing until completion.
> >>
> >> Single port ELSP property will ensure there are no possible inversions
> >> between different independent pairs of aligned requests.
> > 
> > Nak.
> > 
> >> Going forward we can think of introducing new uapi to request this
> >> behaviour as a separate, more explicit flag, and at that point retire this
> >> semi-generic special handling.
> > 
> > As CI will say, such behaviour will already need a new flag. Forcing
> > no-preemption should be a privileged flag, so I would expect some
> > acknowledge that this is a HW problem that we have to workaround.
> 
> I don't know how hw exactly works so from my side it's only empirical.
> 
> I agree new flag is needed but we also need a fix ASAP for TGL. And I 
> don't think we can add new uapi in a reasonable time frame here. We 
> would need the ctx set_parallel extension with a dont-preempt flag and 
> multi-batch execbuf. And a lot of work in media-driver which will not be 
> ready for TGL. I don't think a flag at the I915_EXEC_FENCE_SUBMIT level 
> is a solution.

I'm going to say the starting point is to remove the
I915_ENGINE_HAS_PREEMPTION flag from vcs*. That way the tests that
require preemption will skip. Sadly that's the level of our current API.

> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> Suggested-by: Xiaogang Li <xiaogang.li@intel.com>
> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> ---
> >>   drivers/gpu/drm/i915/gt/intel_context.h |  2 +-
> >>   drivers/gpu/drm/i915/gt/intel_lrc.c     | 46 +++++++++++++++++++++----
> >>   2 files changed, 41 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> >> index 07be021882cc..576d11c0cdd9 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> >> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> >> @@ -212,7 +212,7 @@ intel_context_force_single_submission(const struct intel_context *ce)
> >>   static inline void
> >>   intel_context_set_single_submission(struct intel_context *ce)
> >>   {
> >> -       __set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
> >> +       set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ce->flags);
> >>   }
> >>   
> >>   static inline bool
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> index 3214a4ecc31a..88cf20fd92c8 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> @@ -1734,8 +1734,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
> >>   
> >>   static bool ctx_single_port_submission(const struct intel_context *ce)
> >>   {
> >> -       return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
> >> -               intel_context_force_single_submission(ce));
> >> +       return intel_context_force_single_submission(ce);
> >>   }
> >>   
> >>   static bool can_merge_ctx(const struct intel_context *prev,
> >> @@ -4929,9 +4928,41 @@ static void execlists_park(struct intel_engine_cs *engine)
> >>          cancel_timer(&engine->execlists.preempt);
> >>   }
> >>   
> >> +static void
> >> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
> >> +{
> >> +       /*
> >> +        * Give (temporary) special meaning to a pair requests with requested
> >> +        * aligned start along the video engines.
> >> +        *
> >> +        * They should be non-preemptable and have all ELSP ports to themselves
> >> +        * to avoid any deadlocks caused by inversions.
> > 
> > This sentence needs expanding, because you are implying that this is an
> > issue we need to address in the code. If there is a deadlock resulting
> > from incorrect submission ordering, that is a bug in the code.
> 
> If we add no-preempt I think we have to have single elsp because there 
> is no ordering between unrelated pairs and then elsp inversion certainly 
> sounds like a deadlock.

You don't have a deadlock between unrelated pairs :)

> 
> ctx-A: vcs0 + vcs1
> ctx-B: vcs0 + vcs1
> 
> There is no ordering between A and B but we'd have to pick same port for 
> both pairs of A and B respectively. I don't see that we can do that 
> since all four submissions are async.
> 
> Regards,
> 
> Tvrtko
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Special handling for bonded requests
  2020-05-27  8:53 [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2020-05-27  9:50 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2020-05-27 10:52 ` Patchwork
  2020-05-28  9:57 ` [Intel-gfx] [PATCH] " Chris Wilson
  2020-05-28 20:56 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Special handling for bonded requests (rev2) Patchwork
  5 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2020-05-27 10:52 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Special handling for bonded requests
URL   : https://patchwork.freedesktop.org/series/77688/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8541_full -> Patchwork_17782_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_17782_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17782_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_17782_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_eio@in-flight-contexts-10ms:
    - shard-snb:          [PASS][1] -> [TIMEOUT][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-snb6/igt@gem_eio@in-flight-contexts-10ms.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-snb2/igt@gem_eio@in-flight-contexts-10ms.html

  * igt@kms_plane_alpha_blend@pipe-b-alpha-basic:
    - shard-glk:          NOTRUN -> [TIMEOUT][3] +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-glk1/igt@kms_plane_alpha_blend@pipe-b-alpha-basic.html

  
#### Warnings ####

  * igt@runner@aborted:
    - shard-hsw:          [FAIL][4] ([i915#226]) -> ([FAIL][5], [FAIL][6])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-hsw8/igt@runner@aborted.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-hsw6/igt@runner@aborted.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-hsw4/igt@runner@aborted.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@gem_exec_fence@concurrent@vcs0}:
    - shard-iclb:         [PASS][7] -> [FAIL][8] +1 similar issue
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-iclb1/igt@gem_exec_fence@concurrent@vcs0.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-iclb8/igt@gem_exec_fence@concurrent@vcs0.html

  * {igt@gem_exec_fence@submit3@vcs0}:
    - shard-tglb:         [PASS][9] -> [INCOMPLETE][10] +2 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-tglb1/igt@gem_exec_fence@submit3@vcs0.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-tglb3/igt@gem_exec_fence@submit3@vcs0.html

  * {igt@gem_exec_fence@submit67@vcs0}:
    - shard-iclb:         [PASS][11] -> [INCOMPLETE][12] +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-iclb1/igt@gem_exec_fence@submit67@vcs0.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-iclb4/igt@gem_exec_fence@submit67@vcs0.html

  * {igt@gem_exec_schedule@submit-golden-slice@vcs0}:
    - shard-tglb:         [PASS][13] -> [FAIL][14] +2 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-tglb5/igt@gem_exec_schedule@submit-golden-slice@vcs0.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-tglb5/igt@gem_exec_schedule@submit-golden-slice@vcs0.html

  
Known issues
------------

  Here are the changes found in Patchwork_17782_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_cursor_crc@pipe-a-cursor-128x42-onscreen:
    - shard-kbl:          [PASS][15] -> [DMESG-WARN][16] ([i915#78])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-kbl1/igt@kms_cursor_crc@pipe-a-cursor-128x42-onscreen.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-kbl2/igt@kms_cursor_crc@pipe-a-cursor-128x42-onscreen.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
    - shard-kbl:          [PASS][17] -> [DMESG-WARN][18] ([i915#180]) +3 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-kbl7/igt@kms_cursor_crc@pipe-a-cursor-suspend.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-kbl6/igt@kms_cursor_crc@pipe-a-cursor-suspend.html

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy:
    - shard-hsw:          [PASS][19] -> [DMESG-WARN][20] ([i915#1927])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-hsw8/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-hsw4/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy.html

  * igt@kms_draw_crc@draw-method-rgb565-render-untiled:
    - shard-snb:          [PASS][21] -> [SKIP][22] ([fdo#109271])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-snb1/igt@kms_draw_crc@draw-method-rgb565-render-untiled.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-snb4/igt@kms_draw_crc@draw-method-rgb565-render-untiled.html

  * igt@kms_lease@lease-uevent:
    - shard-kbl:          [PASS][23] -> [DMESG-WARN][24] ([i915#165])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-kbl1/igt@kms_lease@lease-uevent.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-kbl2/igt@kms_lease@lease-uevent.html

  * igt@kms_plane@plane-panning-bottom-right-pipe-b-planes:
    - shard-skl:          [PASS][25] -> [FAIL][26] ([i915#1036])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl3/igt@kms_plane@plane-panning-bottom-right-pipe-b-planes.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl3/igt@kms_plane@plane-panning-bottom-right-pipe-b-planes.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes:
    - shard-kbl:          [PASS][27] -> [DMESG-WARN][28] ([i915#180] / [i915#93] / [i915#95])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-kbl4/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-kbl4/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [PASS][29] -> [FAIL][30] ([fdo#108145] / [i915#265])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl3/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl3/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_mmap_gtt:
    - shard-iclb:         [PASS][31] -> [SKIP][32] ([fdo#109441])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-iclb2/igt@kms_psr@psr2_cursor_mmap_gtt.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-iclb4/igt@kms_psr@psr2_cursor_mmap_gtt.html

  * igt@kms_vblank@pipe-c-ts-continuation-suspend:
    - shard-apl:          [PASS][33] -> [DMESG-WARN][34] ([i915#180]) +1 similar issue
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-apl8/igt@kms_vblank@pipe-c-ts-continuation-suspend.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-apl6/igt@kms_vblank@pipe-c-ts-continuation-suspend.html

  
#### Possible fixes ####

  * {igt@gem_exec_parallel@fds@bcs0}:
    - shard-iclb:         [INCOMPLETE][35] ([i915#1895]) -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-iclb2/igt@gem_exec_parallel@fds@bcs0.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-iclb1/igt@gem_exec_parallel@fds@bcs0.html

  * igt@gem_workarounds@suspend-resume:
    - shard-iclb:         [INCOMPLETE][37] ([i915#1185]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-iclb3/igt@gem_workarounds@suspend-resume.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-iclb1/igt@gem_workarounds@suspend-resume.html

  * igt@kms_big_fb@y-tiled-64bpp-rotate-180:
    - shard-glk:          [FAIL][39] ([i915#1119] / [i915#118] / [i915#95]) -> [PASS][40] +1 similar issue
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-glk8/igt@kms_big_fb@y-tiled-64bpp-rotate-180.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-glk7/igt@kms_big_fb@y-tiled-64bpp-rotate-180.html

  * igt@kms_cursor_crc@pipe-c-cursor-128x42-onscreen:
    - shard-skl:          [FAIL][41] ([i915#54]) -> [PASS][42]
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl3/igt@kms_cursor_crc@pipe-c-cursor-128x42-onscreen.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl9/igt@kms_cursor_crc@pipe-c-cursor-128x42-onscreen.html

  * igt@kms_cursor_legacy@all-pipes-torture-move:
    - shard-hsw:          [DMESG-WARN][43] ([i915#128]) -> [PASS][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-hsw8/igt@kms_cursor_legacy@all-pipes-torture-move.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-hsw6/igt@kms_cursor_legacy@all-pipes-torture-move.html

  * igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic:
    - shard-apl:          [FAIL][45] ([i915#1181]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-apl8/igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-apl8/igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic.html

  * {igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1}:
    - shard-skl:          [FAIL][47] ([i915#79]) -> [PASS][48]
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl10/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl4/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html

  * {igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a2}:
    - shard-glk:          [FAIL][49] ([i915#79]) -> [PASS][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-glk9/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a2.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-glk9/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a2.html

  * {igt@kms_flip@flip-vs-suspend-interruptible@a-dp1}:
    - shard-kbl:          [DMESG-WARN][51] ([i915#180]) -> [PASS][52] +4 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-kbl4/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-kbl4/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html

  * {igt@kms_flip@flip-vs-suspend@a-edp1}:
    - shard-skl:          [INCOMPLETE][53] ([i915#198]) -> [PASS][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl8/igt@kms_flip@flip-vs-suspend@a-edp1.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl6/igt@kms_flip@flip-vs-suspend@a-edp1.html

  * {igt@kms_flip@plain-flip-fb-recreate-interruptible@c-dp1}:
    - shard-kbl:          [FAIL][55] ([i915#1928]) -> [PASS][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-kbl3/igt@kms_flip@plain-flip-fb-recreate-interruptible@c-dp1.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-kbl4/igt@kms_flip@plain-flip-fb-recreate-interruptible@c-dp1.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-skl:          [FAIL][57] ([i915#1188]) -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl6/igt@kms_hdr@bpc-switch-suspend.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl8/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [FAIL][59] ([fdo#108145] / [i915#265]) -> [PASS][60] +1 similar issue
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl7/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl5/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_psr@psr2_sprite_mmap_cpu:
    - shard-iclb:         [SKIP][61] ([fdo#109441]) -> [PASS][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-iclb7/igt@kms_psr@psr2_sprite_mmap_cpu.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_cpu.html

  * igt@kms_vblank@pipe-b-ts-continuation-suspend:
    - shard-apl:          [DMESG-WARN][63] ([i915#180]) -> [PASS][64] +1 similar issue
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-apl1/igt@kms_vblank@pipe-b-ts-continuation-suspend.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-apl1/igt@kms_vblank@pipe-b-ts-continuation-suspend.html

  * igt@kms_vblank@pipe-c-ts-continuation-suspend:
    - shard-skl:          [INCOMPLETE][65] ([i915#69]) -> [PASS][66]
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-skl2/igt@kms_vblank@pipe-c-ts-continuation-suspend.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-skl7/igt@kms_vblank@pipe-c-ts-continuation-suspend.html

  * {igt@perf@blocking-parameterized}:
    - shard-hsw:          [FAIL][67] ([i915#1542]) -> [PASS][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-hsw6/igt@perf@blocking-parameterized.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-hsw1/igt@perf@blocking-parameterized.html

  
#### Warnings ####

  * igt@kms_content_protection@atomic-dpms:
    - shard-apl:          [TIMEOUT][69] ([i915#1319]) -> [TIMEOUT][70] ([i915#1319] / [i915#1635])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-apl4/igt@kms_content_protection@atomic-dpms.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-apl8/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_content_protection@legacy:
    - shard-apl:          [TIMEOUT][71] ([i915#1319] / [i915#1635]) -> [FAIL][72] ([fdo#110321] / [fdo#110336])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-apl7/igt@kms_content_protection@legacy.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-apl2/igt@kms_content_protection@legacy.html

  * igt@kms_content_protection@lic:
    - shard-apl:          [TIMEOUT][73] ([i915#1319] / [i915#1635]) -> [FAIL][74] ([fdo#110321])
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-apl3/igt@kms_content_protection@lic.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-apl7/igt@kms_content_protection@lic.html

  * igt@kms_content_protection@srm:
    - shard-apl:          [FAIL][75] ([fdo#110321]) -> [TIMEOUT][76] ([i915#1319] / [i915#1635])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-apl8/igt@kms_content_protection@srm.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-apl8/igt@kms_content_protection@srm.html

  * igt@kms_cursor_legacy@cursora-vs-flipb-toggle:
    - shard-glk:          [DMESG-WARN][77] ([i915#1926]) -> [DMESG-FAIL][78] ([i915#1925])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-glk4/igt@kms_cursor_legacy@cursora-vs-flipb-toggle.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-glk4/igt@kms_cursor_legacy@cursora-vs-flipb-toggle.html

  * igt@kms_plane@plane-position-covered-pipe-d-planes:
    - shard-hsw:          [SKIP][79] ([fdo#109271]) -> [SKIP][80] ([fdo#109271] / [i915#1927])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-hsw1/igt@kms_plane@plane-position-covered-pipe-d-planes.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-hsw6/igt@kms_plane@plane-position-covered-pipe-d-planes.html

  * igt@kms_psr2_su@page_flip:
    - shard-iclb:         [SKIP][81] ([fdo#109642] / [fdo#111068]) -> [FAIL][82] ([i915#608])
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8541/shard-iclb3/igt@kms_psr2_su@page_flip.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/shard-iclb2/igt@kms_psr2_su@page_flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110321]: https://bugs.freedesktop.org/show_bug.cgi?id=110321
  [fdo#110336]: https://bugs.freedesktop.org/show_bug.cgi?id=110336
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [i915#1036]: https://gitlab.freedesktop.org/drm/intel/issues/1036
  [i915#1119]: https://gitlab.freedesktop.org/drm/intel/issues/1119
  [i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
  [i915#1181]: https://gitlab.freedesktop.org/drm/intel/issues/1181
  [i915#1185]: https://gitlab.freedesktop.org/drm/intel/issues/1185
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#128]: https://gitlab.freedesktop.org/drm/intel/issues/128
  [i915#1319]: https://gitlab.freedesktop.org/drm/intel/issues/1319
  [i915#1542]: https://gitlab.freedesktop.org/drm/intel/issues/1542
  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#165]: https://gitlab.freedesktop.org/drm/intel/issues/165
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1895]: https://gitlab.freedesktop.org/drm/intel/issues/1895
  [i915#1925]: https://gitlab.freedesktop.org/drm/intel/issues/1925
  [i915#1926]: https://gitlab.freedesktop.org/drm/intel/issues/1926
  [i915#1927]: https://gitlab.freedesktop.org/drm/intel/issues/1927
  [i915#1928]: https://gitlab.freedesktop.org/drm/intel/issues/1928
  [i915#1930]: https://gitlab.freedesktop.org/drm/intel/issues/1930
  [i915#198]: https://gitlab.freedesktop.org/drm/intel/issues/198
  [i915#226]: https://gitlab.freedesktop.org/drm/intel/issues/226
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
  [i915#608]: https://gitlab.freedesktop.org/drm/intel/issues/608
  [i915#61]: https://gitlab.freedesktop.org/drm/intel/issues/61
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#78]: https://gitlab.freedesktop.org/drm/intel/issues/78
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#93]: https://gitlab.freedesktop.org/drm/intel/issues/93
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (11 -> 10)
------------------------------

  Missing    (1): pig-icl-1065g7 


Build changes
-------------

  * Linux: CI_DRM_8541 -> Patchwork_17782

  CI-20190529: 20190529
  CI_DRM_8541: 6f35fec9320a0ee5178075dd3d9df6507f55af68 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5680: f7e3772175c53f0c910f4513831791cb5bdcab04 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17782: 10e39c0f7a03edbe826ad25606a4d465b1817d6b @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17782/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
  2020-05-27  8:53 [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2020-05-27 10:52 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
@ 2020-05-28  9:57 ` Chris Wilson
  2020-05-28 10:20   ` Tvrtko Ursulin
  2020-05-28 20:56 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Special handling for bonded requests (rev2) Patchwork
  5 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2020-05-28  9:57 UTC (permalink / raw)
  To: Intel-gfx, Tvrtko Ursulin; +Cc: Xiaogang Li

Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
> +static void
> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
> +{
> +       /*
> +        * Give (temporary) special meaning to a pair requests with requested
> +        * aligned start along the video engines.
> +        *
> +        * They should be non-preemptable and have all ELSP ports to themselves
> +        * to avoid any deadlocks caused by inversions.
> +        *
> +        * Gen11+
> +        */
> +       if (INTEL_GEN(rq->i915) < 11 ||
> +           rq->engine->class != VIDEO_DECODE_CLASS ||
> +           rq->engine->class != signal->engine->class)
> +               return;
> +
> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &signal->fence.flags);
> +
> +       intel_context_set_single_submission(rq->context);
> +       intel_context_set_single_submission(signal->context);

The thought that just popped into my head:

This can be after signal is already submitted into ELSP[1].
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
  2020-05-28  9:57 ` [Intel-gfx] [PATCH] " Chris Wilson
@ 2020-05-28 10:20   ` Tvrtko Ursulin
  2020-05-28 20:23     ` Chris Wilson
  0 siblings, 1 reply; 12+ messages in thread
From: Tvrtko Ursulin @ 2020-05-28 10:20 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx; +Cc: Xiaogang Li


On 28/05/2020 10:57, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
>> +static void
>> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
>> +{
>> +       /*
>> +        * Give (temporary) special meaning to a pair requests with requested
>> +        * aligned start along the video engines.
>> +        *
>> +        * They should be non-preemptable and have all ELSP ports to themselves
>> +        * to avoid any deadlocks caused by inversions.
>> +        *
>> +        * Gen11+
>> +        */
>> +       if (INTEL_GEN(rq->i915) < 11 ||
>> +           rq->engine->class != VIDEO_DECODE_CLASS ||
>> +           rq->engine->class != signal->engine->class)
>> +               return;
>> +
>> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
>> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &signal->fence.flags);
>> +
>> +       intel_context_set_single_submission(rq->context);
>> +       intel_context_set_single_submission(signal->context);
> 
> The thought that just popped into my head:
> 
> This can be after signal is already submitted into ELSP[1].

Yep I knew that but thought it would still work.

Master in vcs0 port1, slave in vcs1 port0 or queued.

If queued that means at worst case another bonded pair is running on 
same engines, so they should be able to complete.

If slave submitted to vcs1 port0 then it will stay there until whatever 
is in vcs0 port0 finishes and lets master in.

Do you see a possibility for things to go bad?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
  2020-05-28 10:20   ` Tvrtko Ursulin
@ 2020-05-28 20:23     ` Chris Wilson
  2020-05-29  7:25       ` Tvrtko Ursulin
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2020-05-28 20:23 UTC (permalink / raw)
  To: Intel-gfx, Tvrtko Ursulin; +Cc: Xiaogang Li

Quoting Tvrtko Ursulin (2020-05-28 11:20:07)
> 
> On 28/05/2020 10:57, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
> >> +static void
> >> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
> >> +{
> >> +       /*
> >> +        * Give (temporary) special meaning to a pair requests with requested
> >> +        * aligned start along the video engines.
> >> +        *
> >> +        * They should be non-preemptable and have all ELSP ports to themselves
> >> +        * to avoid any deadlocks caused by inversions.
> >> +        *
> >> +        * Gen11+
> >> +        */
> >> +       if (INTEL_GEN(rq->i915) < 11 ||
> >> +           rq->engine->class != VIDEO_DECODE_CLASS ||
> >> +           rq->engine->class != signal->engine->class)
> >> +               return;
> >> +
> >> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
> >> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &signal->fence.flags);
> >> +
> >> +       intel_context_set_single_submission(rq->context);
> >> +       intel_context_set_single_submission(signal->context);
> > 
> > The thought that just popped into my head:
> > 
> > This can be after signal is already submitted into ELSP[1].
> 
> Yep I knew that but thought it would still work.
> 
> Master in vcs0 port1, slave in vcs1 port0 or queued.
> 
> If queued that means at worst case another bonded pair is running on 
> same engines, so they should be able to complete.
> 
> If slave submitted to vcs1 port0 then it will stay there until whatever 
> is in vcs0 port0 finishes and lets master in.
> 
> Do you see a possibility for things to go bad?

Because the master is already submitted in port1, the bond can go into
port0. Then a second bond turns up for the master in port0, and we're
back at square one.

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 37855ae8f8b3..698608e11df3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2502,6 +2502,7 @@ static void eb_request_add(struct i915_execbuffer *eb)
        lockdep_unpin_lock(&tl->mutex, rq->cookie);
 
        trace_i915_request_add(rq);
+       set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);
 
        prev = __i915_request_commit(rq);

Will do the trick.

(Plus fixing up the rules for assert_pending_valid).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Special handling for bonded requests (rev2)
  2020-05-27  8:53 [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2020-05-28  9:57 ` [Intel-gfx] [PATCH] " Chris Wilson
@ 2020-05-28 20:56 ` Patchwork
  5 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2020-05-28 20:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Special handling for bonded requests (rev2)
URL   : https://patchwork.freedesktop.org/series/77688/
State : failure

== Summary ==

Applying: drm/i915: Special handling for bonded requests
error: sha1 information is lacking or useless (drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 drm/i915: Special handling for bonded requests
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests
  2020-05-28 20:23     ` Chris Wilson
@ 2020-05-29  7:25       ` Tvrtko Ursulin
  0 siblings, 0 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2020-05-29  7:25 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx; +Cc: Xiaogang Li


On 28/05/2020 21:23, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-05-28 11:20:07)
>>
>> On 28/05/2020 10:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
>>>> +static void
>>>> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
>>>> +{
>>>> +       /*
>>>> +        * Give (temporary) special meaning to a pair requests with requested
>>>> +        * aligned start along the video engines.
>>>> +        *
>>>> +        * They should be non-preemptable and have all ELSP ports to themselves
>>>> +        * to avoid any deadlocks caused by inversions.
>>>> +        *
>>>> +        * Gen11+
>>>> +        */
>>>> +       if (INTEL_GEN(rq->i915) < 11 ||
>>>> +           rq->engine->class != VIDEO_DECODE_CLASS ||
>>>> +           rq->engine->class != signal->engine->class)
>>>> +               return;
>>>> +
>>>> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
>>>> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &signal->fence.flags);
>>>> +
>>>> +       intel_context_set_single_submission(rq->context);
>>>> +       intel_context_set_single_submission(signal->context);
>>>
>>> The thought that just popped into my head:
>>>
>>> This can be after signal is already submitted into ELSP[1].
>>
>> Yep I knew that but thought it would still work.
>>
>> Master in vcs0 port1, slave in vcs1 port0 or queued.
>>
>> If queued that means at worst case another bonded pair is running on
>> same engines, so they should be able to complete.
>>
>> If slave submitted to vcs1 port0 then it will stay there until whatever
>> is in vcs0 port0 finishes and lets master in.
>>
>> Do you see a possibility for things to go bad?
> 
> Because the master is already submitted in port1, the bond can go into
> port0. Then a second bond turns up for the master in port0, and we're
> back at square one.
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 37855ae8f8b3..698608e11df3 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2502,6 +2502,7 @@ static void eb_request_add(struct i915_execbuffer *eb)
>          lockdep_unpin_lock(&tl->mutex, rq->cookie);
>   
>          trace_i915_request_add(rq);
> +       set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);
>   
>          prev = __i915_request_commit(rq);
> 
> Will do the trick.
> 
> (Plus fixing up the rules for assert_pending_valid).

Hmm yes, my logic was flawed by missing to see the async disconnect 
between master and slave submission on both ends. That's why Xiaogang 
was saying slaves must not have no-preempt set... But sentinel on 
everything? Or just everything vcs and gen11+?

So if we indeed had slave preemptible the deadlock would have been 
avoided I think, but can the media pipeline handle that is the question.

Another question is that it sounds it could be possible to work around 
this in userspace, combined with this patch (original thread), if a 
fence was used to block master until slave is submitted.

  split_fence = sw_fence_create()
  execbuf(master, in_fence = split_fence) = master_fence
  execbuf(slave, submit_fence = master_fence)
  sw_fence_advance(split_fence)

That would make sure the single port and no-preempt properties are 
applied before either master or slave can enter elsp.

Sounds tempting to try, thoughts?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-05-29  7:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-27  8:53 [Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests Tvrtko Ursulin
2020-05-27  9:20 ` Chris Wilson
2020-05-27  9:36   ` Tvrtko Ursulin
2020-05-27 10:05     ` Chris Wilson
2020-05-27  9:29 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2020-05-27  9:50 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-05-27 10:52 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2020-05-28  9:57 ` [Intel-gfx] [PATCH] " Chris Wilson
2020-05-28 10:20   ` Tvrtko Ursulin
2020-05-28 20:23     ` Chris Wilson
2020-05-29  7:25       ` Tvrtko Ursulin
2020-05-28 20:56 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Special handling for bonded requests (rev2) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).