All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
@ 2017-01-11 12:14 Chris Wilson
  2017-01-11 12:56 ` ✗ Fi.CI.BAT: failure for " Patchwork
  2017-01-11 12:58 ` [PATCH] " Joonas Lahtinen
  0 siblings, 2 replies; 7+ messages in thread
From: Chris Wilson @ 2017-01-11 12:14 UTC (permalink / raw)
  To: intel-gfx

When switching between contexts using the aliasing_ppgtt, the VM is
shared. We don't need to reload the PD registers unless they are dirty.

Martin Peres reported an issue that looks like corruption between
Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
Rearrange switch_context to load the aliasing ppgtt on first use").
Switching between the same mm (the aliasing_ppgtt is used for all
contexts in this case) should be a nop, but appears to trigger some
side-effects in the context switch. However, as we know the switch
is redundant in this case, we can skip it and continue to ignore the
issue until somebody feels strong enough to investigate full-ppgtt on
gen7 again!

Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use")
Reported-by: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Martin Peres <martin.peres@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index ed31133b3ce3..86426c1a9534 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -728,10 +728,10 @@ static inline bool skip_rcs_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static bool
-needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
-		  struct intel_engine_cs *engine,
-		  struct i915_gem_context *to)
+needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt, struct intel_engine_cs *engine)
 {
+	struct i915_hw_ppgtt *last_ppgtt;
+
 	if (!ppgtt)
 		return false;
 
@@ -740,7 +740,9 @@ needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
 		return true;
 
 	/* Same context without new entries, skip */
-	if (engine->legacy_active_context == to &&
+	last_ppgtt =
+		engine->legacy_active_context->ppgtt ?: engine->i915->mm.aliasing_ppgtt;
+	if (last_ppgtt == ppgtt &&
 	    !(intel_engine_flag(engine) & ppgtt->pd_dirty_rings))
 		return false;
 
@@ -784,7 +786,7 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 	if (skip_rcs_switch(ppgtt, engine, to))
 		return 0;
 
-	if (needs_pd_load_pre(ppgtt, engine, to)) {
+	if (needs_pd_load_pre(ppgtt, engine)) {
 		/* Older GENs and non render rings still want the load first,
 		 * "PP_DCLV followed by PP_DIR_BASE register through Load
 		 * Register Immediate commands in Ring Buffer before submitting
@@ -881,7 +883,7 @@ int i915_switch_context(struct drm_i915_gem_request *req)
 		struct i915_hw_ppgtt *ppgtt =
 			to->ppgtt ?: req->i915->mm.aliasing_ppgtt;
 
-		if (needs_pd_load_pre(ppgtt, engine, to)) {
+		if (needs_pd_load_pre(ppgtt, engine)) {
 			int ret;
 
 			trace_switch_mm(engine, to);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  2017-01-11 12:14 [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt Chris Wilson
@ 2017-01-11 12:56 ` Patchwork
  2017-01-11 13:01   ` Chris Wilson
  2017-01-11 12:58 ` [PATCH] " Joonas Lahtinen
  1 sibling, 1 reply; 7+ messages in thread
From: Patchwork @ 2017-01-11 12:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
URL   : https://patchwork.freedesktop.org/series/17823/
State : failure

== Summary ==

Series 17823v1 drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/

Test drv_hangman:
        Subgroup error-state-basic:
                pass       -> TIMEOUT    (fi-hsw-4770)
Test gem_close_race:
        Subgroup basic-process:
                pass       -> INCOMPLETE (fi-hsw-4770)
        Subgroup basic-threads:
                pass       -> INCOMPLETE (fi-hsw-4770r)
Test gem_ctx_basic:
                pass       -> INCOMPLETE (fi-ivb-3520m)

fi-bdw-5557u     total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050     total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700     total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:14   pass:12   dwarn:0   dfail:0   fail:0   skip:0  
fi-hsw-4770r     total:15   pass:14   dwarn:0   dfail:0   fail:0   skip:0  
fi-ivb-3520m     total:18   pass:17   dwarn:0   dfail:0   fail:0   skip:0  
fi-kbl-7500u     total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u     total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hq    total:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m     total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600      total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC integration manifest
f58497f drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3478/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  2017-01-11 12:14 [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt Chris Wilson
  2017-01-11 12:56 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2017-01-11 12:58 ` Joonas Lahtinen
  2017-01-11 15:35   ` Martin Peres
  1 sibling, 1 reply; 7+ messages in thread
From: Joonas Lahtinen @ 2017-01-11 12:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On ke, 2017-01-11 at 12:14 +0000, Chris Wilson wrote:
> When switching between contexts using the aliasing_ppgtt, the VM is
> shared. We don't need to reload the PD registers unless they are dirty.
> 
> Martin Peres reported an issue that looks like corruption between
> Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
> Rearrange switch_context to load the aliasing ppgtt on first use").
> Switching between the same mm (the aliasing_ppgtt is used for all
> contexts in this case) should be a nop, but appears to trigger some
> side-effects in the context switch. However, as we know the switch
> is redundant in this case, we can skip it and continue to ignore the
> issue until somebody feels strong enough to investigate full-ppgtt on
> gen7 again!
> 
> Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use")
> Reported-by: Martin Peres <martin.peres@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Martin Peres <martin.peres@linux.intel.com>

Code looks good, could use the T-b's to verify.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  2017-01-11 12:56 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2017-01-11 13:01   ` Chris Wilson
  2017-01-11 13:38     ` Chris Wilson
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2017-01-11 13:01 UTC (permalink / raw)
  To: intel-gfx

On Wed, Jan 11, 2017 at 12:56:03PM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
> URL   : https://patchwork.freedesktop.org/series/17823/
> State : failure
> 
> == Summary ==
> 
> Series 17823v1 drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
> https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/
> 
> Test drv_hangman:
>         Subgroup error-state-basic:
>                 pass       -> TIMEOUT    (fi-hsw-4770)
> Test gem_close_race:
>         Subgroup basic-process:
>                 pass       -> INCOMPLETE (fi-hsw-4770)
>         Subgroup basic-threads:
>                 pass       -> INCOMPLETE (fi-hsw-4770r)
> Test gem_ctx_basic:
>                 pass       -> INCOMPLETE (fi-ivb-3520m)

Ooh. That is suitably scary that there is something wrong going here.
Still think the patch is sane by itself, so suspecting there is
something not meeting the eye here.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  2017-01-11 13:01   ` Chris Wilson
@ 2017-01-11 13:38     ` Chris Wilson
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2017-01-11 13:38 UTC (permalink / raw)
  To: intel-gfx

On Wed, Jan 11, 2017 at 01:01:18PM +0000, Chris Wilson wrote:
> On Wed, Jan 11, 2017 at 12:56:03PM -0000, Patchwork wrote:
> > == Series Details ==
> > 
> > Series: drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
> > URL   : https://patchwork.freedesktop.org/series/17823/
> > State : failure
> > 
> > == Summary ==
> > 
> > Series 17823v1 drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
> > https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/
> > 
> > Test drv_hangman:
> >         Subgroup error-state-basic:
> >                 pass       -> TIMEOUT    (fi-hsw-4770)
> > Test gem_close_race:
> >         Subgroup basic-process:
> >                 pass       -> INCOMPLETE (fi-hsw-4770)
> >         Subgroup basic-threads:
> >                 pass       -> INCOMPLETE (fi-hsw-4770r)
> > Test gem_ctx_basic:
> >                 pass       -> INCOMPLETE (fi-ivb-3520m)
> 
> Ooh. That is suitably scary that there is something wrong going here.
> Still think the patch is sane by itself, so suspecting there is
> something not meeting the eye here.

To further demonstrate the bizarreness, they are all *CPU* lockups.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  2017-01-11 12:58 ` [PATCH] " Joonas Lahtinen
@ 2017-01-11 15:35   ` Martin Peres
  2017-01-11 15:43     ` Chris Wilson
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Peres @ 2017-01-11 15:35 UTC (permalink / raw)
  To: Joonas Lahtinen, Chris Wilson, intel-gfx

On 11/01/17 14:58, Joonas Lahtinen wrote:
> On ke, 2017-01-11 at 12:14 +0000, Chris Wilson wrote:
>> When switching between contexts using the aliasing_ppgtt, the VM is
>> shared. We don't need to reload the PD registers unless they are dirty.
>>
>> Martin Peres reported an issue that looks like corruption between
>> Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
>> Rearrange switch_context to load the aliasing ppgtt on first use").
>> Switching between the same mm (the aliasing_ppgtt is used for all
>> contexts in this case) should be a nop, but appears to trigger some
>> side-effects in the context switch. However, as we know the switch
>> is redundant in this case, we can skip it and continue to ignore the
>> issue until somebody feels strong enough to investigate full-ppgtt on
>> gen7 again!
>>
>> Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use")
>> Reported-by: Martin Peres <martin.peres@linux.intel.com>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Martin Peres <martin.peres@linux.intel.com>
>
> Code looks good, could use the T-b's to verify.
>
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>
> Regards, Joonas
>

https://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-mupuf&id=cfe8f1043b45896af23e4a978020fe20e90c5072 
was actually the commit that massively improved the corruption I was 
seeing in one benchmark while this patch had no visible impact.

However, my problem was that i915.enable_ppgtt=2 was set in 
/etc/modprobe.d/... and I had completely forgotten about it.

So yeah, now you know that f9326be5f1d3 massively broke enable_ppgtt=2, 
but not sure what you want to do about it.

There is no hurry though, as the defaults are sane.

Sorry for the noise everyone, I hope that my painful manual bisects will 
be useful if someone wants to make the second mode work :)

Martin
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  2017-01-11 15:35   ` Martin Peres
@ 2017-01-11 15:43     ` Chris Wilson
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2017-01-11 15:43 UTC (permalink / raw)
  To: Martin Peres; +Cc: intel-gfx

On Wed, Jan 11, 2017 at 05:35:08PM +0200, Martin Peres wrote:
> On 11/01/17 14:58, Joonas Lahtinen wrote:
> >On ke, 2017-01-11 at 12:14 +0000, Chris Wilson wrote:
> >>When switching between contexts using the aliasing_ppgtt, the VM is
> >>shared. We don't need to reload the PD registers unless they are dirty.
> >>
> >>Martin Peres reported an issue that looks like corruption between
> >>Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
> >>Rearrange switch_context to load the aliasing ppgtt on first use").
> >>Switching between the same mm (the aliasing_ppgtt is used for all
> >>contexts in this case) should be a nop, but appears to trigger some
> >>side-effects in the context switch. However, as we know the switch
> >>is redundant in this case, we can skip it and continue to ignore the
> >>issue until somebody feels strong enough to investigate full-ppgtt on
> >>gen7 again!
> >>
> >>Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use")
> >>Reported-by: Martin Peres <martin.peres@linux.intel.com>
> >>Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>Cc: Martin Peres <martin.peres@linux.intel.com>
> >
> >Code looks good, could use the T-b's to verify.
> >
> >Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >
> >Regards, Joonas
> >
> 
> https://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-mupuf&id=cfe8f1043b45896af23e4a978020fe20e90c5072
> was actually the commit that massively improved the corruption I was
> seeing in one benchmark while this patch had no visible impact.
> 
> However, my problem was that i915.enable_ppgtt=2 was set in
> /etc/modprobe.d/... and I had completely forgotten about it.
> 
> So yeah, now you know that f9326be5f1d3 massively broke
> enable_ppgtt=2, but not sure what you want to do about it.
> 
> There is no hurry though, as the defaults are sane.
> 
> Sorry for the noise everyone, I hope that my painful manual bisects
> will be useful if someone wants to make the second mode work :)

The information is very useful, I've added that the symptoms have only
been seen with full-ppgtt. I don't see any reason to apply this patch now,
so will send it along with the series playing with i915_gem_gtt.c once
the kselftests for it are in.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-01-11 15:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-11 12:14 [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt Chris Wilson
2017-01-11 12:56 ` ✗ Fi.CI.BAT: failure for " Patchwork
2017-01-11 13:01   ` Chris Wilson
2017-01-11 13:38     ` Chris Wilson
2017-01-11 12:58 ` [PATCH] " Joonas Lahtinen
2017-01-11 15:35   ` Martin Peres
2017-01-11 15:43     ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.