All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
@ 2019-09-12  7:09 Chris Wilson
  2019-09-12  7:09 ` [PATCH 2/2] drm/i915/execlists: Ensure the context is reloaded after a GPU reset Chris Wilson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Chris Wilson @ 2019-09-12  7:09 UTC (permalink / raw)
  To: intel-gfx

After a GPU reset, we need to drain all the CS events so that we have an
accurate picture of the execlists state at the time of the reset. Be
paranoid and force a read of the CSB write pointer from memory.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3d83c7e0d9de..61a38a4ccbca 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	struct i915_request *rq;
 	u32 *regs;
 
+	mb(); /* paranoia: read the CSB pointers from after the reset */
+	clflush(execlists->csb_write);
+	mb();
+
 	process_csb(engine); /* drain preemption events */
 
 	/* Following the reset, we need to reload the CSB read/write pointers */
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] drm/i915/execlists: Ensure the context is reloaded after a GPU reset
  2019-09-12  7:09 [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Chris Wilson
@ 2019-09-12  7:09 ` Chris Wilson
  2019-09-12  7:51 ` [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Mika Kuoppala
  2019-09-12  9:07 ` ✗ Fi.CI.BUILD: failure for series starting with [1/2] " Patchwork
  2 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2019-09-12  7:09 UTC (permalink / raw)
  To: intel-gfx

After we manipulate the context to allow replay after a GPU reset, force
that context to be reloaded. This should be a layer of paranoia, for if
the GPU was reset, the context will no longer be resident!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 61a38a4ccbca..40b479d0ca5d 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2921,6 +2921,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	intel_ring_update_space(ce->ring);
 	__execlists_reset_reg_state(ce, engine);
 	__execlists_update_reg_state(ce, engine);
+	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
 	__context_pin_release(ce);
 
 unwind:
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
  2019-09-12  7:09 [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Chris Wilson
  2019-09-12  7:09 ` [PATCH 2/2] drm/i915/execlists: Ensure the context is reloaded after a GPU reset Chris Wilson
@ 2019-09-12  7:51 ` Mika Kuoppala
  2019-09-12  8:04   ` Chris Wilson
  2019-09-12  9:07 ` ✗ Fi.CI.BUILD: failure for series starting with [1/2] " Patchwork
  2 siblings, 1 reply; 8+ messages in thread
From: Mika Kuoppala @ 2019-09-12  7:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> After a GPU reset, we need to drain all the CS events so that we have an
> accurate picture of the execlists state at the time of the reset. Be
> paranoid and force a read of the CSB write pointer from memory.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 3d83c7e0d9de..61a38a4ccbca 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>  	struct i915_request *rq;
>  	u32 *regs;
>  
> +	mb(); /* paranoia: read the CSB pointers from after the reset */
> +	clflush(execlists->csb_write);
> +	mb();
> +

We know there is always a cost. We do invalidate the csb
on each pass on process_csb.

Add csb_write in to invalidate_csb entries along
with mbs. Rename it to invalidate_csb and use it
always?

By doing so, we could prolly throw out the rmb() at
the start of the process_csb as we would have invalidated
the write pointer along with the entries we read,
on previous pass.

-Mika


>  	process_csb(engine); /* drain preemption events */
>  
>  	/* Following the reset, we need to reload the CSB read/write pointers */
> -- 
> 2.23.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
  2019-09-12  7:51 ` [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Mika Kuoppala
@ 2019-09-12  8:04   ` Chris Wilson
  2019-09-12  8:27     ` Mika Kuoppala
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2019-09-12  8:04 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2019-09-12 08:51:38)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > After a GPU reset, we need to drain all the CS events so that we have an
> > accurate picture of the execlists state at the time of the reset. Be
> > paranoid and force a read of the CSB write pointer from memory.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 3d83c7e0d9de..61a38a4ccbca 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
> >       struct i915_request *rq;
> >       u32 *regs;
> >  
> > +     mb(); /* paranoia: read the CSB pointers from after the reset */
> > +     clflush(execlists->csb_write);
> > +     mb();
> > +
> 
> We know there is always a cost. We do invalidate the csb
> on each pass on process_csb.
> 
> Add csb_write in to invalidate_csb entries along
> with mbs. Rename it to invalidate_csb and use it
> always?
> 
> By doing so, we could prolly throw out the rmb() at
> the start of the process_csb as we would have invalidated
> the write pointer along with the entries we read,
> on previous pass.

No. That rmb is essential for the read ordering at that moment in time.

All I have in mind here is a delay, not really a barrier per se, just
this is a nice way of saying no speculation either.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
  2019-09-12  8:04   ` Chris Wilson
@ 2019-09-12  8:27     ` Mika Kuoppala
  2019-09-12  8:38       ` Chris Wilson
  0 siblings, 1 reply; 8+ messages in thread
From: Mika Kuoppala @ 2019-09-12  8:27 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2019-09-12 08:51:38)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > After a GPU reset, we need to drain all the CS events so that we have an
>> > accurate picture of the execlists state at the time of the reset. Be
>> > paranoid and force a read of the CSB write pointer from memory.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
>> >  1 file changed, 4 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > index 3d83c7e0d9de..61a38a4ccbca 100644
>> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>> >       struct i915_request *rq;
>> >       u32 *regs;
>> >  
>> > +     mb(); /* paranoia: read the CSB pointers from after the reset */
>> > +     clflush(execlists->csb_write);
>> > +     mb();
>> > +
>> 
>> We know there is always a cost. We do invalidate the csb
>> on each pass on process_csb.
>> 
>> Add csb_write in to invalidate_csb entries along
>> with mbs. Rename it to invalidate_csb and use it
>> always?
>> 
>> By doing so, we could prolly throw out the rmb() at
>> the start of the process_csb as we would have invalidated
>> the write pointer along with the entries we read,
>> on previous pass.
>
> No. That rmb is essential for the read ordering at that moment in time.

Ah yes indeed it is. head vs entries coherency.

>
> All I have in mind here is a delay, not really a barrier per se, just
> this is a nice way of saying no speculation either.

Forgetting the rmb(), there is similar pattern of mb()+flush
elsewhere. Just saw the profiliferation and opportunity to converge.

But syncing with the hardware on moment of reset, this should
do.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
  2019-09-12  8:27     ` Mika Kuoppala
@ 2019-09-12  8:38       ` Chris Wilson
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2019-09-12  8:38 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2019-09-12 09:27:56)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Mika Kuoppala (2019-09-12 08:51:38)
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> 
> >> > After a GPU reset, we need to drain all the CS events so that we have an
> >> > accurate picture of the execlists state at the time of the reset. Be
> >> > paranoid and force a read of the CSB write pointer from memory.
> >> >
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
> >> >  1 file changed, 4 insertions(+)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> > index 3d83c7e0d9de..61a38a4ccbca 100644
> >> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
> >> >       struct i915_request *rq;
> >> >       u32 *regs;
> >> >  
> >> > +     mb(); /* paranoia: read the CSB pointers from after the reset */
> >> > +     clflush(execlists->csb_write);
> >> > +     mb();
> >> > +
> >> 
> >> We know there is always a cost. We do invalidate the csb
> >> on each pass on process_csb.
> >> 
> >> Add csb_write in to invalidate_csb entries along
> >> with mbs. Rename it to invalidate_csb and use it
> >> always?
> >> 
> >> By doing so, we could prolly throw out the rmb() at
> >> the start of the process_csb as we would have invalidated
> >> the write pointer along with the entries we read,
> >> on previous pass.
> >
> > No. That rmb is essential for the read ordering at that moment in time.
> 
> Ah yes indeed it is. head vs entries coherency.
> 
> >
> > All I have in mind here is a delay, not really a barrier per se, just
> > this is a nice way of saying no speculation either.
> 
> Forgetting the rmb(), there is similar pattern of mb()+flush
> elsewhere. Just saw the profiliferation and opportunity to converge.

I understood. I think your barrier-less w/a works pretty well and I
haven't yet poked a hole in how I think it works ;)

> But syncing with the hardware on moment of reset, this should
> do.

I looked at reusing invalidate_csb_entries() and I think the key part
here is that we do want to invalidate the execlists->csb_write itself,
so a subtly different location/reason (not sure if it's the same
cacheline or the neighbouring one).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* ✗ Fi.CI.BUILD: failure for series starting with [1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
  2019-09-12  7:09 [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Chris Wilson
  2019-09-12  7:09 ` [PATCH 2/2] drm/i915/execlists: Ensure the context is reloaded after a GPU reset Chris Wilson
  2019-09-12  7:51 ` [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Mika Kuoppala
@ 2019-09-12  9:07 ` Patchwork
  2 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2019-09-12  9:07 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
URL   : https://patchwork.freedesktop.org/series/66579/
State : failure

== Summary ==

Applying: drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
Applying: drm/i915/execlists: Ensure the context is reloaded after a GPU reset
error: sha1 information is lacking or useless (drivers/gpu/drm/i915/gt/intel_lrc.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0002 drm/i915/execlists: Ensure the context is reloaded after a GPU reset
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset
@ 2019-09-12  9:29 Chris Wilson
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2019-09-12  9:29 UTC (permalink / raw)
  To: intel-gfx

After a GPU reset, we need to drain all the CS events so that we have an
accurate picture of the execlists state at the time of the reset. Be
paranoid and force a read of the CSB write pointer from memory.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index dcdf7cf66e7e..dbc90da2341a 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2359,6 +2359,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	struct i915_request *rq;
 	u32 *regs;
 
+	mb(); /* paranoia: read the CSB pointers from after the reset */
+	clflush(execlists->csb_write);
+	mb();
+
 	process_csb(engine); /* drain preemption events */
 
 	/* Following the reset, we need to reload the CSB read/write pointers */
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-09-12  9:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-12  7:09 [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Chris Wilson
2019-09-12  7:09 ` [PATCH 2/2] drm/i915/execlists: Ensure the context is reloaded after a GPU reset Chris Wilson
2019-09-12  7:51 ` [PATCH 1/2] drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset Mika Kuoppala
2019-09-12  8:04   ` Chris Wilson
2019-09-12  8:27     ` Mika Kuoppala
2019-09-12  8:38       ` Chris Wilson
2019-09-12  9:07 ` ✗ Fi.CI.BUILD: failure for series starting with [1/2] " Patchwork
2019-09-12  9:29 [PATCH 1/2] " Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.