All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 10:25 ` Chris Wilson
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2019-11-28 10:25 UTC (permalink / raw)
  To: intel-gfx

We have a case of a mysteriously absent pulse, so dump the engine
details to see if we can find out what happened to it.

References: https://bugs.freedesktop.org/show_bug.cgi?id=112405
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c           | 2 ++
 drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 8f6e353caa66..df3369c3f330 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1479,6 +1479,8 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		drm_printf(m, "*** WEDGED ***\n");
 
 	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
+	drm_printf(m, "\tBarriers?: %s\n",
+		   yesno(!llist_empty(&engine->barrier_tasks)));
 
 	rcu_read_lock();
 	rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
index f665a0e23c61..0b1148cf3f61 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
@@ -91,6 +91,11 @@ static int __live_idle_pulse(struct intel_engine_cs *engine,
 	GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
 
 	if (intel_gt_retire_requests_timeout(engine->gt, HZ / 5)) {
+		struct drm_printer m = drm_err_printer("pulse");
+
+		pr_err("%s: no heartbeat pulse?\n", engine->name);
+		intel_engine_dump(engine, &m, "%s", engine->name);
+
 		err = -ETIME;
 		goto out;
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Intel-gfx] [PATCH] drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 10:25 ` Chris Wilson
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2019-11-28 10:25 UTC (permalink / raw)
  To: intel-gfx

We have a case of a mysteriously absent pulse, so dump the engine
details to see if we can find out what happened to it.

References: https://bugs.freedesktop.org/show_bug.cgi?id=112405
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c           | 2 ++
 drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 8f6e353caa66..df3369c3f330 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1479,6 +1479,8 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		drm_printf(m, "*** WEDGED ***\n");
 
 	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
+	drm_printf(m, "\tBarriers?: %s\n",
+		   yesno(!llist_empty(&engine->barrier_tasks)));
 
 	rcu_read_lock();
 	rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
index f665a0e23c61..0b1148cf3f61 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
@@ -91,6 +91,11 @@ static int __live_idle_pulse(struct intel_engine_cs *engine,
 	GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
 
 	if (intel_gt_retire_requests_timeout(engine->gt, HZ / 5)) {
+		struct drm_printer m = drm_err_printer("pulse");
+
+		pr_err("%s: no heartbeat pulse?\n", engine->name);
+		intel_engine_dump(engine, &m, "%s", engine->name);
+
 		err = -ETIME;
 		goto out;
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 11:27   ` Mika Kuoppala
  0 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2019-11-28 11:27 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> We have a case of a mysteriously absent pulse, so dump the engine
> details to see if we can find out what happened to it.
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=112405
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c           | 2 ++
>  drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 +++++
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 8f6e353caa66..df3369c3f330 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1479,6 +1479,8 @@ void intel_engine_dump(struct intel_engine_cs *engine,
>  		drm_printf(m, "*** WEDGED ***\n");
>  
>  	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
> +	drm_printf(m, "\tBarriers?: %s\n",
> +		   yesno(!llist_empty(&engine->barrier_tasks)));

A random thought arise when reading this and it was that
should we do a memory barrier before dump. But there
would be no pairing and it could actually hide something
more sinister.

It is important to know if and where and why did we drop
our stethoscope.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>  
>  	rcu_read_lock();
>  	rq = READ_ONCE(engine->heartbeat.systole);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
> index f665a0e23c61..0b1148cf3f61 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
> @@ -91,6 +91,11 @@ static int __live_idle_pulse(struct intel_engine_cs *engine,
>  	GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
>  
>  	if (intel_gt_retire_requests_timeout(engine->gt, HZ / 5)) {
> +		struct drm_printer m = drm_err_printer("pulse");
> +
> +		pr_err("%s: no heartbeat pulse?\n", engine->name);
> +		intel_engine_dump(engine, &m, "%s", engine->name);
> +
>  		err = -ETIME;
>  		goto out;
>  	}
> -- 
> 2.24.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 11:27   ` Mika Kuoppala
  0 siblings, 0 replies; 8+ messages in thread
From: Mika Kuoppala @ 2019-11-28 11:27 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> We have a case of a mysteriously absent pulse, so dump the engine
> details to see if we can find out what happened to it.
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=112405
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c           | 2 ++
>  drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 +++++
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 8f6e353caa66..df3369c3f330 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1479,6 +1479,8 @@ void intel_engine_dump(struct intel_engine_cs *engine,
>  		drm_printf(m, "*** WEDGED ***\n");
>  
>  	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
> +	drm_printf(m, "\tBarriers?: %s\n",
> +		   yesno(!llist_empty(&engine->barrier_tasks)));

A random thought arise when reading this and it was that
should we do a memory barrier before dump. But there
would be no pairing and it could actually hide something
more sinister.

It is important to know if and where and why did we drop
our stethoscope.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>  
>  	rcu_read_lock();
>  	rq = READ_ONCE(engine->heartbeat.systole);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
> index f665a0e23c61..0b1148cf3f61 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
> @@ -91,6 +91,11 @@ static int __live_idle_pulse(struct intel_engine_cs *engine,
>  	GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
>  
>  	if (intel_gt_retire_requests_timeout(engine->gt, HZ / 5)) {
> +		struct drm_printer m = drm_err_printer("pulse");
> +
> +		pr_err("%s: no heartbeat pulse?\n", engine->name);
> +		intel_engine_dump(engine, &m, "%s", engine->name);
> +
>  		err = -ETIME;
>  		goto out;
>  	}
> -- 
> 2.24.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 11:31     ` Chris Wilson
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2019-11-28 11:31 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2019-11-28 11:27:30)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > We have a case of a mysteriously absent pulse, so dump the engine
> > details to see if we can find out what happened to it.
> >
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=112405
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_engine_cs.c           | 2 ++
> >  drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 +++++
> >  2 files changed, 7 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 8f6e353caa66..df3369c3f330 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -1479,6 +1479,8 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> >               drm_printf(m, "*** WEDGED ***\n");
> >  
> >       drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
> > +     drm_printf(m, "\tBarriers?: %s\n",
> > +                yesno(!llist_empty(&engine->barrier_tasks)));
> 
> A random thought arise when reading this and it was that
> should we do a memory barrier before dump. But there
> would be no pairing and it could actually hide something
> more sinister.

Yeah, exactly who do we want to serialise with -- and we will always
need a snapshot just in case we screw that up :)

It's debug; as async as we can without oopsing, and keep on staring
until it starts to make sense.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 11:31     ` Chris Wilson
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Wilson @ 2019-11-28 11:31 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2019-11-28 11:27:30)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > We have a case of a mysteriously absent pulse, so dump the engine
> > details to see if we can find out what happened to it.
> >
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=112405
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_engine_cs.c           | 2 ++
> >  drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 +++++
> >  2 files changed, 7 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 8f6e353caa66..df3369c3f330 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -1479,6 +1479,8 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> >               drm_printf(m, "*** WEDGED ***\n");
> >  
> >       drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
> > +     drm_printf(m, "\tBarriers?: %s\n",
> > +                yesno(!llist_empty(&engine->barrier_tasks)));
> 
> A random thought arise when reading this and it was that
> should we do a memory barrier before dump. But there
> would be no pairing and it could actually hide something
> more sinister.

Yeah, exactly who do we want to serialise with -- and we will always
need a snapshot just in case we screw that up :)

It's debug; as async as we can without oopsing, and keep on staring
until it starts to make sense.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 14:04   ` Patchwork
  0 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2019-11-28 14:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/selftests: Try to show where the pulse went
URL   : https://patchwork.freedesktop.org/series/70155/
State : failure

== Summary ==

Applying: drm/i915/selftests: Try to show where the pulse went
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gt/intel_engine_cs.c
M	drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/selftests: Try to show where the pulse went
@ 2019-11-28 14:04   ` Patchwork
  0 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2019-11-28 14:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/selftests: Try to show where the pulse went
URL   : https://patchwork.freedesktop.org/series/70155/
State : failure

== Summary ==

Applying: drm/i915/selftests: Try to show where the pulse went
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gt/intel_engine_cs.c
M	drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-11-28 14:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28 10:25 [PATCH] drm/i915/selftests: Try to show where the pulse went Chris Wilson
2019-11-28 10:25 ` [Intel-gfx] " Chris Wilson
2019-11-28 11:27 ` Mika Kuoppala
2019-11-28 11:27   ` [Intel-gfx] " Mika Kuoppala
2019-11-28 11:31   ` Chris Wilson
2019-11-28 11:31     ` [Intel-gfx] " Chris Wilson
2019-11-28 14:04 ` ✗ Fi.CI.BAT: failure for " Patchwork
2019-11-28 14:04   ` [Intel-gfx] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.