All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Matthew Brost <matthew.brost@intel.com>
Cc: jason.ekstrand@intel.com, daniel.vetter@intel.com,
	intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [RFC PATCH 74/97] drm/i915/guc: Capture error state on context reset
Date: Tue, 11 May 2021 19:45:55 +0200	[thread overview]
Message-ID: <YJrC092WXSgvXNP1@phenom.ffwll.local> (raw)
In-Reply-To: <20210511171230.GA363@sdutt-i7>

On Tue, May 11, 2021 at 10:12:32AM -0700, Matthew Brost wrote:
> On Tue, May 11, 2021 at 06:28:25PM +0200, Daniel Vetter wrote:
> > On Thu, May 06, 2021 at 12:14:28PM -0700, Matthew Brost wrote:
> > > We receive notification of an engine reset from GuC at its
> > > completion. Meaning GuC has potentially cleared any HW state
> > > we may have been interested in capturing. GuC resumes scheduling
> > > on the engine post-reset, as the resets are meant to be transparent,
> > > further muddling our error state.
> > > 
> > > There is ongoing work to define an API for a GuC debug state dump. The
> > > suggestion for now is to manually disable FW initiated resets in cases
> > > where debug state is needed.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > 
> > This looks a bit backwards to me:
> > 
> 
> Definitely a bit hacky but this patch does the best to capture the error as it
> can,
> 
> > - I figured we should capture error state when we get the G2H, in which
> >   case I hope we do know which the offending context was that got shot.
> >
> 
> We know which context was shot based on the G2H. See 'hung_ce' in this patch.

Ah maybe I should read more. Would be good to have comments on how the
locking works here, especially around reset it tends to be tricky.
Comments in the data structs/members.

> 
> > - For now we're missing the hw state, but we should still be able to
> >   capture the buffers userspace wants us to capture. So that could be
> >   wired up already?
> 
> Which buffers exactly? We dump all buffers associated with the context. 

There's an opt-in list that userspace can set in execbuf. Maybe that's the
one you mean.
-Daniel

> 
> > 
> > But yeah register state capturing needs support from GuC fw.
> >
> > I think this is a big enough miss in GuC features that we should list it
> > on the rfc as a thing to fix.
> 
> Agree this needs to be fixed.
> 
> Matt
> 
> > -Daniel
> > 
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_context.c       | 20 +++++++++++
> > >  drivers/gpu/drm/i915/gt/intel_context.h       |  3 ++
> > >  drivers/gpu/drm/i915/gt/intel_engine.h        | 21 ++++++++++-
> > >  drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 11 ++++--
> > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  2 ++
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 35 +++++++++----------
> > >  drivers/gpu/drm/i915/i915_gpu_error.c         | 25 ++++++++++---
> > >  7 files changed, 91 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> > > index 2f01437056a8..3fe7794b2bfd 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > > @@ -514,6 +514,26 @@ struct i915_request *intel_context_create_request(struct intel_context *ce)
> > >  	return rq;
> > >  }
> > >  
> > > +struct i915_request *intel_context_find_active_request(struct intel_context *ce)
> > > +{
> > > +	struct i915_request *rq, *active = NULL;
> > > +	unsigned long flags;
> > > +
> > > +	GEM_BUG_ON(!intel_engine_uses_guc(ce->engine));
> > > +
> > > +	spin_lock_irqsave(&ce->guc_active.lock, flags);
> > > +	list_for_each_entry_reverse(rq, &ce->guc_active.requests,
> > > +				    sched.link) {
> > > +		if (i915_request_completed(rq))
> > > +			break;
> > > +
> > > +		active = rq;
> > > +	}
> > > +	spin_unlock_irqrestore(&ce->guc_active.lock, flags);
> > > +
> > > +	return active;
> > > +}
> > > +
> > >  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> > >  #include "selftest_context.c"
> > >  #endif
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> > > index 9b211ca5ecc7..d2b499ed8a05 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > > @@ -195,6 +195,9 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
> > >  
> > >  struct i915_request *intel_context_create_request(struct intel_context *ce);
> > >  
> > > +struct i915_request *
> > > +intel_context_find_active_request(struct intel_context *ce);
> > > +
> > >  static inline struct intel_ring *__intel_context_ring_size(u64 sz)
> > >  {
> > >  	return u64_to_ptr(struct intel_ring, sz);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> > > index 3321d0917a99..bb94963a9fa2 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> > > @@ -242,7 +242,7 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine,
> > >  				   ktime_t *now);
> > >  
> > >  struct i915_request *
> > > -intel_engine_find_active_request(struct intel_engine_cs *engine);
> > > +intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine);
> > >  
> > >  u32 intel_engine_context_size(struct intel_gt *gt, u8 class);
> > >  
> > > @@ -316,4 +316,23 @@ intel_engine_get_sibling(struct intel_engine_cs *engine, unsigned int sibling)
> > >  	return engine->cops->get_sibling(engine, sibling);
> > >  }
> > >  
> > > +static inline void
> > > +intel_engine_set_hung_context(struct intel_engine_cs *engine,
> > > +			      struct intel_context *ce)
> > > +{
> > > +	engine->hung_ce = ce;
> > > +}
> > > +
> > > +static inline void
> > > +intel_engine_clear_hung_context(struct intel_engine_cs *engine)
> > > +{
> > > +	intel_engine_set_hung_context(engine, NULL);
> > > +}
> > > +
> > > +static inline struct intel_context *
> > > +intel_engine_get_hung_context(struct intel_engine_cs *engine)
> > > +{
> > > +	return engine->hung_ce;
> > > +}
> > > +
> > >  #endif /* _INTEL_RINGBUFFER_H_ */
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > index 10300db1c9a6..ad3987289f09 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > @@ -1727,7 +1727,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> > >  	drm_printf(m, "\tRequests:\n");
> > >  
> > >  	spin_lock_irqsave(&engine->sched_engine->lock, flags);
> > > -	rq = intel_engine_find_active_request(engine);
> > > +	rq = intel_engine_execlist_find_hung_request(engine);
> > >  	if (rq) {
> > >  		struct intel_timeline *tl = get_timeline(rq);
> > >  
> > > @@ -1838,10 +1838,17 @@ static bool match_ring(struct i915_request *rq)
> > >  }
> > >  
> > >  struct i915_request *
> > > -intel_engine_find_active_request(struct intel_engine_cs *engine)
> > > +intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine)
> > >  {
> > >  	struct i915_request *request, *active = NULL;
> > >  
> > > +	/*
> > > +	 * This search does not work in GuC submission mode. However, the GuC
> > > +	 * will report the hanging context directly to the driver itself. So
> > > +	 * the driver should never get here when in GuC mode.
> > > +	 */
> > > +	GEM_BUG_ON(intel_uc_uses_guc_submission(&engine->gt->uc));
> > > +
> > >  	/*
> > >  	 * We are called by the error capture, reset and to dump engine
> > >  	 * state at random points in time. In particular, note that neither is
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > index b84562b2708b..bba53e3b39b9 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > @@ -304,6 +304,8 @@ struct intel_engine_cs {
> > >  	/* keep a request in reserve for a [pm] barrier under oom */
> > >  	struct i915_request *request_pool;
> > >  
> > > +	struct intel_context *hung_ce;
> > > +
> > >  	struct llist_head barrier_tasks;
> > >  
> > >  	struct intel_context *kernel_context; /* pinned */
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > index 22f17a055b21..6b3b74e50b31 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > @@ -726,24 +726,6 @@ __unwind_incomplete_requests(struct intel_context *ce)
> > >  	spin_unlock_irqrestore(&sched_engine->lock, flags);
> > >  }
> > >  
> > > -static struct i915_request *context_find_active_request(struct intel_context *ce)
> > > -{
> > > -	struct i915_request *rq, *active = NULL;
> > > -	unsigned long flags;
> > > -
> > > -	spin_lock_irqsave(&ce->guc_active.lock, flags);
> > > -	list_for_each_entry_reverse(rq, &ce->guc_active.requests,
> > > -				    sched.link) {
> > > -		if (i915_request_completed(rq))
> > > -			break;
> > > -
> > > -		active = rq;
> > > -	}
> > > -	spin_unlock_irqrestore(&ce->guc_active.lock, flags);
> > > -
> > > -	return active;
> > > -}
> > > -
> > >  static void __guc_reset_context(struct intel_context *ce, bool stalled)
> > >  {
> > >  	struct i915_request *rq;
> > > @@ -757,7 +739,7 @@ static void __guc_reset_context(struct intel_context *ce, bool stalled)
> > >  	 */
> > >  	clr_context_enabled(ce);
> > >  
> > > -	rq = context_find_active_request(ce);
> > > +	rq = intel_context_find_active_request(ce);
> > >  	if (!rq) {
> > >  		head = ce->ring->tail;
> > >  		stalled = false;
> > > @@ -2192,6 +2174,20 @@ int intel_guc_sched_done_process_msg(struct intel_guc *guc,
> > >  	return 0;
> > >  }
> > >  
> > > +static void capture_error_state(struct intel_guc *guc,
> > > +				struct intel_context *ce)
> > > +{
> > > +	struct intel_gt *gt = guc_to_gt(guc);
> > > +	struct drm_i915_private *i915 = gt->i915;
> > > +	struct intel_engine_cs *engine = __context_to_physical_engine(ce);
> > > +	intel_wakeref_t wakeref;
> > > +
> > > +	intel_engine_set_hung_context(engine, ce);
> > > +	with_intel_runtime_pm(&i915->runtime_pm, wakeref)
> > > +		i915_capture_error_state(gt, engine->mask);
> > > +	atomic_inc(&i915->gpu_error.reset_engine_count[engine->uabi_class]);
> > > +}
> > > +
> > >  static void guc_context_replay(struct intel_context *ce)
> > >  {
> > >  	struct i915_sched_engine *sched_engine = ce->engine->sched_engine;
> > > @@ -2204,6 +2200,7 @@ static void guc_handle_context_reset(struct intel_guc *guc,
> > >  				     struct intel_context *ce)
> > >  {
> > >  	trace_intel_context_reset(ce);
> > > +	capture_error_state(guc, ce);
> > >  	guc_context_replay(ce);
> > >  }
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > index 3352f56bcf63..825bdfe44225 100644
> > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > @@ -1435,20 +1435,37 @@ capture_engine(struct intel_engine_cs *engine,
> > >  {
> > >  	struct intel_engine_capture_vma *capture = NULL;
> > >  	struct intel_engine_coredump *ee;
> > > -	struct i915_request *rq;
> > > +	struct intel_context *ce;
> > > +	struct i915_request *rq = NULL;
> > >  	unsigned long flags;
> > >  
> > >  	ee = intel_engine_coredump_alloc(engine, GFP_KERNEL);
> > >  	if (!ee)
> > >  		return NULL;
> > >  
> > > -	spin_lock_irqsave(&engine->sched_engine->lock, flags);
> > > -	rq = intel_engine_find_active_request(engine);
> > > +	ce = intel_engine_get_hung_context(engine);
> > > +	if (ce) {
> > > +		intel_engine_clear_hung_context(engine);
> > > +		rq = intel_context_find_active_request(ce);
> > > +		if (!rq || !i915_request_started(rq))
> > > +			goto no_request_capture;
> > > +	} else {
> > > +		/*
> > > +		 * Getting here with GuC enabled means it is a forced error capture
> > > +		 * with no actual hang. So, no need to attempt the execlist search.
> > > +		 */
> > > +		if (!intel_uc_uses_guc_submission(&engine->gt->uc)) {
> > > +			spin_lock_irqsave(&engine->sched_engine->lock, flags);
> > > +			rq = intel_engine_execlist_find_hung_request(engine);
> > > +			spin_unlock_irqrestore(&engine->sched_engine->lock,
> > > +					       flags);
> > > +		}
> > > +	}
> > >  	if (rq)
> > >  		capture = intel_engine_coredump_add_request(ee, rq,
> > >  							    ATOMIC_MAYFAIL);
> > > -	spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
> > >  	if (!capture) {
> > > +no_request_capture:
> > >  		kfree(ee);
> > >  		return NULL;
> > >  	}
> > > -- 
> > > 2.28.0
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Matthew Brost <matthew.brost@intel.com>
Cc: jason.ekstrand@intel.com, daniel.vetter@intel.com,
	intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [RFC PATCH 74/97] drm/i915/guc: Capture error state on context reset
Date: Tue, 11 May 2021 19:45:55 +0200	[thread overview]
Message-ID: <YJrC092WXSgvXNP1@phenom.ffwll.local> (raw)
In-Reply-To: <20210511171230.GA363@sdutt-i7>

On Tue, May 11, 2021 at 10:12:32AM -0700, Matthew Brost wrote:
> On Tue, May 11, 2021 at 06:28:25PM +0200, Daniel Vetter wrote:
> > On Thu, May 06, 2021 at 12:14:28PM -0700, Matthew Brost wrote:
> > > We receive notification of an engine reset from GuC at its
> > > completion. Meaning GuC has potentially cleared any HW state
> > > we may have been interested in capturing. GuC resumes scheduling
> > > on the engine post-reset, as the resets are meant to be transparent,
> > > further muddling our error state.
> > > 
> > > There is ongoing work to define an API for a GuC debug state dump. The
> > > suggestion for now is to manually disable FW initiated resets in cases
> > > where debug state is needed.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > 
> > This looks a bit backwards to me:
> > 
> 
> Definitely a bit hacky but this patch does the best to capture the error as it
> can,
> 
> > - I figured we should capture error state when we get the G2H, in which
> >   case I hope we do know which the offending context was that got shot.
> >
> 
> We know which context was shot based on the G2H. See 'hung_ce' in this patch.

Ah maybe I should read more. Would be good to have comments on how the
locking works here, especially around reset it tends to be tricky.
Comments in the data structs/members.

> 
> > - For now we're missing the hw state, but we should still be able to
> >   capture the buffers userspace wants us to capture. So that could be
> >   wired up already?
> 
> Which buffers exactly? We dump all buffers associated with the context. 

There's an opt-in list that userspace can set in execbuf. Maybe that's the
one you mean.
-Daniel

> 
> > 
> > But yeah register state capturing needs support from GuC fw.
> >
> > I think this is a big enough miss in GuC features that we should list it
> > on the rfc as a thing to fix.
> 
> Agree this needs to be fixed.
> 
> Matt
> 
> > -Daniel
> > 
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_context.c       | 20 +++++++++++
> > >  drivers/gpu/drm/i915/gt/intel_context.h       |  3 ++
> > >  drivers/gpu/drm/i915/gt/intel_engine.h        | 21 ++++++++++-
> > >  drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 11 ++++--
> > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  2 ++
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 35 +++++++++----------
> > >  drivers/gpu/drm/i915/i915_gpu_error.c         | 25 ++++++++++---
> > >  7 files changed, 91 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> > > index 2f01437056a8..3fe7794b2bfd 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > > @@ -514,6 +514,26 @@ struct i915_request *intel_context_create_request(struct intel_context *ce)
> > >  	return rq;
> > >  }
> > >  
> > > +struct i915_request *intel_context_find_active_request(struct intel_context *ce)
> > > +{
> > > +	struct i915_request *rq, *active = NULL;
> > > +	unsigned long flags;
> > > +
> > > +	GEM_BUG_ON(!intel_engine_uses_guc(ce->engine));
> > > +
> > > +	spin_lock_irqsave(&ce->guc_active.lock, flags);
> > > +	list_for_each_entry_reverse(rq, &ce->guc_active.requests,
> > > +				    sched.link) {
> > > +		if (i915_request_completed(rq))
> > > +			break;
> > > +
> > > +		active = rq;
> > > +	}
> > > +	spin_unlock_irqrestore(&ce->guc_active.lock, flags);
> > > +
> > > +	return active;
> > > +}
> > > +
> > >  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> > >  #include "selftest_context.c"
> > >  #endif
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> > > index 9b211ca5ecc7..d2b499ed8a05 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > > @@ -195,6 +195,9 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
> > >  
> > >  struct i915_request *intel_context_create_request(struct intel_context *ce);
> > >  
> > > +struct i915_request *
> > > +intel_context_find_active_request(struct intel_context *ce);
> > > +
> > >  static inline struct intel_ring *__intel_context_ring_size(u64 sz)
> > >  {
> > >  	return u64_to_ptr(struct intel_ring, sz);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> > > index 3321d0917a99..bb94963a9fa2 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> > > @@ -242,7 +242,7 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine,
> > >  				   ktime_t *now);
> > >  
> > >  struct i915_request *
> > > -intel_engine_find_active_request(struct intel_engine_cs *engine);
> > > +intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine);
> > >  
> > >  u32 intel_engine_context_size(struct intel_gt *gt, u8 class);
> > >  
> > > @@ -316,4 +316,23 @@ intel_engine_get_sibling(struct intel_engine_cs *engine, unsigned int sibling)
> > >  	return engine->cops->get_sibling(engine, sibling);
> > >  }
> > >  
> > > +static inline void
> > > +intel_engine_set_hung_context(struct intel_engine_cs *engine,
> > > +			      struct intel_context *ce)
> > > +{
> > > +	engine->hung_ce = ce;
> > > +}
> > > +
> > > +static inline void
> > > +intel_engine_clear_hung_context(struct intel_engine_cs *engine)
> > > +{
> > > +	intel_engine_set_hung_context(engine, NULL);
> > > +}
> > > +
> > > +static inline struct intel_context *
> > > +intel_engine_get_hung_context(struct intel_engine_cs *engine)
> > > +{
> > > +	return engine->hung_ce;
> > > +}
> > > +
> > >  #endif /* _INTEL_RINGBUFFER_H_ */
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > index 10300db1c9a6..ad3987289f09 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > @@ -1727,7 +1727,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> > >  	drm_printf(m, "\tRequests:\n");
> > >  
> > >  	spin_lock_irqsave(&engine->sched_engine->lock, flags);
> > > -	rq = intel_engine_find_active_request(engine);
> > > +	rq = intel_engine_execlist_find_hung_request(engine);
> > >  	if (rq) {
> > >  		struct intel_timeline *tl = get_timeline(rq);
> > >  
> > > @@ -1838,10 +1838,17 @@ static bool match_ring(struct i915_request *rq)
> > >  }
> > >  
> > >  struct i915_request *
> > > -intel_engine_find_active_request(struct intel_engine_cs *engine)
> > > +intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine)
> > >  {
> > >  	struct i915_request *request, *active = NULL;
> > >  
> > > +	/*
> > > +	 * This search does not work in GuC submission mode. However, the GuC
> > > +	 * will report the hanging context directly to the driver itself. So
> > > +	 * the driver should never get here when in GuC mode.
> > > +	 */
> > > +	GEM_BUG_ON(intel_uc_uses_guc_submission(&engine->gt->uc));
> > > +
> > >  	/*
> > >  	 * We are called by the error capture, reset and to dump engine
> > >  	 * state at random points in time. In particular, note that neither is
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > index b84562b2708b..bba53e3b39b9 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > @@ -304,6 +304,8 @@ struct intel_engine_cs {
> > >  	/* keep a request in reserve for a [pm] barrier under oom */
> > >  	struct i915_request *request_pool;
> > >  
> > > +	struct intel_context *hung_ce;
> > > +
> > >  	struct llist_head barrier_tasks;
> > >  
> > >  	struct intel_context *kernel_context; /* pinned */
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > index 22f17a055b21..6b3b74e50b31 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > @@ -726,24 +726,6 @@ __unwind_incomplete_requests(struct intel_context *ce)
> > >  	spin_unlock_irqrestore(&sched_engine->lock, flags);
> > >  }
> > >  
> > > -static struct i915_request *context_find_active_request(struct intel_context *ce)
> > > -{
> > > -	struct i915_request *rq, *active = NULL;
> > > -	unsigned long flags;
> > > -
> > > -	spin_lock_irqsave(&ce->guc_active.lock, flags);
> > > -	list_for_each_entry_reverse(rq, &ce->guc_active.requests,
> > > -				    sched.link) {
> > > -		if (i915_request_completed(rq))
> > > -			break;
> > > -
> > > -		active = rq;
> > > -	}
> > > -	spin_unlock_irqrestore(&ce->guc_active.lock, flags);
> > > -
> > > -	return active;
> > > -}
> > > -
> > >  static void __guc_reset_context(struct intel_context *ce, bool stalled)
> > >  {
> > >  	struct i915_request *rq;
> > > @@ -757,7 +739,7 @@ static void __guc_reset_context(struct intel_context *ce, bool stalled)
> > >  	 */
> > >  	clr_context_enabled(ce);
> > >  
> > > -	rq = context_find_active_request(ce);
> > > +	rq = intel_context_find_active_request(ce);
> > >  	if (!rq) {
> > >  		head = ce->ring->tail;
> > >  		stalled = false;
> > > @@ -2192,6 +2174,20 @@ int intel_guc_sched_done_process_msg(struct intel_guc *guc,
> > >  	return 0;
> > >  }
> > >  
> > > +static void capture_error_state(struct intel_guc *guc,
> > > +				struct intel_context *ce)
> > > +{
> > > +	struct intel_gt *gt = guc_to_gt(guc);
> > > +	struct drm_i915_private *i915 = gt->i915;
> > > +	struct intel_engine_cs *engine = __context_to_physical_engine(ce);
> > > +	intel_wakeref_t wakeref;
> > > +
> > > +	intel_engine_set_hung_context(engine, ce);
> > > +	with_intel_runtime_pm(&i915->runtime_pm, wakeref)
> > > +		i915_capture_error_state(gt, engine->mask);
> > > +	atomic_inc(&i915->gpu_error.reset_engine_count[engine->uabi_class]);
> > > +}
> > > +
> > >  static void guc_context_replay(struct intel_context *ce)
> > >  {
> > >  	struct i915_sched_engine *sched_engine = ce->engine->sched_engine;
> > > @@ -2204,6 +2200,7 @@ static void guc_handle_context_reset(struct intel_guc *guc,
> > >  				     struct intel_context *ce)
> > >  {
> > >  	trace_intel_context_reset(ce);
> > > +	capture_error_state(guc, ce);
> > >  	guc_context_replay(ce);
> > >  }
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > index 3352f56bcf63..825bdfe44225 100644
> > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > @@ -1435,20 +1435,37 @@ capture_engine(struct intel_engine_cs *engine,
> > >  {
> > >  	struct intel_engine_capture_vma *capture = NULL;
> > >  	struct intel_engine_coredump *ee;
> > > -	struct i915_request *rq;
> > > +	struct intel_context *ce;
> > > +	struct i915_request *rq = NULL;
> > >  	unsigned long flags;
> > >  
> > >  	ee = intel_engine_coredump_alloc(engine, GFP_KERNEL);
> > >  	if (!ee)
> > >  		return NULL;
> > >  
> > > -	spin_lock_irqsave(&engine->sched_engine->lock, flags);
> > > -	rq = intel_engine_find_active_request(engine);
> > > +	ce = intel_engine_get_hung_context(engine);
> > > +	if (ce) {
> > > +		intel_engine_clear_hung_context(engine);
> > > +		rq = intel_context_find_active_request(ce);
> > > +		if (!rq || !i915_request_started(rq))
> > > +			goto no_request_capture;
> > > +	} else {
> > > +		/*
> > > +		 * Getting here with GuC enabled means it is a forced error capture
> > > +		 * with no actual hang. So, no need to attempt the execlist search.
> > > +		 */
> > > +		if (!intel_uc_uses_guc_submission(&engine->gt->uc)) {
> > > +			spin_lock_irqsave(&engine->sched_engine->lock, flags);
> > > +			rq = intel_engine_execlist_find_hung_request(engine);
> > > +			spin_unlock_irqrestore(&engine->sched_engine->lock,
> > > +					       flags);
> > > +		}
> > > +	}
> > >  	if (rq)
> > >  		capture = intel_engine_coredump_add_request(ee, rq,
> > >  							    ATOMIC_MAYFAIL);
> > > -	spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
> > >  	if (!capture) {
> > > +no_request_capture:
> > >  		kfree(ee);
> > >  		return NULL;
> > >  	}
> > > -- 
> > > 2.28.0
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-05-11 17:46 UTC|newest]

Thread overview: 504+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 19:13 [RFC PATCH 00/97] Basic GuC submission support in the i915 Matthew Brost
2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost
2021-05-06 19:12 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
2021-05-06 19:13 ` [RFC PATCH 01/97] drm/i915/gt: Move engine setup out of set_default_submission Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-19  0:25   ` Matthew Brost
2021-05-19  0:25     ` [Intel-gfx] " Matthew Brost
2021-05-25  8:44   ` Tvrtko Ursulin
2021-05-25  8:44     ` Tvrtko Ursulin
2021-05-06 19:13 ` [RFC PATCH 02/97] drm/i915/gt: Move submission_method into intel_gt Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-19  3:10   ` Matthew Brost
2021-05-19  3:10     ` [Intel-gfx] " Matthew Brost
2021-05-25  8:44   ` Tvrtko Ursulin
2021-05-25  8:44     ` Tvrtko Ursulin
2021-05-06 19:13 ` [RFC PATCH 03/97] drm/i915/gt: Move CS interrupt handler to the backend Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-19  3:31   ` Matthew Brost
2021-05-19  3:31     ` [Intel-gfx] " Matthew Brost
2021-05-25  8:45   ` Tvrtko Ursulin
2021-05-25  8:45     ` Tvrtko Ursulin
2021-05-06 19:13 ` [RFC PATCH 04/97] drm/i915/guc: skip disabling CTBs before sanitizing the GuC Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-20 16:47   ` Matthew Brost
2021-05-20 16:47     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 05/97] drm/i915/guc: use probe_error log for CT enablement failure Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 10:30   ` Michal Wajdeczko
2021-05-24 10:30     ` [Intel-gfx] " Michal Wajdeczko
2021-05-06 19:13 ` [RFC PATCH 06/97] drm/i915/guc: enable only the user interrupt when using GuC submission Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  0:31   ` Matthew Brost
2021-05-25  0:31     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 07/97] drm/i915/guc: Remove sample_forcewake h2g action Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 10:48   ` Michal Wajdeczko
2021-05-24 10:48     ` [Intel-gfx] " Michal Wajdeczko
2021-05-25  0:36   ` Matthew Brost
2021-05-25  0:36     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 08/97] drm/i915/guc: Keep strict GuC ABI definitions Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 23:52   ` Michał Winiarski
2021-05-24 23:52     ` [Intel-gfx] " Michał Winiarski
2021-05-06 19:13 ` [RFC PATCH 09/97] drm/i915/guc: Stop using fence/status from CTB descriptor Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  2:38   ` Matthew Brost
2021-05-25  2:38     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 10/97] drm/i915: Promote ptrdiff() to i915_utils.h Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  0:42   ` Matthew Brost
2021-05-25  0:42     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 11/97] drm/i915/guc: Only rely on own CTB size Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  2:47   ` Matthew Brost
2021-05-25  2:47     ` [Intel-gfx] " Matthew Brost
2021-05-25 12:48     ` Michal Wajdeczko
2021-05-25 12:48       ` Michal Wajdeczko
2021-05-06 19:13 ` [RFC PATCH 12/97] drm/i915/guc: Don't repeat CTB layout calculations Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  2:53   ` Matthew Brost
2021-05-25  2:53     ` [Intel-gfx] " Matthew Brost
2021-05-25 13:07     ` Michal Wajdeczko
2021-05-25 13:07       ` [Intel-gfx] " Michal Wajdeczko
2021-05-25 16:56       ` Matthew Brost
2021-05-25 16:56         ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 13/97] drm/i915/guc: Replace CTB array with explicit members Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  3:15   ` Matthew Brost
2021-05-25  3:15     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 14/97] drm/i915/guc: Update sizes of CTB buffers Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  2:56   ` Matthew Brost
2021-05-25  2:56     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 15/97] drm/i915/guc: Relax CTB response timeout Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25 18:08   ` Matthew Brost
2021-05-25 18:08     ` [Intel-gfx] " Matthew Brost
2021-05-25 19:37     ` Michal Wajdeczko
2021-05-25 19:37       ` Michal Wajdeczko
2021-05-06 19:13 ` [RFC PATCH 16/97] drm/i915/guc: Start protecting access to CTB descriptors Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  3:21   ` Matthew Brost
2021-05-25  3:21     ` [Intel-gfx] " Matthew Brost
2021-05-25 13:10     ` Michal Wajdeczko
2021-05-25  3:21   ` Matthew Brost
2021-05-25  3:21     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 17/97] drm/i915/guc: Stop using mutex while sending CTB messages Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25 16:14   ` Matthew Brost
2021-05-25 16:14     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 18/97] drm/i915/guc: Don't receive all G2H messages in irq handler Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25 18:15   ` Matthew Brost
2021-05-25 18:15     ` [Intel-gfx] " Matthew Brost
2021-05-25 19:43     ` Michal Wajdeczko
2021-05-25 19:43       ` Michal Wajdeczko
2021-05-06 19:13 ` [RFC PATCH 19/97] drm/i915/guc: Always copy CT message to new allocation Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25 18:25   ` Matthew Brost
2021-05-25 18:25     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 20/97] drm/i915/guc: Introduce unified HXG messages Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-11 15:16   ` Daniel Vetter
2021-05-11 15:16     ` [Intel-gfx] " Daniel Vetter
2021-05-11 17:59     ` Matthew Brost
2021-05-11 17:59       ` [Intel-gfx] " Matthew Brost
2021-05-11 22:11     ` Michal Wajdeczko
2021-05-11 22:11       ` [Intel-gfx] " Michal Wajdeczko
2021-05-12  8:40       ` Daniel Vetter
2021-05-12  8:40         ` [Intel-gfx] " Daniel Vetter
2021-05-06 19:13 ` [RFC PATCH 21/97] drm/i915/guc: Update MMIO based communication Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 22/97] drm/i915/guc: Update CTB response status Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 23/97] drm/i915/guc: Support per context scheduling policies Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  1:15   ` Matthew Brost
2021-05-25  1:15     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 24/97] drm/i915/guc: Add flag for mark broken CTB Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-27 19:44   ` Matthew Brost
2021-05-27 19:44     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 25/97] drm/i915/guc: New definition of the CTB descriptor Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 26/97] drm/i915/guc: New definition of the CTB registration action Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 27/97] drm/i915/guc: New CTB based communication Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 28/97] drm/i915/guc: Kill guc_clients.ct_pool Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  1:01   ` Matthew Brost
2021-05-25  1:01     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 29/97] drm/i915/guc: Update firmware to v60.1.2 Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 30/97] drm/i915/uc: turn on GuC/HuC auto mode by default Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 11:00   ` Michal Wajdeczko
2021-05-24 11:00     ` [Intel-gfx] " Michal Wajdeczko
2021-05-06 19:13 ` [RFC PATCH 31/97] drm/i915/guc: Early initialization of GuC send registers Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-26 20:28   ` Matthew Brost
2021-05-26 20:28     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 32/97] drm/i915: Introduce i915_sched_engine object Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-11 15:18   ` Daniel Vetter
2021-05-11 15:18     ` [Intel-gfx] " Daniel Vetter
2021-05-11 17:56     ` Matthew Brost
2021-05-11 17:56       ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 33/97] drm/i915: Engine relative MMIO Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  9:05   ` Tvrtko Ursulin
2021-05-25  9:05     ` Tvrtko Ursulin
2021-05-06 19:13 ` [RFC PATCH 34/97] drm/i915/guc: Use guc_class instead of engine_class in fw interface Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-26 20:41   ` Matthew Brost
2021-05-26 20:41     ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 35/97] drm/i915/guc: Improve error message for unsolicited CT response Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 11:59   ` Michal Wajdeczko
2021-05-24 11:59     ` [Intel-gfx] " Michal Wajdeczko
2021-05-25 17:32     ` Matthew Brost
2021-05-25 17:32       ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 36/97] drm/i915/guc: Add non blocking CTB send function Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 12:21   ` Michal Wajdeczko
2021-05-24 12:21     ` [Intel-gfx] " Michal Wajdeczko
2021-05-25 17:30     ` Matthew Brost
2021-05-25 17:30       ` [Intel-gfx] " Matthew Brost
2021-05-25  9:21   ` Tvrtko Ursulin
2021-05-25  9:21     ` Tvrtko Ursulin
2021-05-25 17:21     ` Matthew Brost
2021-05-25 17:21       ` Matthew Brost
2021-05-26  8:57       ` Tvrtko Ursulin
2021-05-26  8:57         ` Tvrtko Ursulin
2021-05-26 18:10         ` Matthew Brost
2021-05-26 18:10           ` Matthew Brost
2021-05-27 10:02           ` Tvrtko Ursulin
2021-05-27 10:02             ` Tvrtko Ursulin
2021-05-27 14:35             ` Matthew Brost
2021-05-27 14:35               ` Matthew Brost
2021-05-27 15:11               ` Tvrtko Ursulin
2021-05-27 15:11                 ` Tvrtko Ursulin
2021-06-07 17:31                 ` Matthew Brost
2021-06-07 17:31                   ` Matthew Brost
2021-06-08  8:39                   ` Tvrtko Ursulin
2021-06-08  8:39                     ` Tvrtko Ursulin
2021-06-08  8:46                     ` Daniel Vetter
2021-06-08  8:46                       ` Daniel Vetter
2021-06-09 23:10                       ` Matthew Brost
2021-06-09 23:10                         ` Matthew Brost
2021-06-10 15:27                         ` Daniel Vetter
2021-06-10 15:27                           ` Daniel Vetter
2021-06-24 16:38                           ` Matthew Brost
2021-06-24 16:38                             ` Matthew Brost
2021-06-24 17:25                             ` Daniel Vetter
2021-06-24 17:25                               ` Daniel Vetter
2021-06-09 13:58                     ` Michal Wajdeczko
2021-06-09 13:58                       ` Michal Wajdeczko
2021-06-09 23:05                       ` Matthew Brost
2021-06-09 23:05                         ` Matthew Brost
2021-06-09 14:14                   ` Michal Wajdeczko
2021-06-09 14:14                     ` Michal Wajdeczko
2021-06-09 23:13                     ` Matthew Brost
2021-06-09 23:13                       ` Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 37/97] drm/i915/guc: Add stall timer to " Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 12:58   ` Michal Wajdeczko
2021-05-24 12:58     ` [Intel-gfx] " Michal Wajdeczko
2021-05-24 18:35     ` Matthew Brost
2021-05-24 18:35       ` [Intel-gfx] " Matthew Brost
2021-05-25 14:15       ` Michal Wajdeczko
2021-05-25 14:15         ` [Intel-gfx] " Michal Wajdeczko
2021-05-25 16:54         ` Matthew Brost
2021-05-25 16:54           ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 38/97] drm/i915/guc: Optimize CTB writes and reads Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 13:31   ` Michal Wajdeczko
2021-05-24 13:31     ` [Intel-gfx] " Michal Wajdeczko
2021-05-25 17:39     ` Matthew Brost
2021-05-25 17:39       ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 39/97] drm/i915/guc: Increase size of CTB buffers Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 13:43   ` Michal Wajdeczko
2021-05-24 13:43     ` Michal Wajdeczko
2021-05-24 18:40     ` Matthew Brost
2021-05-24 18:40       ` Matthew Brost
2021-05-25  9:24   ` Tvrtko Ursulin
2021-05-25  9:24     ` Tvrtko Ursulin
2021-05-25 17:15     ` Matthew Brost
2021-05-25 17:15       ` Matthew Brost
2021-05-26  9:30       ` Tvrtko Ursulin
2021-05-26  9:30         ` Tvrtko Ursulin
2021-05-26 18:20         ` Matthew Brost
2021-05-26 18:20           ` Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 40/97] drm/i915/guc: Module load failure test for CT buffer creation Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-24 13:45   ` Michal Wajdeczko
2021-05-24 13:45     ` [Intel-gfx] " Michal Wajdeczko
2021-05-06 19:13 ` [RFC PATCH 41/97] drm/i915/guc: Add new GuC interface defines and structures Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 42/97] drm/i915/guc: Remove GuC stage descriptor, add lrc descriptor Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 43/97] drm/i915/guc: Add lrc descriptor context lookup array Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-11 15:26   ` Daniel Vetter
2021-05-11 15:26     ` [Intel-gfx] " Daniel Vetter
2021-05-11 17:01     ` Matthew Brost
2021-05-11 17:01       ` [Intel-gfx] " Matthew Brost
2021-05-11 17:43       ` Daniel Vetter
2021-05-11 17:43         ` [Intel-gfx] " Daniel Vetter
2021-05-11 19:34         ` Matthew Brost
2021-05-11 19:34           ` [Intel-gfx] " Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 44/97] drm/i915/guc: Implement GuC submission tasklet Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-25  9:43   ` Tvrtko Ursulin
2021-05-25  9:43     ` Tvrtko Ursulin
2021-05-25 17:10     ` Matthew Brost
2021-05-25 17:10       ` Matthew Brost
2021-05-06 19:13 ` [RFC PATCH 45/97] drm/i915/guc: Add bypass tasklet submission path to GuC Matthew Brost
2021-05-06 19:13   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 46/97] drm/i915/guc: Implement GuC context operations for new inteface Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-29 20:32   ` Michal Wajdeczko
2021-05-29 20:32     ` [Intel-gfx] " Michal Wajdeczko
2021-05-06 19:14 ` [RFC PATCH 47/97] drm/i915/guc: Insert fence on context when deregistering Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 48/97] drm/i915/guc: Defer context unpin until scheduling is disabled Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 49/97] drm/i915/guc: Disable engine barriers with GuC during unpin Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-11 15:37   ` Daniel Vetter
2021-05-11 15:37     ` [Intel-gfx] " Daniel Vetter
2021-05-11 16:31     ` Matthew Brost
2021-05-11 16:31       ` [Intel-gfx] " Matthew Brost
2021-05-26 10:26   ` Tvrtko Ursulin
2021-05-26 10:26     ` Tvrtko Ursulin
2021-05-06 19:14 ` [RFC PATCH 50/97] drm/i915/guc: Extend deregistration fence to schedule disable Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 51/97] drm/i915: Disable preempt busywait when using GuC scheduling Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 52/97] drm/i915/guc: Ensure request ordering via completion fences Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 53/97] drm/i915/guc: Disable semaphores when using GuC scheduling Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-25  9:52   ` Tvrtko Ursulin
2021-05-25  9:52     ` Tvrtko Ursulin
2021-05-25 17:01     ` Matthew Brost
2021-05-25 17:01       ` Matthew Brost
2021-05-26  9:25       ` Tvrtko Ursulin
2021-05-26  9:25         ` Tvrtko Ursulin
2021-05-26 18:15         ` Matthew Brost
2021-05-26 18:15           ` Matthew Brost
2021-05-27  8:41           ` Tvrtko Ursulin
2021-05-27  8:41             ` Tvrtko Ursulin
2021-05-27 14:38             ` Matthew Brost
2021-05-27 14:38               ` Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 54/97] drm/i915/guc: Ensure G2H response has space in buffer Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 55/97] drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-07  5:56   ` kernel test robot
2021-05-25 10:06   ` Tvrtko Ursulin
2021-05-25 10:06     ` Tvrtko Ursulin
2021-05-25 17:07     ` Matthew Brost
2021-05-25 17:07       ` Matthew Brost
2021-05-26  9:21       ` Tvrtko Ursulin
2021-05-26  9:21         ` Tvrtko Ursulin
2021-05-26 18:18         ` Matthew Brost
2021-05-26 18:18           ` Matthew Brost
2021-05-27  9:02           ` Tvrtko Ursulin
2021-05-27  9:02             ` Tvrtko Ursulin
2021-05-27 14:37             ` Matthew Brost
2021-05-27 14:37               ` Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 56/97] drm/i915/guc: Update GuC debugfs to support new GuC Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 57/97] drm/i915/guc: Add several request trace points Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 58/97] drm/i915: Add intel_context tracing Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 59/97] drm/i915/guc: GuC virtual engines Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 60/97] drm/i915: Track 'serial' counts for " Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-25 10:16   ` Tvrtko Ursulin
2021-05-25 10:16     ` Tvrtko Ursulin
2021-05-25 17:52     ` Matthew Brost
2021-05-25 17:52       ` Matthew Brost
2021-05-26  8:40       ` Tvrtko Ursulin
2021-05-26  8:40         ` Tvrtko Ursulin
2021-05-26 18:45         ` John Harrison
2021-05-26 18:45           ` John Harrison
2021-05-27  8:53           ` Tvrtko Ursulin
2021-05-27  8:53             ` Tvrtko Ursulin
2021-05-27 17:01             ` John Harrison
2021-05-27 17:01               ` John Harrison
2021-06-01  9:31               ` Tvrtko Ursulin
2021-06-01  9:31                 ` Tvrtko Ursulin
2021-06-02  1:20                 ` John Harrison
2021-06-02  1:20                   ` John Harrison
2021-06-02 12:04                   ` Tvrtko Ursulin
2021-06-02 12:04                     ` Tvrtko Ursulin
2021-06-02 12:09   ` Tvrtko Ursulin
2021-06-02 12:09     ` Tvrtko Ursulin
2021-05-06 19:14 ` [RFC PATCH 61/97] drm/i915: Hold reference to intel_context over life of i915_request Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-06-02 12:18   ` Tvrtko Ursulin
2021-06-02 12:18     ` Tvrtko Ursulin
2021-05-06 19:14 ` [RFC PATCH 62/97] drm/i915/guc: Disable bonding extension with GuC submission Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 63/97] drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-06-02 13:31   ` Tvrtko Ursulin
2021-06-02 13:31     ` Tvrtko Ursulin
2021-05-06 19:14 ` [RFC PATCH 64/97] drm/i915/guc: Reset implementation for new GuC interface Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-06-02 14:33   ` Tvrtko Ursulin
2021-06-02 14:33     ` Tvrtko Ursulin
2021-06-04  3:17     ` Matthew Brost
2021-06-04  3:17       ` Matthew Brost
2021-06-04  8:16       ` Daniel Vetter
2021-06-04  8:16         ` Daniel Vetter
2021-06-04 18:02         ` Matthew Brost
2021-06-04 18:02           ` Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 65/97] drm/i915: Reset GPU immediately if submission is disabled Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-06-02 14:36   ` Tvrtko Ursulin
2021-06-02 14:36     ` Tvrtko Ursulin
2021-05-06 19:14 ` [RFC PATCH 66/97] drm/i915/guc: Add disable interrupts to guc sanitize Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-11  8:16   ` [drm/i915/guc] 07336fb545: WARNING:at_drivers/gpu/drm/i915/gt/uc/intel_uc.c:#__uc_sanitize[i915] kernel test robot
2021-05-11  8:16     ` kernel test robot
2021-05-11  8:16     ` [Intel-gfx] " kernel test robot
2021-05-11  8:16     ` kernel test robot
2021-05-06 19:14 ` [RFC PATCH 67/97] drm/i915/guc: Suspend/resume implementation for new interface Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 68/97] drm/i915/guc: Handle context reset notification Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-11 16:25   ` Daniel Vetter
2021-05-11 16:25     ` Daniel Vetter
2021-05-06 19:14 ` [RFC PATCH 69/97] drm/i915/guc: Handle engine reset failure notification Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 70/97] drm/i915/guc: Enable the timer expired interrupt for GuC Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 71/97] drm/i915/guc: Provide mmio list to be saved/restored on engine reset Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 72/97] drm/i915/guc: Don't complain about reset races Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 73/97] drm/i915/guc: Enable GuC engine reset Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 74/97] drm/i915/guc: Capture error state on context reset Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-11 16:28   ` Daniel Vetter
2021-05-11 16:28     ` Daniel Vetter
2021-05-11 17:12     ` Matthew Brost
2021-05-11 17:12       ` Matthew Brost
2021-05-11 17:45       ` Daniel Vetter [this message]
2021-05-11 17:45         ` Daniel Vetter
2021-05-06 19:14 ` [RFC PATCH 75/97] drm/i915/guc: Fix for error capture after full GPU reset with GuC Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 76/97] drm/i915/guc: Hook GuC scheduling policies up Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 77/97] drm/i915/guc: Connect reset modparam updates to GuC policy flags Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 78/97] drm/i915/guc: Include scheduling policies in the debugfs state dump Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 79/97] drm/i915/guc: Don't call ring_is_idle in GuC submission Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 80/97] drm/i915/guc: Implement banned contexts for " Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 81/97] drm/i915/guc: Allow flexible number of context ids Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 82/97] drm/i915/guc: Connect the number of guc_ids to debugfs Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 83/97] drm/i915/guc: Don't return -EAGAIN to user when guc_ids exhausted Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-07  6:06   ` kernel test robot
2021-05-06 19:14 ` [RFC PATCH 84/97] drm/i915/guc: Don't allow requests not ready to consume all guc_ids Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 85/97] drm/i915/guc: Introduce guc_submit_engine object Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 86/97] drm/i915/guc: Add golden context to GuC ADS Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 87/97] drm/i915/guc: Implement GuC priority management Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 88/97] drm/i915/guc: Support request cancellation Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 89/97] drm/i915/guc: Check return of __xa_store when registering a context Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 90/97] drm/i915/guc: Non-static lrc descriptor registration buffer Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 91/97] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 92/97] drm/i915: Add GT PM delayed worker Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 93/97] drm/i915/guc: Take engine PM when a context is pinned with GuC submission Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 94/97] drm/i915/guc: Don't call switch_to_kernel_context " Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 95/97] drm/i915/guc: Selftest for GuC flow control Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 96/97] drm/i915/guc: Update GuC documentation Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-06 19:14 ` [RFC PATCH 97/97] drm/i915/guc: Unblock GuC submission on Gen11+ Matthew Brost
2021-05-06 19:14   ` [Intel-gfx] " Matthew Brost
2021-05-09 17:12 ` [RFC PATCH 00/97] Basic GuC submission support in the i915 Martin Peres
2021-05-09 17:12   ` [Intel-gfx] " Martin Peres
2021-05-09 23:11   ` Jason Ekstrand
2021-05-09 23:11     ` [Intel-gfx] " Jason Ekstrand
2021-05-10 13:55     ` Martin Peres
2021-05-10 13:55       ` [Intel-gfx] " Martin Peres
2021-05-10 16:25       ` Jason Ekstrand
2021-05-10 16:25         ` [Intel-gfx] " Jason Ekstrand
2021-05-11  8:01         ` Martin Peres
2021-05-11  8:01           ` [Intel-gfx] " Martin Peres
2021-05-10 16:33       ` Daniel Vetter
2021-05-10 16:33         ` [Intel-gfx] " Daniel Vetter
2021-05-10 18:30         ` Francisco Jerez
2021-05-10 18:30           ` Francisco Jerez
2021-05-11  8:06         ` Martin Peres
2021-05-11  8:06           ` [Intel-gfx] " Martin Peres
2021-05-11 15:26           ` Bloomfield, Jon
2021-05-11 15:26             ` [Intel-gfx] " Bloomfield, Jon
2021-05-11 16:39             ` Matthew Brost
2021-05-11 16:39               ` [Intel-gfx] " Matthew Brost
2021-05-12  6:26               ` Martin Peres
2021-05-12  6:26                 ` [Intel-gfx] " Martin Peres
2021-05-14 16:31                 ` Jason Ekstrand
2021-05-14 16:31                   ` [Intel-gfx] " Jason Ekstrand
2021-05-25 15:37                   ` Alex Deucher
2021-05-25 15:37                     ` [Intel-gfx] " Alex Deucher
2021-05-11  2:58     ` Dixit, Ashutosh
2021-05-11  2:58       ` [Intel-gfx] " Dixit, Ashutosh
2021-05-11  7:47       ` Martin Peres
2021-05-11  7:47         ` [Intel-gfx] " Martin Peres
2021-05-14 11:11 ` Tvrtko Ursulin
2021-05-14 11:11   ` Tvrtko Ursulin
2021-05-14 16:36   ` Jason Ekstrand
2021-05-14 16:36     ` Jason Ekstrand
2021-05-14 16:46     ` Matthew Brost
2021-05-14 16:46       ` Matthew Brost
2021-05-14 16:41   ` Matthew Brost
2021-05-14 16:41     ` Matthew Brost
2021-05-25 10:32 ` Tvrtko Ursulin
2021-05-25 10:32   ` Tvrtko Ursulin
2021-05-25 16:45   ` Matthew Brost
2021-05-25 16:45     ` Matthew Brost
2021-06-02 15:27     ` Tvrtko Ursulin
2021-06-02 15:27       ` Tvrtko Ursulin
2021-06-02 18:57       ` Daniel Vetter
2021-06-02 18:57         ` Daniel Vetter
2021-06-03  3:41         ` Matthew Brost
2021-06-03  3:41           ` Matthew Brost
2021-06-03  4:47           ` Daniel Vetter
2021-06-03  4:47             ` Daniel Vetter
2021-06-03  9:49             ` Tvrtko Ursulin
2021-06-03  9:49               ` Tvrtko Ursulin
2021-06-03 10:52           ` Tvrtko Ursulin
2021-06-03 10:52             ` Tvrtko Ursulin
2021-06-03  4:10       ` Matthew Brost
2021-06-03  4:10         ` Matthew Brost
2021-06-03  8:51         ` Tvrtko Ursulin
2021-06-03  8:51           ` Tvrtko Ursulin
2021-06-03 16:34           ` Matthew Brost
2021-06-03 16:34             ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJrC092WXSgvXNP1@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jason.ekstrand@intel.com \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.