From: Matthew Brost <matthew.brost@intel.com> To: <intel-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org> Cc: matthew.brost@intel.com, tvrtko.ursulin@intel.com, daniele.ceraolospurio@intel.com, jason.ekstrand@intel.com, jon.bloomfield@intel.com, daniel.vetter@intel.com, john.c.harrison@intel.com Subject: [RFC PATCH 75/97] drm/i915/guc: Fix for error capture after full GPU reset with GuC Date: Thu, 6 May 2021 12:14:29 -0700 [thread overview] Message-ID: <20210506191451.77768-76-matthew.brost@intel.com> (raw) In-Reply-To: <20210506191451.77768-1-matthew.brost@intel.com> From: John Harrison <John.C.Harrison@Intel.com> In the case of a full GPU reset (e.g. because GuC has died or because GuC's hang detection has been disabled), the driver can't rely on GuC reporting the guilty context. Instead, the driver needs to scan all active contexts and find one that is currently executing, as per the execlist mode behaviour. In GuC mode, this scan is different to execlist mode as the active request list is handled very differently. Similarly, the request state dump in debugfs needs to be handled differently when in GuC submission mode. Also refactured some of the request scanning code to avoid duplication across the multiple code paths that are now replicating it. Signed-off-by: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/i915/gt/intel_engine.h | 3 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 139 ++++++++++++------ .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 8 + drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 2 + .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 67 +++++++++ .../gpu/drm/i915/gt/uc/intel_guc_submission.h | 3 + drivers/gpu/drm/i915/i915_request.c | 41 ++++++ drivers/gpu/drm/i915/i915_request.h | 11 ++ 9 files changed, 229 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index bb94963a9fa2..2e69be3bb1cf 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -237,6 +237,9 @@ __printf(3, 4) void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m, const char *header, ...); +void intel_engine_dump_active_requests(struct list_head *requests, + struct i915_request *hung_rq, + struct drm_printer *m); ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t *now); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index ad3987289f09..e34a61600c8c 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1680,6 +1680,97 @@ static void print_properties(struct intel_engine_cs *engine, read_ul(&engine->defaults, p->offset)); } +static void engine_dump_request(struct i915_request *rq, struct drm_printer *m, const char *msg) +{ + struct intel_timeline *tl = get_timeline(rq); + + i915_request_show(m, rq, msg, 0); + + drm_printf(m, "\t\tring->start: 0x%08x\n", + i915_ggtt_offset(rq->ring->vma)); + drm_printf(m, "\t\tring->head: 0x%08x\n", + rq->ring->head); + drm_printf(m, "\t\tring->tail: 0x%08x\n", + rq->ring->tail); + drm_printf(m, "\t\tring->emit: 0x%08x\n", + rq->ring->emit); + drm_printf(m, "\t\tring->space: 0x%08x\n", + rq->ring->space); + + if (tl) { + drm_printf(m, "\t\tring->hwsp: 0x%08x\n", + tl->hwsp_offset); + intel_timeline_put(tl); + } + + print_request_ring(m, rq); + + if (rq->context->lrc_reg_state) { + drm_printf(m, "Logical Ring Context:\n"); + hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE); + } +} + +void intel_engine_dump_active_requests(struct list_head *requests, + struct i915_request *hung_rq, + struct drm_printer *m) +{ + struct i915_request *rq; + const char *msg; + enum i915_request_state state; + + list_for_each_entry(rq, requests, sched.link) { + if (rq == hung_rq) + continue; + + state = i915_test_request_state(rq); + if (state < I915_REQUEST_QUEUED) + continue; + + if (state == I915_REQUEST_ACTIVE) + msg = "\t\tactive on engine"; + else + msg = "\t\tactive in queue"; + + engine_dump_request(rq, m, msg); + } +} + +static void engine_dump_active_requests(struct intel_engine_cs *engine, struct drm_printer *m) +{ + struct i915_request *hung_rq = NULL; + struct intel_context *ce; + bool guc; + + /* + * No need for an engine->irq_seqno_barrier() before the seqno reads. + * The GPU is still running so requests are still executing and any + * hardware reads will be out of date by the time they are reported. + * But the intention here is just to report an instantaneous snapshot + * so that's fine. + */ + lockdep_assert_held(&engine->sched_engine->lock); + + drm_printf(m, "\tRequests:\n"); + + guc = intel_uc_uses_guc_submission(&engine->gt->uc); + if (guc) { + ce = intel_engine_get_hung_context(engine); + if (ce) + hung_rq = intel_context_find_active_request(ce); + } else + hung_rq = intel_engine_execlist_find_hung_request(engine); + + if (hung_rq) + engine_dump_request(hung_rq, m, "\t\thung"); + + if (guc) + intel_guc_dump_active_requests(engine, hung_rq, m); + else + intel_engine_dump_active_requests(&engine->sched_engine->requests, + hung_rq, m); +} + void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m, const char *header, ...) @@ -1724,39 +1815,9 @@ void intel_engine_dump(struct intel_engine_cs *engine, i915_reset_count(error)); print_properties(engine, m); - drm_printf(m, "\tRequests:\n"); - spin_lock_irqsave(&engine->sched_engine->lock, flags); - rq = intel_engine_execlist_find_hung_request(engine); - if (rq) { - struct intel_timeline *tl = get_timeline(rq); - - i915_request_show(m, rq, "\t\tactive ", 0); - - drm_printf(m, "\t\tring->start: 0x%08x\n", - i915_ggtt_offset(rq->ring->vma)); - drm_printf(m, "\t\tring->head: 0x%08x\n", - rq->ring->head); - drm_printf(m, "\t\tring->tail: 0x%08x\n", - rq->ring->tail); - drm_printf(m, "\t\tring->emit: 0x%08x\n", - rq->ring->emit); - drm_printf(m, "\t\tring->space: 0x%08x\n", - rq->ring->space); - - if (tl) { - drm_printf(m, "\t\tring->hwsp: 0x%08x\n", - tl->hwsp_offset); - intel_timeline_put(tl); - } - - print_request_ring(m, rq); + engine_dump_active_requests(engine, m); - if (rq->context->lrc_reg_state) { - drm_printf(m, "Logical Ring Context:\n"); - hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE); - } - } drm_printf(m, "\tOn hold?: %lu\n", list_count(&engine->sched_engine->hold)); spin_unlock_irqrestore(&engine->sched_engine->lock, flags); @@ -1830,13 +1891,6 @@ intel_engine_create_virtual(struct intel_engine_cs **siblings, return siblings[0]->cops->create_virtual(siblings, count); } -static bool match_ring(struct i915_request *rq) -{ - u32 ring = ENGINE_READ(rq->engine, RING_START); - - return ring == i915_ggtt_offset(rq->ring->vma); -} - struct i915_request * intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine) { @@ -1880,14 +1934,7 @@ intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine) list_for_each_entry(request, &engine->sched_engine->requests, sched.link) { - if (__i915_request_is_complete(request)) - continue; - - if (!__i915_request_has_started(request)) - continue; - - /* More than one preemptible request may match! */ - if (!match_ring(request)) + if (i915_test_request_state(request) != I915_REQUEST_ACTIVE) continue; active = request; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index a8495364d906..f0768824de6f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -90,6 +90,14 @@ reset_engine(struct intel_engine_cs *engine, struct i915_request *rq) if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) show_heartbeat(rq, engine); + if (intel_engine_uses_guc(engine)) + /* + * GuC itself is toast or GuC's hang detection + * is disabled. Either way, need to find the + * hang culprit manually. + */ + intel_guc_find_hung_context(engine); + intel_gt_handle_error(engine->gt, engine->mask, I915_ERROR_CAPTURE, "stopped heartbeat on %s", diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index ce3ef26ffe2d..c35c4b529ce5 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -156,7 +156,7 @@ void __i915_request_reset(struct i915_request *rq, bool guilty) if (guilty) { i915_request_set_error_once(rq, -EIO); __i915_request_skip(rq); - if (mark_guilty(rq)) + if (mark_guilty(rq) && !intel_engine_uses_guc(rq->engine)) skip_context(rq); } else { i915_request_set_error_once(rq, -EAGAIN); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 097687937cec..10b48b9f7603 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -269,6 +269,8 @@ int intel_guc_context_reset_process_msg(struct intel_guc *guc, int intel_guc_engine_failure_process_msg(struct intel_guc *guc, const u32 *msg, u32 len); +void intel_guc_find_hung_context(struct intel_engine_cs *engine); + void intel_guc_submission_reset_prepare(struct intel_guc *guc); void intel_guc_submission_reset(struct intel_guc *guc, bool stalled); void intel_guc_submission_reset_finish(struct intel_guc *guc); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 6b3b74e50b31..ad3d2326a81d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2267,6 +2267,73 @@ int intel_guc_engine_failure_process_msg(struct intel_guc *guc, return 0; } +void intel_guc_find_hung_context(struct intel_engine_cs *engine) +{ + struct intel_guc *guc = &engine->gt->uc.guc; + struct intel_context *ce; + struct i915_request *rq; + unsigned long index; + + /* Reset called during driver load? GuC not yet initialised! */ + if (unlikely(!guc_submission_initialized(guc))) + return; + + xa_for_each(&guc->context_lookup, index, ce) { + if (!intel_context_is_pinned(ce)) + continue; + + if (intel_engine_is_virtual(ce->engine)) { + if (!(ce->engine->mask & engine->mask)) + continue; + } else { + if (ce->engine != engine) + continue; + } + + list_for_each_entry(rq, &ce->guc_active.requests, sched.link) { + if (i915_test_request_state(rq) != I915_REQUEST_ACTIVE) + continue; + + intel_engine_set_hung_context(engine, ce); + + /* Can only cope with one hang at a time... */ + return; + } + } +} + +void intel_guc_dump_active_requests(struct intel_engine_cs *engine, + struct i915_request *hung_rq, + struct drm_printer *m) +{ + struct intel_guc *guc = &engine->gt->uc.guc; + struct intel_context *ce; + unsigned long index; + unsigned long flags; + + /* Reset called during driver load? GuC not yet initialised! */ + if (unlikely(!guc_submission_initialized(guc))) + return; + + xa_for_each(&guc->context_lookup, index, ce) { + if (!intel_context_is_pinned(ce)) + continue; + + if (intel_engine_is_virtual(ce->engine)) { + if (!(ce->engine->mask & engine->mask)) + continue; + } else { + if (ce->engine != engine) + continue; + } + + spin_lock_irqsave(&ce->guc_active.lock, flags); + intel_engine_dump_active_requests(&ce->guc_active.requests, + hung_rq, m); + spin_unlock_irqrestore(&ce->guc_active.lock, flags); + } +} + void intel_guc_log_submission_info(struct intel_guc *guc, struct drm_printer *p) { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h index b9b9f0f60f91..a2a3fad72be1 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h @@ -24,6 +24,9 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine); void intel_guc_log_submission_info(struct intel_guc *guc, struct drm_printer *p); void intel_guc_log_context_info(struct intel_guc *guc, struct drm_printer *p); +void intel_guc_dump_active_requests(struct intel_engine_cs *engine, + struct i915_request *hung_rq, + struct drm_printer *m); bool intel_guc_virtual_engine_has_heartbeat(const struct intel_engine_cs *ve); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 4855cf7ebe21..ef9eb91ec84c 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -2076,6 +2076,47 @@ void i915_request_show(struct drm_printer *m, name); } +static bool engine_match_ring(struct intel_engine_cs *engine, struct i915_request *rq) +{ + u32 ring = ENGINE_READ(engine, RING_START); + + return ring == i915_ggtt_offset(rq->ring->vma); +} + +static bool match_ring(struct i915_request *rq) +{ + struct intel_engine_cs *engine; + bool found; + int i; + + if (!intel_engine_is_virtual(rq->engine)) + return engine_match_ring(rq->engine, rq); + + found = false; + i = 0; + while ((engine = intel_engine_get_sibling(rq->engine, i++))) { + found = engine_match_ring(engine, rq); + if (found) + break; + } + + return found; +} + +enum i915_request_state i915_test_request_state(struct i915_request *rq) +{ + if (i915_request_completed(rq)) + return I915_REQUEST_COMPLETE; + + if (!i915_request_started(rq)) + return I915_REQUEST_PENDING; + + if (match_ring(rq)) + return I915_REQUEST_ACTIVE; + + return I915_REQUEST_QUEUED; +} + #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/mock_request.c" #include "selftests/i915_request.c" diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index bcc6340c505e..f98385f72782 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -651,4 +651,15 @@ i915_request_active_engine(struct i915_request *rq, void i915_request_notify_execute_cb_imm(struct i915_request *rq); +enum i915_request_state +{ + I915_REQUEST_UNKNOWN = 0, + I915_REQUEST_COMPLETE, + I915_REQUEST_PENDING, + I915_REQUEST_QUEUED, + I915_REQUEST_ACTIVE, +}; + +enum i915_request_state i915_test_request_state(struct i915_request *rq); + #endif /* I915_REQUEST_H */ -- 2.28.0
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Brost <matthew.brost@intel.com> To: <intel-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org> Cc: jason.ekstrand@intel.com, daniel.vetter@intel.com Subject: [Intel-gfx] [RFC PATCH 75/97] drm/i915/guc: Fix for error capture after full GPU reset with GuC Date: Thu, 6 May 2021 12:14:29 -0700 [thread overview] Message-ID: <20210506191451.77768-76-matthew.brost@intel.com> (raw) In-Reply-To: <20210506191451.77768-1-matthew.brost@intel.com> From: John Harrison <John.C.Harrison@Intel.com> In the case of a full GPU reset (e.g. because GuC has died or because GuC's hang detection has been disabled), the driver can't rely on GuC reporting the guilty context. Instead, the driver needs to scan all active contexts and find one that is currently executing, as per the execlist mode behaviour. In GuC mode, this scan is different to execlist mode as the active request list is handled very differently. Similarly, the request state dump in debugfs needs to be handled differently when in GuC submission mode. Also refactured some of the request scanning code to avoid duplication across the multiple code paths that are now replicating it. Signed-off-by: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/i915/gt/intel_engine.h | 3 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 139 ++++++++++++------ .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 8 + drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 2 + .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 67 +++++++++ .../gpu/drm/i915/gt/uc/intel_guc_submission.h | 3 + drivers/gpu/drm/i915/i915_request.c | 41 ++++++ drivers/gpu/drm/i915/i915_request.h | 11 ++ 9 files changed, 229 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index bb94963a9fa2..2e69be3bb1cf 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -237,6 +237,9 @@ __printf(3, 4) void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m, const char *header, ...); +void intel_engine_dump_active_requests(struct list_head *requests, + struct i915_request *hung_rq, + struct drm_printer *m); ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t *now); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index ad3987289f09..e34a61600c8c 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1680,6 +1680,97 @@ static void print_properties(struct intel_engine_cs *engine, read_ul(&engine->defaults, p->offset)); } +static void engine_dump_request(struct i915_request *rq, struct drm_printer *m, const char *msg) +{ + struct intel_timeline *tl = get_timeline(rq); + + i915_request_show(m, rq, msg, 0); + + drm_printf(m, "\t\tring->start: 0x%08x\n", + i915_ggtt_offset(rq->ring->vma)); + drm_printf(m, "\t\tring->head: 0x%08x\n", + rq->ring->head); + drm_printf(m, "\t\tring->tail: 0x%08x\n", + rq->ring->tail); + drm_printf(m, "\t\tring->emit: 0x%08x\n", + rq->ring->emit); + drm_printf(m, "\t\tring->space: 0x%08x\n", + rq->ring->space); + + if (tl) { + drm_printf(m, "\t\tring->hwsp: 0x%08x\n", + tl->hwsp_offset); + intel_timeline_put(tl); + } + + print_request_ring(m, rq); + + if (rq->context->lrc_reg_state) { + drm_printf(m, "Logical Ring Context:\n"); + hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE); + } +} + +void intel_engine_dump_active_requests(struct list_head *requests, + struct i915_request *hung_rq, + struct drm_printer *m) +{ + struct i915_request *rq; + const char *msg; + enum i915_request_state state; + + list_for_each_entry(rq, requests, sched.link) { + if (rq == hung_rq) + continue; + + state = i915_test_request_state(rq); + if (state < I915_REQUEST_QUEUED) + continue; + + if (state == I915_REQUEST_ACTIVE) + msg = "\t\tactive on engine"; + else + msg = "\t\tactive in queue"; + + engine_dump_request(rq, m, msg); + } +} + +static void engine_dump_active_requests(struct intel_engine_cs *engine, struct drm_printer *m) +{ + struct i915_request *hung_rq = NULL; + struct intel_context *ce; + bool guc; + + /* + * No need for an engine->irq_seqno_barrier() before the seqno reads. + * The GPU is still running so requests are still executing and any + * hardware reads will be out of date by the time they are reported. + * But the intention here is just to report an instantaneous snapshot + * so that's fine. + */ + lockdep_assert_held(&engine->sched_engine->lock); + + drm_printf(m, "\tRequests:\n"); + + guc = intel_uc_uses_guc_submission(&engine->gt->uc); + if (guc) { + ce = intel_engine_get_hung_context(engine); + if (ce) + hung_rq = intel_context_find_active_request(ce); + } else + hung_rq = intel_engine_execlist_find_hung_request(engine); + + if (hung_rq) + engine_dump_request(hung_rq, m, "\t\thung"); + + if (guc) + intel_guc_dump_active_requests(engine, hung_rq, m); + else + intel_engine_dump_active_requests(&engine->sched_engine->requests, + hung_rq, m); +} + void intel_engine_dump(struct intel_engine_cs *engine, struct drm_printer *m, const char *header, ...) @@ -1724,39 +1815,9 @@ void intel_engine_dump(struct intel_engine_cs *engine, i915_reset_count(error)); print_properties(engine, m); - drm_printf(m, "\tRequests:\n"); - spin_lock_irqsave(&engine->sched_engine->lock, flags); - rq = intel_engine_execlist_find_hung_request(engine); - if (rq) { - struct intel_timeline *tl = get_timeline(rq); - - i915_request_show(m, rq, "\t\tactive ", 0); - - drm_printf(m, "\t\tring->start: 0x%08x\n", - i915_ggtt_offset(rq->ring->vma)); - drm_printf(m, "\t\tring->head: 0x%08x\n", - rq->ring->head); - drm_printf(m, "\t\tring->tail: 0x%08x\n", - rq->ring->tail); - drm_printf(m, "\t\tring->emit: 0x%08x\n", - rq->ring->emit); - drm_printf(m, "\t\tring->space: 0x%08x\n", - rq->ring->space); - - if (tl) { - drm_printf(m, "\t\tring->hwsp: 0x%08x\n", - tl->hwsp_offset); - intel_timeline_put(tl); - } - - print_request_ring(m, rq); + engine_dump_active_requests(engine, m); - if (rq->context->lrc_reg_state) { - drm_printf(m, "Logical Ring Context:\n"); - hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE); - } - } drm_printf(m, "\tOn hold?: %lu\n", list_count(&engine->sched_engine->hold)); spin_unlock_irqrestore(&engine->sched_engine->lock, flags); @@ -1830,13 +1891,6 @@ intel_engine_create_virtual(struct intel_engine_cs **siblings, return siblings[0]->cops->create_virtual(siblings, count); } -static bool match_ring(struct i915_request *rq) -{ - u32 ring = ENGINE_READ(rq->engine, RING_START); - - return ring == i915_ggtt_offset(rq->ring->vma); -} - struct i915_request * intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine) { @@ -1880,14 +1934,7 @@ intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine) list_for_each_entry(request, &engine->sched_engine->requests, sched.link) { - if (__i915_request_is_complete(request)) - continue; - - if (!__i915_request_has_started(request)) - continue; - - /* More than one preemptible request may match! */ - if (!match_ring(request)) + if (i915_test_request_state(request) != I915_REQUEST_ACTIVE) continue; active = request; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index a8495364d906..f0768824de6f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -90,6 +90,14 @@ reset_engine(struct intel_engine_cs *engine, struct i915_request *rq) if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) show_heartbeat(rq, engine); + if (intel_engine_uses_guc(engine)) + /* + * GuC itself is toast or GuC's hang detection + * is disabled. Either way, need to find the + * hang culprit manually. + */ + intel_guc_find_hung_context(engine); + intel_gt_handle_error(engine->gt, engine->mask, I915_ERROR_CAPTURE, "stopped heartbeat on %s", diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index ce3ef26ffe2d..c35c4b529ce5 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -156,7 +156,7 @@ void __i915_request_reset(struct i915_request *rq, bool guilty) if (guilty) { i915_request_set_error_once(rq, -EIO); __i915_request_skip(rq); - if (mark_guilty(rq)) + if (mark_guilty(rq) && !intel_engine_uses_guc(rq->engine)) skip_context(rq); } else { i915_request_set_error_once(rq, -EAGAIN); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 097687937cec..10b48b9f7603 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -269,6 +269,8 @@ int intel_guc_context_reset_process_msg(struct intel_guc *guc, int intel_guc_engine_failure_process_msg(struct intel_guc *guc, const u32 *msg, u32 len); +void intel_guc_find_hung_context(struct intel_engine_cs *engine); + void intel_guc_submission_reset_prepare(struct intel_guc *guc); void intel_guc_submission_reset(struct intel_guc *guc, bool stalled); void intel_guc_submission_reset_finish(struct intel_guc *guc); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 6b3b74e50b31..ad3d2326a81d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2267,6 +2267,73 @@ int intel_guc_engine_failure_process_msg(struct intel_guc *guc, return 0; } +void intel_guc_find_hung_context(struct intel_engine_cs *engine) +{ + struct intel_guc *guc = &engine->gt->uc.guc; + struct intel_context *ce; + struct i915_request *rq; + unsigned long index; + + /* Reset called during driver load? GuC not yet initialised! */ + if (unlikely(!guc_submission_initialized(guc))) + return; + + xa_for_each(&guc->context_lookup, index, ce) { + if (!intel_context_is_pinned(ce)) + continue; + + if (intel_engine_is_virtual(ce->engine)) { + if (!(ce->engine->mask & engine->mask)) + continue; + } else { + if (ce->engine != engine) + continue; + } + + list_for_each_entry(rq, &ce->guc_active.requests, sched.link) { + if (i915_test_request_state(rq) != I915_REQUEST_ACTIVE) + continue; + + intel_engine_set_hung_context(engine, ce); + + /* Can only cope with one hang at a time... */ + return; + } + } +} + +void intel_guc_dump_active_requests(struct intel_engine_cs *engine, + struct i915_request *hung_rq, + struct drm_printer *m) +{ + struct intel_guc *guc = &engine->gt->uc.guc; + struct intel_context *ce; + unsigned long index; + unsigned long flags; + + /* Reset called during driver load? GuC not yet initialised! */ + if (unlikely(!guc_submission_initialized(guc))) + return; + + xa_for_each(&guc->context_lookup, index, ce) { + if (!intel_context_is_pinned(ce)) + continue; + + if (intel_engine_is_virtual(ce->engine)) { + if (!(ce->engine->mask & engine->mask)) + continue; + } else { + if (ce->engine != engine) + continue; + } + + spin_lock_irqsave(&ce->guc_active.lock, flags); + intel_engine_dump_active_requests(&ce->guc_active.requests, + hung_rq, m); + spin_unlock_irqrestore(&ce->guc_active.lock, flags); + } +} + void intel_guc_log_submission_info(struct intel_guc *guc, struct drm_printer *p) { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h index b9b9f0f60f91..a2a3fad72be1 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h @@ -24,6 +24,9 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine); void intel_guc_log_submission_info(struct intel_guc *guc, struct drm_printer *p); void intel_guc_log_context_info(struct intel_guc *guc, struct drm_printer *p); +void intel_guc_dump_active_requests(struct intel_engine_cs *engine, + struct i915_request *hung_rq, + struct drm_printer *m); bool intel_guc_virtual_engine_has_heartbeat(const struct intel_engine_cs *ve); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 4855cf7ebe21..ef9eb91ec84c 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -2076,6 +2076,47 @@ void i915_request_show(struct drm_printer *m, name); } +static bool engine_match_ring(struct intel_engine_cs *engine, struct i915_request *rq) +{ + u32 ring = ENGINE_READ(engine, RING_START); + + return ring == i915_ggtt_offset(rq->ring->vma); +} + +static bool match_ring(struct i915_request *rq) +{ + struct intel_engine_cs *engine; + bool found; + int i; + + if (!intel_engine_is_virtual(rq->engine)) + return engine_match_ring(rq->engine, rq); + + found = false; + i = 0; + while ((engine = intel_engine_get_sibling(rq->engine, i++))) { + found = engine_match_ring(engine, rq); + if (found) + break; + } + + return found; +} + +enum i915_request_state i915_test_request_state(struct i915_request *rq) +{ + if (i915_request_completed(rq)) + return I915_REQUEST_COMPLETE; + + if (!i915_request_started(rq)) + return I915_REQUEST_PENDING; + + if (match_ring(rq)) + return I915_REQUEST_ACTIVE; + + return I915_REQUEST_QUEUED; +} + #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/mock_request.c" #include "selftests/i915_request.c" diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index bcc6340c505e..f98385f72782 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -651,4 +651,15 @@ i915_request_active_engine(struct i915_request *rq, void i915_request_notify_execute_cb_imm(struct i915_request *rq); +enum i915_request_state +{ + I915_REQUEST_UNKNOWN = 0, + I915_REQUEST_COMPLETE, + I915_REQUEST_PENDING, + I915_REQUEST_QUEUED, + I915_REQUEST_ACTIVE, +}; + +enum i915_request_state i915_test_request_state(struct i915_request *rq); + #endif /* I915_REQUEST_H */ -- 2.28.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2021-05-06 18:59 UTC|newest] Thread overview: 504+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-06 19:13 [RFC PATCH 00/97] Basic GuC submission support in the i915 Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:12 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork 2021-05-06 19:13 ` [RFC PATCH 01/97] drm/i915/gt: Move engine setup out of set_default_submission Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-19 0:25 ` Matthew Brost 2021-05-19 0:25 ` [Intel-gfx] " Matthew Brost 2021-05-25 8:44 ` Tvrtko Ursulin 2021-05-25 8:44 ` Tvrtko Ursulin 2021-05-06 19:13 ` [RFC PATCH 02/97] drm/i915/gt: Move submission_method into intel_gt Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-19 3:10 ` Matthew Brost 2021-05-19 3:10 ` [Intel-gfx] " Matthew Brost 2021-05-25 8:44 ` Tvrtko Ursulin 2021-05-25 8:44 ` Tvrtko Ursulin 2021-05-06 19:13 ` [RFC PATCH 03/97] drm/i915/gt: Move CS interrupt handler to the backend Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-19 3:31 ` Matthew Brost 2021-05-19 3:31 ` [Intel-gfx] " Matthew Brost 2021-05-25 8:45 ` Tvrtko Ursulin 2021-05-25 8:45 ` Tvrtko Ursulin 2021-05-06 19:13 ` [RFC PATCH 04/97] drm/i915/guc: skip disabling CTBs before sanitizing the GuC Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-20 16:47 ` Matthew Brost 2021-05-20 16:47 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 05/97] drm/i915/guc: use probe_error log for CT enablement failure Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 10:30 ` Michal Wajdeczko 2021-05-24 10:30 ` [Intel-gfx] " Michal Wajdeczko 2021-05-06 19:13 ` [RFC PATCH 06/97] drm/i915/guc: enable only the user interrupt when using GuC submission Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 0:31 ` Matthew Brost 2021-05-25 0:31 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 07/97] drm/i915/guc: Remove sample_forcewake h2g action Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 10:48 ` Michal Wajdeczko 2021-05-24 10:48 ` [Intel-gfx] " Michal Wajdeczko 2021-05-25 0:36 ` Matthew Brost 2021-05-25 0:36 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 08/97] drm/i915/guc: Keep strict GuC ABI definitions Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 23:52 ` Michał Winiarski 2021-05-24 23:52 ` [Intel-gfx] " Michał Winiarski 2021-05-06 19:13 ` [RFC PATCH 09/97] drm/i915/guc: Stop using fence/status from CTB descriptor Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 2:38 ` Matthew Brost 2021-05-25 2:38 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 10/97] drm/i915: Promote ptrdiff() to i915_utils.h Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 0:42 ` Matthew Brost 2021-05-25 0:42 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 11/97] drm/i915/guc: Only rely on own CTB size Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 2:47 ` Matthew Brost 2021-05-25 2:47 ` [Intel-gfx] " Matthew Brost 2021-05-25 12:48 ` Michal Wajdeczko 2021-05-25 12:48 ` Michal Wajdeczko 2021-05-06 19:13 ` [RFC PATCH 12/97] drm/i915/guc: Don't repeat CTB layout calculations Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 2:53 ` Matthew Brost 2021-05-25 2:53 ` [Intel-gfx] " Matthew Brost 2021-05-25 13:07 ` Michal Wajdeczko 2021-05-25 13:07 ` [Intel-gfx] " Michal Wajdeczko 2021-05-25 16:56 ` Matthew Brost 2021-05-25 16:56 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 13/97] drm/i915/guc: Replace CTB array with explicit members Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 3:15 ` Matthew Brost 2021-05-25 3:15 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 14/97] drm/i915/guc: Update sizes of CTB buffers Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 2:56 ` Matthew Brost 2021-05-25 2:56 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 15/97] drm/i915/guc: Relax CTB response timeout Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 18:08 ` Matthew Brost 2021-05-25 18:08 ` [Intel-gfx] " Matthew Brost 2021-05-25 19:37 ` Michal Wajdeczko 2021-05-25 19:37 ` Michal Wajdeczko 2021-05-06 19:13 ` [RFC PATCH 16/97] drm/i915/guc: Start protecting access to CTB descriptors Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 3:21 ` Matthew Brost 2021-05-25 3:21 ` [Intel-gfx] " Matthew Brost 2021-05-25 13:10 ` Michal Wajdeczko 2021-05-25 3:21 ` Matthew Brost 2021-05-25 3:21 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 17/97] drm/i915/guc: Stop using mutex while sending CTB messages Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 16:14 ` Matthew Brost 2021-05-25 16:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 18/97] drm/i915/guc: Don't receive all G2H messages in irq handler Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 18:15 ` Matthew Brost 2021-05-25 18:15 ` [Intel-gfx] " Matthew Brost 2021-05-25 19:43 ` Michal Wajdeczko 2021-05-25 19:43 ` Michal Wajdeczko 2021-05-06 19:13 ` [RFC PATCH 19/97] drm/i915/guc: Always copy CT message to new allocation Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 18:25 ` Matthew Brost 2021-05-25 18:25 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 20/97] drm/i915/guc: Introduce unified HXG messages Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-11 15:16 ` Daniel Vetter 2021-05-11 15:16 ` [Intel-gfx] " Daniel Vetter 2021-05-11 17:59 ` Matthew Brost 2021-05-11 17:59 ` [Intel-gfx] " Matthew Brost 2021-05-11 22:11 ` Michal Wajdeczko 2021-05-11 22:11 ` [Intel-gfx] " Michal Wajdeczko 2021-05-12 8:40 ` Daniel Vetter 2021-05-12 8:40 ` [Intel-gfx] " Daniel Vetter 2021-05-06 19:13 ` [RFC PATCH 21/97] drm/i915/guc: Update MMIO based communication Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 22/97] drm/i915/guc: Update CTB response status Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 23/97] drm/i915/guc: Support per context scheduling policies Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 1:15 ` Matthew Brost 2021-05-25 1:15 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 24/97] drm/i915/guc: Add flag for mark broken CTB Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-27 19:44 ` Matthew Brost 2021-05-27 19:44 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 25/97] drm/i915/guc: New definition of the CTB descriptor Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 26/97] drm/i915/guc: New definition of the CTB registration action Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 27/97] drm/i915/guc: New CTB based communication Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 28/97] drm/i915/guc: Kill guc_clients.ct_pool Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 1:01 ` Matthew Brost 2021-05-25 1:01 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 29/97] drm/i915/guc: Update firmware to v60.1.2 Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 30/97] drm/i915/uc: turn on GuC/HuC auto mode by default Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 11:00 ` Michal Wajdeczko 2021-05-24 11:00 ` [Intel-gfx] " Michal Wajdeczko 2021-05-06 19:13 ` [RFC PATCH 31/97] drm/i915/guc: Early initialization of GuC send registers Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-26 20:28 ` Matthew Brost 2021-05-26 20:28 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 32/97] drm/i915: Introduce i915_sched_engine object Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-11 15:18 ` Daniel Vetter 2021-05-11 15:18 ` [Intel-gfx] " Daniel Vetter 2021-05-11 17:56 ` Matthew Brost 2021-05-11 17:56 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 33/97] drm/i915: Engine relative MMIO Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 9:05 ` Tvrtko Ursulin 2021-05-25 9:05 ` Tvrtko Ursulin 2021-05-06 19:13 ` [RFC PATCH 34/97] drm/i915/guc: Use guc_class instead of engine_class in fw interface Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-26 20:41 ` Matthew Brost 2021-05-26 20:41 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 35/97] drm/i915/guc: Improve error message for unsolicited CT response Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 11:59 ` Michal Wajdeczko 2021-05-24 11:59 ` [Intel-gfx] " Michal Wajdeczko 2021-05-25 17:32 ` Matthew Brost 2021-05-25 17:32 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 36/97] drm/i915/guc: Add non blocking CTB send function Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 12:21 ` Michal Wajdeczko 2021-05-24 12:21 ` [Intel-gfx] " Michal Wajdeczko 2021-05-25 17:30 ` Matthew Brost 2021-05-25 17:30 ` [Intel-gfx] " Matthew Brost 2021-05-25 9:21 ` Tvrtko Ursulin 2021-05-25 9:21 ` Tvrtko Ursulin 2021-05-25 17:21 ` Matthew Brost 2021-05-25 17:21 ` Matthew Brost 2021-05-26 8:57 ` Tvrtko Ursulin 2021-05-26 8:57 ` Tvrtko Ursulin 2021-05-26 18:10 ` Matthew Brost 2021-05-26 18:10 ` Matthew Brost 2021-05-27 10:02 ` Tvrtko Ursulin 2021-05-27 10:02 ` Tvrtko Ursulin 2021-05-27 14:35 ` Matthew Brost 2021-05-27 14:35 ` Matthew Brost 2021-05-27 15:11 ` Tvrtko Ursulin 2021-05-27 15:11 ` Tvrtko Ursulin 2021-06-07 17:31 ` Matthew Brost 2021-06-07 17:31 ` Matthew Brost 2021-06-08 8:39 ` Tvrtko Ursulin 2021-06-08 8:39 ` Tvrtko Ursulin 2021-06-08 8:46 ` Daniel Vetter 2021-06-08 8:46 ` Daniel Vetter 2021-06-09 23:10 ` Matthew Brost 2021-06-09 23:10 ` Matthew Brost 2021-06-10 15:27 ` Daniel Vetter 2021-06-10 15:27 ` Daniel Vetter 2021-06-24 16:38 ` Matthew Brost 2021-06-24 16:38 ` Matthew Brost 2021-06-24 17:25 ` Daniel Vetter 2021-06-24 17:25 ` Daniel Vetter 2021-06-09 13:58 ` Michal Wajdeczko 2021-06-09 13:58 ` Michal Wajdeczko 2021-06-09 23:05 ` Matthew Brost 2021-06-09 23:05 ` Matthew Brost 2021-06-09 14:14 ` Michal Wajdeczko 2021-06-09 14:14 ` Michal Wajdeczko 2021-06-09 23:13 ` Matthew Brost 2021-06-09 23:13 ` Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 37/97] drm/i915/guc: Add stall timer to " Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 12:58 ` Michal Wajdeczko 2021-05-24 12:58 ` [Intel-gfx] " Michal Wajdeczko 2021-05-24 18:35 ` Matthew Brost 2021-05-24 18:35 ` [Intel-gfx] " Matthew Brost 2021-05-25 14:15 ` Michal Wajdeczko 2021-05-25 14:15 ` [Intel-gfx] " Michal Wajdeczko 2021-05-25 16:54 ` Matthew Brost 2021-05-25 16:54 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 38/97] drm/i915/guc: Optimize CTB writes and reads Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 13:31 ` Michal Wajdeczko 2021-05-24 13:31 ` [Intel-gfx] " Michal Wajdeczko 2021-05-25 17:39 ` Matthew Brost 2021-05-25 17:39 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 39/97] drm/i915/guc: Increase size of CTB buffers Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 13:43 ` Michal Wajdeczko 2021-05-24 13:43 ` Michal Wajdeczko 2021-05-24 18:40 ` Matthew Brost 2021-05-24 18:40 ` Matthew Brost 2021-05-25 9:24 ` Tvrtko Ursulin 2021-05-25 9:24 ` Tvrtko Ursulin 2021-05-25 17:15 ` Matthew Brost 2021-05-25 17:15 ` Matthew Brost 2021-05-26 9:30 ` Tvrtko Ursulin 2021-05-26 9:30 ` Tvrtko Ursulin 2021-05-26 18:20 ` Matthew Brost 2021-05-26 18:20 ` Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 40/97] drm/i915/guc: Module load failure test for CT buffer creation Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-24 13:45 ` Michal Wajdeczko 2021-05-24 13:45 ` [Intel-gfx] " Michal Wajdeczko 2021-05-06 19:13 ` [RFC PATCH 41/97] drm/i915/guc: Add new GuC interface defines and structures Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 42/97] drm/i915/guc: Remove GuC stage descriptor, add lrc descriptor Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 43/97] drm/i915/guc: Add lrc descriptor context lookup array Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-11 15:26 ` Daniel Vetter 2021-05-11 15:26 ` [Intel-gfx] " Daniel Vetter 2021-05-11 17:01 ` Matthew Brost 2021-05-11 17:01 ` [Intel-gfx] " Matthew Brost 2021-05-11 17:43 ` Daniel Vetter 2021-05-11 17:43 ` [Intel-gfx] " Daniel Vetter 2021-05-11 19:34 ` Matthew Brost 2021-05-11 19:34 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 44/97] drm/i915/guc: Implement GuC submission tasklet Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-25 9:43 ` Tvrtko Ursulin 2021-05-25 9:43 ` Tvrtko Ursulin 2021-05-25 17:10 ` Matthew Brost 2021-05-25 17:10 ` Matthew Brost 2021-05-06 19:13 ` [RFC PATCH 45/97] drm/i915/guc: Add bypass tasklet submission path to GuC Matthew Brost 2021-05-06 19:13 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 46/97] drm/i915/guc: Implement GuC context operations for new inteface Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-29 20:32 ` Michal Wajdeczko 2021-05-29 20:32 ` [Intel-gfx] " Michal Wajdeczko 2021-05-06 19:14 ` [RFC PATCH 47/97] drm/i915/guc: Insert fence on context when deregistering Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 48/97] drm/i915/guc: Defer context unpin until scheduling is disabled Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 49/97] drm/i915/guc: Disable engine barriers with GuC during unpin Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-11 15:37 ` Daniel Vetter 2021-05-11 15:37 ` [Intel-gfx] " Daniel Vetter 2021-05-11 16:31 ` Matthew Brost 2021-05-11 16:31 ` [Intel-gfx] " Matthew Brost 2021-05-26 10:26 ` Tvrtko Ursulin 2021-05-26 10:26 ` Tvrtko Ursulin 2021-05-06 19:14 ` [RFC PATCH 50/97] drm/i915/guc: Extend deregistration fence to schedule disable Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 51/97] drm/i915: Disable preempt busywait when using GuC scheduling Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 52/97] drm/i915/guc: Ensure request ordering via completion fences Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 53/97] drm/i915/guc: Disable semaphores when using GuC scheduling Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-25 9:52 ` Tvrtko Ursulin 2021-05-25 9:52 ` Tvrtko Ursulin 2021-05-25 17:01 ` Matthew Brost 2021-05-25 17:01 ` Matthew Brost 2021-05-26 9:25 ` Tvrtko Ursulin 2021-05-26 9:25 ` Tvrtko Ursulin 2021-05-26 18:15 ` Matthew Brost 2021-05-26 18:15 ` Matthew Brost 2021-05-27 8:41 ` Tvrtko Ursulin 2021-05-27 8:41 ` Tvrtko Ursulin 2021-05-27 14:38 ` Matthew Brost 2021-05-27 14:38 ` Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 54/97] drm/i915/guc: Ensure G2H response has space in buffer Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 55/97] drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-07 5:56 ` kernel test robot 2021-05-25 10:06 ` Tvrtko Ursulin 2021-05-25 10:06 ` Tvrtko Ursulin 2021-05-25 17:07 ` Matthew Brost 2021-05-25 17:07 ` Matthew Brost 2021-05-26 9:21 ` Tvrtko Ursulin 2021-05-26 9:21 ` Tvrtko Ursulin 2021-05-26 18:18 ` Matthew Brost 2021-05-26 18:18 ` Matthew Brost 2021-05-27 9:02 ` Tvrtko Ursulin 2021-05-27 9:02 ` Tvrtko Ursulin 2021-05-27 14:37 ` Matthew Brost 2021-05-27 14:37 ` Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 56/97] drm/i915/guc: Update GuC debugfs to support new GuC Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 57/97] drm/i915/guc: Add several request trace points Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 58/97] drm/i915: Add intel_context tracing Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 59/97] drm/i915/guc: GuC virtual engines Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 60/97] drm/i915: Track 'serial' counts for " Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-25 10:16 ` Tvrtko Ursulin 2021-05-25 10:16 ` Tvrtko Ursulin 2021-05-25 17:52 ` Matthew Brost 2021-05-25 17:52 ` Matthew Brost 2021-05-26 8:40 ` Tvrtko Ursulin 2021-05-26 8:40 ` Tvrtko Ursulin 2021-05-26 18:45 ` John Harrison 2021-05-26 18:45 ` John Harrison 2021-05-27 8:53 ` Tvrtko Ursulin 2021-05-27 8:53 ` Tvrtko Ursulin 2021-05-27 17:01 ` John Harrison 2021-05-27 17:01 ` John Harrison 2021-06-01 9:31 ` Tvrtko Ursulin 2021-06-01 9:31 ` Tvrtko Ursulin 2021-06-02 1:20 ` John Harrison 2021-06-02 1:20 ` John Harrison 2021-06-02 12:04 ` Tvrtko Ursulin 2021-06-02 12:04 ` Tvrtko Ursulin 2021-06-02 12:09 ` Tvrtko Ursulin 2021-06-02 12:09 ` Tvrtko Ursulin 2021-05-06 19:14 ` [RFC PATCH 61/97] drm/i915: Hold reference to intel_context over life of i915_request Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-06-02 12:18 ` Tvrtko Ursulin 2021-06-02 12:18 ` Tvrtko Ursulin 2021-05-06 19:14 ` [RFC PATCH 62/97] drm/i915/guc: Disable bonding extension with GuC submission Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 63/97] drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-06-02 13:31 ` Tvrtko Ursulin 2021-06-02 13:31 ` Tvrtko Ursulin 2021-05-06 19:14 ` [RFC PATCH 64/97] drm/i915/guc: Reset implementation for new GuC interface Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-06-02 14:33 ` Tvrtko Ursulin 2021-06-02 14:33 ` Tvrtko Ursulin 2021-06-04 3:17 ` Matthew Brost 2021-06-04 3:17 ` Matthew Brost 2021-06-04 8:16 ` Daniel Vetter 2021-06-04 8:16 ` Daniel Vetter 2021-06-04 18:02 ` Matthew Brost 2021-06-04 18:02 ` Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 65/97] drm/i915: Reset GPU immediately if submission is disabled Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-06-02 14:36 ` Tvrtko Ursulin 2021-06-02 14:36 ` Tvrtko Ursulin 2021-05-06 19:14 ` [RFC PATCH 66/97] drm/i915/guc: Add disable interrupts to guc sanitize Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-11 8:16 ` [drm/i915/guc] 07336fb545: WARNING:at_drivers/gpu/drm/i915/gt/uc/intel_uc.c:#__uc_sanitize[i915] kernel test robot 2021-05-11 8:16 ` kernel test robot 2021-05-11 8:16 ` [Intel-gfx] " kernel test robot 2021-05-11 8:16 ` kernel test robot 2021-05-06 19:14 ` [RFC PATCH 67/97] drm/i915/guc: Suspend/resume implementation for new interface Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 68/97] drm/i915/guc: Handle context reset notification Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-11 16:25 ` Daniel Vetter 2021-05-11 16:25 ` Daniel Vetter 2021-05-06 19:14 ` [RFC PATCH 69/97] drm/i915/guc: Handle engine reset failure notification Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 70/97] drm/i915/guc: Enable the timer expired interrupt for GuC Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 71/97] drm/i915/guc: Provide mmio list to be saved/restored on engine reset Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 72/97] drm/i915/guc: Don't complain about reset races Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 73/97] drm/i915/guc: Enable GuC engine reset Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 74/97] drm/i915/guc: Capture error state on context reset Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-11 16:28 ` Daniel Vetter 2021-05-11 16:28 ` Daniel Vetter 2021-05-11 17:12 ` Matthew Brost 2021-05-11 17:12 ` Matthew Brost 2021-05-11 17:45 ` Daniel Vetter 2021-05-11 17:45 ` Daniel Vetter 2021-05-06 19:14 ` Matthew Brost [this message] 2021-05-06 19:14 ` [Intel-gfx] [RFC PATCH 75/97] drm/i915/guc: Fix for error capture after full GPU reset with GuC Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 76/97] drm/i915/guc: Hook GuC scheduling policies up Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 77/97] drm/i915/guc: Connect reset modparam updates to GuC policy flags Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 78/97] drm/i915/guc: Include scheduling policies in the debugfs state dump Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 79/97] drm/i915/guc: Don't call ring_is_idle in GuC submission Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 80/97] drm/i915/guc: Implement banned contexts for " Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 81/97] drm/i915/guc: Allow flexible number of context ids Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 82/97] drm/i915/guc: Connect the number of guc_ids to debugfs Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 83/97] drm/i915/guc: Don't return -EAGAIN to user when guc_ids exhausted Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-07 6:06 ` kernel test robot 2021-05-06 19:14 ` [RFC PATCH 84/97] drm/i915/guc: Don't allow requests not ready to consume all guc_ids Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 85/97] drm/i915/guc: Introduce guc_submit_engine object Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 86/97] drm/i915/guc: Add golden context to GuC ADS Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 87/97] drm/i915/guc: Implement GuC priority management Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 88/97] drm/i915/guc: Support request cancellation Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 89/97] drm/i915/guc: Check return of __xa_store when registering a context Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 90/97] drm/i915/guc: Non-static lrc descriptor registration buffer Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 91/97] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 92/97] drm/i915: Add GT PM delayed worker Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 93/97] drm/i915/guc: Take engine PM when a context is pinned with GuC submission Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 94/97] drm/i915/guc: Don't call switch_to_kernel_context " Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 95/97] drm/i915/guc: Selftest for GuC flow control Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 96/97] drm/i915/guc: Update GuC documentation Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-06 19:14 ` [RFC PATCH 97/97] drm/i915/guc: Unblock GuC submission on Gen11+ Matthew Brost 2021-05-06 19:14 ` [Intel-gfx] " Matthew Brost 2021-05-09 17:12 ` [RFC PATCH 00/97] Basic GuC submission support in the i915 Martin Peres 2021-05-09 17:12 ` [Intel-gfx] " Martin Peres 2021-05-09 23:11 ` Jason Ekstrand 2021-05-09 23:11 ` [Intel-gfx] " Jason Ekstrand 2021-05-10 13:55 ` Martin Peres 2021-05-10 13:55 ` [Intel-gfx] " Martin Peres 2021-05-10 16:25 ` Jason Ekstrand 2021-05-10 16:25 ` [Intel-gfx] " Jason Ekstrand 2021-05-11 8:01 ` Martin Peres 2021-05-11 8:01 ` [Intel-gfx] " Martin Peres 2021-05-10 16:33 ` Daniel Vetter 2021-05-10 16:33 ` [Intel-gfx] " Daniel Vetter 2021-05-10 18:30 ` Francisco Jerez 2021-05-10 18:30 ` Francisco Jerez 2021-05-11 8:06 ` Martin Peres 2021-05-11 8:06 ` [Intel-gfx] " Martin Peres 2021-05-11 15:26 ` Bloomfield, Jon 2021-05-11 15:26 ` [Intel-gfx] " Bloomfield, Jon 2021-05-11 16:39 ` Matthew Brost 2021-05-11 16:39 ` [Intel-gfx] " Matthew Brost 2021-05-12 6:26 ` Martin Peres 2021-05-12 6:26 ` [Intel-gfx] " Martin Peres 2021-05-14 16:31 ` Jason Ekstrand 2021-05-14 16:31 ` [Intel-gfx] " Jason Ekstrand 2021-05-25 15:37 ` Alex Deucher 2021-05-25 15:37 ` [Intel-gfx] " Alex Deucher 2021-05-11 2:58 ` Dixit, Ashutosh 2021-05-11 2:58 ` [Intel-gfx] " Dixit, Ashutosh 2021-05-11 7:47 ` Martin Peres 2021-05-11 7:47 ` [Intel-gfx] " Martin Peres 2021-05-14 11:11 ` Tvrtko Ursulin 2021-05-14 11:11 ` Tvrtko Ursulin 2021-05-14 16:36 ` Jason Ekstrand 2021-05-14 16:36 ` Jason Ekstrand 2021-05-14 16:46 ` Matthew Brost 2021-05-14 16:46 ` Matthew Brost 2021-05-14 16:41 ` Matthew Brost 2021-05-14 16:41 ` Matthew Brost 2021-05-25 10:32 ` Tvrtko Ursulin 2021-05-25 10:32 ` Tvrtko Ursulin 2021-05-25 16:45 ` Matthew Brost 2021-05-25 16:45 ` Matthew Brost 2021-06-02 15:27 ` Tvrtko Ursulin 2021-06-02 15:27 ` Tvrtko Ursulin 2021-06-02 18:57 ` Daniel Vetter 2021-06-02 18:57 ` Daniel Vetter 2021-06-03 3:41 ` Matthew Brost 2021-06-03 3:41 ` Matthew Brost 2021-06-03 4:47 ` Daniel Vetter 2021-06-03 4:47 ` Daniel Vetter 2021-06-03 9:49 ` Tvrtko Ursulin 2021-06-03 9:49 ` Tvrtko Ursulin 2021-06-03 10:52 ` Tvrtko Ursulin 2021-06-03 10:52 ` Tvrtko Ursulin 2021-06-03 4:10 ` Matthew Brost 2021-06-03 4:10 ` Matthew Brost 2021-06-03 8:51 ` Tvrtko Ursulin 2021-06-03 8:51 ` Tvrtko Ursulin 2021-06-03 16:34 ` Matthew Brost 2021-06-03 16:34 ` Matthew Brost
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210506191451.77768-76-matthew.brost@intel.com \ --to=matthew.brost@intel.com \ --cc=daniel.vetter@intel.com \ --cc=daniele.ceraolospurio@intel.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=intel-gfx@lists.freedesktop.org \ --cc=jason.ekstrand@intel.com \ --cc=john.c.harrison@intel.com \ --cc=jon.bloomfield@intel.com \ --cc=tvrtko.ursulin@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.