From: Matthew Brost <matthew.brost@intel.com> To: <intel-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org> Cc: daniele.ceraolospurio@intel.com, john.c.harrison@intel.com Subject: [PATCH 02/33] drm/i915/guc: Make hangcheck work with GuC virtual engines Date: Mon, 26 Jul 2021 17:23:17 -0700 [thread overview] Message-ID: <20210727002348.97202-3-matthew.brost@intel.com> (raw) In-Reply-To: <20210727002348.97202-1-matthew.brost@intel.com> From: John Harrison <John.C.Harrison@Intel.com> The serial number tracking of engines happens at the backend of request submission and was expecting to only be given physical engines. However, in GuC submission mode, the decomposition of virtual to physical engines does not happen in i915. Instead, requests are submitted to their virtual engine mask all the way through to the hardware (i.e. to GuC). This would mean that the heart beat code thinks the physical engines are idle due to the serial number not incrementing. Which in turns means hangcheck does not work for GuC virtual engines. This patch updates the tracking to decompose virtual engines into their physical constituents and tracks the request against each. This is not entirely accurate as the GuC will only be issuing the request to one physical engine. However, it is the best that i915 can do given that it has no knowledge of the GuC's scheduling decisions. Downside of this is that all physical engines constituting a GuC virtual engine will be periodically unparked (even during just a single context executing) in order to be pinged with a heartbeat request. However the power and performance cost of this is not expected to be measurable (due low frequency of heartbeat pulses) and it is considered an easier option than trying to make changes to GuC firmware. v2: (Tvrtko) - Update commit message - Have default behavior if no vfunc present Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 10 ++++++++++ drivers/gpu/drm/i915/i915_request.c | 6 +++++- 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 3f308a920b50..75a34cd3f1c2 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -382,6 +382,8 @@ struct intel_engine_cs { void (*park)(struct intel_engine_cs *engine); void (*unpark)(struct intel_engine_cs *engine); + void (*bump_serial)(struct intel_engine_cs *engine); + void (*set_default_submission)(struct intel_engine_cs *engine); const struct intel_context_ops *cops; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 8b3ae5f65cd5..6b08221df143 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1492,6 +1492,15 @@ static void guc_release(struct intel_engine_cs *engine) lrc_fini_wa_ctx(engine); } +static void virtual_guc_bump_serial(struct intel_engine_cs *engine) +{ + struct intel_engine_cs *e; + intel_engine_mask_t tmp, mask = engine->mask; + + for_each_engine_masked(e, engine->gt, mask, tmp) + e->serial++; +} + static void guc_default_vfuncs(struct intel_engine_cs *engine) { /* Default vfuncs which can be overridden by each engine. */ @@ -1835,6 +1844,7 @@ guc_create_virtual(struct intel_engine_cs **siblings, unsigned int count) ve->base.cops = &virtual_guc_context_ops; ve->base.request_alloc = guc_request_alloc; + ve->base.bump_serial = virtual_guc_bump_serial; ve->base.submit_request = guc_submit_request; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 6594cb2f8ebd..39a21d96577e 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -669,7 +669,11 @@ bool __i915_request_submit(struct i915_request *request) request->ring->vaddr + request->postfix); trace_i915_request_execute(request); - engine->serial++; + if (engine->bump_serial) + engine->bump_serial(engine); + else + engine->serial++; + result = true; GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)); -- 2.28.0
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Brost <matthew.brost@intel.com> To: <intel-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org> Subject: [Intel-gfx] [PATCH 02/33] drm/i915/guc: Make hangcheck work with GuC virtual engines Date: Mon, 26 Jul 2021 17:23:17 -0700 [thread overview] Message-ID: <20210727002348.97202-3-matthew.brost@intel.com> (raw) In-Reply-To: <20210727002348.97202-1-matthew.brost@intel.com> From: John Harrison <John.C.Harrison@Intel.com> The serial number tracking of engines happens at the backend of request submission and was expecting to only be given physical engines. However, in GuC submission mode, the decomposition of virtual to physical engines does not happen in i915. Instead, requests are submitted to their virtual engine mask all the way through to the hardware (i.e. to GuC). This would mean that the heart beat code thinks the physical engines are idle due to the serial number not incrementing. Which in turns means hangcheck does not work for GuC virtual engines. This patch updates the tracking to decompose virtual engines into their physical constituents and tracks the request against each. This is not entirely accurate as the GuC will only be issuing the request to one physical engine. However, it is the best that i915 can do given that it has no knowledge of the GuC's scheduling decisions. Downside of this is that all physical engines constituting a GuC virtual engine will be periodically unparked (even during just a single context executing) in order to be pinged with a heartbeat request. However the power and performance cost of this is not expected to be measurable (due low frequency of heartbeat pulses) and it is considered an easier option than trying to make changes to GuC firmware. v2: (Tvrtko) - Update commit message - Have default behavior if no vfunc present Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 10 ++++++++++ drivers/gpu/drm/i915/i915_request.c | 6 +++++- 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 3f308a920b50..75a34cd3f1c2 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -382,6 +382,8 @@ struct intel_engine_cs { void (*park)(struct intel_engine_cs *engine); void (*unpark)(struct intel_engine_cs *engine); + void (*bump_serial)(struct intel_engine_cs *engine); + void (*set_default_submission)(struct intel_engine_cs *engine); const struct intel_context_ops *cops; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 8b3ae5f65cd5..6b08221df143 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1492,6 +1492,15 @@ static void guc_release(struct intel_engine_cs *engine) lrc_fini_wa_ctx(engine); } +static void virtual_guc_bump_serial(struct intel_engine_cs *engine) +{ + struct intel_engine_cs *e; + intel_engine_mask_t tmp, mask = engine->mask; + + for_each_engine_masked(e, engine->gt, mask, tmp) + e->serial++; +} + static void guc_default_vfuncs(struct intel_engine_cs *engine) { /* Default vfuncs which can be overridden by each engine. */ @@ -1835,6 +1844,7 @@ guc_create_virtual(struct intel_engine_cs **siblings, unsigned int count) ve->base.cops = &virtual_guc_context_ops; ve->base.request_alloc = guc_request_alloc; + ve->base.bump_serial = virtual_guc_bump_serial; ve->base.submit_request = guc_submit_request; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 6594cb2f8ebd..39a21d96577e 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -669,7 +669,11 @@ bool __i915_request_submit(struct i915_request *request) request->ring->vaddr + request->postfix); trace_i915_request_execute(request); - engine->serial++; + if (engine->bump_serial) + engine->bump_serial(engine); + else + engine->serial++; + result = true; GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)); -- 2.28.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2021-07-27 0:06 UTC|newest] Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-27 0:23 [PATCH 00/33] Remaining patches for basic GuC submission Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 01/33] drm/i915/guc: GuC virtual engines Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` Matthew Brost [this message] 2021-07-27 0:23 ` [Intel-gfx] [PATCH 02/33] drm/i915/guc: Make hangcheck work with " Matthew Brost 2021-07-27 0:23 ` [PATCH 03/33] drm/i915: Hold reference to intel_context over life of i915_request Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-28 8:01 ` Daniel Vetter 2021-07-28 8:01 ` Daniel Vetter 2021-07-27 0:23 ` [PATCH 04/33] drm/i915/guc: Disable bonding extension with GuC submission Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 05/33] drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 06/33] drm/i915: Add i915_sched_engine destroy vfunc Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 07/33] drm/i915: Move active request tracking to a vfunc Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 08/33] drm/i915/guc: Reset implementation for new GuC interface Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 09/33] drm/i915: Reset GPU immediately if submission is disabled Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 10/33] drm/i915/guc: Add disable interrupts to guc sanitize Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 11/33] drm/i915/guc: Suspend/resume implementation for new interface Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 12/33] drm/i915/guc: Handle context reset notification Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 13/33] drm/i915/guc: Handle engine reset failure notification Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 14/33] drm/i915/guc: Enable the timer expired interrupt for GuC Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 15/33] drm/i915/guc: Provide mmio list to be saved/restored on engine reset Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 16/33] drm/i915/guc: Don't complain about reset races Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 17/33] drm/i915/guc: Enable GuC engine reset Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 18/33] drm/i915/guc: Capture error state on context reset Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 19/33] drm/i915/guc: Fix for error capture after full GPU reset with GuC Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 20/33] drm/i915/guc: Hook GuC scheduling policies up Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 21/33] drm/i915/guc: Connect reset modparam updates to GuC policy flags Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-08-26 8:55 ` Jani Nikula 2021-08-26 8:55 ` [Intel-gfx] " Jani Nikula 2021-07-27 0:23 ` [PATCH 22/33] drm/i915/guc: Include scheduling policies in the debugfs state dump Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 23/33] drm/i915/guc: Add golden context to GuC ADS Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 24/33] drm/i915/guc: Implement banned contexts for GuC submission Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-08-05 11:52 ` Tvrtko Ursulin 2021-08-25 10:39 ` Tvrtko Ursulin 2021-08-26 3:49 ` Matthew Brost 2021-08-26 11:27 ` Tvrtko Ursulin 2021-08-26 14:28 ` Matthew Brost 2021-07-27 0:23 ` [PATCH 25/33] drm/i915/guc: Support request cancellation Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 19:15 ` Daniele Ceraolo Spurio 2021-07-27 19:15 ` [Intel-gfx] " Daniele Ceraolo Spurio 2021-10-05 7:06 ` Sebastian Andrzej Siewior 2021-10-05 7:06 ` [Intel-gfx] " Sebastian Andrzej Siewior 2021-10-05 10:13 ` Tvrtko Ursulin 2021-10-05 10:58 ` Sebastian Andrzej Siewior 2021-07-27 0:23 ` [PATCH 26/33] drm/i915/selftest: Better error reporting from hangcheck selftest Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 27/33] drm/i915/selftest: Fix workarounds selftest for GuC submission Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 28/33] drm/i915/selftest: Fix MOCS " Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 29/33] drm/i915/selftest: Increase some timeouts in live_requests Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 19:21 ` John Harrison 2021-07-27 19:21 ` [Intel-gfx] " John Harrison 2021-07-27 0:23 ` [PATCH 30/33] drm/i915/selftest: Fix hangcheck self test for GuC submission Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 31/33] drm/i915/selftest: Bump selftest timeouts for hangcheck Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:23 ` [PATCH 32/33] drm/i915/guc: Implement GuC priority management Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2022-06-07 13:58 ` Tvrtko Ursulin 2022-06-07 13:58 ` Tvrtko Ursulin 2022-06-07 22:40 ` John Harrison 2021-07-27 0:23 ` [PATCH 33/33] drm/i915/guc: Unblock GuC submission on Gen11+ Matthew Brost 2021-07-27 0:23 ` [Intel-gfx] " Matthew Brost 2021-07-27 0:34 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Remaining patches for basic GuC submission (rev2) Patchwork 2021-07-27 1:04 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork 2021-07-27 4:50 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork -- strict thread matches above, loose matches on Subject: below -- 2021-07-22 23:53 [PATCH 00/33] Remaining patches for basic GuC submission Matthew Brost 2021-07-22 23:53 ` [PATCH 02/33] drm/i915/guc: Make hangcheck work with GuC virtual engines Matthew Brost
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210727002348.97202-3-matthew.brost@intel.com \ --to=matthew.brost@intel.com \ --cc=daniele.ceraolospurio@intel.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=intel-gfx@lists.freedesktop.org \ --cc=john.c.harrison@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.