archive mirror
 help / color / mirror / Atom feed
From: John Harrison <>
To: Daniel Vetter <>, Matthew Brost <>
Cc: <>, <>
Subject: Re: [Intel-gfx] [PATCH 0/1] Fix gem_ctx_persistence failures with GuC submission
Date: Tue, 17 Aug 2021 17:08:02 -0700	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <YRIe8jEI+0TLreAI@phenom.ffwll.local>

On 8/9/2021 23:38, Daniel Vetter wrote:
> On Wed, Jul 28, 2021 at 05:33:59PM -0700, Matthew Brost wrote:
>> Should fix below failures with GuC submission for the following tests:
>> gem_exec_balancer --r noheartbeat
>> gem_ctx_persistence --r heartbeat-close
>> Not going to fix:
>> gem_ctx_persistence --r heartbeat-many
>> gem_ctx_persistence --r heartbeat-stop
> After looking at that big thread and being very confused: Are we fixing an
> actual use-case here, or is this another case of blindly following igts
> tests just because they exist?
My understanding is that this is established behaviour and therefore 
must be maintained because the UAPI (whether documented or not) is 
inviolate. Therefore IGTs have been written to validate this past 
behaviour and now we must conform to the IGTs in order to keep the 
existing behaviour unchanged.

Whether anybody actually makes use of this behaviour or not is another 
matter entirely. I am certainly not aware of any vital use case. Others 
might have more recollection. I do know that we tell the UMD teams to 
explicitly disable persistence on every context they create.

> I'm leaning towards that we should stall on this, and first document what
> exactly is the actual intention behind all this, and then fix up the tests
I'm not sure there ever was an 'intention'. The rumour I heard way back 
when was that persistence was a bug on earlier platforms (or possibly we 
didn't have hardware support for doing engine resets?). But once the bug 
was realised (or the hardware support was added), it was too late to 
change the default behaviour because existing kernel behaviour must 
never change on pain of painful things. Thus the persistence flag was 
added so that people could opt out of the broken, leaky behaviour and 
have their contexts clean up properly.

Feel free to document what you believe should be the behaviour from a 
software architect point of view. Any documentation I produce is 
basically going to be created by reverse engineering the existing code. 
That is the only 'spec' that I am aware of and as I keep saying, I 
personally think it is a totally broken concept that should just be removed.

> to match (if needed). And only then fix up GuC to match whatever we
> actually want to do.
I also still maintain there is no 'fix up the GuC'. This is not 
behaviour we should be adding to a hardware scheduler. It is behaviour 
that should be implemented at the front end not the back end. If we 
absolutely need to do this then we need to do it solely at the context 
management level not at the back end submission level. And the solution 
should work by default on any submission back end.


> -Daniel
>> As the above tests change the heartbeat value to 0 (off) after the
>> context is closed and we have no way to detect that with GuC submission
>> unless we keep a list of closed but running contexts which seems like
>> overkill for a non-real world use case. We likely should just skip these
>> tests with GuC submission.
>> Signed-off-by: Matthew Brost <>
>> Matthew Brost (1):
>>    drm/i915: Check if engine has heartbeat when closing a context
>>   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  5 +++--
>>   drivers/gpu/drm/i915/gt/intel_context_types.h |  2 ++
>>   drivers/gpu/drm/i915/gt/intel_engine.h        | 21 ++-----------------
>>   .../drm/i915/gt/intel_execlists_submission.c  | 14 +++++++++++++
>>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  6 +++++-
>>   .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  2 --
>>   6 files changed, 26 insertions(+), 24 deletions(-)
>> -- 
>> 2.28.0

  reply	other threads:[~2021-08-18  0:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29  0:33 Matthew Brost
2021-07-29  0:34 ` [Intel-gfx] [PATCH 1/1] drm/i915: Check if engine has heartbeat when closing a context Matthew Brost
2021-07-30  0:13   ` John Harrison
2021-07-30  9:49     ` Tvrtko Ursulin
2021-07-30 18:13       ` John Harrison
2021-08-02  9:40         ` Tvrtko Ursulin
2021-08-06 18:00           ` John Harrison
2021-08-06 19:46             ` Daniel Vetter
2021-08-09 23:12               ` John Harrison
2021-08-10  6:36                 ` Daniel Vetter
2021-08-18  0:28                   ` John Harrison
2021-08-18  9:26                     ` Daniel Vetter
2021-07-30 18:13       ` Matthew Brost
2021-07-29  2:08 ` [Intel-gfx] ✓ Fi.CI.BAT: success for Fix gem_ctx_persistence failures with GuC submission Patchwork
2021-07-29  7:30 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-08-10  6:38 ` [Intel-gfx] [PATCH 0/1] " Daniel Vetter
2021-08-18  0:08   ` John Harrison [this message]
2021-08-18  9:49     ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \
    --subject='Re: [Intel-gfx] [PATCH 0/1] Fix gem_ctx_persistence failures with GuC submission' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).