All of lore.kernel.org
 help / color / mirror / Atom feed
From: Salvatore Bonaccorso <carnil@debian.org>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>, stable@vger.kernel.org
Subject: Re: [PATCH v2] drm/i915/gt: Schedule request retirement when timeline idles
Date: Mon, 13 Apr 2020 16:37:28 +0200	[thread overview]
Message-ID: <20200413143728.GA709872@eldamar.local> (raw)
In-Reply-To: <20200412075607.GA2709076@kroah.com>

Hi,

[note not involved in the respective patch, but noticed your question
and saw the context]

On Sun, Apr 12, 2020 at 09:56:07AM +0200, Greg KH wrote:
> On Sun, Apr 12, 2020 at 12:48:51AM -0700, Sultan Alsawaf wrote:
> > On Tue, Apr 07, 2020 at 01:43:45PM -0700, Sultan Alsawaf wrote:
> > > From: Chris Wilson <chris@chris-wilson.co.uk>
> > > 
> > > The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> > > corruption WA") is that it disables RC6 while Skylake (and friends) is
> > > active, and we do not consider the GPU idle until all outstanding
> > > requests have been retired and the engine switched over to the kernel
> > > context. If userspace is idle, this task falls onto our background idle
> > > worker, which only runs roughly once a second, meaning that userspace has
> > > to have been idle for a couple of seconds before we enable RC6 again.
> > > Naturally, this causes us to consume considerably more energy than
> > > before as powersaving is effectively disabled while a display server
> > > (here's looking at you Xorg) is running.
> > > 
> > > As execlists will get a completion event as each context is completed,
> > > we can use this interrupt to queue a retire worker bound to this engine
> > > to cleanup idle timelines. We will then immediately notice the idle
> > > engine (without userspace intervention or the aid of the background
> > > retire worker) and start parking the GPU. Thus during light workloads,
> > > we will do much more work to idle the GPU faster...  Hopefully with
> > > commensurate power saving!
> > > 
> > > v2: Watch context completions and only look at those local to the engine
> > > when retiring to reduce the amount of excess work we perform.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> > > References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > > References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
> > > (cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
> > > 
> > > For the backport to 5.4, struct_mutex needs to be held while retiring so
> > > that retirement doesn't race with vma destruction.
> > > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
> > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
> > >  drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
> > >  drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
> > >  .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
> > >  drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
> > >  drivers/gpu/drm/i915/i915_request.h           |  4 +
> > >  7 files changed, 101 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > index 4ce8626b140e..f732a2177cd0 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > @@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
> > >  	intel_engine_init_hangcheck(engine);
> > >  	intel_engine_init_cmd_parser(engine);
> > >  	intel_engine_init__pm(engine);
> > > +	intel_engine_init_retire(engine);
> > >  
> > >  	intel_engine_pool_init(&engine->pool);
> > >  
> > > @@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
> > >  
> > >  	cleanup_status_page(engine);
> > >  
> > > +	intel_engine_fini_retire(engine);
> > >  	intel_engine_pool_fini(&engine->pool);
> > >  	intel_engine_fini_breadcrumbs(engine);
> > >  	intel_engine_cleanup_cmd_parser(engine);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > index c77c9518c58b..1eb7189f88e1 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > @@ -471,6 +471,14 @@ struct intel_engine_cs {
> > >  
> > >  	struct intel_engine_execlists execlists;
> > >  
> > > +	/*
> > > +	 * Keep track of completed timelines on this engine for early
> > > +	 * retirement with the goal of quickly enabling powersaving as
> > > +	 * soon as the engine is idle.
> > > +	 */
> > > +	struct intel_timeline *retire;
> > > +	struct work_struct retire_work;
> > > +
> > >  	/* status_notifier: list of callbacks for context-switch changes */
> > >  	struct atomic_notifier_head context_status_notifier;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > > index 66f6d1a897f2..a1538c8f7922 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > > @@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
> > >  {
> > >  	struct intel_context * const ce = rq->hw_context;
> > >  
> > > +	/*
> > > +	 * If we have just completed this context, the engine may now be
> > > +	 * idle and we want to re-enter powersaving.
> > > +	 */
> > > +	if (list_is_last(&rq->link, &ce->timeline->requests) &&
> > > +	    i915_request_completed(rq))
> > > +		intel_engine_add_retire(engine, ce->timeline);
> > > +
> > >  	intel_engine_context_out(engine);
> > >  	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> > >  	intel_gt_pm_put(engine->gt);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > > index 9cb01d9828f1..63515e3caaf2 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > > @@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
> > >  {
> > >  	GEM_BUG_ON(atomic_read(&timeline->pin_count));
> > >  	GEM_BUG_ON(!list_empty(&timeline->requests));
> > > +	GEM_BUG_ON(timeline->retire);
> > >  
> > >  	if (timeline->hwsp_cacheline)
> > >  		cacheline_free(timeline->hwsp_cacheline);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > > index 2b1baf2fcc8e..adf2d14ef647 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > > @@ -65,6 +65,9 @@ struct intel_timeline {
> > >  	 */
> > >  	struct i915_active_request last_request;
> > >  
> > > +	/** A chain of completed timelines ready for early retirement. */
> > > +	struct intel_timeline *retire;
> > > +
> > >  	/**
> > >  	 * We track the most recent seqno that we wait on in every context so
> > >  	 * that we only have to emit a new await and dependency on a more
> > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > > index 0d39038898d4..35457fda350c 100644
> > > --- a/drivers/gpu/drm/i915/i915_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_request.c
> > > @@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
> > >  			break;
> > >  }
> > >  
> > > +static void engine_retire(struct work_struct *work)
> > > +{
> > > +	struct intel_engine_cs *engine =
> > > +		container_of(work, typeof(*engine), retire_work);
> > > +	struct intel_timeline *tl = xchg(&engine->retire, NULL);
> > > +
> > > +	do {
> > > +		struct intel_timeline *next = xchg(&tl->retire, NULL);
> > > +
> > > +		/*
> > > +		 * Our goal here is to retire _idle_ timelines as soon as
> > > +		 * possible (as they are idle, we do not expect userspace
> > > +		 * to be cleaning up anytime soon).
> > > +		 *
> > > +		 * If the timeline is currently locked, either it is being
> > > +		 * retired elsewhere or about to be!
> > > +		 */
> > > +		mutex_lock(&engine->i915->drm.struct_mutex);
> > > +		if (mutex_trylock(&tl->mutex)) {
> > > +			retire_requests(tl);
> > > +			mutex_unlock(&tl->mutex);
> > > +		}
> > > +		mutex_unlock(&engine->i915->drm.struct_mutex);
> > > +		intel_timeline_put(tl);
> > > +
> > > +		GEM_BUG_ON(!next);
> > > +		tl = ptr_mask_bits(next, 1);
> > > +	} while (tl);
> > > +}
> > > +
> > > +static bool add_retire(struct intel_engine_cs *engine,
> > > +		       struct intel_timeline *tl)
> > > +{
> > > +	struct intel_timeline *first;
> > > +
> > > +	/*
> > > +	 * We open-code a llist here to include the additional tag [BIT(0)]
> > > +	 * so that we know when the timeline is already on a
> > > +	 * retirement queue: either this engine or another.
> > > +	 *
> > > +	 * However, we rely on that a timeline can only be active on a single
> > > +	 * engine at any one time and that add_retire() is called before the
> > > +	 * engine releases the timeline and transferred to another to retire.
> > > +	 */
> > > +
> > > +	if (READ_ONCE(tl->retire)) /* already queued */
> > > +		return false;
> > > +
> > > +	intel_timeline_get(tl);
> > > +	first = READ_ONCE(engine->retire);
> > > +	do
> > > +		tl->retire = ptr_pack_bits(first, 1, 1);
> > > +	while (!try_cmpxchg(&engine->retire, &first, tl));
> > > +
> > > +	return !first;
> > > +}
> > > +
> > > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > > +			     struct intel_timeline *tl)
> > > +{
> > > +	if (add_retire(engine, tl))
> > > +		schedule_work(&engine->retire_work);
> > > +}
> > > +
> > > +void intel_engine_init_retire(struct intel_engine_cs *engine)
> > > +{
> > > +	INIT_WORK(&engine->retire_work, engine_retire);
> > > +}
> > > +
> > > +void intel_engine_fini_retire(struct intel_engine_cs *engine)
> > > +{
> > > +	flush_work(&engine->retire_work);
> > > +	GEM_BUG_ON(engine->retire);
> > > +}
> > > +
> > >  static noinline struct i915_request *
> > >  request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
> > >  {
> > > diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> > > index 3a3e7bbf19ff..40caa2f3f4a4 100644
> > > --- a/drivers/gpu/drm/i915/i915_request.h
> > > +++ b/drivers/gpu/drm/i915/i915_request.h
> > > @@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
> > >  }
> > >  
> > >  bool i915_retire_requests(struct drm_i915_private *i915);
> > > +void intel_engine_init_retire(struct intel_engine_cs *engine);
> > > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > > +			     struct intel_timeline *tl);
> > > +void intel_engine_fini_retire(struct intel_engine_cs *engine);
> > >  
> > >  #endif /* I915_REQUEST_H */
> > > -- 
> > > 2.26.0
> > > 
> > 
> > Hi Greg,
> > 
> > Could you queue this one up as well?
> 
> What is the git commit id of this in Linus's tree?

This appears to be 4f88f8747fa43c97c3b3712d8d87295ea757cc51 upstream,
but there is as well this comment added:

> For the backport to 5.4, struct_mutex needs to be held while retiring so
> that retirement doesn't race with vma destruction.
> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

Regards,
Salvatore

  reply	other threads:[~2020-04-13 14:37 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-30  3:30 [PATCH 0/2] A couple of important fixes for i915 on 5.4 Sultan Alsawaf
2020-03-30  3:30 ` [Intel-gfx] " Sultan Alsawaf
2020-03-30  3:30 ` Sultan Alsawaf
2020-03-30  3:30 ` [PATCH 1/2] drm/i915: Disable Panel Self Refresh (PSR) by default Sultan Alsawaf
2020-03-30  8:51   ` Greg KH
2020-03-30 15:50     ` Sultan Alsawaf
2020-03-30  3:30 ` [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles Sultan Alsawaf
2020-03-30  3:30   ` [Intel-gfx] " Sultan Alsawaf
2020-03-30  3:30   ` Sultan Alsawaf
2020-03-30 15:56   ` Sultan Alsawaf
2020-03-30 15:56     ` [Intel-gfx] " Sultan Alsawaf
2020-03-30 15:56     ` Sultan Alsawaf
2020-03-30 16:27   ` Chris Wilson
2020-03-30 16:35     ` Sultan Alsawaf
2020-03-30 16:45       ` Chris Wilson
2020-04-07 20:28         ` Sultan Alsawaf
2020-04-07 20:43           ` [PATCH v2] " Sultan Alsawaf
2020-04-12  7:48             ` Sultan Alsawaf
2020-04-12  7:56               ` Greg KH
2020-04-13 14:37                 ` Salvatore Bonaccorso [this message]
2020-04-13 15:23                 ` Sultan Alsawaf
2020-04-13 15:29                   ` [PATCH v3] " Sultan Alsawaf
2020-04-17  9:38                     ` Sultan Alsawaf
2020-04-18 10:29                       ` Greg KH
2020-04-18 10:30                     ` Greg KH
2020-03-30  8:51 ` [PATCH 0/2] A couple of important fixes for i915 on 5.4 Greg KH
2020-03-30 15:55   ` Sultan Alsawaf
2020-03-30 15:55     ` [Intel-gfx] " Sultan Alsawaf
2020-03-30 15:55     ` Sultan Alsawaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200413143728.GA709872@eldamar.local \
    --to=carnil@debian.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=stable@vger.kernel.org \
    --cc=sultan@kerneltoast.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.