All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] A couple of important fixes for i915 on 5.4
@ 2020-03-30  3:30 ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30  3:30 UTC (permalink / raw)
  To: stable; +Cc: Sultan Alsawaf

From: Sultan Alsawaf <sultan@kerneltoast.com>

Hi,

This patchset contains fixes for two pesky i915 bugs that exist in 5.4.

The first bug, fixed by
"drm/i915/gt: Schedule request retirement when timeline idles" upstream, is very
high power consumption by i915 hardware due to an old commit that broke the RC6
power state for the iGPU on Skylake+. On my laptop, a Dell Precision 5540 with
an i9-9880H, the idle power consumption of my laptop with this commit goes down
from 10 W to 2 W, saving a massive 8 W of power. On other Skylake+ laptops I
tested, this commit reduced idle power consumption by at least a few watts. The
difference in power consumption can be observed by running `powertop` while
disconnected from the charger, or by using the intel-undervolt tool [1] and
running `intel-undervolt measure` which doesn't require being disconnected from
the charger. The psys measurement from intel-undervolt is the one of interest.

Backporting this commit was not trivial due to i915's high rate of churn, but
the backport didn't require changing any code from the original commit; it only
required moving code around and not altering some #includes.

The second bug causes severe graphical corruption and flickering on laptops
which support an esoteric power-saving mechanism called Panel Self Refresh
(PSR). Enabled by default in 5.0, PSR causes graphical corruption to the point
of unusability on many Dell laptops made within the past few years, since they
typically support PSR. A bug was filed with Intel a while ago for this with more
information [2]. I suspect most the community hasn't been affected by this bug
because ThinkPads and many other laptops I checked didn't support PSR. As of
writing, the issues I observed with PSR are fixed in Intel's drm-tip branch, but
i915 suffers from so much churn that backporting the individual PSR fixes is
infeasible (and an Intel engineer attested to this, saying that the PSR fixes in
drm-tip wouldn't be backported [3]). Disabling PSR by default brings us back to
pre-5.0 behavior, and from my tests with functional PSR in the drm-tip kernel,
I didn't observe any reduction in power consumption with PSR enabled, so there
isn't much lost from turning it off. Also, Ubuntu now ships with PSR disabled by
default in its affected kernels, so there is distro support behind this change.

Thanks,
Sultan

[1] https://github.com/kitsunyan/intel-undervolt
[2] https://gitlab.freedesktop.org/drm/intel/issues/425
[3] https://gitlab.freedesktop.org/drm/intel/issues/425#note_416130

Chris Wilson (1):
  drm/i915/gt: Schedule request retirement when timeline idles

Sultan Alsawaf (1):
  drm/i915: Disable Panel Self Refresh (PSR) by default

 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_params.c            |  3 +-
 drivers/gpu/drm/i915/i915_params.h            |  2 +-
 drivers/gpu/drm/i915/i915_request.c           | 73 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 9 files changed, 101 insertions(+), 3 deletions(-)

-- 
2.26.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 0/2] A couple of important fixes for i915 on 5.4
@ 2020-03-30  3:30 ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30  3:30 UTC (permalink / raw)
  To: stable; +Cc: Sultan Alsawaf

From: Sultan Alsawaf <sultan@kerneltoast.com>

Hi,

This patchset contains fixes for two pesky i915 bugs that exist in 5.4.

The first bug, fixed by
"drm/i915/gt: Schedule request retirement when timeline idles" upstream, is very
high power consumption by i915 hardware due to an old commit that broke the RC6
power state for the iGPU on Skylake+. On my laptop, a Dell Precision 5540 with
an i9-9880H, the idle power consumption of my laptop with this commit goes down
from 10 W to 2 W, saving a massive 8 W of power. On other Skylake+ laptops I
tested, this commit reduced idle power consumption by at least a few watts. The
difference in power consumption can be observed by running `powertop` while
disconnected from the charger, or by using the intel-undervolt tool [1] and
running `intel-undervolt measure` which doesn't require being disconnected from
the charger. The psys measurement from intel-undervolt is the one of interest.

Backporting this commit was not trivial due to i915's high rate of churn, but
the backport didn't require changing any code from the original commit; it only
required moving code around and not altering some #includes.

The second bug causes severe graphical corruption and flickering on laptops
which support an esoteric power-saving mechanism called Panel Self Refresh
(PSR). Enabled by default in 5.0, PSR causes graphical corruption to the point
of unusability on many Dell laptops made within the past few years, since they
typically support PSR. A bug was filed with Intel a while ago for this with more
information [2]. I suspect most the community hasn't been affected by this bug
because ThinkPads and many other laptops I checked didn't support PSR. As of
writing, the issues I observed with PSR are fixed in Intel's drm-tip branch, but
i915 suffers from so much churn that backporting the individual PSR fixes is
infeasible (and an Intel engineer attested to this, saying that the PSR fixes in
drm-tip wouldn't be backported [3]). Disabling PSR by default brings us back to
pre-5.0 behavior, and from my tests with functional PSR in the drm-tip kernel,
I didn't observe any reduction in power consumption with PSR enabled, so there
isn't much lost from turning it off. Also, Ubuntu now ships with PSR disabled by
default in its affected kernels, so there is distro support behind this change.

Thanks,
Sultan

[1] https://github.com/kitsunyan/intel-undervolt
[2] https://gitlab.freedesktop.org/drm/intel/issues/425
[3] https://gitlab.freedesktop.org/drm/intel/issues/425#note_416130

Chris Wilson (1):
  drm/i915/gt: Schedule request retirement when timeline idles

Sultan Alsawaf (1):
  drm/i915: Disable Panel Self Refresh (PSR) by default

 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_params.c            |  3 +-
 drivers/gpu/drm/i915/i915_params.h            |  2 +-
 drivers/gpu/drm/i915/i915_request.c           | 73 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 9 files changed, 101 insertions(+), 3 deletions(-)

-- 
2.26.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Intel-gfx] [PATCH 0/2] A couple of important fixes for i915 on 5.4
@ 2020-03-30  3:30 ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30  3:30 UTC (permalink / raw)
  To: stable; +Cc: Sultan Alsawaf

From: Sultan Alsawaf <sultan@kerneltoast.com>

Hi,

This patchset contains fixes for two pesky i915 bugs that exist in 5.4.

The first bug, fixed by
"drm/i915/gt: Schedule request retirement when timeline idles" upstream, is very
high power consumption by i915 hardware due to an old commit that broke the RC6
power state for the iGPU on Skylake+. On my laptop, a Dell Precision 5540 with
an i9-9880H, the idle power consumption of my laptop with this commit goes down
from 10 W to 2 W, saving a massive 8 W of power. On other Skylake+ laptops I
tested, this commit reduced idle power consumption by at least a few watts. The
difference in power consumption can be observed by running `powertop` while
disconnected from the charger, or by using the intel-undervolt tool [1] and
running `intel-undervolt measure` which doesn't require being disconnected from
the charger. The psys measurement from intel-undervolt is the one of interest.

Backporting this commit was not trivial due to i915's high rate of churn, but
the backport didn't require changing any code from the original commit; it only
required moving code around and not altering some #includes.

The second bug causes severe graphical corruption and flickering on laptops
which support an esoteric power-saving mechanism called Panel Self Refresh
(PSR). Enabled by default in 5.0, PSR causes graphical corruption to the point
of unusability on many Dell laptops made within the past few years, since they
typically support PSR. A bug was filed with Intel a while ago for this with more
information [2]. I suspect most the community hasn't been affected by this bug
because ThinkPads and many other laptops I checked didn't support PSR. As of
writing, the issues I observed with PSR are fixed in Intel's drm-tip branch, but
i915 suffers from so much churn that backporting the individual PSR fixes is
infeasible (and an Intel engineer attested to this, saying that the PSR fixes in
drm-tip wouldn't be backported [3]). Disabling PSR by default brings us back to
pre-5.0 behavior, and from my tests with functional PSR in the drm-tip kernel,
I didn't observe any reduction in power consumption with PSR enabled, so there
isn't much lost from turning it off. Also, Ubuntu now ships with PSR disabled by
default in its affected kernels, so there is distro support behind this change.

Thanks,
Sultan

[1] https://github.com/kitsunyan/intel-undervolt
[2] https://gitlab.freedesktop.org/drm/intel/issues/425
[3] https://gitlab.freedesktop.org/drm/intel/issues/425#note_416130

Chris Wilson (1):
  drm/i915/gt: Schedule request retirement when timeline idles

Sultan Alsawaf (1):
  drm/i915: Disable Panel Self Refresh (PSR) by default

 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_params.c            |  3 +-
 drivers/gpu/drm/i915/i915_params.h            |  2 +-
 drivers/gpu/drm/i915/i915_request.c           | 73 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 9 files changed, 101 insertions(+), 3 deletions(-)

-- 
2.26.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] drm/i915: Disable Panel Self Refresh (PSR) by default
  2020-03-30  3:30 ` Sultan Alsawaf
  (?)
  (?)
@ 2020-03-30  3:30 ` Sultan Alsawaf
  2020-03-30  8:51   ` Greg KH
  -1 siblings, 1 reply; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30  3:30 UTC (permalink / raw)
  To: stable; +Cc: Sultan Alsawaf

From: Sultan Alsawaf <sultan@kerneltoast.com>

On all Dell laptops with panels and chipsets that support PSR (an
esoteric power-saving mechanism), both PSR1 and PSR2 cause flickering
and graphical glitches, sometimes to the point of making the laptop
unusable. Many laptops don't support PSR so it isn't known if PSR works
correctly on any consumer hardware as of 5.4. PSR was enabled by default
in 5.0 for capable hardware, so this patch just restores the previous
functionality of PSR being disabled by default.

More info is available on the freedesktop bug:
https://gitlab.freedesktop.org/drm/intel/issues/425

Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
---
 drivers/gpu/drm/i915/i915_params.c | 3 +--
 drivers/gpu/drm/i915/i915_params.h | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 296452f9efe4..4a0d37f7cf49 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -84,8 +84,7 @@ i915_param_named_unsafe(enable_hangcheck, bool, 0600,
 
 i915_param_named_unsafe(enable_psr, int, 0600,
 	"Enable PSR "
-	"(0=disabled, 1=enabled) "
-	"Default: -1 (use per-chip default)");
+	"(-1=use per-chip default, 0=disabled [default], 1=enabled) ");
 
 i915_param_named_unsafe(force_probe, charp, 0400,
 	"Force probe the driver for specified devices. "
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index d29ade3b7de6..715888811593 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -50,7 +50,7 @@ struct drm_printer;
 	param(int, vbt_sdvo_panel_type, -1) \
 	param(int, enable_dc, -1) \
 	param(int, enable_fbc, -1) \
-	param(int, enable_psr, -1) \
+	param(int, enable_psr, 0) \
 	param(int, disable_power_well, -1) \
 	param(int, enable_ips, 1) \
 	param(int, invert_brightness, 0) \
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-03-30  3:30 ` Sultan Alsawaf
  (?)
@ 2020-03-30  3:30   ` Sultan Alsawaf
  -1 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30  3:30 UTC (permalink / raw)
  To: stable; +Cc: Chris Wilson, Tvrtko Ursulin, Joonas Lahtinen, Sultan Alsawaf

From: Chris Wilson <chris@chris-wilson.co.uk>

commit 311770173fac27845a3a83e2c16100a54d308f72 upstream.

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster...  Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_request.c           | 73 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 7 files changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ce8626b140e..f732a2177cd0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_hangcheck(engine);
 	intel_engine_init_cmd_parser(engine);
 	intel_engine_init__pm(engine);
+	intel_engine_init_retire(engine);
 
 	intel_engine_pool_init(&engine->pool);
 
@@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 
 	cleanup_status_page(engine);
 
+	intel_engine_fini_retire(engine);
 	intel_engine_pool_fini(&engine->pool);
 	intel_engine_fini_breadcrumbs(engine);
 	intel_engine_cleanup_cmd_parser(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c77c9518c58b..1eb7189f88e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -471,6 +471,14 @@ struct intel_engine_cs {
 
 	struct intel_engine_execlists execlists;
 
+	/*
+	 * Keep track of completed timelines on this engine for early
+	 * retirement with the goal of quickly enabling powersaving as
+	 * soon as the engine is idle.
+	 */
+	struct intel_timeline *retire;
+	struct work_struct retire_work;
+
 	/* status_notifier: list of callbacks for context-switch changes */
 	struct atomic_notifier_head context_status_notifier;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 66f6d1a897f2..a1538c8f7922 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
 {
 	struct intel_context * const ce = rq->hw_context;
 
+	/*
+	 * If we have just completed this context, the engine may now be
+	 * idle and we want to re-enter powersaving.
+	 */
+	if (list_is_last(&rq->link, &ce->timeline->requests) &&
+	    i915_request_completed(rq))
+		intel_engine_add_retire(engine, ce->timeline);
+
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put(engine->gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 9cb01d9828f1..63515e3caaf2 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
 {
 	GEM_BUG_ON(atomic_read(&timeline->pin_count));
 	GEM_BUG_ON(!list_empty(&timeline->requests));
+	GEM_BUG_ON(timeline->retire);
 
 	if (timeline->hwsp_cacheline)
 		cacheline_free(timeline->hwsp_cacheline);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 2b1baf2fcc8e..adf2d14ef647 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -65,6 +65,9 @@ struct intel_timeline {
 	 */
 	struct i915_active_request last_request;
 
+	/** A chain of completed timelines ready for early retirement. */
+	struct intel_timeline *retire;
+
 	/**
 	 * We track the most recent seqno that we wait on in every context so
 	 * that we only have to emit a new await and dependency on a more
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0d39038898d4..4d59cfca9eae 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -600,6 +600,79 @@ static void retire_requests(struct intel_timeline *tl)
 			break;
 }
 
+static void engine_retire(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), retire_work);
+	struct intel_timeline *tl = xchg(&engine->retire, NULL);
+
+	do {
+		struct intel_timeline *next = xchg(&tl->retire, NULL);
+
+		/*
+		 * Our goal here is to retire _idle_ timelines as soon as
+		 * possible (as they are idle, we do not expect userspace
+		 * to be cleaning up anytime soon).
+		 *
+		 * If the timeline is currently locked, either it is being
+		 * retired elsewhere or about to be!
+		 */
+		if (mutex_trylock(&tl->mutex)) {
+			retire_requests(tl);
+			mutex_unlock(&tl->mutex);
+		}
+		intel_timeline_put(tl);
+
+		GEM_BUG_ON(!next);
+		tl = ptr_mask_bits(next, 1);
+	} while (tl);
+}
+
+static bool add_retire(struct intel_engine_cs *engine,
+		       struct intel_timeline *tl)
+{
+	struct intel_timeline *first;
+
+	/*
+	 * We open-code a llist here to include the additional tag [BIT(0)]
+	 * so that we know when the timeline is already on a
+	 * retirement queue: either this engine or another.
+	 *
+	 * However, we rely on that a timeline can only be active on a single
+	 * engine at any one time and that add_retire() is called before the
+	 * engine releases the timeline and transferred to another to retire.
+	 */
+
+	if (READ_ONCE(tl->retire)) /* already queued */
+		return false;
+
+	intel_timeline_get(tl);
+	first = READ_ONCE(engine->retire);
+	do
+		tl->retire = ptr_pack_bits(first, 1, 1);
+	while (!try_cmpxchg(&engine->retire, &first, tl));
+
+	return !first;
+}
+
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl)
+{
+	if (add_retire(engine, tl))
+		schedule_work(&engine->retire_work);
+}
+
+void intel_engine_init_retire(struct intel_engine_cs *engine)
+{
+	INIT_WORK(&engine->retire_work, engine_retire);
+}
+
+void intel_engine_fini_retire(struct intel_engine_cs *engine)
+{
+	flush_work(&engine->retire_work);
+	GEM_BUG_ON(engine->retire);
+}
+
 static noinline struct i915_request *
 request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
 {
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 3a3e7bbf19ff..40caa2f3f4a4 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
 }
 
 bool i915_retire_requests(struct drm_i915_private *i915);
+void intel_engine_init_retire(struct intel_engine_cs *engine);
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl);
+void intel_engine_fini_retire(struct intel_engine_cs *engine);
 
 #endif /* I915_REQUEST_H */
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
@ 2020-03-30  3:30   ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30  3:30 UTC (permalink / raw)
  To: stable; +Cc: Sultan Alsawaf, Tvrtko Ursulin, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

commit 311770173fac27845a3a83e2c16100a54d308f72 upstream.

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster...  Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_request.c           | 73 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 7 files changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ce8626b140e..f732a2177cd0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_hangcheck(engine);
 	intel_engine_init_cmd_parser(engine);
 	intel_engine_init__pm(engine);
+	intel_engine_init_retire(engine);
 
 	intel_engine_pool_init(&engine->pool);
 
@@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 
 	cleanup_status_page(engine);
 
+	intel_engine_fini_retire(engine);
 	intel_engine_pool_fini(&engine->pool);
 	intel_engine_fini_breadcrumbs(engine);
 	intel_engine_cleanup_cmd_parser(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c77c9518c58b..1eb7189f88e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -471,6 +471,14 @@ struct intel_engine_cs {
 
 	struct intel_engine_execlists execlists;
 
+	/*
+	 * Keep track of completed timelines on this engine for early
+	 * retirement with the goal of quickly enabling powersaving as
+	 * soon as the engine is idle.
+	 */
+	struct intel_timeline *retire;
+	struct work_struct retire_work;
+
 	/* status_notifier: list of callbacks for context-switch changes */
 	struct atomic_notifier_head context_status_notifier;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 66f6d1a897f2..a1538c8f7922 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
 {
 	struct intel_context * const ce = rq->hw_context;
 
+	/*
+	 * If we have just completed this context, the engine may now be
+	 * idle and we want to re-enter powersaving.
+	 */
+	if (list_is_last(&rq->link, &ce->timeline->requests) &&
+	    i915_request_completed(rq))
+		intel_engine_add_retire(engine, ce->timeline);
+
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put(engine->gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 9cb01d9828f1..63515e3caaf2 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
 {
 	GEM_BUG_ON(atomic_read(&timeline->pin_count));
 	GEM_BUG_ON(!list_empty(&timeline->requests));
+	GEM_BUG_ON(timeline->retire);
 
 	if (timeline->hwsp_cacheline)
 		cacheline_free(timeline->hwsp_cacheline);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 2b1baf2fcc8e..adf2d14ef647 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -65,6 +65,9 @@ struct intel_timeline {
 	 */
 	struct i915_active_request last_request;
 
+	/** A chain of completed timelines ready for early retirement. */
+	struct intel_timeline *retire;
+
 	/**
 	 * We track the most recent seqno that we wait on in every context so
 	 * that we only have to emit a new await and dependency on a more
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0d39038898d4..4d59cfca9eae 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -600,6 +600,79 @@ static void retire_requests(struct intel_timeline *tl)
 			break;
 }
 
+static void engine_retire(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), retire_work);
+	struct intel_timeline *tl = xchg(&engine->retire, NULL);
+
+	do {
+		struct intel_timeline *next = xchg(&tl->retire, NULL);
+
+		/*
+		 * Our goal here is to retire _idle_ timelines as soon as
+		 * possible (as they are idle, we do not expect userspace
+		 * to be cleaning up anytime soon).
+		 *
+		 * If the timeline is currently locked, either it is being
+		 * retired elsewhere or about to be!
+		 */
+		if (mutex_trylock(&tl->mutex)) {
+			retire_requests(tl);
+			mutex_unlock(&tl->mutex);
+		}
+		intel_timeline_put(tl);
+
+		GEM_BUG_ON(!next);
+		tl = ptr_mask_bits(next, 1);
+	} while (tl);
+}
+
+static bool add_retire(struct intel_engine_cs *engine,
+		       struct intel_timeline *tl)
+{
+	struct intel_timeline *first;
+
+	/*
+	 * We open-code a llist here to include the additional tag [BIT(0)]
+	 * so that we know when the timeline is already on a
+	 * retirement queue: either this engine or another.
+	 *
+	 * However, we rely on that a timeline can only be active on a single
+	 * engine at any one time and that add_retire() is called before the
+	 * engine releases the timeline and transferred to another to retire.
+	 */
+
+	if (READ_ONCE(tl->retire)) /* already queued */
+		return false;
+
+	intel_timeline_get(tl);
+	first = READ_ONCE(engine->retire);
+	do
+		tl->retire = ptr_pack_bits(first, 1, 1);
+	while (!try_cmpxchg(&engine->retire, &first, tl));
+
+	return !first;
+}
+
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl)
+{
+	if (add_retire(engine, tl))
+		schedule_work(&engine->retire_work);
+}
+
+void intel_engine_init_retire(struct intel_engine_cs *engine)
+{
+	INIT_WORK(&engine->retire_work, engine_retire);
+}
+
+void intel_engine_fini_retire(struct intel_engine_cs *engine)
+{
+	flush_work(&engine->retire_work);
+	GEM_BUG_ON(engine->retire);
+}
+
 static noinline struct i915_request *
 request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
 {
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 3a3e7bbf19ff..40caa2f3f4a4 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
 }
 
 bool i915_retire_requests(struct drm_i915_private *i915);
+void intel_engine_init_retire(struct intel_engine_cs *engine);
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl);
+void intel_engine_fini_retire(struct intel_engine_cs *engine);
 
 #endif /* I915_REQUEST_H */
-- 
2.26.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Intel-gfx] [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
@ 2020-03-30  3:30   ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30  3:30 UTC (permalink / raw)
  To: stable; +Cc: Sultan Alsawaf, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

commit 311770173fac27845a3a83e2c16100a54d308f72 upstream.

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster...  Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_request.c           | 73 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 7 files changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ce8626b140e..f732a2177cd0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_hangcheck(engine);
 	intel_engine_init_cmd_parser(engine);
 	intel_engine_init__pm(engine);
+	intel_engine_init_retire(engine);
 
 	intel_engine_pool_init(&engine->pool);
 
@@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 
 	cleanup_status_page(engine);
 
+	intel_engine_fini_retire(engine);
 	intel_engine_pool_fini(&engine->pool);
 	intel_engine_fini_breadcrumbs(engine);
 	intel_engine_cleanup_cmd_parser(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c77c9518c58b..1eb7189f88e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -471,6 +471,14 @@ struct intel_engine_cs {
 
 	struct intel_engine_execlists execlists;
 
+	/*
+	 * Keep track of completed timelines on this engine for early
+	 * retirement with the goal of quickly enabling powersaving as
+	 * soon as the engine is idle.
+	 */
+	struct intel_timeline *retire;
+	struct work_struct retire_work;
+
 	/* status_notifier: list of callbacks for context-switch changes */
 	struct atomic_notifier_head context_status_notifier;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 66f6d1a897f2..a1538c8f7922 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
 {
 	struct intel_context * const ce = rq->hw_context;
 
+	/*
+	 * If we have just completed this context, the engine may now be
+	 * idle and we want to re-enter powersaving.
+	 */
+	if (list_is_last(&rq->link, &ce->timeline->requests) &&
+	    i915_request_completed(rq))
+		intel_engine_add_retire(engine, ce->timeline);
+
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put(engine->gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 9cb01d9828f1..63515e3caaf2 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
 {
 	GEM_BUG_ON(atomic_read(&timeline->pin_count));
 	GEM_BUG_ON(!list_empty(&timeline->requests));
+	GEM_BUG_ON(timeline->retire);
 
 	if (timeline->hwsp_cacheline)
 		cacheline_free(timeline->hwsp_cacheline);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 2b1baf2fcc8e..adf2d14ef647 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -65,6 +65,9 @@ struct intel_timeline {
 	 */
 	struct i915_active_request last_request;
 
+	/** A chain of completed timelines ready for early retirement. */
+	struct intel_timeline *retire;
+
 	/**
 	 * We track the most recent seqno that we wait on in every context so
 	 * that we only have to emit a new await and dependency on a more
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0d39038898d4..4d59cfca9eae 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -600,6 +600,79 @@ static void retire_requests(struct intel_timeline *tl)
 			break;
 }
 
+static void engine_retire(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), retire_work);
+	struct intel_timeline *tl = xchg(&engine->retire, NULL);
+
+	do {
+		struct intel_timeline *next = xchg(&tl->retire, NULL);
+
+		/*
+		 * Our goal here is to retire _idle_ timelines as soon as
+		 * possible (as they are idle, we do not expect userspace
+		 * to be cleaning up anytime soon).
+		 *
+		 * If the timeline is currently locked, either it is being
+		 * retired elsewhere or about to be!
+		 */
+		if (mutex_trylock(&tl->mutex)) {
+			retire_requests(tl);
+			mutex_unlock(&tl->mutex);
+		}
+		intel_timeline_put(tl);
+
+		GEM_BUG_ON(!next);
+		tl = ptr_mask_bits(next, 1);
+	} while (tl);
+}
+
+static bool add_retire(struct intel_engine_cs *engine,
+		       struct intel_timeline *tl)
+{
+	struct intel_timeline *first;
+
+	/*
+	 * We open-code a llist here to include the additional tag [BIT(0)]
+	 * so that we know when the timeline is already on a
+	 * retirement queue: either this engine or another.
+	 *
+	 * However, we rely on that a timeline can only be active on a single
+	 * engine at any one time and that add_retire() is called before the
+	 * engine releases the timeline and transferred to another to retire.
+	 */
+
+	if (READ_ONCE(tl->retire)) /* already queued */
+		return false;
+
+	intel_timeline_get(tl);
+	first = READ_ONCE(engine->retire);
+	do
+		tl->retire = ptr_pack_bits(first, 1, 1);
+	while (!try_cmpxchg(&engine->retire, &first, tl));
+
+	return !first;
+}
+
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl)
+{
+	if (add_retire(engine, tl))
+		schedule_work(&engine->retire_work);
+}
+
+void intel_engine_init_retire(struct intel_engine_cs *engine)
+{
+	INIT_WORK(&engine->retire_work, engine_retire);
+}
+
+void intel_engine_fini_retire(struct intel_engine_cs *engine)
+{
+	flush_work(&engine->retire_work);
+	GEM_BUG_ON(engine->retire);
+}
+
 static noinline struct i915_request *
 request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
 {
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 3a3e7bbf19ff..40caa2f3f4a4 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
 }
 
 bool i915_retire_requests(struct drm_i915_private *i915);
+void intel_engine_init_retire(struct intel_engine_cs *engine);
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl);
+void intel_engine_fini_retire(struct intel_engine_cs *engine);
 
 #endif /* I915_REQUEST_H */
-- 
2.26.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] drm/i915: Disable Panel Self Refresh (PSR) by default
  2020-03-30  3:30 ` [PATCH 1/2] drm/i915: Disable Panel Self Refresh (PSR) by default Sultan Alsawaf
@ 2020-03-30  8:51   ` Greg KH
  2020-03-30 15:50     ` Sultan Alsawaf
  0 siblings, 1 reply; 29+ messages in thread
From: Greg KH @ 2020-03-30  8:51 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: stable

On Sun, Mar 29, 2020 at 08:30:56PM -0700, Sultan Alsawaf wrote:
> From: Sultan Alsawaf <sultan@kerneltoast.com>
> 
> On all Dell laptops with panels and chipsets that support PSR (an
> esoteric power-saving mechanism), both PSR1 and PSR2 cause flickering
> and graphical glitches, sometimes to the point of making the laptop
> unusable. Many laptops don't support PSR so it isn't known if PSR works
> correctly on any consumer hardware as of 5.4. PSR was enabled by default
> in 5.0 for capable hardware, so this patch just restores the previous
> functionality of PSR being disabled by default.
> 
> More info is available on the freedesktop bug:
> https://gitlab.freedesktop.org/drm/intel/issues/425
> 
> Cc: <stable@vger.kernel.org> # 5.4.x
> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> ---
>  drivers/gpu/drm/i915/i915_params.c | 3 +--
>  drivers/gpu/drm/i915/i915_params.h | 2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)

What is the git commit id of this patch in Linus's tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] A couple of important fixes for i915 on 5.4
  2020-03-30  3:30 ` Sultan Alsawaf
                   ` (3 preceding siblings ...)
  (?)
@ 2020-03-30  8:51 ` Greg KH
  2020-03-30 15:55     ` Sultan Alsawaf
  -1 siblings, 1 reply; 29+ messages in thread
From: Greg KH @ 2020-03-30  8:51 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: stable

On Sun, Mar 29, 2020 at 08:30:55PM -0700, Sultan Alsawaf wrote:
> From: Sultan Alsawaf <sultan@kerneltoast.com>
> 
> Hi,
> 
> This patchset contains fixes for two pesky i915 bugs that exist in 5.4.

Any reason you didn't also cc: the developers involved in these patches,
and maintainers?

Please resend and do that.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] drm/i915: Disable Panel Self Refresh (PSR) by default
  2020-03-30  8:51   ` Greg KH
@ 2020-03-30 15:50     ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 15:50 UTC (permalink / raw)
  To: Greg KH; +Cc: stable

On Mon, Mar 30, 2020 at 10:51:04AM +0200, Greg KH wrote:
> On Sun, Mar 29, 2020 at 08:30:56PM -0700, Sultan Alsawaf wrote:
> > From: Sultan Alsawaf <sultan@kerneltoast.com>
> > 
> > On all Dell laptops with panels and chipsets that support PSR (an
> > esoteric power-saving mechanism), both PSR1 and PSR2 cause flickering
> > and graphical glitches, sometimes to the point of making the laptop
> > unusable. Many laptops don't support PSR so it isn't known if PSR works
> > correctly on any consumer hardware as of 5.4. PSR was enabled by default
> > in 5.0 for capable hardware, so this patch just restores the previous
> > functionality of PSR being disabled by default.
> > 
> > More info is available on the freedesktop bug:
> > https://gitlab.freedesktop.org/drm/intel/issues/425
> > 
> > Cc: <stable@vger.kernel.org> # 5.4.x
> > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> > ---
> >  drivers/gpu/drm/i915/i915_params.c | 3 +--
> >  drivers/gpu/drm/i915/i915_params.h | 2 +-
> >  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> What is the git commit id of this patch in Linus's tree?
> 
> thanks,
> 
> greg k-h

This isn't in Linus' tree. I know for a fact that 5.4 has broken PSR support
that'll never be fixed, but I wasn't sure if PSR was still broken in 5.6.

However, I tested 5.6 for a bit right now and PSR is still broken. And to top it
off, I managed to reproduce some PSR glitches with the drm-tip branch, so
there's currently no kernel in existence with functional PSR. :)

I will send this patch for inclusion in Linus' tree, so please disregard it for
now.

Thanks,
Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] A couple of important fixes for i915 on 5.4
  2020-03-30  8:51 ` [PATCH 0/2] A couple of important fixes for i915 on 5.4 Greg KH
  2020-03-30 15:55     ` Sultan Alsawaf
@ 2020-03-30 15:55     ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 15:55 UTC (permalink / raw)
  To: Greg KH
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, David Airlie,
	Daniel Vetter, Chris Wilson, Tvrtko Ursulin, Mika Kuoppala,
	Daniele Ceraolo Spurio, Matthew Auld, intel-gfx, dri-devel,
	linux-kernel

On Mon, Mar 30, 2020 at 10:51:28AM +0200, Greg KH wrote:
> On Sun, Mar 29, 2020 at 08:30:55PM -0700, Sultan Alsawaf wrote:
> > From: Sultan Alsawaf <sultan@kerneltoast.com>
> > 
> > Hi,
> > 
> > This patchset contains fixes for two pesky i915 bugs that exist in 5.4.
> 
> Any reason you didn't also cc: the developers involved in these patches,
> and maintainers?
> 
> Please resend and do that.
> 
> thanks,
> 
> greg k-h

CC'd. Sorry about that.

Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] A couple of important fixes for i915 on 5.4
@ 2020-03-30 15:55     ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 15:55 UTC (permalink / raw)
  To: Greg KH
  Cc: dri-devel, Tvrtko Ursulin, David Airlie, Mika Kuoppala,
	intel-gfx, linux-kernel, Chris Wilson, Daniele Ceraolo Spurio,
	Matthew Auld, Rodrigo Vivi

On Mon, Mar 30, 2020 at 10:51:28AM +0200, Greg KH wrote:
> On Sun, Mar 29, 2020 at 08:30:55PM -0700, Sultan Alsawaf wrote:
> > From: Sultan Alsawaf <sultan@kerneltoast.com>
> > 
> > Hi,
> > 
> > This patchset contains fixes for two pesky i915 bugs that exist in 5.4.
> 
> Any reason you didn't also cc: the developers involved in these patches,
> and maintainers?
> 
> Please resend and do that.
> 
> thanks,
> 
> greg k-h

CC'd. Sorry about that.

Sultan
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Intel-gfx] [PATCH 0/2] A couple of important fixes for i915 on 5.4
@ 2020-03-30 15:55     ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 15:55 UTC (permalink / raw)
  To: Greg KH
  Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, Chris Wilson,
	Matthew Auld

On Mon, Mar 30, 2020 at 10:51:28AM +0200, Greg KH wrote:
> On Sun, Mar 29, 2020 at 08:30:55PM -0700, Sultan Alsawaf wrote:
> > From: Sultan Alsawaf <sultan@kerneltoast.com>
> > 
> > Hi,
> > 
> > This patchset contains fixes for two pesky i915 bugs that exist in 5.4.
> 
> Any reason you didn't also cc: the developers involved in these patches,
> and maintainers?
> 
> Please resend and do that.
> 
> thanks,
> 
> greg k-h

CC'd. Sorry about that.

Sultan
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-03-30  3:30   ` Sultan Alsawaf
  (?)
@ 2020-03-30 15:56     ` Sultan Alsawaf
  -1 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 15:56 UTC (permalink / raw)
  To: stable
  Cc: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, David Airlie,
	Daniel Vetter, Chris Wilson, Tvrtko Ursulin, Mika Kuoppala,
	Daniele Ceraolo Spurio, Matthew Auld, intel-gfx, dri-devel,
	linux-kernel

CC'ing relevant folks. My apologies.

Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
@ 2020-03-30 15:56     ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 15:56 UTC (permalink / raw)
  To: stable
  Cc: dri-devel, Tvrtko Ursulin, David Airlie, Mika Kuoppala,
	intel-gfx, linux-kernel, Chris Wilson, Daniele Ceraolo Spurio,
	Matthew Auld, Rodrigo Vivi

CC'ing relevant folks. My apologies.

Sultan
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
@ 2020-03-30 15:56     ` Sultan Alsawaf
  0 siblings, 0 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 15:56 UTC (permalink / raw)
  To: stable
  Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, Chris Wilson,
	Matthew Auld

CC'ing relevant folks. My apologies.

Sultan
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-03-30  3:30   ` Sultan Alsawaf
                     ` (2 preceding siblings ...)
  (?)
@ 2020-03-30 16:27   ` Chris Wilson
  2020-03-30 16:35     ` Sultan Alsawaf
  -1 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2020-03-30 16:27 UTC (permalink / raw)
  To: Sultan Alsawaf, stable; +Cc: Tvrtko Ursulin, Joonas Lahtinen, Sultan Alsawaf

Quoting Sultan Alsawaf (2020-03-30 04:30:57)
> +static void engine_retire(struct work_struct *work)
> +{
> +       struct intel_engine_cs *engine =
> +               container_of(work, typeof(*engine), retire_work);
> +       struct intel_timeline *tl = xchg(&engine->retire, NULL);
> +
> +       do {
> +               struct intel_timeline *next = xchg(&tl->retire, NULL);
> +
> +               /*
> +                * Our goal here is to retire _idle_ timelines as soon as
> +                * possible (as they are idle, we do not expect userspace
> +                * to be cleaning up anytime soon).
> +                *
> +                * If the timeline is currently locked, either it is being
> +                * retired elsewhere or about to be!
> +                */

iirc the backport requires the retirement to be wrapped in struct_mutex

mutex_lock(&engine->i915->drm.struct_mutex);

> +               if (mutex_trylock(&tl->mutex)) {
> +                       retire_requests(tl);
> +                       mutex_unlock(&tl->mutex);
> +               }

mutex_unlock(&engine->i915->drm.struct_mutex);

Now the question is whether that was for 5.3 or 5.4. I think 5.3 is
where the changes were more severe and where we switch to the 4.19
approach.

> +               intel_timeline_put(tl);
> +
> +               GEM_BUG_ON(!next);
> +               tl = ptr_mask_bits(next, 1);
> +       } while (tl);
> +}

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-03-30 16:27   ` Chris Wilson
@ 2020-03-30 16:35     ` Sultan Alsawaf
  2020-03-30 16:45       ` Chris Wilson
  0 siblings, 1 reply; 29+ messages in thread
From: Sultan Alsawaf @ 2020-03-30 16:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: stable, Tvrtko Ursulin, Joonas Lahtinen

On Mon, Mar 30, 2020 at 05:27:28PM +0100, Chris Wilson wrote:
> Quoting Sultan Alsawaf (2020-03-30 04:30:57)
> > +static void engine_retire(struct work_struct *work)
> > +{
> > +       struct intel_engine_cs *engine =
> > +               container_of(work, typeof(*engine), retire_work);
> > +       struct intel_timeline *tl = xchg(&engine->retire, NULL);
> > +
> > +       do {
> > +               struct intel_timeline *next = xchg(&tl->retire, NULL);
> > +
> > +               /*
> > +                * Our goal here is to retire _idle_ timelines as soon as
> > +                * possible (as they are idle, we do not expect userspace
> > +                * to be cleaning up anytime soon).
> > +                *
> > +                * If the timeline is currently locked, either it is being
> > +                * retired elsewhere or about to be!
> > +                */
> 
> iirc the backport requires the retirement to be wrapped in struct_mutex
> 
> mutex_lock(&engine->i915->drm.struct_mutex);
> 
> > +               if (mutex_trylock(&tl->mutex)) {
> > +                       retire_requests(tl);
> > +                       mutex_unlock(&tl->mutex);
> > +               }
> 
> mutex_unlock(&engine->i915->drm.struct_mutex);
> 
> Now the question is whether that was for 5.3 or 5.4. I think 5.3 is
> where the changes were more severe and where we switch to the 4.19
> approach.
> 
> > +               intel_timeline_put(tl);
> > +
> > +               GEM_BUG_ON(!next);
> > +               tl = ptr_mask_bits(next, 1);
> > +       } while (tl);
> > +}

In 5.4, the existing retirement instances don't hold
`&engine->i915->drm.struct_mutex`, however it is held in 5.3. So it looks like
this is fine as-is for 5.4.

Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-03-30 16:35     ` Sultan Alsawaf
@ 2020-03-30 16:45       ` Chris Wilson
  2020-04-07 20:28         ` Sultan Alsawaf
  0 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2020-03-30 16:45 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: stable, Tvrtko Ursulin, Joonas Lahtinen

Quoting Sultan Alsawaf (2020-03-30 17:35:14)
> On Mon, Mar 30, 2020 at 05:27:28PM +0100, Chris Wilson wrote:
> > Quoting Sultan Alsawaf (2020-03-30 04:30:57)
> > > +static void engine_retire(struct work_struct *work)
> > > +{
> > > +       struct intel_engine_cs *engine =
> > > +               container_of(work, typeof(*engine), retire_work);
> > > +       struct intel_timeline *tl = xchg(&engine->retire, NULL);
> > > +
> > > +       do {
> > > +               struct intel_timeline *next = xchg(&tl->retire, NULL);
> > > +
> > > +               /*
> > > +                * Our goal here is to retire _idle_ timelines as soon as
> > > +                * possible (as they are idle, we do not expect userspace
> > > +                * to be cleaning up anytime soon).
> > > +                *
> > > +                * If the timeline is currently locked, either it is being
> > > +                * retired elsewhere or about to be!
> > > +                */
> > 
> > iirc the backport requires the retirement to be wrapped in struct_mutex
> > 
> > mutex_lock(&engine->i915->drm.struct_mutex);
> > 
> > > +               if (mutex_trylock(&tl->mutex)) {
> > > +                       retire_requests(tl);
> > > +                       mutex_unlock(&tl->mutex);
> > > +               }
> > 
> > mutex_unlock(&engine->i915->drm.struct_mutex);
> > 
> > Now the question is whether that was for 5.3 or 5.4. I think 5.3 is
> > where the changes were more severe and where we switch to the 4.19
> > approach.
> > 
> > > +               intel_timeline_put(tl);
> > > +
> > > +               GEM_BUG_ON(!next);
> > > +               tl = ptr_mask_bits(next, 1);
> > > +       } while (tl);
> > > +}
> 
> In 5.4, the existing retirement instances don't hold
> `&engine->i915->drm.struct_mutex`, however it is held in 5.3. So it looks like
> this is fine as-is for 5.4.

git agrees.

commit e5dadff4b09376e8ed92ecc0c12f1b9b3b1fbd19
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Aug 15 21:57:09 2019 +0100

    drm/i915: Protect request retirement with timeline->mutex

is in v5.4, so we should be safe to retire without the struct_mutex.
-Chris

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-03-30 16:45       ` Chris Wilson
@ 2020-04-07 20:28         ` Sultan Alsawaf
  2020-04-07 20:43           ` [PATCH v2] " Sultan Alsawaf
  0 siblings, 1 reply; 29+ messages in thread
From: Sultan Alsawaf @ 2020-04-07 20:28 UTC (permalink / raw)
  To: Chris Wilson; +Cc: stable, Tvrtko Ursulin, Joonas Lahtinen

On Mon, Mar 30, 2020 at 05:45:25PM +0100, Chris Wilson wrote:
> Quoting Sultan Alsawaf (2020-03-30 17:35:14)
> > On Mon, Mar 30, 2020 at 05:27:28PM +0100, Chris Wilson wrote:
> > > Quoting Sultan Alsawaf (2020-03-30 04:30:57)
> > > > +static void engine_retire(struct work_struct *work)
> > > > +{
> > > > +       struct intel_engine_cs *engine =
> > > > +               container_of(work, typeof(*engine), retire_work);
> > > > +       struct intel_timeline *tl = xchg(&engine->retire, NULL);
> > > > +
> > > > +       do {
> > > > +               struct intel_timeline *next = xchg(&tl->retire, NULL);
> > > > +
> > > > +               /*
> > > > +                * Our goal here is to retire _idle_ timelines as soon as
> > > > +                * possible (as they are idle, we do not expect userspace
> > > > +                * to be cleaning up anytime soon).
> > > > +                *
> > > > +                * If the timeline is currently locked, either it is being
> > > > +                * retired elsewhere or about to be!
> > > > +                */
> > > 
> > > iirc the backport requires the retirement to be wrapped in struct_mutex
> > > 
> > > mutex_lock(&engine->i915->drm.struct_mutex);
> > > 
> > > > +               if (mutex_trylock(&tl->mutex)) {
> > > > +                       retire_requests(tl);
> > > > +                       mutex_unlock(&tl->mutex);
> > > > +               }
> > > 
> > > mutex_unlock(&engine->i915->drm.struct_mutex);
> > > 
> > > Now the question is whether that was for 5.3 or 5.4. I think 5.3 is
> > > where the changes were more severe and where we switch to the 4.19
> > > approach.
> > > 
> > > > +               intel_timeline_put(tl);
> > > > +
> > > > +               GEM_BUG_ON(!next);
> > > > +               tl = ptr_mask_bits(next, 1);
> > > > +       } while (tl);
> > > > +}
> > 
> > In 5.4, the existing retirement instances don't hold
> > `&engine->i915->drm.struct_mutex`, however it is held in 5.3. So it looks like
> > this is fine as-is for 5.4.
> 
> git agrees.
> 
> commit e5dadff4b09376e8ed92ecc0c12f1b9b3b1fbd19
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Thu Aug 15 21:57:09 2019 +0100
> 
>     drm/i915: Protect request retirement with timeline->mutex
> 
> is in v5.4, so we should be safe to retire without the struct_mutex.
> -Chris

No, you were right; `&engine->i915->drm.struct_mutex` needs to be held in 5.4,
otherwise retirement races with vma destruction. I'll send an updated patch.

Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-07 20:28         ` Sultan Alsawaf
@ 2020-04-07 20:43           ` Sultan Alsawaf
  2020-04-12  7:48             ` Sultan Alsawaf
  0 siblings, 1 reply; 29+ messages in thread
From: Sultan Alsawaf @ 2020-04-07 20:43 UTC (permalink / raw)
  To: stable; +Cc: Chris Wilson, Tvrtko Ursulin, Joonas Lahtinen, Sultan Alsawaf

From: Chris Wilson <chris@chris-wilson.co.uk>

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster...  Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)

For the backport to 5.4, struct_mutex needs to be held while retiring so
that retirement doesn't race with vma destruction.
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 7 files changed, 101 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ce8626b140e..f732a2177cd0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_hangcheck(engine);
 	intel_engine_init_cmd_parser(engine);
 	intel_engine_init__pm(engine);
+	intel_engine_init_retire(engine);
 
 	intel_engine_pool_init(&engine->pool);
 
@@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 
 	cleanup_status_page(engine);
 
+	intel_engine_fini_retire(engine);
 	intel_engine_pool_fini(&engine->pool);
 	intel_engine_fini_breadcrumbs(engine);
 	intel_engine_cleanup_cmd_parser(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c77c9518c58b..1eb7189f88e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -471,6 +471,14 @@ struct intel_engine_cs {
 
 	struct intel_engine_execlists execlists;
 
+	/*
+	 * Keep track of completed timelines on this engine for early
+	 * retirement with the goal of quickly enabling powersaving as
+	 * soon as the engine is idle.
+	 */
+	struct intel_timeline *retire;
+	struct work_struct retire_work;
+
 	/* status_notifier: list of callbacks for context-switch changes */
 	struct atomic_notifier_head context_status_notifier;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 66f6d1a897f2..a1538c8f7922 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
 {
 	struct intel_context * const ce = rq->hw_context;
 
+	/*
+	 * If we have just completed this context, the engine may now be
+	 * idle and we want to re-enter powersaving.
+	 */
+	if (list_is_last(&rq->link, &ce->timeline->requests) &&
+	    i915_request_completed(rq))
+		intel_engine_add_retire(engine, ce->timeline);
+
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put(engine->gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 9cb01d9828f1..63515e3caaf2 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
 {
 	GEM_BUG_ON(atomic_read(&timeline->pin_count));
 	GEM_BUG_ON(!list_empty(&timeline->requests));
+	GEM_BUG_ON(timeline->retire);
 
 	if (timeline->hwsp_cacheline)
 		cacheline_free(timeline->hwsp_cacheline);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 2b1baf2fcc8e..adf2d14ef647 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -65,6 +65,9 @@ struct intel_timeline {
 	 */
 	struct i915_active_request last_request;
 
+	/** A chain of completed timelines ready for early retirement. */
+	struct intel_timeline *retire;
+
 	/**
 	 * We track the most recent seqno that we wait on in every context so
 	 * that we only have to emit a new await and dependency on a more
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0d39038898d4..35457fda350c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
 			break;
 }
 
+static void engine_retire(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), retire_work);
+	struct intel_timeline *tl = xchg(&engine->retire, NULL);
+
+	do {
+		struct intel_timeline *next = xchg(&tl->retire, NULL);
+
+		/*
+		 * Our goal here is to retire _idle_ timelines as soon as
+		 * possible (as they are idle, we do not expect userspace
+		 * to be cleaning up anytime soon).
+		 *
+		 * If the timeline is currently locked, either it is being
+		 * retired elsewhere or about to be!
+		 */
+		mutex_lock(&engine->i915->drm.struct_mutex);
+		if (mutex_trylock(&tl->mutex)) {
+			retire_requests(tl);
+			mutex_unlock(&tl->mutex);
+		}
+		mutex_unlock(&engine->i915->drm.struct_mutex);
+		intel_timeline_put(tl);
+
+		GEM_BUG_ON(!next);
+		tl = ptr_mask_bits(next, 1);
+	} while (tl);
+}
+
+static bool add_retire(struct intel_engine_cs *engine,
+		       struct intel_timeline *tl)
+{
+	struct intel_timeline *first;
+
+	/*
+	 * We open-code a llist here to include the additional tag [BIT(0)]
+	 * so that we know when the timeline is already on a
+	 * retirement queue: either this engine or another.
+	 *
+	 * However, we rely on that a timeline can only be active on a single
+	 * engine at any one time and that add_retire() is called before the
+	 * engine releases the timeline and transferred to another to retire.
+	 */
+
+	if (READ_ONCE(tl->retire)) /* already queued */
+		return false;
+
+	intel_timeline_get(tl);
+	first = READ_ONCE(engine->retire);
+	do
+		tl->retire = ptr_pack_bits(first, 1, 1);
+	while (!try_cmpxchg(&engine->retire, &first, tl));
+
+	return !first;
+}
+
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl)
+{
+	if (add_retire(engine, tl))
+		schedule_work(&engine->retire_work);
+}
+
+void intel_engine_init_retire(struct intel_engine_cs *engine)
+{
+	INIT_WORK(&engine->retire_work, engine_retire);
+}
+
+void intel_engine_fini_retire(struct intel_engine_cs *engine)
+{
+	flush_work(&engine->retire_work);
+	GEM_BUG_ON(engine->retire);
+}
+
 static noinline struct i915_request *
 request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
 {
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 3a3e7bbf19ff..40caa2f3f4a4 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
 }
 
 bool i915_retire_requests(struct drm_i915_private *i915);
+void intel_engine_init_retire(struct intel_engine_cs *engine);
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl);
+void intel_engine_fini_retire(struct intel_engine_cs *engine);
 
 #endif /* I915_REQUEST_H */
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-07 20:43           ` [PATCH v2] " Sultan Alsawaf
@ 2020-04-12  7:48             ` Sultan Alsawaf
  2020-04-12  7:56               ` Greg KH
  0 siblings, 1 reply; 29+ messages in thread
From: Sultan Alsawaf @ 2020-04-12  7:48 UTC (permalink / raw)
  To: gregkh; +Cc: stable

On Tue, Apr 07, 2020 at 01:43:45PM -0700, Sultan Alsawaf wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> corruption WA") is that it disables RC6 while Skylake (and friends) is
> active, and we do not consider the GPU idle until all outstanding
> requests have been retired and the engine switched over to the kernel
> context. If userspace is idle, this task falls onto our background idle
> worker, which only runs roughly once a second, meaning that userspace has
> to have been idle for a couple of seconds before we enable RC6 again.
> Naturally, this causes us to consume considerably more energy than
> before as powersaving is effectively disabled while a display server
> (here's looking at you Xorg) is running.
> 
> As execlists will get a completion event as each context is completed,
> we can use this interrupt to queue a retire worker bound to this engine
> to cleanup idle timelines. We will then immediately notice the idle
> engine (without userspace intervention or the aid of the background
> retire worker) and start parking the GPU. Thus during light workloads,
> we will do much more work to idle the GPU faster...  Hopefully with
> commensurate power saving!
> 
> v2: Watch context completions and only look at those local to the engine
> when retiring to reduce the amount of excess work we perform.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
> (cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
> 
> For the backport to 5.4, struct_mutex needs to be held while retiring so
> that retirement doesn't race with vma destruction.
> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
>  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
>  drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
>  drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
>  .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
>  drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
>  drivers/gpu/drm/i915/i915_request.h           |  4 +
>  7 files changed, 101 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 4ce8626b140e..f732a2177cd0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
>  	intel_engine_init_hangcheck(engine);
>  	intel_engine_init_cmd_parser(engine);
>  	intel_engine_init__pm(engine);
> +	intel_engine_init_retire(engine);
>  
>  	intel_engine_pool_init(&engine->pool);
>  
> @@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
>  
>  	cleanup_status_page(engine);
>  
> +	intel_engine_fini_retire(engine);
>  	intel_engine_pool_fini(&engine->pool);
>  	intel_engine_fini_breadcrumbs(engine);
>  	intel_engine_cleanup_cmd_parser(engine);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index c77c9518c58b..1eb7189f88e1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -471,6 +471,14 @@ struct intel_engine_cs {
>  
>  	struct intel_engine_execlists execlists;
>  
> +	/*
> +	 * Keep track of completed timelines on this engine for early
> +	 * retirement with the goal of quickly enabling powersaving as
> +	 * soon as the engine is idle.
> +	 */
> +	struct intel_timeline *retire;
> +	struct work_struct retire_work;
> +
>  	/* status_notifier: list of callbacks for context-switch changes */
>  	struct atomic_notifier_head context_status_notifier;
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 66f6d1a897f2..a1538c8f7922 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
>  {
>  	struct intel_context * const ce = rq->hw_context;
>  
> +	/*
> +	 * If we have just completed this context, the engine may now be
> +	 * idle and we want to re-enter powersaving.
> +	 */
> +	if (list_is_last(&rq->link, &ce->timeline->requests) &&
> +	    i915_request_completed(rq))
> +		intel_engine_add_retire(engine, ce->timeline);
> +
>  	intel_engine_context_out(engine);
>  	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
>  	intel_gt_pm_put(engine->gt);
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index 9cb01d9828f1..63515e3caaf2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
>  {
>  	GEM_BUG_ON(atomic_read(&timeline->pin_count));
>  	GEM_BUG_ON(!list_empty(&timeline->requests));
> +	GEM_BUG_ON(timeline->retire);
>  
>  	if (timeline->hwsp_cacheline)
>  		cacheline_free(timeline->hwsp_cacheline);
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> index 2b1baf2fcc8e..adf2d14ef647 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> @@ -65,6 +65,9 @@ struct intel_timeline {
>  	 */
>  	struct i915_active_request last_request;
>  
> +	/** A chain of completed timelines ready for early retirement. */
> +	struct intel_timeline *retire;
> +
>  	/**
>  	 * We track the most recent seqno that we wait on in every context so
>  	 * that we only have to emit a new await and dependency on a more
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 0d39038898d4..35457fda350c 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
>  			break;
>  }
>  
> +static void engine_retire(struct work_struct *work)
> +{
> +	struct intel_engine_cs *engine =
> +		container_of(work, typeof(*engine), retire_work);
> +	struct intel_timeline *tl = xchg(&engine->retire, NULL);
> +
> +	do {
> +		struct intel_timeline *next = xchg(&tl->retire, NULL);
> +
> +		/*
> +		 * Our goal here is to retire _idle_ timelines as soon as
> +		 * possible (as they are idle, we do not expect userspace
> +		 * to be cleaning up anytime soon).
> +		 *
> +		 * If the timeline is currently locked, either it is being
> +		 * retired elsewhere or about to be!
> +		 */
> +		mutex_lock(&engine->i915->drm.struct_mutex);
> +		if (mutex_trylock(&tl->mutex)) {
> +			retire_requests(tl);
> +			mutex_unlock(&tl->mutex);
> +		}
> +		mutex_unlock(&engine->i915->drm.struct_mutex);
> +		intel_timeline_put(tl);
> +
> +		GEM_BUG_ON(!next);
> +		tl = ptr_mask_bits(next, 1);
> +	} while (tl);
> +}
> +
> +static bool add_retire(struct intel_engine_cs *engine,
> +		       struct intel_timeline *tl)
> +{
> +	struct intel_timeline *first;
> +
> +	/*
> +	 * We open-code a llist here to include the additional tag [BIT(0)]
> +	 * so that we know when the timeline is already on a
> +	 * retirement queue: either this engine or another.
> +	 *
> +	 * However, we rely on that a timeline can only be active on a single
> +	 * engine at any one time and that add_retire() is called before the
> +	 * engine releases the timeline and transferred to another to retire.
> +	 */
> +
> +	if (READ_ONCE(tl->retire)) /* already queued */
> +		return false;
> +
> +	intel_timeline_get(tl);
> +	first = READ_ONCE(engine->retire);
> +	do
> +		tl->retire = ptr_pack_bits(first, 1, 1);
> +	while (!try_cmpxchg(&engine->retire, &first, tl));
> +
> +	return !first;
> +}
> +
> +void intel_engine_add_retire(struct intel_engine_cs *engine,
> +			     struct intel_timeline *tl)
> +{
> +	if (add_retire(engine, tl))
> +		schedule_work(&engine->retire_work);
> +}
> +
> +void intel_engine_init_retire(struct intel_engine_cs *engine)
> +{
> +	INIT_WORK(&engine->retire_work, engine_retire);
> +}
> +
> +void intel_engine_fini_retire(struct intel_engine_cs *engine)
> +{
> +	flush_work(&engine->retire_work);
> +	GEM_BUG_ON(engine->retire);
> +}
> +
>  static noinline struct i915_request *
>  request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 3a3e7bbf19ff..40caa2f3f4a4 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
>  }
>  
>  bool i915_retire_requests(struct drm_i915_private *i915);
> +void intel_engine_init_retire(struct intel_engine_cs *engine);
> +void intel_engine_add_retire(struct intel_engine_cs *engine,
> +			     struct intel_timeline *tl);
> +void intel_engine_fini_retire(struct intel_engine_cs *engine);
>  
>  #endif /* I915_REQUEST_H */
> -- 
> 2.26.0
> 

Hi Greg,

Could you queue this one up as well?

Thanks,
Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-12  7:48             ` Sultan Alsawaf
@ 2020-04-12  7:56               ` Greg KH
  2020-04-13 14:37                 ` Salvatore Bonaccorso
  2020-04-13 15:23                 ` Sultan Alsawaf
  0 siblings, 2 replies; 29+ messages in thread
From: Greg KH @ 2020-04-12  7:56 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: stable

On Sun, Apr 12, 2020 at 12:48:51AM -0700, Sultan Alsawaf wrote:
> On Tue, Apr 07, 2020 at 01:43:45PM -0700, Sultan Alsawaf wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> > corruption WA") is that it disables RC6 while Skylake (and friends) is
> > active, and we do not consider the GPU idle until all outstanding
> > requests have been retired and the engine switched over to the kernel
> > context. If userspace is idle, this task falls onto our background idle
> > worker, which only runs roughly once a second, meaning that userspace has
> > to have been idle for a couple of seconds before we enable RC6 again.
> > Naturally, this causes us to consume considerably more energy than
> > before as powersaving is effectively disabled while a display server
> > (here's looking at you Xorg) is running.
> > 
> > As execlists will get a completion event as each context is completed,
> > we can use this interrupt to queue a retire worker bound to this engine
> > to cleanup idle timelines. We will then immediately notice the idle
> > engine (without userspace intervention or the aid of the background
> > retire worker) and start parking the GPU. Thus during light workloads,
> > we will do much more work to idle the GPU faster...  Hopefully with
> > commensurate power saving!
> > 
> > v2: Watch context completions and only look at those local to the engine
> > when retiring to reduce the amount of excess work we perform.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> > References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
> > (cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
> > 
> > For the backport to 5.4, struct_mutex needs to be held while retiring so
> > that retirement doesn't race with vma destruction.
> > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
> >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
> >  drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
> >  drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
> >  .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
> >  drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_request.h           |  4 +
> >  7 files changed, 101 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 4ce8626b140e..f732a2177cd0 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
> >  	intel_engine_init_hangcheck(engine);
> >  	intel_engine_init_cmd_parser(engine);
> >  	intel_engine_init__pm(engine);
> > +	intel_engine_init_retire(engine);
> >  
> >  	intel_engine_pool_init(&engine->pool);
> >  
> > @@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
> >  
> >  	cleanup_status_page(engine);
> >  
> > +	intel_engine_fini_retire(engine);
> >  	intel_engine_pool_fini(&engine->pool);
> >  	intel_engine_fini_breadcrumbs(engine);
> >  	intel_engine_cleanup_cmd_parser(engine);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index c77c9518c58b..1eb7189f88e1 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -471,6 +471,14 @@ struct intel_engine_cs {
> >  
> >  	struct intel_engine_execlists execlists;
> >  
> > +	/*
> > +	 * Keep track of completed timelines on this engine for early
> > +	 * retirement with the goal of quickly enabling powersaving as
> > +	 * soon as the engine is idle.
> > +	 */
> > +	struct intel_timeline *retire;
> > +	struct work_struct retire_work;
> > +
> >  	/* status_notifier: list of callbacks for context-switch changes */
> >  	struct atomic_notifier_head context_status_notifier;
> >  
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 66f6d1a897f2..a1538c8f7922 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
> >  {
> >  	struct intel_context * const ce = rq->hw_context;
> >  
> > +	/*
> > +	 * If we have just completed this context, the engine may now be
> > +	 * idle and we want to re-enter powersaving.
> > +	 */
> > +	if (list_is_last(&rq->link, &ce->timeline->requests) &&
> > +	    i915_request_completed(rq))
> > +		intel_engine_add_retire(engine, ce->timeline);
> > +
> >  	intel_engine_context_out(engine);
> >  	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> >  	intel_gt_pm_put(engine->gt);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > index 9cb01d9828f1..63515e3caaf2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > @@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
> >  {
> >  	GEM_BUG_ON(atomic_read(&timeline->pin_count));
> >  	GEM_BUG_ON(!list_empty(&timeline->requests));
> > +	GEM_BUG_ON(timeline->retire);
> >  
> >  	if (timeline->hwsp_cacheline)
> >  		cacheline_free(timeline->hwsp_cacheline);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > index 2b1baf2fcc8e..adf2d14ef647 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > @@ -65,6 +65,9 @@ struct intel_timeline {
> >  	 */
> >  	struct i915_active_request last_request;
> >  
> > +	/** A chain of completed timelines ready for early retirement. */
> > +	struct intel_timeline *retire;
> > +
> >  	/**
> >  	 * We track the most recent seqno that we wait on in every context so
> >  	 * that we only have to emit a new await and dependency on a more
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 0d39038898d4..35457fda350c 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
> >  			break;
> >  }
> >  
> > +static void engine_retire(struct work_struct *work)
> > +{
> > +	struct intel_engine_cs *engine =
> > +		container_of(work, typeof(*engine), retire_work);
> > +	struct intel_timeline *tl = xchg(&engine->retire, NULL);
> > +
> > +	do {
> > +		struct intel_timeline *next = xchg(&tl->retire, NULL);
> > +
> > +		/*
> > +		 * Our goal here is to retire _idle_ timelines as soon as
> > +		 * possible (as they are idle, we do not expect userspace
> > +		 * to be cleaning up anytime soon).
> > +		 *
> > +		 * If the timeline is currently locked, either it is being
> > +		 * retired elsewhere or about to be!
> > +		 */
> > +		mutex_lock(&engine->i915->drm.struct_mutex);
> > +		if (mutex_trylock(&tl->mutex)) {
> > +			retire_requests(tl);
> > +			mutex_unlock(&tl->mutex);
> > +		}
> > +		mutex_unlock(&engine->i915->drm.struct_mutex);
> > +		intel_timeline_put(tl);
> > +
> > +		GEM_BUG_ON(!next);
> > +		tl = ptr_mask_bits(next, 1);
> > +	} while (tl);
> > +}
> > +
> > +static bool add_retire(struct intel_engine_cs *engine,
> > +		       struct intel_timeline *tl)
> > +{
> > +	struct intel_timeline *first;
> > +
> > +	/*
> > +	 * We open-code a llist here to include the additional tag [BIT(0)]
> > +	 * so that we know when the timeline is already on a
> > +	 * retirement queue: either this engine or another.
> > +	 *
> > +	 * However, we rely on that a timeline can only be active on a single
> > +	 * engine at any one time and that add_retire() is called before the
> > +	 * engine releases the timeline and transferred to another to retire.
> > +	 */
> > +
> > +	if (READ_ONCE(tl->retire)) /* already queued */
> > +		return false;
> > +
> > +	intel_timeline_get(tl);
> > +	first = READ_ONCE(engine->retire);
> > +	do
> > +		tl->retire = ptr_pack_bits(first, 1, 1);
> > +	while (!try_cmpxchg(&engine->retire, &first, tl));
> > +
> > +	return !first;
> > +}
> > +
> > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > +			     struct intel_timeline *tl)
> > +{
> > +	if (add_retire(engine, tl))
> > +		schedule_work(&engine->retire_work);
> > +}
> > +
> > +void intel_engine_init_retire(struct intel_engine_cs *engine)
> > +{
> > +	INIT_WORK(&engine->retire_work, engine_retire);
> > +}
> > +
> > +void intel_engine_fini_retire(struct intel_engine_cs *engine)
> > +{
> > +	flush_work(&engine->retire_work);
> > +	GEM_BUG_ON(engine->retire);
> > +}
> > +
> >  static noinline struct i915_request *
> >  request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
> >  {
> > diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> > index 3a3e7bbf19ff..40caa2f3f4a4 100644
> > --- a/drivers/gpu/drm/i915/i915_request.h
> > +++ b/drivers/gpu/drm/i915/i915_request.h
> > @@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
> >  }
> >  
> >  bool i915_retire_requests(struct drm_i915_private *i915);
> > +void intel_engine_init_retire(struct intel_engine_cs *engine);
> > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > +			     struct intel_timeline *tl);
> > +void intel_engine_fini_retire(struct intel_engine_cs *engine);
> >  
> >  #endif /* I915_REQUEST_H */
> > -- 
> > 2.26.0
> > 
> 
> Hi Greg,
> 
> Could you queue this one up as well?

What is the git commit id of this in Linus's tree?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-12  7:56               ` Greg KH
@ 2020-04-13 14:37                 ` Salvatore Bonaccorso
  2020-04-13 15:23                 ` Sultan Alsawaf
  1 sibling, 0 replies; 29+ messages in thread
From: Salvatore Bonaccorso @ 2020-04-13 14:37 UTC (permalink / raw)
  To: Greg KH; +Cc: Sultan Alsawaf, stable

Hi,

[note not involved in the respective patch, but noticed your question
and saw the context]

On Sun, Apr 12, 2020 at 09:56:07AM +0200, Greg KH wrote:
> On Sun, Apr 12, 2020 at 12:48:51AM -0700, Sultan Alsawaf wrote:
> > On Tue, Apr 07, 2020 at 01:43:45PM -0700, Sultan Alsawaf wrote:
> > > From: Chris Wilson <chris@chris-wilson.co.uk>
> > > 
> > > The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> > > corruption WA") is that it disables RC6 while Skylake (and friends) is
> > > active, and we do not consider the GPU idle until all outstanding
> > > requests have been retired and the engine switched over to the kernel
> > > context. If userspace is idle, this task falls onto our background idle
> > > worker, which only runs roughly once a second, meaning that userspace has
> > > to have been idle for a couple of seconds before we enable RC6 again.
> > > Naturally, this causes us to consume considerably more energy than
> > > before as powersaving is effectively disabled while a display server
> > > (here's looking at you Xorg) is running.
> > > 
> > > As execlists will get a completion event as each context is completed,
> > > we can use this interrupt to queue a retire worker bound to this engine
> > > to cleanup idle timelines. We will then immediately notice the idle
> > > engine (without userspace intervention or the aid of the background
> > > retire worker) and start parking the GPU. Thus during light workloads,
> > > we will do much more work to idle the GPU faster...  Hopefully with
> > > commensurate power saving!
> > > 
> > > v2: Watch context completions and only look at those local to the engine
> > > when retiring to reduce the amount of excess work we perform.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> > > References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > > References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
> > > (cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
> > > 
> > > For the backport to 5.4, struct_mutex needs to be held while retiring so
> > > that retirement doesn't race with vma destruction.
> > > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
> > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
> > >  drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
> > >  drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
> > >  .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
> > >  drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
> > >  drivers/gpu/drm/i915/i915_request.h           |  4 +
> > >  7 files changed, 101 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > index 4ce8626b140e..f732a2177cd0 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > @@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
> > >  	intel_engine_init_hangcheck(engine);
> > >  	intel_engine_init_cmd_parser(engine);
> > >  	intel_engine_init__pm(engine);
> > > +	intel_engine_init_retire(engine);
> > >  
> > >  	intel_engine_pool_init(&engine->pool);
> > >  
> > > @@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
> > >  
> > >  	cleanup_status_page(engine);
> > >  
> > > +	intel_engine_fini_retire(engine);
> > >  	intel_engine_pool_fini(&engine->pool);
> > >  	intel_engine_fini_breadcrumbs(engine);
> > >  	intel_engine_cleanup_cmd_parser(engine);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > index c77c9518c58b..1eb7189f88e1 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > @@ -471,6 +471,14 @@ struct intel_engine_cs {
> > >  
> > >  	struct intel_engine_execlists execlists;
> > >  
> > > +	/*
> > > +	 * Keep track of completed timelines on this engine for early
> > > +	 * retirement with the goal of quickly enabling powersaving as
> > > +	 * soon as the engine is idle.
> > > +	 */
> > > +	struct intel_timeline *retire;
> > > +	struct work_struct retire_work;
> > > +
> > >  	/* status_notifier: list of callbacks for context-switch changes */
> > >  	struct atomic_notifier_head context_status_notifier;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > > index 66f6d1a897f2..a1538c8f7922 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > > @@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
> > >  {
> > >  	struct intel_context * const ce = rq->hw_context;
> > >  
> > > +	/*
> > > +	 * If we have just completed this context, the engine may now be
> > > +	 * idle and we want to re-enter powersaving.
> > > +	 */
> > > +	if (list_is_last(&rq->link, &ce->timeline->requests) &&
> > > +	    i915_request_completed(rq))
> > > +		intel_engine_add_retire(engine, ce->timeline);
> > > +
> > >  	intel_engine_context_out(engine);
> > >  	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> > >  	intel_gt_pm_put(engine->gt);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > > index 9cb01d9828f1..63515e3caaf2 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > > @@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
> > >  {
> > >  	GEM_BUG_ON(atomic_read(&timeline->pin_count));
> > >  	GEM_BUG_ON(!list_empty(&timeline->requests));
> > > +	GEM_BUG_ON(timeline->retire);
> > >  
> > >  	if (timeline->hwsp_cacheline)
> > >  		cacheline_free(timeline->hwsp_cacheline);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > > index 2b1baf2fcc8e..adf2d14ef647 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > > @@ -65,6 +65,9 @@ struct intel_timeline {
> > >  	 */
> > >  	struct i915_active_request last_request;
> > >  
> > > +	/** A chain of completed timelines ready for early retirement. */
> > > +	struct intel_timeline *retire;
> > > +
> > >  	/**
> > >  	 * We track the most recent seqno that we wait on in every context so
> > >  	 * that we only have to emit a new await and dependency on a more
> > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > > index 0d39038898d4..35457fda350c 100644
> > > --- a/drivers/gpu/drm/i915/i915_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_request.c
> > > @@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
> > >  			break;
> > >  }
> > >  
> > > +static void engine_retire(struct work_struct *work)
> > > +{
> > > +	struct intel_engine_cs *engine =
> > > +		container_of(work, typeof(*engine), retire_work);
> > > +	struct intel_timeline *tl = xchg(&engine->retire, NULL);
> > > +
> > > +	do {
> > > +		struct intel_timeline *next = xchg(&tl->retire, NULL);
> > > +
> > > +		/*
> > > +		 * Our goal here is to retire _idle_ timelines as soon as
> > > +		 * possible (as they are idle, we do not expect userspace
> > > +		 * to be cleaning up anytime soon).
> > > +		 *
> > > +		 * If the timeline is currently locked, either it is being
> > > +		 * retired elsewhere or about to be!
> > > +		 */
> > > +		mutex_lock(&engine->i915->drm.struct_mutex);
> > > +		if (mutex_trylock(&tl->mutex)) {
> > > +			retire_requests(tl);
> > > +			mutex_unlock(&tl->mutex);
> > > +		}
> > > +		mutex_unlock(&engine->i915->drm.struct_mutex);
> > > +		intel_timeline_put(tl);
> > > +
> > > +		GEM_BUG_ON(!next);
> > > +		tl = ptr_mask_bits(next, 1);
> > > +	} while (tl);
> > > +}
> > > +
> > > +static bool add_retire(struct intel_engine_cs *engine,
> > > +		       struct intel_timeline *tl)
> > > +{
> > > +	struct intel_timeline *first;
> > > +
> > > +	/*
> > > +	 * We open-code a llist here to include the additional tag [BIT(0)]
> > > +	 * so that we know when the timeline is already on a
> > > +	 * retirement queue: either this engine or another.
> > > +	 *
> > > +	 * However, we rely on that a timeline can only be active on a single
> > > +	 * engine at any one time and that add_retire() is called before the
> > > +	 * engine releases the timeline and transferred to another to retire.
> > > +	 */
> > > +
> > > +	if (READ_ONCE(tl->retire)) /* already queued */
> > > +		return false;
> > > +
> > > +	intel_timeline_get(tl);
> > > +	first = READ_ONCE(engine->retire);
> > > +	do
> > > +		tl->retire = ptr_pack_bits(first, 1, 1);
> > > +	while (!try_cmpxchg(&engine->retire, &first, tl));
> > > +
> > > +	return !first;
> > > +}
> > > +
> > > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > > +			     struct intel_timeline *tl)
> > > +{
> > > +	if (add_retire(engine, tl))
> > > +		schedule_work(&engine->retire_work);
> > > +}
> > > +
> > > +void intel_engine_init_retire(struct intel_engine_cs *engine)
> > > +{
> > > +	INIT_WORK(&engine->retire_work, engine_retire);
> > > +}
> > > +
> > > +void intel_engine_fini_retire(struct intel_engine_cs *engine)
> > > +{
> > > +	flush_work(&engine->retire_work);
> > > +	GEM_BUG_ON(engine->retire);
> > > +}
> > > +
> > >  static noinline struct i915_request *
> > >  request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
> > >  {
> > > diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> > > index 3a3e7bbf19ff..40caa2f3f4a4 100644
> > > --- a/drivers/gpu/drm/i915/i915_request.h
> > > +++ b/drivers/gpu/drm/i915/i915_request.h
> > > @@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
> > >  }
> > >  
> > >  bool i915_retire_requests(struct drm_i915_private *i915);
> > > +void intel_engine_init_retire(struct intel_engine_cs *engine);
> > > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > > +			     struct intel_timeline *tl);
> > > +void intel_engine_fini_retire(struct intel_engine_cs *engine);
> > >  
> > >  #endif /* I915_REQUEST_H */
> > > -- 
> > > 2.26.0
> > > 
> > 
> > Hi Greg,
> > 
> > Could you queue this one up as well?
> 
> What is the git commit id of this in Linus's tree?

This appears to be 4f88f8747fa43c97c3b3712d8d87295ea757cc51 upstream,
but there is as well this comment added:

> For the backport to 5.4, struct_mutex needs to be held while retiring so
> that retirement doesn't race with vma destruction.
> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

Regards,
Salvatore

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-12  7:56               ` Greg KH
  2020-04-13 14:37                 ` Salvatore Bonaccorso
@ 2020-04-13 15:23                 ` Sultan Alsawaf
  2020-04-13 15:29                   ` [PATCH v3] " Sultan Alsawaf
  1 sibling, 1 reply; 29+ messages in thread
From: Sultan Alsawaf @ 2020-04-13 15:23 UTC (permalink / raw)
  To: Greg KH; +Cc: stable

On Sun, Apr 12, 2020 at 09:56:07AM +0200, Greg KH wrote:
> On Sun, Apr 12, 2020 at 12:48:51AM -0700, Sultan Alsawaf wrote:
> > Hi Greg,
> > 
> > Could you queue this one up as well?
> 
> What is the git commit id of this in Linus's tree?

Sorry, I botched the commit message formatting. I'll send a patch with the
correct message formatting now.

Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-13 15:23                 ` Sultan Alsawaf
@ 2020-04-13 15:29                   ` Sultan Alsawaf
  2020-04-17  9:38                     ` Sultan Alsawaf
  2020-04-18 10:30                     ` Greg KH
  0 siblings, 2 replies; 29+ messages in thread
From: Sultan Alsawaf @ 2020-04-13 15:29 UTC (permalink / raw)
  To: gregkh; +Cc: stable

From: Chris Wilson <chris@chris-wilson.co.uk>

commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51 upstream.

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster...  Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
[Sultan: for the backport to 5.4, struct_mutex needs to be held while
 retiring so that retirement doesn't race with vma destruction.]
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
 drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
 drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.h           |  4 +
 7 files changed, 101 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ce8626b140e..f732a2177cd0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_hangcheck(engine);
 	intel_engine_init_cmd_parser(engine);
 	intel_engine_init__pm(engine);
+	intel_engine_init_retire(engine);
 
 	intel_engine_pool_init(&engine->pool);
 
@@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 
 	cleanup_status_page(engine);
 
+	intel_engine_fini_retire(engine);
 	intel_engine_pool_fini(&engine->pool);
 	intel_engine_fini_breadcrumbs(engine);
 	intel_engine_cleanup_cmd_parser(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c77c9518c58b..1eb7189f88e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -471,6 +471,14 @@ struct intel_engine_cs {
 
 	struct intel_engine_execlists execlists;
 
+	/*
+	 * Keep track of completed timelines on this engine for early
+	 * retirement with the goal of quickly enabling powersaving as
+	 * soon as the engine is idle.
+	 */
+	struct intel_timeline *retire;
+	struct work_struct retire_work;
+
 	/* status_notifier: list of callbacks for context-switch changes */
 	struct atomic_notifier_head context_status_notifier;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 66f6d1a897f2..a1538c8f7922 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
 {
 	struct intel_context * const ce = rq->hw_context;
 
+	/*
+	 * If we have just completed this context, the engine may now be
+	 * idle and we want to re-enter powersaving.
+	 */
+	if (list_is_last(&rq->link, &ce->timeline->requests) &&
+	    i915_request_completed(rq))
+		intel_engine_add_retire(engine, ce->timeline);
+
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put(engine->gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 9cb01d9828f1..63515e3caaf2 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
 {
 	GEM_BUG_ON(atomic_read(&timeline->pin_count));
 	GEM_BUG_ON(!list_empty(&timeline->requests));
+	GEM_BUG_ON(timeline->retire);
 
 	if (timeline->hwsp_cacheline)
 		cacheline_free(timeline->hwsp_cacheline);
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 2b1baf2fcc8e..adf2d14ef647 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -65,6 +65,9 @@ struct intel_timeline {
 	 */
 	struct i915_active_request last_request;
 
+	/** A chain of completed timelines ready for early retirement. */
+	struct intel_timeline *retire;
+
 	/**
 	 * We track the most recent seqno that we wait on in every context so
 	 * that we only have to emit a new await and dependency on a more
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0d39038898d4..35457fda350c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
 			break;
 }
 
+static void engine_retire(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), retire_work);
+	struct intel_timeline *tl = xchg(&engine->retire, NULL);
+
+	do {
+		struct intel_timeline *next = xchg(&tl->retire, NULL);
+
+		/*
+		 * Our goal here is to retire _idle_ timelines as soon as
+		 * possible (as they are idle, we do not expect userspace
+		 * to be cleaning up anytime soon).
+		 *
+		 * If the timeline is currently locked, either it is being
+		 * retired elsewhere or about to be!
+		 */
+		mutex_lock(&engine->i915->drm.struct_mutex);
+		if (mutex_trylock(&tl->mutex)) {
+			retire_requests(tl);
+			mutex_unlock(&tl->mutex);
+		}
+		mutex_unlock(&engine->i915->drm.struct_mutex);
+		intel_timeline_put(tl);
+
+		GEM_BUG_ON(!next);
+		tl = ptr_mask_bits(next, 1);
+	} while (tl);
+}
+
+static bool add_retire(struct intel_engine_cs *engine,
+		       struct intel_timeline *tl)
+{
+	struct intel_timeline *first;
+
+	/*
+	 * We open-code a llist here to include the additional tag [BIT(0)]
+	 * so that we know when the timeline is already on a
+	 * retirement queue: either this engine or another.
+	 *
+	 * However, we rely on that a timeline can only be active on a single
+	 * engine at any one time and that add_retire() is called before the
+	 * engine releases the timeline and transferred to another to retire.
+	 */
+
+	if (READ_ONCE(tl->retire)) /* already queued */
+		return false;
+
+	intel_timeline_get(tl);
+	first = READ_ONCE(engine->retire);
+	do
+		tl->retire = ptr_pack_bits(first, 1, 1);
+	while (!try_cmpxchg(&engine->retire, &first, tl));
+
+	return !first;
+}
+
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl)
+{
+	if (add_retire(engine, tl))
+		schedule_work(&engine->retire_work);
+}
+
+void intel_engine_init_retire(struct intel_engine_cs *engine)
+{
+	INIT_WORK(&engine->retire_work, engine_retire);
+}
+
+void intel_engine_fini_retire(struct intel_engine_cs *engine)
+{
+	flush_work(&engine->retire_work);
+	GEM_BUG_ON(engine->retire);
+}
+
 static noinline struct i915_request *
 request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
 {
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 3a3e7bbf19ff..40caa2f3f4a4 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
 }
 
 bool i915_retire_requests(struct drm_i915_private *i915);
+void intel_engine_init_retire(struct intel_engine_cs *engine);
+void intel_engine_add_retire(struct intel_engine_cs *engine,
+			     struct intel_timeline *tl);
+void intel_engine_fini_retire(struct intel_engine_cs *engine);
 
 #endif /* I915_REQUEST_H */
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-13 15:29                   ` [PATCH v3] " Sultan Alsawaf
@ 2020-04-17  9:38                     ` Sultan Alsawaf
  2020-04-18 10:29                       ` Greg KH
  2020-04-18 10:30                     ` Greg KH
  1 sibling, 1 reply; 29+ messages in thread
From: Sultan Alsawaf @ 2020-04-17  9:38 UTC (permalink / raw)
  To: gregkh; +Cc: stable

On Mon, Apr 13, 2020 at 08:29:30AM -0700, Sultan Alsawaf wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51 upstream.
> 
> The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> corruption WA") is that it disables RC6 while Skylake (and friends) is
> active, and we do not consider the GPU idle until all outstanding
> requests have been retired and the engine switched over to the kernel
> context. If userspace is idle, this task falls onto our background idle
> worker, which only runs roughly once a second, meaning that userspace has
> to have been idle for a couple of seconds before we enable RC6 again.
> Naturally, this causes us to consume considerably more energy than
> before as powersaving is effectively disabled while a display server
> (here's looking at you Xorg) is running.
> 
> As execlists will get a completion event as each context is completed,
> we can use this interrupt to queue a retire worker bound to this engine
> to cleanup idle timelines. We will then immediately notice the idle
> engine (without userspace intervention or the aid of the background
> retire worker) and start parking the GPU. Thus during light workloads,
> we will do much more work to idle the GPU faster...  Hopefully with
> commensurate power saving!
> 
> v2: Watch context completions and only look at those local to the engine
> when retiring to reduce the amount of excess work we perform.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
> [Sultan: for the backport to 5.4, struct_mutex needs to be held while
>  retiring so that retirement doesn't race with vma destruction.]
> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
>  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
>  drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
>  drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
>  .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
>  drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
>  drivers/gpu/drm/i915/i915_request.h           |  4 +
>  7 files changed, 101 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 4ce8626b140e..f732a2177cd0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
>  	intel_engine_init_hangcheck(engine);
>  	intel_engine_init_cmd_parser(engine);
>  	intel_engine_init__pm(engine);
> +	intel_engine_init_retire(engine);
>  
>  	intel_engine_pool_init(&engine->pool);
>  
> @@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
>  
>  	cleanup_status_page(engine);
>  
> +	intel_engine_fini_retire(engine);
>  	intel_engine_pool_fini(&engine->pool);
>  	intel_engine_fini_breadcrumbs(engine);
>  	intel_engine_cleanup_cmd_parser(engine);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index c77c9518c58b..1eb7189f88e1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -471,6 +471,14 @@ struct intel_engine_cs {
>  
>  	struct intel_engine_execlists execlists;
>  
> +	/*
> +	 * Keep track of completed timelines on this engine for early
> +	 * retirement with the goal of quickly enabling powersaving as
> +	 * soon as the engine is idle.
> +	 */
> +	struct intel_timeline *retire;
> +	struct work_struct retire_work;
> +
>  	/* status_notifier: list of callbacks for context-switch changes */
>  	struct atomic_notifier_head context_status_notifier;
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 66f6d1a897f2..a1538c8f7922 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
>  {
>  	struct intel_context * const ce = rq->hw_context;
>  
> +	/*
> +	 * If we have just completed this context, the engine may now be
> +	 * idle and we want to re-enter powersaving.
> +	 */
> +	if (list_is_last(&rq->link, &ce->timeline->requests) &&
> +	    i915_request_completed(rq))
> +		intel_engine_add_retire(engine, ce->timeline);
> +
>  	intel_engine_context_out(engine);
>  	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
>  	intel_gt_pm_put(engine->gt);
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index 9cb01d9828f1..63515e3caaf2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
>  {
>  	GEM_BUG_ON(atomic_read(&timeline->pin_count));
>  	GEM_BUG_ON(!list_empty(&timeline->requests));
> +	GEM_BUG_ON(timeline->retire);
>  
>  	if (timeline->hwsp_cacheline)
>  		cacheline_free(timeline->hwsp_cacheline);
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> index 2b1baf2fcc8e..adf2d14ef647 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> @@ -65,6 +65,9 @@ struct intel_timeline {
>  	 */
>  	struct i915_active_request last_request;
>  
> +	/** A chain of completed timelines ready for early retirement. */
> +	struct intel_timeline *retire;
> +
>  	/**
>  	 * We track the most recent seqno that we wait on in every context so
>  	 * that we only have to emit a new await and dependency on a more
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 0d39038898d4..35457fda350c 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
>  			break;
>  }
>  
> +static void engine_retire(struct work_struct *work)
> +{
> +	struct intel_engine_cs *engine =
> +		container_of(work, typeof(*engine), retire_work);
> +	struct intel_timeline *tl = xchg(&engine->retire, NULL);
> +
> +	do {
> +		struct intel_timeline *next = xchg(&tl->retire, NULL);
> +
> +		/*
> +		 * Our goal here is to retire _idle_ timelines as soon as
> +		 * possible (as they are idle, we do not expect userspace
> +		 * to be cleaning up anytime soon).
> +		 *
> +		 * If the timeline is currently locked, either it is being
> +		 * retired elsewhere or about to be!
> +		 */
> +		mutex_lock(&engine->i915->drm.struct_mutex);
> +		if (mutex_trylock(&tl->mutex)) {
> +			retire_requests(tl);
> +			mutex_unlock(&tl->mutex);
> +		}
> +		mutex_unlock(&engine->i915->drm.struct_mutex);
> +		intel_timeline_put(tl);
> +
> +		GEM_BUG_ON(!next);
> +		tl = ptr_mask_bits(next, 1);
> +	} while (tl);
> +}
> +
> +static bool add_retire(struct intel_engine_cs *engine,
> +		       struct intel_timeline *tl)
> +{
> +	struct intel_timeline *first;
> +
> +	/*
> +	 * We open-code a llist here to include the additional tag [BIT(0)]
> +	 * so that we know when the timeline is already on a
> +	 * retirement queue: either this engine or another.
> +	 *
> +	 * However, we rely on that a timeline can only be active on a single
> +	 * engine at any one time and that add_retire() is called before the
> +	 * engine releases the timeline and transferred to another to retire.
> +	 */
> +
> +	if (READ_ONCE(tl->retire)) /* already queued */
> +		return false;
> +
> +	intel_timeline_get(tl);
> +	first = READ_ONCE(engine->retire);
> +	do
> +		tl->retire = ptr_pack_bits(first, 1, 1);
> +	while (!try_cmpxchg(&engine->retire, &first, tl));
> +
> +	return !first;
> +}
> +
> +void intel_engine_add_retire(struct intel_engine_cs *engine,
> +			     struct intel_timeline *tl)
> +{
> +	if (add_retire(engine, tl))
> +		schedule_work(&engine->retire_work);
> +}
> +
> +void intel_engine_init_retire(struct intel_engine_cs *engine)
> +{
> +	INIT_WORK(&engine->retire_work, engine_retire);
> +}
> +
> +void intel_engine_fini_retire(struct intel_engine_cs *engine)
> +{
> +	flush_work(&engine->retire_work);
> +	GEM_BUG_ON(engine->retire);
> +}
> +
>  static noinline struct i915_request *
>  request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 3a3e7bbf19ff..40caa2f3f4a4 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
>  }
>  
>  bool i915_retire_requests(struct drm_i915_private *i915);
> +void intel_engine_init_retire(struct intel_engine_cs *engine);
> +void intel_engine_add_retire(struct intel_engine_cs *engine,
> +			     struct intel_timeline *tl);
> +void intel_engine_fini_retire(struct intel_engine_cs *engine);
>  
>  #endif /* I915_REQUEST_H */
> -- 
> 2.26.0
> 

Greg,

Could you take another look at this please and let me know if looks good to
queue up?

Thanks,
Sultan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-17  9:38                     ` Sultan Alsawaf
@ 2020-04-18 10:29                       ` Greg KH
  0 siblings, 0 replies; 29+ messages in thread
From: Greg KH @ 2020-04-18 10:29 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: stable

On Fri, Apr 17, 2020 at 02:38:47AM -0700, Sultan Alsawaf wrote:
> On Mon, Apr 13, 2020 at 08:29:30AM -0700, Sultan Alsawaf wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51 upstream.
> > 
> > The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> > corruption WA") is that it disables RC6 while Skylake (and friends) is
> > active, and we do not consider the GPU idle until all outstanding
> > requests have been retired and the engine switched over to the kernel
> > context. If userspace is idle, this task falls onto our background idle
> > worker, which only runs roughly once a second, meaning that userspace has
> > to have been idle for a couple of seconds before we enable RC6 again.
> > Naturally, this causes us to consume considerably more energy than
> > before as powersaving is effectively disabled while a display server
> > (here's looking at you Xorg) is running.
> > 
> > As execlists will get a completion event as each context is completed,
> > we can use this interrupt to queue a retire worker bound to this engine
> > to cleanup idle timelines. We will then immediately notice the idle
> > engine (without userspace intervention or the aid of the background
> > retire worker) and start parking the GPU. Thus during light workloads,
> > we will do much more work to idle the GPU faster...  Hopefully with
> > commensurate power saving!
> > 
> > v2: Watch context completions and only look at those local to the engine
> > when retiring to reduce the amount of excess work we perform.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> > References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
> > [Sultan: for the backport to 5.4, struct_mutex needs to be held while
> >  retiring so that retirement doesn't race with vma destruction.]
> > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
> >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
> >  drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
> >  drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
> >  .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
> >  drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_request.h           |  4 +
> >  7 files changed, 101 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 4ce8626b140e..f732a2177cd0 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -600,6 +600,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
> >  	intel_engine_init_hangcheck(engine);
> >  	intel_engine_init_cmd_parser(engine);
> >  	intel_engine_init__pm(engine);
> > +	intel_engine_init_retire(engine);
> >  
> >  	intel_engine_pool_init(&engine->pool);
> >  
> > @@ -807,6 +808,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
> >  
> >  	cleanup_status_page(engine);
> >  
> > +	intel_engine_fini_retire(engine);
> >  	intel_engine_pool_fini(&engine->pool);
> >  	intel_engine_fini_breadcrumbs(engine);
> >  	intel_engine_cleanup_cmd_parser(engine);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index c77c9518c58b..1eb7189f88e1 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -471,6 +471,14 @@ struct intel_engine_cs {
> >  
> >  	struct intel_engine_execlists execlists;
> >  
> > +	/*
> > +	 * Keep track of completed timelines on this engine for early
> > +	 * retirement with the goal of quickly enabling powersaving as
> > +	 * soon as the engine is idle.
> > +	 */
> > +	struct intel_timeline *retire;
> > +	struct work_struct retire_work;
> > +
> >  	/* status_notifier: list of callbacks for context-switch changes */
> >  	struct atomic_notifier_head context_status_notifier;
> >  
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 66f6d1a897f2..a1538c8f7922 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -606,6 +606,14 @@ __execlists_schedule_out(struct i915_request *rq,
> >  {
> >  	struct intel_context * const ce = rq->hw_context;
> >  
> > +	/*
> > +	 * If we have just completed this context, the engine may now be
> > +	 * idle and we want to re-enter powersaving.
> > +	 */
> > +	if (list_is_last(&rq->link, &ce->timeline->requests) &&
> > +	    i915_request_completed(rq))
> > +		intel_engine_add_retire(engine, ce->timeline);
> > +
> >  	intel_engine_context_out(engine);
> >  	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> >  	intel_gt_pm_put(engine->gt);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > index 9cb01d9828f1..63515e3caaf2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > @@ -282,6 +282,7 @@ void intel_timeline_fini(struct intel_timeline *timeline)
> >  {
> >  	GEM_BUG_ON(atomic_read(&timeline->pin_count));
> >  	GEM_BUG_ON(!list_empty(&timeline->requests));
> > +	GEM_BUG_ON(timeline->retire);
> >  
> >  	if (timeline->hwsp_cacheline)
> >  		cacheline_free(timeline->hwsp_cacheline);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > index 2b1baf2fcc8e..adf2d14ef647 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> > @@ -65,6 +65,9 @@ struct intel_timeline {
> >  	 */
> >  	struct i915_active_request last_request;
> >  
> > +	/** A chain of completed timelines ready for early retirement. */
> > +	struct intel_timeline *retire;
> > +
> >  	/**
> >  	 * We track the most recent seqno that we wait on in every context so
> >  	 * that we only have to emit a new await and dependency on a more
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 0d39038898d4..35457fda350c 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -600,6 +600,81 @@ static void retire_requests(struct intel_timeline *tl)
> >  			break;
> >  }
> >  
> > +static void engine_retire(struct work_struct *work)
> > +{
> > +	struct intel_engine_cs *engine =
> > +		container_of(work, typeof(*engine), retire_work);
> > +	struct intel_timeline *tl = xchg(&engine->retire, NULL);
> > +
> > +	do {
> > +		struct intel_timeline *next = xchg(&tl->retire, NULL);
> > +
> > +		/*
> > +		 * Our goal here is to retire _idle_ timelines as soon as
> > +		 * possible (as they are idle, we do not expect userspace
> > +		 * to be cleaning up anytime soon).
> > +		 *
> > +		 * If the timeline is currently locked, either it is being
> > +		 * retired elsewhere or about to be!
> > +		 */
> > +		mutex_lock(&engine->i915->drm.struct_mutex);
> > +		if (mutex_trylock(&tl->mutex)) {
> > +			retire_requests(tl);
> > +			mutex_unlock(&tl->mutex);
> > +		}
> > +		mutex_unlock(&engine->i915->drm.struct_mutex);
> > +		intel_timeline_put(tl);
> > +
> > +		GEM_BUG_ON(!next);
> > +		tl = ptr_mask_bits(next, 1);
> > +	} while (tl);
> > +}
> > +
> > +static bool add_retire(struct intel_engine_cs *engine,
> > +		       struct intel_timeline *tl)
> > +{
> > +	struct intel_timeline *first;
> > +
> > +	/*
> > +	 * We open-code a llist here to include the additional tag [BIT(0)]
> > +	 * so that we know when the timeline is already on a
> > +	 * retirement queue: either this engine or another.
> > +	 *
> > +	 * However, we rely on that a timeline can only be active on a single
> > +	 * engine at any one time and that add_retire() is called before the
> > +	 * engine releases the timeline and transferred to another to retire.
> > +	 */
> > +
> > +	if (READ_ONCE(tl->retire)) /* already queued */
> > +		return false;
> > +
> > +	intel_timeline_get(tl);
> > +	first = READ_ONCE(engine->retire);
> > +	do
> > +		tl->retire = ptr_pack_bits(first, 1, 1);
> > +	while (!try_cmpxchg(&engine->retire, &first, tl));
> > +
> > +	return !first;
> > +}
> > +
> > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > +			     struct intel_timeline *tl)
> > +{
> > +	if (add_retire(engine, tl))
> > +		schedule_work(&engine->retire_work);
> > +}
> > +
> > +void intel_engine_init_retire(struct intel_engine_cs *engine)
> > +{
> > +	INIT_WORK(&engine->retire_work, engine_retire);
> > +}
> > +
> > +void intel_engine_fini_retire(struct intel_engine_cs *engine)
> > +{
> > +	flush_work(&engine->retire_work);
> > +	GEM_BUG_ON(engine->retire);
> > +}
> > +
> >  static noinline struct i915_request *
> >  request_alloc_slow(struct intel_timeline *tl, gfp_t gfp)
> >  {
> > diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> > index 3a3e7bbf19ff..40caa2f3f4a4 100644
> > --- a/drivers/gpu/drm/i915/i915_request.h
> > +++ b/drivers/gpu/drm/i915/i915_request.h
> > @@ -445,5 +445,9 @@ static inline bool i915_request_has_nopreempt(const struct i915_request *rq)
> >  }
> >  
> >  bool i915_retire_requests(struct drm_i915_private *i915);
> > +void intel_engine_init_retire(struct intel_engine_cs *engine);
> > +void intel_engine_add_retire(struct intel_engine_cs *engine,
> > +			     struct intel_timeline *tl);
> > +void intel_engine_fini_retire(struct intel_engine_cs *engine);
> >  
> >  #endif /* I915_REQUEST_H */
> > -- 
> > 2.26.0
> > 
> 
> Greg,
> 
> Could you take another look at this please and let me know if looks good to
> queue up?

I have no idea, I want an ack from the maintainers of this driver to say
it is ok, given these threads with no solid agreement.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3] drm/i915/gt: Schedule request retirement when timeline idles
  2020-04-13 15:29                   ` [PATCH v3] " Sultan Alsawaf
  2020-04-17  9:38                     ` Sultan Alsawaf
@ 2020-04-18 10:30                     ` Greg KH
  1 sibling, 0 replies; 29+ messages in thread
From: Greg KH @ 2020-04-18 10:30 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: stable

On Mon, Apr 13, 2020 at 08:29:30AM -0700, Sultan Alsawaf wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51 upstream.
> 
> The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> corruption WA") is that it disables RC6 while Skylake (and friends) is
> active, and we do not consider the GPU idle until all outstanding
> requests have been retired and the engine switched over to the kernel
> context. If userspace is idle, this task falls onto our background idle
> worker, which only runs roughly once a second, meaning that userspace has
> to have been idle for a couple of seconds before we enable RC6 again.
> Naturally, this causes us to consume considerably more energy than
> before as powersaving is effectively disabled while a display server
> (here's looking at you Xorg) is running.
> 
> As execlists will get a completion event as each context is completed,
> we can use this interrupt to queue a retire worker bound to this engine
> to cleanup idle timelines. We will then immediately notice the idle
> engine (without userspace intervention or the aid of the background
> retire worker) and start parking the GPU. Thus during light workloads,
> we will do much more work to idle the GPU faster...  Hopefully with
> commensurate power saving!
> 
> v2: Watch context completions and only look at those local to the engine
> when retiring to reduce the amount of excess work we perform.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
> [Sultan: for the backport to 5.4, struct_mutex needs to be held while
>  retiring so that retirement doesn't race with vma destruction.]
> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +
>  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 ++
>  drivers/gpu/drm/i915/gt/intel_lrc.c           |  8 ++
>  drivers/gpu/drm/i915/gt/intel_timeline.c      |  1 +
>  .../gpu/drm/i915/gt/intel_timeline_types.h    |  3 +
>  drivers/gpu/drm/i915/i915_request.c           | 75 +++++++++++++++++++
>  drivers/gpu/drm/i915/i915_request.h           |  4 +
>  7 files changed, 101 insertions(+)

Why are you not cc:ing all of the relevant people on these patches?
That's not ok, I'm just going to drop all of these requests until that
happens, and I get an ack from the developers/maintainers involved.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2020-04-18 10:30 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-30  3:30 [PATCH 0/2] A couple of important fixes for i915 on 5.4 Sultan Alsawaf
2020-03-30  3:30 ` [Intel-gfx] " Sultan Alsawaf
2020-03-30  3:30 ` Sultan Alsawaf
2020-03-30  3:30 ` [PATCH 1/2] drm/i915: Disable Panel Self Refresh (PSR) by default Sultan Alsawaf
2020-03-30  8:51   ` Greg KH
2020-03-30 15:50     ` Sultan Alsawaf
2020-03-30  3:30 ` [PATCH 2/2] drm/i915/gt: Schedule request retirement when timeline idles Sultan Alsawaf
2020-03-30  3:30   ` [Intel-gfx] " Sultan Alsawaf
2020-03-30  3:30   ` Sultan Alsawaf
2020-03-30 15:56   ` Sultan Alsawaf
2020-03-30 15:56     ` [Intel-gfx] " Sultan Alsawaf
2020-03-30 15:56     ` Sultan Alsawaf
2020-03-30 16:27   ` Chris Wilson
2020-03-30 16:35     ` Sultan Alsawaf
2020-03-30 16:45       ` Chris Wilson
2020-04-07 20:28         ` Sultan Alsawaf
2020-04-07 20:43           ` [PATCH v2] " Sultan Alsawaf
2020-04-12  7:48             ` Sultan Alsawaf
2020-04-12  7:56               ` Greg KH
2020-04-13 14:37                 ` Salvatore Bonaccorso
2020-04-13 15:23                 ` Sultan Alsawaf
2020-04-13 15:29                   ` [PATCH v3] " Sultan Alsawaf
2020-04-17  9:38                     ` Sultan Alsawaf
2020-04-18 10:29                       ` Greg KH
2020-04-18 10:30                     ` Greg KH
2020-03-30  8:51 ` [PATCH 0/2] A couple of important fixes for i915 on 5.4 Greg KH
2020-03-30 15:55   ` Sultan Alsawaf
2020-03-30 15:55     ` [Intel-gfx] " Sultan Alsawaf
2020-03-30 15:55     ` Sultan Alsawaf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.