All of lore.kernel.org
 help / color / mirror / Atom feed
* patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1
@ 2018-02-13  7:38 Rodrigo Vivi
  2018-02-13  9:01 ` [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers Chris Wilson
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2018-02-13  7:38 UTC (permalink / raw)
  To: intel-gfx


These patches failed to cherry-pick on drm-intel-fixes

Please provide a backport or advise how to proceed.

fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers")
1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")

Thanks,
Rodrigo.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers
  2018-02-13  7:38 patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Rodrigo Vivi
@ 2018-02-13  9:01 ` Chris Wilson
  2018-02-14  1:00     ` Rodrigo Vivi
  2018-02-13  9:20 ` patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Tvrtko Ursulin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2018-02-13  9:01 UTC (permalink / raw)
  To: intel-gfx
  Cc: rodrigo.vivi, Chris Wilson, Tvrtko Ursulin, Joonas Lahtinen, stable

When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno, or it may match the cleared seqno with the
now zero rq->global_seqno. This confuses the signaler into action and
signaling the fence.

Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.12+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@chris-wilson.co.uk
(cherry picked from commit fd10e2ce9905030d922e179a8047a4d50daffd8e)
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index bd40fea16b4f..f54ddda9fdad 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -594,29 +594,16 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
 	spin_unlock_irq(&b->rb_lock);
 }
 
-static bool signal_valid(const struct drm_i915_gem_request *request)
-{
-	return intel_wait_check_request(&request->signaling.wait, request);
-}
-
 static bool signal_complete(const struct drm_i915_gem_request *request)
 {
 	if (!request)
 		return false;
 
-	/* If another process served as the bottom-half it may have already
-	 * signalled that this wait is already completed.
-	 */
-	if (intel_wait_complete(&request->signaling.wait))
-		return signal_valid(request);
-
-	/* Carefully check if the request is complete, giving time for the
+	/*
+	 * Carefully check if the request is complete, giving time for the
 	 * seqno to be visible or if the GPU hung.
 	 */
-	if (__i915_request_irq_complete(request))
-		return true;
-
-	return false;
+	return __i915_request_irq_complete(request);
 }
 
 static struct drm_i915_gem_request *to_signaler(struct rb_node *rb)
@@ -659,9 +646,13 @@ static int intel_breadcrumbs_signaler(void *arg)
 			request = i915_gem_request_get_rcu(request);
 		rcu_read_unlock();
 		if (signal_complete(request)) {
-			local_bh_disable();
-			dma_fence_signal(&request->fence);
-			local_bh_enable(); /* kick start the tasklets */
+			if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+				      &request->fence.flags)) {
+				local_bh_disable();
+				dma_fence_signal(&request->fence);
+				GEM_BUG_ON(!i915_gem_request_completed(request));
+				local_bh_enable(); /* kick start the tasklets */
+			}
 
 			spin_lock_irq(&b->rb_lock);
 
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1
  2018-02-13  7:38 patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Rodrigo Vivi
  2018-02-13  9:01 ` [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers Chris Wilson
@ 2018-02-13  9:20 ` Tvrtko Ursulin
  2018-02-13  9:38   ` Jani Nikula
  2018-02-13  9:29 ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
  2018-02-13  9:43 ` ✗ Fi.CI.BAT: failure for drm/i915/breadcrumbs: Ignore unsubmitted signalers (rev2) Patchwork
  3 siblings, 1 reply; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:20 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-gfx


On 13/02/18 07:38, Rodrigo Vivi wrote:
> 
> These patches failed to cherry-pick on drm-intel-fixes
> 
> Please provide a backport or advise how to proceed.
> 
> fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers")
> 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")


I'm building at the moment in order to test and then I'll send some 
patches. Plural because, since drm-intel-fixes includes:

commit 4900727d35bb20028f9bd83146ec4bf78afffe30
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Jan 11 07:30:31 2018 +0000

     drm/i915/pmu: Reconstruct active state on starting busy-stats


We need to include a chain of two more fixups:

commit 99e48bf98dd036090b480a12c39e8b971731247e
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jan 15 09:20:41 2018 +0000

     drm/i915: Lock out execlist tasklet while peeking inside for busy-stats


commit b2f78cda260bc6a1a2d382b1d85a29e69b5b3724
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Mon Feb 5 09:34:48 2018 +0000

     drm/i915/pmu: Fix PMU enable vs execlists tasklet race


And then the referenced fix, plus another fixup:

commit 1fe699e30113ed6f6e853ff44710d256072ea627
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Tue Feb 6 18:33:11 2018 +0000

     drm/i915/pmu: Fix sleep under atomic in RC6 readout

commit 05273c950a3c93c5f96be8807eaf24f2cc9f1c1e
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Feb 7 16:04:28 2018 +0000

     drm/i915/pmu: Fix building without CONFIG_PM


I'll send them when I test my rebases.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats
  2018-02-13  7:38 patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Rodrigo Vivi
  2018-02-13  9:01 ` [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers Chris Wilson
  2018-02-13  9:20 ` patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Tvrtko Ursulin
@ 2018-02-13  9:29 ` Tvrtko Ursulin
  2018-02-13  9:29   ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
                     ` (2 more replies)
  2018-02-13  9:43 ` ✗ Fi.CI.BAT: failure for drm/i915/breadcrumbs: Ignore unsubmitted signalers (rev2) Patchwork
  3 siblings, 3 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:29 UTC (permalink / raw)
  To: Intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

In order to prevent a race condition where we may end up overaccounting
the active state and leaving the busy-stats believing the GPU is 100%
busy, lock out the tasklet while we reconstruct the busy state. There is
no direct spinlock guard for the execlists->port[], so we need to
utilise tasklet_disable() as a synchronous barrier to prevent it, the
only writer to execlists->port[], from running at the same time as the
enable.

Fixes: 4900727d35bb ("drm/i915/pmu: Reconstruct active state on starting busy-stats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115092041.13509-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index acc661aa9c0c..fa960cfd2764 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1945,16 +1945,22 @@ intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
  */
 int intel_enable_engine_stats(struct intel_engine_cs *engine)
 {
+	struct intel_engine_execlists *execlists = &engine->execlists;
 	unsigned long flags;
+	int err = 0;
 
 	if (!intel_engine_supports_stats(engine))
 		return -ENODEV;
 
+	tasklet_disable(&execlists->tasklet);
 	spin_lock_irqsave(&engine->stats.lock, flags);
-	if (engine->stats.enabled == ~0)
-		goto busy;
+
+	if (unlikely(engine->stats.enabled == ~0)) {
+		err = -EBUSY;
+		goto unlock;
+	}
+
 	if (engine->stats.enabled++ == 0) {
-		struct intel_engine_execlists *execlists = &engine->execlists;
 		const struct execlist_port *port = execlists->port;
 		unsigned int num_ports = execlists_num_ports(execlists);
 
@@ -1969,14 +1975,12 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 		if (engine->stats.active)
 			engine->stats.start = engine->stats.enabled_at;
 	}
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
 
-	return 0;
-
-busy:
+unlock:
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	tasklet_enable(&execlists->tasklet);
 
-	return -EBUSY;
+	return err;
 }
 
 static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race
  2018-02-13  9:29 ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
@ 2018-02-13  9:29   ` Tvrtko Ursulin
  2018-02-13  9:29   ` [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout Tvrtko Ursulin
  2018-02-13  9:29   ` [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM Tvrtko Ursulin
  2 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Commit 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking
inside for busy-stats") added a tasklet_disable call in busy stats
enabling, but we failed to understand that the PMU enable callback runs
as an hard IRQ (IPI).

Consequence of this is that the PMU enable callback can interrupt the
execlists tasklet, and will then deadlock when it calls
intel_engine_stats_enable->tasklet_disable.

To fix this, I realized it is possible to move the engine stats enablement
and disablement to PMU event init and destroy hooks. This allows for much
simpler implementation since those hooks run in normal context (can
sleep).

v2: Extract engine_event_destroy. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats")
Testcase: igt/perf_pmu/enable-race-*
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205093448.13877-1-tvrtko.ursulin@linux.intel.com
---
 drivers/gpu/drm/i915/i915_pmu.c         | 125 +++++++++++++-------------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  14 ----
 2 files changed, 52 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 55a8a1e29424..337eaa6ede52 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -285,26 +285,41 @@ static u64 count_interrupts(struct drm_i915_private *i915)
 	return sum;
 }
 
-static void i915_pmu_event_destroy(struct perf_event *event)
+static void engine_event_destroy(struct perf_event *event)
 {
-	WARN_ON(event->parent);
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	struct intel_engine_cs *engine;
+
+	engine = intel_engine_lookup_user(i915,
+					  engine_event_class(event),
+					  engine_event_instance(event));
+	if (WARN_ON_ONCE(!engine))
+		return;
+
+	if (engine_event_sample(event) == I915_SAMPLE_BUSY &&
+	    intel_engine_supports_stats(engine))
+		intel_disable_engine_stats(engine);
 }
 
-static int engine_event_init(struct perf_event *event)
+static void i915_pmu_event_destroy(struct perf_event *event)
 {
-	struct drm_i915_private *i915 =
-		container_of(event->pmu, typeof(*i915), pmu.base);
+	WARN_ON(event->parent);
 
-	if (!intel_engine_lookup_user(i915, engine_event_class(event),
-				      engine_event_instance(event)))
-		return -ENODEV;
+	if (is_engine_event(event))
+		engine_event_destroy(event);
+}
 
-	switch (engine_event_sample(event)) {
+static int
+engine_event_status(struct intel_engine_cs *engine,
+		    enum drm_i915_pmu_engine_sample sample)
+{
+	switch (sample) {
 	case I915_SAMPLE_BUSY:
 	case I915_SAMPLE_WAIT:
 		break;
 	case I915_SAMPLE_SEMA:
-		if (INTEL_GEN(i915) < 6)
+		if (INTEL_GEN(engine->i915) < 6)
 			return -ENODEV;
 		break;
 	default:
@@ -314,6 +329,30 @@ static int engine_event_init(struct perf_event *event)
 	return 0;
 }
 
+static int engine_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	struct intel_engine_cs *engine;
+	u8 sample;
+	int ret;
+
+	engine = intel_engine_lookup_user(i915, engine_event_class(event),
+					  engine_event_instance(event));
+	if (!engine)
+		return -ENODEV;
+
+	sample = engine_event_sample(event);
+	ret = engine_event_status(engine, sample);
+	if (ret)
+		return ret;
+
+	if (sample == I915_SAMPLE_BUSY && intel_engine_supports_stats(engine))
+		ret = intel_enable_engine_stats(engine);
+
+	return ret;
+}
+
 static int i915_pmu_event_init(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -387,7 +426,7 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 		if (WARN_ON_ONCE(!engine)) {
 			/* Do nothing */
 		} else if (sample == I915_SAMPLE_BUSY &&
-			   engine->pmu.busy_stats) {
+			   intel_engine_supports_stats(engine)) {
 			val = ktime_to_ns(intel_engine_get_busy_time(engine));
 		} else {
 			val = engine->pmu.sample[sample].cur;
@@ -442,12 +481,6 @@ static void i915_pmu_event_read(struct perf_event *event)
 	local64_add(new - prev, &event->count);
 }
 
-static bool engine_needs_busy_stats(struct intel_engine_cs *engine)
-{
-	return intel_engine_supports_stats(engine) &&
-	       (engine->pmu.enable & BIT(I915_SAMPLE_BUSY));
-}
-
 static void i915_pmu_enable(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -487,21 +520,7 @@ static void i915_pmu_enable(struct perf_event *event)
 
 		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
 		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
-		if (engine->pmu.enable_count[sample]++ == 0) {
-			/*
-			 * Enable engine busy stats tracking if needed or
-			 * alternatively cancel the scheduled disable.
-			 *
-			 * If the delayed disable was pending, cancel it and
-			 * in this case do not enable since it already is.
-			 */
-			if (engine_needs_busy_stats(engine) &&
-			    !engine->pmu.busy_stats) {
-				engine->pmu.busy_stats = true;
-				if (!cancel_delayed_work(&engine->pmu.disable_busy_stats))
-					intel_enable_engine_stats(engine);
-			}
-		}
+		engine->pmu.enable_count[sample]++;
 	}
 
 	/*
@@ -514,14 +533,6 @@ static void i915_pmu_enable(struct perf_event *event)
 	spin_unlock_irqrestore(&i915->pmu.lock, flags);
 }
 
-static void __disable_busy_stats(struct work_struct *work)
-{
-	struct intel_engine_cs *engine =
-	       container_of(work, typeof(*engine), pmu.disable_busy_stats.work);
-
-	intel_disable_engine_stats(engine);
-}
-
 static void i915_pmu_disable(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -545,26 +556,8 @@ static void i915_pmu_disable(struct perf_event *event)
 		 * Decrement the reference count and clear the enabled
 		 * bitmask when the last listener on an event goes away.
 		 */
-		if (--engine->pmu.enable_count[sample] == 0) {
+		if (--engine->pmu.enable_count[sample] == 0)
 			engine->pmu.enable &= ~BIT(sample);
-			if (!engine_needs_busy_stats(engine) &&
-			    engine->pmu.busy_stats) {
-				engine->pmu.busy_stats = false;
-				/*
-				 * We request a delayed disable to handle the
-				 * rapid on/off cycles on events, which can
-				 * happen when tools like perf stat start, in a
-				 * nicer way.
-				 *
-				 * In addition, this also helps with busy stats
-				 * accuracy with background CPU offline/online
-				 * migration events.
-				 */
-				queue_delayed_work(system_wq,
-						   &engine->pmu.disable_busy_stats,
-						   round_jiffies_up_relative(HZ));
-			}
-		}
 	}
 
 	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
@@ -797,8 +790,6 @@ static void i915_pmu_unregister_cpuhp_state(struct drm_i915_private *i915)
 
 void i915_pmu_register(struct drm_i915_private *i915)
 {
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
 	int ret;
 
 	if (INTEL_GEN(i915) <= 2) {
@@ -820,10 +811,6 @@ void i915_pmu_register(struct drm_i915_private *i915)
 	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	i915->pmu.timer.function = i915_sample;
 
-	for_each_engine(engine, i915, id)
-		INIT_DELAYED_WORK(&engine->pmu.disable_busy_stats,
-				  __disable_busy_stats);
-
 	ret = perf_pmu_register(&i915->pmu.base, "i915", -1);
 	if (ret)
 		goto err;
@@ -843,9 +830,6 @@ void i915_pmu_register(struct drm_i915_private *i915)
 
 void i915_pmu_unregister(struct drm_i915_private *i915)
 {
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
 	if (!i915->pmu.base.event_init)
 		return;
 
@@ -853,11 +837,6 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
 
 	hrtimer_cancel(&i915->pmu.timer);
 
-	for_each_engine(engine, i915, id) {
-		GEM_BUG_ON(engine->pmu.busy_stats);
-		flush_delayed_work(&engine->pmu.disable_busy_stats);
-	}
-
 	i915_pmu_unregister_cpuhp_state(i915);
 
 	perf_pmu_unregister(&i915->pmu.base);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c5ff203e42d6..a0e7a6c2a57c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -366,20 +366,6 @@ struct intel_engine_cs {
 		 */
 #define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_SEMA + 1)
 		struct i915_pmu_sample sample[I915_ENGINE_SAMPLE_MAX];
-		/**
-		 * @busy_stats: Has enablement of engine stats tracking been
-		 * 		requested.
-		 */
-		bool busy_stats;
-		/**
-		 * @disable_busy_stats: Work item for busy stats disabling.
-		 *
-		 * Same as with @enable_busy_stats action, with the difference
-		 * that we delay it in case there are rapid enable-disable
-		 * actions, which can happen during tool startup (like perf
-		 * stat).
-		 */
-		struct delayed_work disable_busy_stats;
 	} pmu;
 
 	/*
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout
  2018-02-13  9:29 ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
  2018-02-13  9:29   ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
@ 2018-02-13  9:29   ` Tvrtko Ursulin
  2018-02-13  9:29   ` [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM Tvrtko Ursulin
  2 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:29 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We are not allowed to call intel_runtime_pm_get from the PMU counter read
callback since the former can sleep, and the latter is running under IRQ
context.

To workaround this, we record the last known RC6 and while runtime
suspended estimate its increase by querying the runtime PM core
timestamps.

Downside of this approach is that we can temporarily lose a chunk of RC6
time, from the last PMU read-out to runtime suspend entry, but that will
eventually catch up, once device comes back online and in the presence of
PMU queries.

Also, we have to be careful not to overshoot the RC6 estimate, so once
resumed after a period of approximation, we only update the counter once
it catches up. With the observation that RC6 is increasing while the
device is suspended, this should not pose a problem and can only cause
slight inaccuracies due clock base differences.

v2: Simplify by estimating on top of PM core counters. (Imre)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943
Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics")
Testcase: igt/perf_pmu/rc6-runtime-pm
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206183311.17924-1-tvrtko.ursulin@linux.intel.com
---
 drivers/gpu/drm/i915/i915_pmu.c | 93 ++++++++++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_pmu.h |  6 +++
 2 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 337eaa6ede52..e13859aaa2a3 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -409,7 +409,81 @@ static int i915_pmu_event_init(struct perf_event *event)
 	return 0;
 }
 
-static u64 __i915_pmu_event_read(struct perf_event *event)
+static u64 get_rc6(struct drm_i915_private *i915, bool locked)
+{
+	unsigned long flags;
+	u64 val;
+
+	if (intel_runtime_pm_get_if_in_use(i915)) {
+		val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ?
+						   VLV_GT_RENDER_RC6 :
+						   GEN6_GT_GFX_RC6);
+
+		if (HAS_RC6p(i915))
+			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+
+		if (HAS_RC6pp(i915))
+			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+
+		intel_runtime_pm_put(i915);
+
+		/*
+		 * If we are coming back from being runtime suspended we must
+		 * be careful not to report a larger value than returned
+		 * previously.
+		 */
+
+		if (!locked)
+			spin_lock_irqsave(&i915->pmu.lock, flags);
+
+		if (val >= i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) {
+			i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0;
+			i915->pmu.sample[__I915_SAMPLE_RC6].cur = val;
+		} else {
+			val = i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur;
+		}
+
+		if (!locked)
+			spin_unlock_irqrestore(&i915->pmu.lock, flags);
+	} else {
+		struct pci_dev *pdev = i915->drm.pdev;
+		struct device *kdev = &pdev->dev;
+		unsigned long flags2;
+
+		/*
+		 * We are runtime suspended.
+		 *
+		 * Report the delta from when the device was suspended to now,
+		 * on top of the last known real value, as the approximated RC6
+		 * counter value.
+		 */
+		if (!locked)
+			spin_lock_irqsave(&i915->pmu.lock, flags);
+
+		spin_lock_irqsave(&kdev->power.lock, flags2);
+
+		if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
+			i915->pmu.suspended_jiffies_last =
+						kdev->power.suspended_jiffies;
+
+		val = kdev->power.suspended_jiffies -
+		      i915->pmu.suspended_jiffies_last;
+		val += jiffies - kdev->power.accounting_timestamp;
+
+		spin_unlock_irqrestore(&kdev->power.lock, flags2);
+
+		val = jiffies_to_nsecs(val);
+		val += i915->pmu.sample[__I915_SAMPLE_RC6].cur;
+		i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val;
+
+		if (!locked)
+			spin_unlock_irqrestore(&i915->pmu.lock, flags);
+	}
+
+	return val;
+}
+
+static u64 __i915_pmu_event_read(struct perf_event *event, bool locked)
 {
 	struct drm_i915_private *i915 =
 		container_of(event->pmu, typeof(*i915), pmu.base);
@@ -447,18 +521,7 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 			val = count_interrupts(i915);
 			break;
 		case I915_PMU_RC6_RESIDENCY:
-			intel_runtime_pm_get(i915);
-			val = intel_rc6_residency_ns(i915,
-						     IS_VALLEYVIEW(i915) ?
-						     VLV_GT_RENDER_RC6 :
-						     GEN6_GT_GFX_RC6);
-			if (HAS_RC6p(i915))
-				val += intel_rc6_residency_ns(i915,
-							      GEN6_GT_GFX_RC6p);
-			if (HAS_RC6pp(i915))
-				val += intel_rc6_residency_ns(i915,
-							      GEN6_GT_GFX_RC6pp);
-			intel_runtime_pm_put(i915);
+			val = get_rc6(i915, locked);
 			break;
 		}
 	}
@@ -473,7 +536,7 @@ static void i915_pmu_event_read(struct perf_event *event)
 
 again:
 	prev = local64_read(&hwc->prev_count);
-	new = __i915_pmu_event_read(event);
+	new = __i915_pmu_event_read(event, false);
 
 	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
 		goto again;
@@ -528,7 +591,7 @@ static void i915_pmu_enable(struct perf_event *event)
 	 * for all listeners. Even when the event was already enabled and has
 	 * an existing non-zero value.
 	 */
-	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event, true));
 
 	spin_unlock_irqrestore(&i915->pmu.lock, flags);
 }
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
index 40c154d13565..bb62df15afa4 100644
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -27,6 +27,8 @@
 enum {
 	__I915_SAMPLE_FREQ_ACT = 0,
 	__I915_SAMPLE_FREQ_REQ,
+	__I915_SAMPLE_RC6,
+	__I915_SAMPLE_RC6_ESTIMATED,
 	__I915_NUM_PMU_SAMPLERS
 };
 
@@ -94,6 +96,10 @@ struct i915_pmu {
 	 * struct intel_engine_cs.
 	 */
 	struct i915_pmu_sample sample[__I915_NUM_PMU_SAMPLERS];
+	/**
+	 * @suspended_jiffies_last: Cached suspend time from PM core.
+	 */
+	unsigned long suspended_jiffies_last;
 };
 
 #ifdef CONFIG_PERF_EVENTS
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM
  2018-02-13  9:29 ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
  2018-02-13  9:29   ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
  2018-02-13  9:29   ` [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout Tvrtko Ursulin
@ 2018-02-13  9:29   ` Tvrtko Ursulin
  2 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:29 UTC (permalink / raw)
  To: Intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

As we peek inside struct device to query members guarded by CONFIG_PM,
so must be the code.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207160428.17015-1-chris@chris-wilson.co.uk
---
 drivers/gpu/drm/i915/i915_pmu.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index e13859aaa2a3..0e9b98c32b62 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -409,22 +409,32 @@ static int i915_pmu_event_init(struct perf_event *event)
 	return 0;
 }
 
-static u64 get_rc6(struct drm_i915_private *i915, bool locked)
+static u64 __get_rc6(struct drm_i915_private *i915)
 {
-	unsigned long flags;
 	u64 val;
 
-	if (intel_runtime_pm_get_if_in_use(i915)) {
-		val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ?
-						   VLV_GT_RENDER_RC6 :
-						   GEN6_GT_GFX_RC6);
+	val = intel_rc6_residency_ns(i915,
+				     IS_VALLEYVIEW(i915) ?
+				     VLV_GT_RENDER_RC6 :
+				     GEN6_GT_GFX_RC6);
 
-		if (HAS_RC6p(i915))
-			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+	if (HAS_RC6p(i915))
+		val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+
+	if (HAS_RC6pp(i915))
+		val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+
+	return val;
+}
 
-		if (HAS_RC6pp(i915))
-			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+static u64 get_rc6(struct drm_i915_private *i915, bool locked)
+{
+#if IS_ENABLED(CONFIG_PM)
+	unsigned long flags;
+	u64 val;
 
+	if (intel_runtime_pm_get_if_in_use(i915)) {
+		val = __get_rc6(i915);
 		intel_runtime_pm_put(i915);
 
 		/*
@@ -481,6 +491,9 @@ static u64 get_rc6(struct drm_i915_private *i915, bool locked)
 	}
 
 	return val;
+#else
+	return __get_rc6(i915);
+#endif
 }
 
 static u64 __i915_pmu_event_read(struct perf_event *event, bool locked)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1
  2018-02-13  9:20 ` patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Tvrtko Ursulin
@ 2018-02-13  9:38   ` Jani Nikula
  2018-02-13  9:57     ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
  0 siblings, 1 reply; 16+ messages in thread
From: Jani Nikula @ 2018-02-13  9:38 UTC (permalink / raw)
  To: Tvrtko Ursulin, Rodrigo Vivi, intel-gfx

On Tue, 13 Feb 2018, Tvrtko Ursulin <tursulin@ursulin.net> wrote:
> On 13/02/18 07:38, Rodrigo Vivi wrote:
>> 
>> These patches failed to cherry-pick on drm-intel-fixes
>> 
>> Please provide a backport or advise how to proceed.
>> 
>> fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers")
>> 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
>
>
> I'm building at the moment in order to test and then I'll send some 
> patches. Plural because, since drm-intel-fixes includes:
>
> commit 4900727d35bb20028f9bd83146ec4bf78afffe30
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Thu Jan 11 07:30:31 2018 +0000
>
>      drm/i915/pmu: Reconstruct active state on starting busy-stats
>
>
> We need to include a chain of two more fixups:
>
> commit 99e48bf98dd036090b480a12c39e8b971731247e
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Jan 15 09:20:41 2018 +0000
>
>      drm/i915: Lock out execlist tasklet while peeking inside for busy-stats
>
>
> commit b2f78cda260bc6a1a2d382b1d85a29e69b5b3724
> Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Date:   Mon Feb 5 09:34:48 2018 +0000
>
>      drm/i915/pmu: Fix PMU enable vs execlists tasklet race
>
>
> And then the referenced fix, plus another fixup:
>
> commit 1fe699e30113ed6f6e853ff44710d256072ea627
> Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Date:   Tue Feb 6 18:33:11 2018 +0000
>
>      drm/i915/pmu: Fix sleep under atomic in RC6 readout
>
> commit 05273c950a3c93c5f96be8807eaf24f2cc9f1c1e
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Wed Feb 7 16:04:28 2018 +0000
>
>      drm/i915/pmu: Fix building without CONFIG_PM
>
>
> I'll send them when I test my rebases.

Please add the original dinq commit references in the format

(cherry picked from commit fd10e2ce9905030d922e179a8047a4d50daffd8e)

to the backports.

Thanks,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915/breadcrumbs: Ignore unsubmitted signalers (rev2)
  2018-02-13  7:38 patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Rodrigo Vivi
                   ` (2 preceding siblings ...)
  2018-02-13  9:29 ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
@ 2018-02-13  9:43 ` Patchwork
  3 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2018-02-13  9:43 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/breadcrumbs: Ignore unsubmitted signalers (rev2)
URL   : https://patchwork.freedesktop.org/series/37724/
State : failure

== Summary ==

Applying: drm/i915/breadcrumbs: Ignore unsubmitted signalers
error: Failed to merge in the changes.
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/intel_breadcrumbs.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_breadcrumbs.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_breadcrumbs.c
Patch failed at 0001 drm/i915/breadcrumbs: Ignore unsubmitted signalers
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats
  2018-02-13  9:38   ` Jani Nikula
@ 2018-02-13  9:57     ` Tvrtko Ursulin
  2018-02-13  9:57       ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
                         ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:57 UTC (permalink / raw)
  To: Intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

In order to prevent a race condition where we may end up overaccounting
the active state and leaving the busy-stats believing the GPU is 100%
busy, lock out the tasklet while we reconstruct the busy state. There is
no direct spinlock guard for the execlists->port[], so we need to
utilise tasklet_disable() as a synchronous barrier to prevent it, the
only writer to execlists->port[], from running at the same time as the
enable.

Fixes: 4900727d35bb ("drm/i915/pmu: Reconstruct active state on starting busy-stats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115092041.13509-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit 99e48bf98dd036090b480a12c39e8b971731247e)
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index acc661aa9c0c..fa960cfd2764 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1945,16 +1945,22 @@ intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
  */
 int intel_enable_engine_stats(struct intel_engine_cs *engine)
 {
+	struct intel_engine_execlists *execlists = &engine->execlists;
 	unsigned long flags;
+	int err = 0;
 
 	if (!intel_engine_supports_stats(engine))
 		return -ENODEV;
 
+	tasklet_disable(&execlists->tasklet);
 	spin_lock_irqsave(&engine->stats.lock, flags);
-	if (engine->stats.enabled == ~0)
-		goto busy;
+
+	if (unlikely(engine->stats.enabled == ~0)) {
+		err = -EBUSY;
+		goto unlock;
+	}
+
 	if (engine->stats.enabled++ == 0) {
-		struct intel_engine_execlists *execlists = &engine->execlists;
 		const struct execlist_port *port = execlists->port;
 		unsigned int num_ports = execlists_num_ports(execlists);
 
@@ -1969,14 +1975,12 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 		if (engine->stats.active)
 			engine->stats.start = engine->stats.enabled_at;
 	}
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
 
-	return 0;
-
-busy:
+unlock:
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	tasklet_enable(&execlists->tasklet);
 
-	return -EBUSY;
+	return err;
 }
 
 static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race
  2018-02-13  9:57     ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
@ 2018-02-13  9:57       ` Tvrtko Ursulin
  2018-02-13  9:57       ` [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout Tvrtko Ursulin
  2018-02-13  9:57       ` [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM Tvrtko Ursulin
  2 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:57 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Commit 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking
inside for busy-stats") added a tasklet_disable call in busy stats
enabling, but we failed to understand that the PMU enable callback runs
as an hard IRQ (IPI).

Consequence of this is that the PMU enable callback can interrupt the
execlists tasklet, and will then deadlock when it calls
intel_engine_stats_enable->tasklet_disable.

To fix this, I realized it is possible to move the engine stats enablement
and disablement to PMU event init and destroy hooks. This allows for much
simpler implementation since those hooks run in normal context (can
sleep).

v2: Extract engine_event_destroy. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats")
Testcase: igt/perf_pmu/enable-race-*
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205093448.13877-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit b2f78cda260bc6a1a2d382b1d85a29e69b5b3724)
---
 drivers/gpu/drm/i915/i915_pmu.c         | 125 +++++++++++++-------------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  14 ----
 2 files changed, 52 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 55a8a1e29424..337eaa6ede52 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -285,26 +285,41 @@ static u64 count_interrupts(struct drm_i915_private *i915)
 	return sum;
 }
 
-static void i915_pmu_event_destroy(struct perf_event *event)
+static void engine_event_destroy(struct perf_event *event)
 {
-	WARN_ON(event->parent);
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	struct intel_engine_cs *engine;
+
+	engine = intel_engine_lookup_user(i915,
+					  engine_event_class(event),
+					  engine_event_instance(event));
+	if (WARN_ON_ONCE(!engine))
+		return;
+
+	if (engine_event_sample(event) == I915_SAMPLE_BUSY &&
+	    intel_engine_supports_stats(engine))
+		intel_disable_engine_stats(engine);
 }
 
-static int engine_event_init(struct perf_event *event)
+static void i915_pmu_event_destroy(struct perf_event *event)
 {
-	struct drm_i915_private *i915 =
-		container_of(event->pmu, typeof(*i915), pmu.base);
+	WARN_ON(event->parent);
 
-	if (!intel_engine_lookup_user(i915, engine_event_class(event),
-				      engine_event_instance(event)))
-		return -ENODEV;
+	if (is_engine_event(event))
+		engine_event_destroy(event);
+}
 
-	switch (engine_event_sample(event)) {
+static int
+engine_event_status(struct intel_engine_cs *engine,
+		    enum drm_i915_pmu_engine_sample sample)
+{
+	switch (sample) {
 	case I915_SAMPLE_BUSY:
 	case I915_SAMPLE_WAIT:
 		break;
 	case I915_SAMPLE_SEMA:
-		if (INTEL_GEN(i915) < 6)
+		if (INTEL_GEN(engine->i915) < 6)
 			return -ENODEV;
 		break;
 	default:
@@ -314,6 +329,30 @@ static int engine_event_init(struct perf_event *event)
 	return 0;
 }
 
+static int engine_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	struct intel_engine_cs *engine;
+	u8 sample;
+	int ret;
+
+	engine = intel_engine_lookup_user(i915, engine_event_class(event),
+					  engine_event_instance(event));
+	if (!engine)
+		return -ENODEV;
+
+	sample = engine_event_sample(event);
+	ret = engine_event_status(engine, sample);
+	if (ret)
+		return ret;
+
+	if (sample == I915_SAMPLE_BUSY && intel_engine_supports_stats(engine))
+		ret = intel_enable_engine_stats(engine);
+
+	return ret;
+}
+
 static int i915_pmu_event_init(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -387,7 +426,7 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 		if (WARN_ON_ONCE(!engine)) {
 			/* Do nothing */
 		} else if (sample == I915_SAMPLE_BUSY &&
-			   engine->pmu.busy_stats) {
+			   intel_engine_supports_stats(engine)) {
 			val = ktime_to_ns(intel_engine_get_busy_time(engine));
 		} else {
 			val = engine->pmu.sample[sample].cur;
@@ -442,12 +481,6 @@ static void i915_pmu_event_read(struct perf_event *event)
 	local64_add(new - prev, &event->count);
 }
 
-static bool engine_needs_busy_stats(struct intel_engine_cs *engine)
-{
-	return intel_engine_supports_stats(engine) &&
-	       (engine->pmu.enable & BIT(I915_SAMPLE_BUSY));
-}
-
 static void i915_pmu_enable(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -487,21 +520,7 @@ static void i915_pmu_enable(struct perf_event *event)
 
 		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
 		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
-		if (engine->pmu.enable_count[sample]++ == 0) {
-			/*
-			 * Enable engine busy stats tracking if needed or
-			 * alternatively cancel the scheduled disable.
-			 *
-			 * If the delayed disable was pending, cancel it and
-			 * in this case do not enable since it already is.
-			 */
-			if (engine_needs_busy_stats(engine) &&
-			    !engine->pmu.busy_stats) {
-				engine->pmu.busy_stats = true;
-				if (!cancel_delayed_work(&engine->pmu.disable_busy_stats))
-					intel_enable_engine_stats(engine);
-			}
-		}
+		engine->pmu.enable_count[sample]++;
 	}
 
 	/*
@@ -514,14 +533,6 @@ static void i915_pmu_enable(struct perf_event *event)
 	spin_unlock_irqrestore(&i915->pmu.lock, flags);
 }
 
-static void __disable_busy_stats(struct work_struct *work)
-{
-	struct intel_engine_cs *engine =
-	       container_of(work, typeof(*engine), pmu.disable_busy_stats.work);
-
-	intel_disable_engine_stats(engine);
-}
-
 static void i915_pmu_disable(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -545,26 +556,8 @@ static void i915_pmu_disable(struct perf_event *event)
 		 * Decrement the reference count and clear the enabled
 		 * bitmask when the last listener on an event goes away.
 		 */
-		if (--engine->pmu.enable_count[sample] == 0) {
+		if (--engine->pmu.enable_count[sample] == 0)
 			engine->pmu.enable &= ~BIT(sample);
-			if (!engine_needs_busy_stats(engine) &&
-			    engine->pmu.busy_stats) {
-				engine->pmu.busy_stats = false;
-				/*
-				 * We request a delayed disable to handle the
-				 * rapid on/off cycles on events, which can
-				 * happen when tools like perf stat start, in a
-				 * nicer way.
-				 *
-				 * In addition, this also helps with busy stats
-				 * accuracy with background CPU offline/online
-				 * migration events.
-				 */
-				queue_delayed_work(system_wq,
-						   &engine->pmu.disable_busy_stats,
-						   round_jiffies_up_relative(HZ));
-			}
-		}
 	}
 
 	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
@@ -797,8 +790,6 @@ static void i915_pmu_unregister_cpuhp_state(struct drm_i915_private *i915)
 
 void i915_pmu_register(struct drm_i915_private *i915)
 {
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
 	int ret;
 
 	if (INTEL_GEN(i915) <= 2) {
@@ -820,10 +811,6 @@ void i915_pmu_register(struct drm_i915_private *i915)
 	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	i915->pmu.timer.function = i915_sample;
 
-	for_each_engine(engine, i915, id)
-		INIT_DELAYED_WORK(&engine->pmu.disable_busy_stats,
-				  __disable_busy_stats);
-
 	ret = perf_pmu_register(&i915->pmu.base, "i915", -1);
 	if (ret)
 		goto err;
@@ -843,9 +830,6 @@ void i915_pmu_register(struct drm_i915_private *i915)
 
 void i915_pmu_unregister(struct drm_i915_private *i915)
 {
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
 	if (!i915->pmu.base.event_init)
 		return;
 
@@ -853,11 +837,6 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
 
 	hrtimer_cancel(&i915->pmu.timer);
 
-	for_each_engine(engine, i915, id) {
-		GEM_BUG_ON(engine->pmu.busy_stats);
-		flush_delayed_work(&engine->pmu.disable_busy_stats);
-	}
-
 	i915_pmu_unregister_cpuhp_state(i915);
 
 	perf_pmu_unregister(&i915->pmu.base);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c5ff203e42d6..a0e7a6c2a57c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -366,20 +366,6 @@ struct intel_engine_cs {
 		 */
 #define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_SEMA + 1)
 		struct i915_pmu_sample sample[I915_ENGINE_SAMPLE_MAX];
-		/**
-		 * @busy_stats: Has enablement of engine stats tracking been
-		 * 		requested.
-		 */
-		bool busy_stats;
-		/**
-		 * @disable_busy_stats: Work item for busy stats disabling.
-		 *
-		 * Same as with @enable_busy_stats action, with the difference
-		 * that we delay it in case there are rapid enable-disable
-		 * actions, which can happen during tool startup (like perf
-		 * stat).
-		 */
-		struct delayed_work disable_busy_stats;
 	} pmu;
 
 	/*
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout
  2018-02-13  9:57     ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
  2018-02-13  9:57       ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
@ 2018-02-13  9:57       ` Tvrtko Ursulin
  2018-02-13  9:57       ` [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM Tvrtko Ursulin
  2 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:57 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We are not allowed to call intel_runtime_pm_get from the PMU counter read
callback since the former can sleep, and the latter is running under IRQ
context.

To workaround this, we record the last known RC6 and while runtime
suspended estimate its increase by querying the runtime PM core
timestamps.

Downside of this approach is that we can temporarily lose a chunk of RC6
time, from the last PMU read-out to runtime suspend entry, but that will
eventually catch up, once device comes back online and in the presence of
PMU queries.

Also, we have to be careful not to overshoot the RC6 estimate, so once
resumed after a period of approximation, we only update the counter once
it catches up. With the observation that RC6 is increasing while the
device is suspended, this should not pose a problem and can only cause
slight inaccuracies due clock base differences.

v2: Simplify by estimating on top of PM core counters. (Imre)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943
Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics")
Testcase: igt/perf_pmu/rc6-runtime-pm
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206183311.17924-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 1fe699e30113ed6f6e853ff44710d256072ea627)
---
 drivers/gpu/drm/i915/i915_pmu.c | 93 ++++++++++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_pmu.h |  6 +++
 2 files changed, 84 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 337eaa6ede52..e13859aaa2a3 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -409,7 +409,81 @@ static int i915_pmu_event_init(struct perf_event *event)
 	return 0;
 }
 
-static u64 __i915_pmu_event_read(struct perf_event *event)
+static u64 get_rc6(struct drm_i915_private *i915, bool locked)
+{
+	unsigned long flags;
+	u64 val;
+
+	if (intel_runtime_pm_get_if_in_use(i915)) {
+		val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ?
+						   VLV_GT_RENDER_RC6 :
+						   GEN6_GT_GFX_RC6);
+
+		if (HAS_RC6p(i915))
+			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+
+		if (HAS_RC6pp(i915))
+			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+
+		intel_runtime_pm_put(i915);
+
+		/*
+		 * If we are coming back from being runtime suspended we must
+		 * be careful not to report a larger value than returned
+		 * previously.
+		 */
+
+		if (!locked)
+			spin_lock_irqsave(&i915->pmu.lock, flags);
+
+		if (val >= i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) {
+			i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0;
+			i915->pmu.sample[__I915_SAMPLE_RC6].cur = val;
+		} else {
+			val = i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur;
+		}
+
+		if (!locked)
+			spin_unlock_irqrestore(&i915->pmu.lock, flags);
+	} else {
+		struct pci_dev *pdev = i915->drm.pdev;
+		struct device *kdev = &pdev->dev;
+		unsigned long flags2;
+
+		/*
+		 * We are runtime suspended.
+		 *
+		 * Report the delta from when the device was suspended to now,
+		 * on top of the last known real value, as the approximated RC6
+		 * counter value.
+		 */
+		if (!locked)
+			spin_lock_irqsave(&i915->pmu.lock, flags);
+
+		spin_lock_irqsave(&kdev->power.lock, flags2);
+
+		if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
+			i915->pmu.suspended_jiffies_last =
+						kdev->power.suspended_jiffies;
+
+		val = kdev->power.suspended_jiffies -
+		      i915->pmu.suspended_jiffies_last;
+		val += jiffies - kdev->power.accounting_timestamp;
+
+		spin_unlock_irqrestore(&kdev->power.lock, flags2);
+
+		val = jiffies_to_nsecs(val);
+		val += i915->pmu.sample[__I915_SAMPLE_RC6].cur;
+		i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val;
+
+		if (!locked)
+			spin_unlock_irqrestore(&i915->pmu.lock, flags);
+	}
+
+	return val;
+}
+
+static u64 __i915_pmu_event_read(struct perf_event *event, bool locked)
 {
 	struct drm_i915_private *i915 =
 		container_of(event->pmu, typeof(*i915), pmu.base);
@@ -447,18 +521,7 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 			val = count_interrupts(i915);
 			break;
 		case I915_PMU_RC6_RESIDENCY:
-			intel_runtime_pm_get(i915);
-			val = intel_rc6_residency_ns(i915,
-						     IS_VALLEYVIEW(i915) ?
-						     VLV_GT_RENDER_RC6 :
-						     GEN6_GT_GFX_RC6);
-			if (HAS_RC6p(i915))
-				val += intel_rc6_residency_ns(i915,
-							      GEN6_GT_GFX_RC6p);
-			if (HAS_RC6pp(i915))
-				val += intel_rc6_residency_ns(i915,
-							      GEN6_GT_GFX_RC6pp);
-			intel_runtime_pm_put(i915);
+			val = get_rc6(i915, locked);
 			break;
 		}
 	}
@@ -473,7 +536,7 @@ static void i915_pmu_event_read(struct perf_event *event)
 
 again:
 	prev = local64_read(&hwc->prev_count);
-	new = __i915_pmu_event_read(event);
+	new = __i915_pmu_event_read(event, false);
 
 	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
 		goto again;
@@ -528,7 +591,7 @@ static void i915_pmu_enable(struct perf_event *event)
 	 * for all listeners. Even when the event was already enabled and has
 	 * an existing non-zero value.
 	 */
-	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event, true));
 
 	spin_unlock_irqrestore(&i915->pmu.lock, flags);
 }
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h
index 40c154d13565..bb62df15afa4 100644
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -27,6 +27,8 @@
 enum {
 	__I915_SAMPLE_FREQ_ACT = 0,
 	__I915_SAMPLE_FREQ_REQ,
+	__I915_SAMPLE_RC6,
+	__I915_SAMPLE_RC6_ESTIMATED,
 	__I915_NUM_PMU_SAMPLERS
 };
 
@@ -94,6 +96,10 @@ struct i915_pmu {
 	 * struct intel_engine_cs.
 	 */
 	struct i915_pmu_sample sample[__I915_NUM_PMU_SAMPLERS];
+	/**
+	 * @suspended_jiffies_last: Cached suspend time from PM core.
+	 */
+	unsigned long suspended_jiffies_last;
 };
 
 #ifdef CONFIG_PERF_EVENTS
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM
  2018-02-13  9:57     ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
  2018-02-13  9:57       ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
  2018-02-13  9:57       ` [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout Tvrtko Ursulin
@ 2018-02-13  9:57       ` Tvrtko Ursulin
  2018-02-14  0:58         ` Rodrigo Vivi
  2 siblings, 1 reply; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-02-13  9:57 UTC (permalink / raw)
  To: Intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

As we peek inside struct device to query members guarded by CONFIG_PM,
so must be the code.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207160428.17015-1-chris@chris-wilson.co.uk
(cherry picked from commit 05273c950a3c93c5f96be8807eaf24f2cc9f1c1e)
---
 drivers/gpu/drm/i915/i915_pmu.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index e13859aaa2a3..0e9b98c32b62 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -409,22 +409,32 @@ static int i915_pmu_event_init(struct perf_event *event)
 	return 0;
 }
 
-static u64 get_rc6(struct drm_i915_private *i915, bool locked)
+static u64 __get_rc6(struct drm_i915_private *i915)
 {
-	unsigned long flags;
 	u64 val;
 
-	if (intel_runtime_pm_get_if_in_use(i915)) {
-		val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ?
-						   VLV_GT_RENDER_RC6 :
-						   GEN6_GT_GFX_RC6);
+	val = intel_rc6_residency_ns(i915,
+				     IS_VALLEYVIEW(i915) ?
+				     VLV_GT_RENDER_RC6 :
+				     GEN6_GT_GFX_RC6);
 
-		if (HAS_RC6p(i915))
-			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+	if (HAS_RC6p(i915))
+		val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+
+	if (HAS_RC6pp(i915))
+		val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+
+	return val;
+}
 
-		if (HAS_RC6pp(i915))
-			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+static u64 get_rc6(struct drm_i915_private *i915, bool locked)
+{
+#if IS_ENABLED(CONFIG_PM)
+	unsigned long flags;
+	u64 val;
 
+	if (intel_runtime_pm_get_if_in_use(i915)) {
+		val = __get_rc6(i915);
 		intel_runtime_pm_put(i915);
 
 		/*
@@ -481,6 +491,9 @@ static u64 get_rc6(struct drm_i915_private *i915, bool locked)
 	}
 
 	return val;
+#else
+	return __get_rc6(i915);
+#endif
 }
 
 static u64 __i915_pmu_event_read(struct perf_event *event, bool locked)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM
  2018-02-13  9:57       ` [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM Tvrtko Ursulin
@ 2018-02-14  0:58         ` Rodrigo Vivi
  0 siblings, 0 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2018-02-14  0:58 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx


On Tue, Feb 13, 2018 at 09:57:47AM +0000, Tvrtko Ursulin wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> As we peek inside struct device to query members guarded by CONFIG_PM,
> so must be the code.
> 
> Reported-by: kbuild test robot <fengguang.wu@intel.com>
> Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20180207160428.17015-1-chris@chris-wilson.co.uk
> (cherry picked from commit 05273c950a3c93c5f96be8807eaf24f2cc9f1c1e)

All 4 patches applied to fixes.
Thanks for backporting and for adding cherry-picked message.

> ---
>  drivers/gpu/drm/i915/i915_pmu.c | 33 +++++++++++++++++++++++----------
>  1 file changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index e13859aaa2a3..0e9b98c32b62 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -409,22 +409,32 @@ static int i915_pmu_event_init(struct perf_event *event)
>  	return 0;
>  }
>  
> -static u64 get_rc6(struct drm_i915_private *i915, bool locked)
> +static u64 __get_rc6(struct drm_i915_private *i915)
>  {
> -	unsigned long flags;
>  	u64 val;
>  
> -	if (intel_runtime_pm_get_if_in_use(i915)) {
> -		val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ?
> -						   VLV_GT_RENDER_RC6 :
> -						   GEN6_GT_GFX_RC6);
> +	val = intel_rc6_residency_ns(i915,
> +				     IS_VALLEYVIEW(i915) ?
> +				     VLV_GT_RENDER_RC6 :
> +				     GEN6_GT_GFX_RC6);
>  
> -		if (HAS_RC6p(i915))
> -			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
> +	if (HAS_RC6p(i915))
> +		val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
> +
> +	if (HAS_RC6pp(i915))
> +		val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
> +
> +	return val;
> +}
>  
> -		if (HAS_RC6pp(i915))
> -			val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
> +static u64 get_rc6(struct drm_i915_private *i915, bool locked)
> +{
> +#if IS_ENABLED(CONFIG_PM)
> +	unsigned long flags;
> +	u64 val;
>  
> +	if (intel_runtime_pm_get_if_in_use(i915)) {
> +		val = __get_rc6(i915);
>  		intel_runtime_pm_put(i915);
>  
>  		/*
> @@ -481,6 +491,9 @@ static u64 get_rc6(struct drm_i915_private *i915, bool locked)
>  	}
>  
>  	return val;
> +#else
> +	return __get_rc6(i915);
> +#endif
>  }
>  
>  static u64 __i915_pmu_event_read(struct perf_event *event, bool locked)
> -- 
> 2.14.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers
  2018-02-13  9:01 ` [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers Chris Wilson
@ 2018-02-14  1:00     ` Rodrigo Vivi
  0 siblings, 0 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2018-02-14  1:00 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Tvrtko Ursulin, Joonas Lahtinen, stable

On Tue, Feb 13, 2018 at 09:01:54AM +0000, Chris Wilson wrote:
> When a request is preempted, it is unsubmitted from the HW queue and
> removed from the active list of breadcrumbs. In the process, this
> however triggers the signaler and it may see the clear rbtree with the
> old, and still valid, seqno, or it may match the cleared seqno with the
> now zero rq->global_seqno. This confuses the signaler into action and
> signaling the fence.
> 
> Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v4.12+
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@chris-wilson.co.uk
> (cherry picked from commit fd10e2ce9905030d922e179a8047a4d50daffd8e)

applied to fixes. Thanks

> ---
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 29 ++++++++++-------------------
>  1 file changed, 10 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index bd40fea16b4f..f54ddda9fdad 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -594,29 +594,16 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
>  	spin_unlock_irq(&b->rb_lock);
>  }
>  
> -static bool signal_valid(const struct drm_i915_gem_request *request)
> -{
> -	return intel_wait_check_request(&request->signaling.wait, request);
> -}
> -
>  static bool signal_complete(const struct drm_i915_gem_request *request)
>  {
>  	if (!request)
>  		return false;
>  
> -	/* If another process served as the bottom-half it may have already
> -	 * signalled that this wait is already completed.
> -	 */
> -	if (intel_wait_complete(&request->signaling.wait))
> -		return signal_valid(request);
> -
> -	/* Carefully check if the request is complete, giving time for the
> +	/*
> +	 * Carefully check if the request is complete, giving time for the
>  	 * seqno to be visible or if the GPU hung.
>  	 */
> -	if (__i915_request_irq_complete(request))
> -		return true;
> -
> -	return false;
> +	return __i915_request_irq_complete(request);
>  }
>  
>  static struct drm_i915_gem_request *to_signaler(struct rb_node *rb)
> @@ -659,9 +646,13 @@ static int intel_breadcrumbs_signaler(void *arg)
>  			request = i915_gem_request_get_rcu(request);
>  		rcu_read_unlock();
>  		if (signal_complete(request)) {
> -			local_bh_disable();
> -			dma_fence_signal(&request->fence);
> -			local_bh_enable(); /* kick start the tasklets */
> +			if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> +				      &request->fence.flags)) {
> +				local_bh_disable();
> +				dma_fence_signal(&request->fence);
> +				GEM_BUG_ON(!i915_gem_request_completed(request));
> +				local_bh_enable(); /* kick start the tasklets */
> +			}
>  
>  			spin_lock_irq(&b->rb_lock);
>  
> -- 
> 2.16.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers
@ 2018-02-14  1:00     ` Rodrigo Vivi
  0 siblings, 0 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2018-02-14  1:00 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On Tue, Feb 13, 2018 at 09:01:54AM +0000, Chris Wilson wrote:
> When a request is preempted, it is unsubmitted from the HW queue and
> removed from the active list of breadcrumbs. In the process, this
> however triggers the signaler and it may see the clear rbtree with the
> old, and still valid, seqno, or it may match the cleared seqno with the
> now zero rq->global_seqno. This confuses the signaler into action and
> signaling the fence.
> 
> Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v4.12+
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@chris-wilson.co.uk
> (cherry picked from commit fd10e2ce9905030d922e179a8047a4d50daffd8e)

applied to fixes. Thanks

> ---
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 29 ++++++++++-------------------
>  1 file changed, 10 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index bd40fea16b4f..f54ddda9fdad 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -594,29 +594,16 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
>  	spin_unlock_irq(&b->rb_lock);
>  }
>  
> -static bool signal_valid(const struct drm_i915_gem_request *request)
> -{
> -	return intel_wait_check_request(&request->signaling.wait, request);
> -}
> -
>  static bool signal_complete(const struct drm_i915_gem_request *request)
>  {
>  	if (!request)
>  		return false;
>  
> -	/* If another process served as the bottom-half it may have already
> -	 * signalled that this wait is already completed.
> -	 */
> -	if (intel_wait_complete(&request->signaling.wait))
> -		return signal_valid(request);
> -
> -	/* Carefully check if the request is complete, giving time for the
> +	/*
> +	 * Carefully check if the request is complete, giving time for the
>  	 * seqno to be visible or if the GPU hung.
>  	 */
> -	if (__i915_request_irq_complete(request))
> -		return true;
> -
> -	return false;
> +	return __i915_request_irq_complete(request);
>  }
>  
>  static struct drm_i915_gem_request *to_signaler(struct rb_node *rb)
> @@ -659,9 +646,13 @@ static int intel_breadcrumbs_signaler(void *arg)
>  			request = i915_gem_request_get_rcu(request);
>  		rcu_read_unlock();
>  		if (signal_complete(request)) {
> -			local_bh_disable();
> -			dma_fence_signal(&request->fence);
> -			local_bh_enable(); /* kick start the tasklets */
> +			if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> +				      &request->fence.flags)) {
> +				local_bh_disable();
> +				dma_fence_signal(&request->fence);
> +				GEM_BUG_ON(!i915_gem_request_completed(request));
> +				local_bh_enable(); /* kick start the tasklets */
> +			}
>  
>  			spin_lock_irq(&b->rb_lock);
>  
> -- 
> 2.16.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-02-14  1:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-13  7:38 patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Rodrigo Vivi
2018-02-13  9:01 ` [PATCH] drm/i915/breadcrumbs: Ignore unsubmitted signalers Chris Wilson
2018-02-14  1:00   ` Rodrigo Vivi
2018-02-14  1:00     ` Rodrigo Vivi
2018-02-13  9:20 ` patches that failed to cherry-pick on drm-intel-fixes for 4.16-rc1 Tvrtko Ursulin
2018-02-13  9:38   ` Jani Nikula
2018-02-13  9:57     ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
2018-02-13  9:57       ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
2018-02-13  9:57       ` [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout Tvrtko Ursulin
2018-02-13  9:57       ` [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM Tvrtko Ursulin
2018-02-14  0:58         ` Rodrigo Vivi
2018-02-13  9:29 ` [PATCH 1/4] drm/i915: Lock out execlist tasklet while peeking inside for busy-stats Tvrtko Ursulin
2018-02-13  9:29   ` [PATCH 2/4] drm/i915/pmu: Fix PMU enable vs execlists tasklet race Tvrtko Ursulin
2018-02-13  9:29   ` [PATCH 3/4] drm/i915/pmu: Fix sleep under atomic in RC6 readout Tvrtko Ursulin
2018-02-13  9:29   ` [PATCH 4/4] drm/i915/pmu: Fix building without CONFIG_PM Tvrtko Ursulin
2018-02-13  9:43 ` ✗ Fi.CI.BAT: failure for drm/i915/breadcrumbs: Ignore unsubmitted signalers (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.