All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/6] Submitted queue depth stats
@ 2018-01-18 10:41 Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 1/6] drm/i915/pmu: Fix enable count array size and bounds checking Tvrtko Ursulin
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-18 10:41 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Per-engine queue depths are an interesting metric for analyzing the system load
and also for users who wish to use it to load balance their submissions based
on it.

In this version I have split the metrics into three separate counters:

1. SUBMITTED - From execbuf time to request being runnable - meaning
	       dependencies have been resolved and fences signaled.
2. QUEUED - From runnable to running on the GPU.
3. RUNNING - Running on the GPU.

When inspected with perf stat the output looks roughly like this:

#           time             counts unit events
   201.160490145               0.01      i915/rcs0-submitted/
   201.160490145              19.13      i915/rcs0-queued/
   201.160490145               2.39      i915/rcs0-running/

The reported numbers are average queue depths for the last query period.

Having split out metrics should be more flexible for all users, and it is still
possible to fetch an atomic snapshot of all using the perf groups for those
wanting to combine them.

For users wanting instantanous numbers instead of averaged, we could potentially
expose them using the query API Lionel is working on.
(https://patchwork.freedesktop.org/series/36622/)

For instance a query packet could look like:

#define DRM_I915_QUERY_ENGINE_QUEUES		0x04

struct drm_i915_query_engine_queues {
	__u8 class;
	__u8 instance

	__u8 pad[2];

	__u32 submitted;
	__u32 queued;
	__u32 running;
};

I also have patches to expose this via intel-gpu-top, using the perf API.

Tvrtko Ursulin (6):
  drm/i915/pmu: Fix enable count array size and bounds checking
  drm/i915: Keep a count of requests waiting for a slot on GPU
  drm/i915: Keep a count of requests submitted from userspace
  drm/i915/pmu: Add queued counter
  drm/i915/pmu: Add submitted counter
  drm/i915/pmu: Add running counter

 drivers/gpu/drm/i915/i915_gem_request.c | 10 +++++
 drivers/gpu/drm/i915/i915_pmu.c         | 67 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_engine_cs.c  |  6 ++-
 drivers/gpu/drm/i915/intel_lrc.c        |  2 +
 drivers/gpu/drm/i915/intel_ringbuffer.h | 19 +++++++++-
 include/uapi/drm/i915_drm.h             | 18 ++++++++-
 6 files changed, 109 insertions(+), 13 deletions(-)

-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC 1/6] drm/i915/pmu: Fix enable count array size and bounds checking
  2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
@ 2018-01-18 10:41 ` Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 2/6] drm/i915: Keep a count of requests waiting for a slot on GPU Tvrtko Ursulin
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-18 10:41 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Enable count array is supposed to have one counter for each possible
engine sampler. As such array sizing and bounds checking is not
correct when more engine samplers are added.

At the same time tidy the assert for readability and robustness.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 13 +++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 065a28c713c4..cbfca4a255ab 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -476,7 +476,8 @@ static void i915_pmu_enable(struct perf_event *event)
 	 * Update the bitmask of enabled events and increment
 	 * the event reference counter.
 	 */
-	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	BUILD_BUG_ON(ARRAY_SIZE(i915->pmu.enable_count) != I915_PMU_MASK_BITS);
+	GEM_BUG_ON(bit >= ARRAY_SIZE(i915->pmu.enable_count));
 	GEM_BUG_ON(i915->pmu.enable_count[bit] == ~0);
 	i915->pmu.enable |= BIT_ULL(bit);
 	i915->pmu.enable_count[bit]++;
@@ -500,7 +501,10 @@ static void i915_pmu_enable(struct perf_event *event)
 		GEM_BUG_ON(!engine);
 		engine->pmu.enable |= BIT(sample);
 
-		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		BUILD_BUG_ON(ARRAY_SIZE(engine->pmu.enable_count) !=
+			     (1 << I915_PMU_SAMPLE_BITS));
+		GEM_BUG_ON(sample >= ARRAY_SIZE(engine->pmu.enable_count));
+		GEM_BUG_ON(sample >= ARRAY_SIZE(engine->pmu.sample));
 		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
 		if (engine->pmu.enable_count[sample]++ == 0) {
 			/*
@@ -554,7 +558,8 @@ static void i915_pmu_disable(struct perf_event *event)
 						  engine_event_class(event),
 						  engine_event_instance(event));
 		GEM_BUG_ON(!engine);
-		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(sample >= ARRAY_SIZE(engine->pmu.enable_count));
+		GEM_BUG_ON(sample >= ARRAY_SIZE(engine->pmu.sample));
 		GEM_BUG_ON(engine->pmu.enable_count[sample] == 0);
 		/*
 		 * Decrement the reference count and clear the enabled
@@ -582,7 +587,7 @@ static void i915_pmu_disable(struct perf_event *event)
 		}
 	}
 
-	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(bit >= ARRAY_SIZE(i915->pmu.enable_count));
 	GEM_BUG_ON(i915->pmu.enable_count[bit] == 0);
 	/*
 	 * Decrement the reference count and clear the enabled
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c5ff203e42d6..27a0c47db51e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -358,7 +358,7 @@ struct intel_engine_cs {
 		 *
 		 * Index number corresponds to the bit number from @enable.
 		 */
-		unsigned int enable_count[I915_PMU_SAMPLE_BITS];
+		unsigned int enable_count[1 << I915_PMU_SAMPLE_BITS];
 		/**
 		 * @sample: Counter values for sampling events.
 		 *
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 2/6] drm/i915: Keep a count of requests waiting for a slot on GPU
  2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 1/6] drm/i915/pmu: Fix enable count array size and bounds checking Tvrtko Ursulin
@ 2018-01-18 10:41 ` Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 3/6] drm/i915: Keep a count of requests submitted from userspace Tvrtko Ursulin
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-18 10:41 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Keep a per-engine number of runnable (waiting for GPU time) requests.

v2:
 * Move queued increment from insert_request to execlist_submit_request to
   avoid bumping when re-ordering for priority.
 * Support the counter on the ringbuffer submission path as well, albeit
   just notionally. (Chris Wilson)

v3:
 * Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 7 +++++++
 drivers/gpu/drm/i915/intel_engine_cs.c  | 5 +++--
 drivers/gpu/drm/i915/intel_lrc.c        | 2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 6 ++++++
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 72bdc203716f..dc85445221e7 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -502,6 +502,9 @@ void __i915_gem_request_submit(struct drm_i915_gem_request *request)
 	engine->emit_breadcrumb(request,
 				request->ring->vaddr + request->postfix);
 
+	GEM_BUG_ON(engine->queued == 0);
+	engine->queued--;
+
 	spin_lock(&request->timeline->lock);
 	list_move_tail(&request->link, &timeline->requests);
 	spin_unlock(&request->timeline->lock);
@@ -517,6 +520,8 @@ void i915_gem_request_submit(struct drm_i915_gem_request *request)
 	/* Will be called from irq-context when using foreign fences. */
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
+	engine->queued++;
+
 	__i915_gem_request_submit(request);
 
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
@@ -548,6 +553,8 @@ void __i915_gem_request_unsubmit(struct drm_i915_gem_request *request)
 	timeline = request->timeline;
 	GEM_BUG_ON(timeline == engine->timeline);
 
+	engine->queued++;
+
 	spin_lock(&timeline->lock);
 	list_move(&request->link, &timeline->requests);
 	spin_unlock(&timeline->lock);
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index b221610f2365..062c7ff88413 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1725,12 +1725,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 	if (i915_terminally_wedged(&engine->i915->gpu_error))
 		drm_printf(m, "*** WEDGED ***\n");
 
-	drm_printf(m, "\tcurrent seqno %x, last %x, hangcheck %x [%d ms], inflight %d\n",
+	drm_printf(m, "\tcurrent seqno %x, last %x, hangcheck %x [%d ms], inflight %d, queued %u\n",
 		   intel_engine_get_seqno(engine),
 		   intel_engine_last_submit(engine),
 		   engine->hangcheck.seqno,
 		   jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp),
-		   engine->timeline->inflight_seqnos);
+		   engine->timeline->inflight_seqnos,
+		   engine->queued);
 	drm_printf(m, "\tReset count: %d (global %d)\n",
 		   i915_reset_engine_count(error, engine),
 		   i915_reset_count(error));
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ff25f209d0a5..24ce781d39b7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -965,6 +965,8 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
 	/* Will be called from irq-context when using foreign fences. */
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 
+	engine->queued++;
+
 	insert_request(engine, &request->priotree, request->priotree.priority);
 
 	GEM_BUG_ON(!engine->execlists.first);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 27a0c47db51e..332f728ac6b3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -302,6 +302,12 @@ struct intel_engine_cs {
 
 	struct intel_ring *buffer;
 	struct intel_timeline *timeline;
+	/**
+	 * @queued: Number of runnable requests submitted to the backend.
+	 *
+	 * Count of requests waiting for the GPU to execute them.
+	 */
+	unsigned int queued;
 
 	struct drm_i915_gem_object *default_state;
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 3/6] drm/i915: Keep a count of requests submitted from userspace
  2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 1/6] drm/i915/pmu: Fix enable count array size and bounds checking Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 2/6] drm/i915: Keep a count of requests waiting for a slot on GPU Tvrtko Ursulin
@ 2018-01-18 10:41 ` Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 4/6] drm/i915/pmu: Add queued counter Tvrtko Ursulin
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-18 10:41 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Keep a count of requests submitted from userspace and not yet runnable due
unresolved dependencies.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 3 +++
 drivers/gpu/drm/i915/intel_engine_cs.c  | 3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h | 9 +++++++++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index dc85445221e7..76496b11a7ee 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -599,6 +599,7 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 		rcu_read_lock();
 		request->engine->submit_request(request);
 		rcu_read_unlock();
+		atomic_dec(&request->engine->submitted);
 		break;
 
 	case FENCE_FREE:
@@ -1056,6 +1057,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 	if (engine->schedule)
 		engine->schedule(request, request->ctx->priority);
 
+	atomic_inc(&engine->submitted);
+
 	local_bh_disable();
 	i915_sw_fence_commit(&request->submit);
 	local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 062c7ff88413..d572b18d39eb 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1725,12 +1725,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 	if (i915_terminally_wedged(&engine->i915->gpu_error))
 		drm_printf(m, "*** WEDGED ***\n");
 
-	drm_printf(m, "\tcurrent seqno %x, last %x, hangcheck %x [%d ms], inflight %d, queued %u\n",
+	drm_printf(m, "\tcurrent seqno %x, last %x, hangcheck %x [%d ms], inflight %d, submitted %u, queued %u\n",
 		   intel_engine_get_seqno(engine),
 		   intel_engine_last_submit(engine),
 		   engine->hangcheck.seqno,
 		   jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp),
 		   engine->timeline->inflight_seqnos,
+		   atomic_read(&engine->submitted),
 		   engine->queued);
 	drm_printf(m, "\tReset count: %d (global %d)\n",
 		   i915_reset_engine_count(error, engine),
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 332f728ac6b3..77fff2488cde 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -302,6 +302,15 @@ struct intel_engine_cs {
 
 	struct intel_ring *buffer;
 	struct intel_timeline *timeline;
+
+	/**
+	 * @submitted: Number of submitted requests with dependencies.
+	 *
+	 * Count of requests waiting for dependencies before they can be
+	 * submitted to the backend.
+	 */
+
+	atomic_t submitted;
 	/**
 	 * @queued: Number of runnable requests submitted to the backend.
 	 *
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 4/6] drm/i915/pmu: Add queued counter
  2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2018-01-18 10:41 ` [RFC 3/6] drm/i915: Keep a count of requests submitted from userspace Tvrtko Ursulin
@ 2018-01-18 10:41 ` Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 5/6] drm/i915/pmu: Add submitted counter Tvrtko Ursulin
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-18 10:41 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We add a PMU counter to expose the number of requests which are ready to
run and waiting on a free slot on the GPU.

This is useful to analyze the overall load of the system.

v2: Don't limit to gen8+.
v3:
 * Rebase for dynamic sysfs.
 * Drop currently executing requests.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 34 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
 include/uapi/drm/i915_drm.h             |  8 +++++++-
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index cbfca4a255ab..aaf48e85c35e 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -36,7 +36,8 @@
 #define ENGINE_SAMPLE_MASK \
 	(BIT(I915_SAMPLE_BUSY) | \
 	 BIT(I915_SAMPLE_WAIT) | \
-	 BIT(I915_SAMPLE_SEMA))
+	 BIT(I915_SAMPLE_SEMA) | \
+	 BIT(I915_SAMPLE_QUEUED))
 
 #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
 
@@ -220,6 +221,11 @@ static void engines_sample(struct drm_i915_private *dev_priv)
 
 		update_sample(&engine->pmu.sample[I915_SAMPLE_SEMA],
 			      PERIOD, !!(val & RING_WAIT_SEMAPHORE));
+
+		if (engine->pmu.enable & BIT(I915_SAMPLE_QUEUED))
+			update_sample(&engine->pmu.sample[I915_SAMPLE_QUEUED],
+				      1 / I915_SAMPLE_QUEUED_SCALE,
+				      engine->queued);
 	}
 
 	if (fw)
@@ -297,6 +303,7 @@ engine_event_status(struct intel_engine_cs *engine,
 	switch (sample) {
 	case I915_SAMPLE_BUSY:
 	case I915_SAMPLE_WAIT:
+	case I915_SAMPLE_QUEUED:
 		break;
 	case I915_SAMPLE_SEMA:
 		if (INTEL_GEN(engine->i915) < 6)
@@ -407,6 +414,9 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 		} else {
 			val = engine->pmu.sample[sample].cur;
 		}
+
+		if (sample == I915_SAMPLE_QUEUED)
+			val = div_u64(val, FREQUENCY);
 	} else {
 		switch (event->attr.config) {
 		case I915_PMU_ACTUAL_FREQUENCY:
@@ -719,6 +729,16 @@ static const struct attribute_group *i915_pmu_attr_groups[] = {
 { \
 	.sample = (__sample), \
 	.name = (__name), \
+	.suffix = "unit", \
+	.value = "ns", \
+}
+
+#define __engine_event_scale(__sample, __name, __scale) \
+{ \
+	.sample = (__sample), \
+	.name = (__name), \
+	.suffix = "scale", \
+	.value = (__scale), \
 }
 
 static struct i915_ext_attribute *
@@ -762,10 +782,14 @@ create_event_attributes(struct drm_i915_private *i915)
 	static const struct {
 		enum drm_i915_pmu_engine_sample sample;
 		char *name;
+		char *suffix;
+		char *value;
 	} engine_events[] = {
 		__engine_event(I915_SAMPLE_BUSY, "busy"),
 		__engine_event(I915_SAMPLE_SEMA, "sema"),
 		__engine_event(I915_SAMPLE_WAIT, "wait"),
+		__engine_event_scale(I915_SAMPLE_QUEUED, "queued",
+				     __stringify(I915_SAMPLE_QUEUED_SCALE)),
 	};
 	unsigned int count = 0;
 	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
@@ -852,13 +876,15 @@ create_event_attributes(struct drm_i915_private *i915)
 								engine->instance,
 								engine_events[i].sample));
 
-			str = kasprintf(GFP_KERNEL, "%s-%s.unit",
-					engine->name, engine_events[i].name);
+			str = kasprintf(GFP_KERNEL, "%s-%s.%s",
+					engine->name, engine_events[i].name,
+					engine_events[i].suffix);
 			if (!str)
 				goto err;
 
 			*attr_iter++ = &pmu_iter->attr.attr;
-			pmu_iter = add_pmu_attr(pmu_iter, str, "ns");
+			pmu_iter = add_pmu_attr(pmu_iter, str,
+						engine_events[i].value);
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 77fff2488cde..84541b91bcd8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -379,7 +379,7 @@ struct intel_engine_cs {
 		 *
 		 * Our internal timer stores the current counters in this field.
 		 */
-#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_SEMA + 1)
+#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_QUEUED + 1)
 		struct i915_pmu_sample sample[I915_ENGINE_SAMPLE_MAX];
 		/**
 		 * @busy_stats: Has enablement of engine stats tracking been
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 536ee4febd74..83458e5b1ac7 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -110,9 +110,12 @@ enum drm_i915_gem_engine_class {
 enum drm_i915_pmu_engine_sample {
 	I915_SAMPLE_BUSY = 0,
 	I915_SAMPLE_WAIT = 1,
-	I915_SAMPLE_SEMA = 2
+	I915_SAMPLE_SEMA = 2,
+	I915_SAMPLE_QUEUED = 3
 };
 
+#define I915_SAMPLE_QUEUED_SCALE 1e-2 /* No braces please. */
+
 #define I915_PMU_SAMPLE_BITS (4)
 #define I915_PMU_SAMPLE_MASK (0xf)
 #define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +136,9 @@ enum drm_i915_pmu_engine_sample {
 #define I915_PMU_ENGINE_SEMA(class, instance) \
 	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
 
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
 #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
 
 #define I915_PMU_ACTUAL_FREQUENCY	__I915_PMU_OTHER(0)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 5/6] drm/i915/pmu: Add submitted counter
  2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2018-01-18 10:41 ` [RFC 4/6] drm/i915/pmu: Add queued counter Tvrtko Ursulin
@ 2018-01-18 10:41 ` Tvrtko Ursulin
  2018-01-18 10:41 ` [RFC 6/6] drm/i915/pmu: Add running counter Tvrtko Ursulin
  2018-01-18 12:04 ` ✗ Fi.CI.BAT: warning for Submitted queue depth stats Patchwork
  6 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-18 10:41 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We add a PMU counter to expose the number of requests currently submitted
to the driver which are not yet runnable on the GPU (unresolved
dependencies or unsignalled fences).

This is useful to analyze the overall load of the system.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 14 ++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
 include/uapi/drm/i915_drm.h             |  7 ++++++-
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index aaf48e85c35e..d6d08cca2bc8 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -37,7 +37,8 @@
 	(BIT(I915_SAMPLE_BUSY) | \
 	 BIT(I915_SAMPLE_WAIT) | \
 	 BIT(I915_SAMPLE_SEMA) | \
-	 BIT(I915_SAMPLE_QUEUED))
+	 BIT(I915_SAMPLE_QUEUED) | \
+	 BIT(I915_SAMPLE_SUBMITTED))
 
 #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
 
@@ -226,6 +227,11 @@ static void engines_sample(struct drm_i915_private *dev_priv)
 			update_sample(&engine->pmu.sample[I915_SAMPLE_QUEUED],
 				      1 / I915_SAMPLE_QUEUED_SCALE,
 				      engine->queued);
+
+		if (engine->pmu.enable & BIT(I915_SAMPLE_SUBMITTED))
+			update_sample(&engine->pmu.sample[I915_SAMPLE_SUBMITTED],
+				      1 / I915_SAMPLE_SUBMITTED_SCALE,
+				      atomic_read(&engine->submitted));
 	}
 
 	if (fw)
@@ -304,6 +310,7 @@ engine_event_status(struct intel_engine_cs *engine,
 	case I915_SAMPLE_BUSY:
 	case I915_SAMPLE_WAIT:
 	case I915_SAMPLE_QUEUED:
+	case I915_SAMPLE_SUBMITTED:
 		break;
 	case I915_SAMPLE_SEMA:
 		if (INTEL_GEN(engine->i915) < 6)
@@ -415,7 +422,8 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 			val = engine->pmu.sample[sample].cur;
 		}
 
-		if (sample == I915_SAMPLE_QUEUED)
+		if (sample == I915_SAMPLE_QUEUED ||
+		    sample == I915_SAMPLE_SUBMITTED)
 			val = div_u64(val, FREQUENCY);
 	} else {
 		switch (event->attr.config) {
@@ -790,6 +798,8 @@ create_event_attributes(struct drm_i915_private *i915)
 		__engine_event(I915_SAMPLE_WAIT, "wait"),
 		__engine_event_scale(I915_SAMPLE_QUEUED, "queued",
 				     __stringify(I915_SAMPLE_QUEUED_SCALE)),
+		__engine_event_scale(I915_SAMPLE_SUBMITTED, "submitted",
+				     __stringify(I915_SAMPLE_SUBMITTED_SCALE)),
 	};
 	unsigned int count = 0;
 	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 84541b91bcd8..01355fc24d1e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -379,7 +379,7 @@ struct intel_engine_cs {
 		 *
 		 * Our internal timer stores the current counters in this field.
 		 */
-#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_QUEUED + 1)
+#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_SUBMITTED + 1)
 		struct i915_pmu_sample sample[I915_ENGINE_SAMPLE_MAX];
 		/**
 		 * @busy_stats: Has enablement of engine stats tracking been
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 83458e5b1ac7..3285027a6ce0 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -111,10 +111,12 @@ enum drm_i915_pmu_engine_sample {
 	I915_SAMPLE_BUSY = 0,
 	I915_SAMPLE_WAIT = 1,
 	I915_SAMPLE_SEMA = 2,
-	I915_SAMPLE_QUEUED = 3
+	I915_SAMPLE_QUEUED = 3,
+	I915_SAMPLE_SUBMITTED = 4,
 };
 
 #define I915_SAMPLE_QUEUED_SCALE 1e-2 /* No braces please. */
+#define I915_SAMPLE_SUBMITTED_SCALE 1e-2 /* No braces please. */
 
 #define I915_PMU_SAMPLE_BITS (4)
 #define I915_PMU_SAMPLE_MASK (0xf)
@@ -139,6 +141,9 @@ enum drm_i915_pmu_engine_sample {
 #define I915_PMU_ENGINE_QUEUED(class, instance) \
 	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
 
+#define I915_PMU_ENGINE_SUBMITTED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SUBMITTED)
+
 #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
 
 #define I915_PMU_ACTUAL_FREQUENCY	__I915_PMU_OTHER(0)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 6/6] drm/i915/pmu: Add running counter
  2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2018-01-18 10:41 ` [RFC 5/6] drm/i915/pmu: Add submitted counter Tvrtko Ursulin
@ 2018-01-18 10:41 ` Tvrtko Ursulin
  2018-01-18 11:57   ` Chris Wilson
  2018-01-18 12:04 ` ✗ Fi.CI.BAT: warning for Submitted queue depth stats Patchwork
  6 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-18 10:41 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We add a PMU counter to expose the number of requests currently executing
on the GPU.

This is useful to analyze the overall load of the system.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 14 ++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
 include/uapi/drm/i915_drm.h             |  5 +++++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index d6d08cca2bc8..93b86bc44c51 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -38,7 +38,8 @@
 	 BIT(I915_SAMPLE_WAIT) | \
 	 BIT(I915_SAMPLE_SEMA) | \
 	 BIT(I915_SAMPLE_QUEUED) | \
-	 BIT(I915_SAMPLE_SUBMITTED))
+	 BIT(I915_SAMPLE_SUBMITTED) | \
+	 BIT(I915_SAMPLE_RUNNING))
 
 #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
 
@@ -232,6 +233,11 @@ static void engines_sample(struct drm_i915_private *dev_priv)
 			update_sample(&engine->pmu.sample[I915_SAMPLE_SUBMITTED],
 				      1 / I915_SAMPLE_SUBMITTED_SCALE,
 				      atomic_read(&engine->submitted));
+
+		if (engine->pmu.enable & BIT(I915_SAMPLE_RUNNING))
+			update_sample(&engine->pmu.sample[I915_SAMPLE_RUNNING],
+				      1 / I915_SAMPLE_RUNNING_SCALE,
+				      last_seqno - current_seqno);
 	}
 
 	if (fw)
@@ -311,6 +317,7 @@ engine_event_status(struct intel_engine_cs *engine,
 	case I915_SAMPLE_WAIT:
 	case I915_SAMPLE_QUEUED:
 	case I915_SAMPLE_SUBMITTED:
+	case I915_SAMPLE_RUNNING:
 		break;
 	case I915_SAMPLE_SEMA:
 		if (INTEL_GEN(engine->i915) < 6)
@@ -423,7 +430,8 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 		}
 
 		if (sample == I915_SAMPLE_QUEUED ||
-		    sample == I915_SAMPLE_SUBMITTED)
+		    sample == I915_SAMPLE_SUBMITTED ||
+		    sample == I915_SAMPLE_RUNNING)
 			val = div_u64(val, FREQUENCY);
 	} else {
 		switch (event->attr.config) {
@@ -800,6 +808,8 @@ create_event_attributes(struct drm_i915_private *i915)
 				     __stringify(I915_SAMPLE_QUEUED_SCALE)),
 		__engine_event_scale(I915_SAMPLE_SUBMITTED, "submitted",
 				     __stringify(I915_SAMPLE_SUBMITTED_SCALE)),
+		__engine_event_scale(I915_SAMPLE_RUNNING, "running",
+				     __stringify(I915_SAMPLE_RUNNING_SCALE)),
 	};
 	unsigned int count = 0;
 	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 01355fc24d1e..b198df1f248c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -379,7 +379,7 @@ struct intel_engine_cs {
 		 *
 		 * Our internal timer stores the current counters in this field.
 		 */
-#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_SUBMITTED + 1)
+#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_RUNNING + 1)
 		struct i915_pmu_sample sample[I915_ENGINE_SAMPLE_MAX];
 		/**
 		 * @busy_stats: Has enablement of engine stats tracking been
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3285027a6ce0..c3b98fb8b691 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -113,10 +113,12 @@ enum drm_i915_pmu_engine_sample {
 	I915_SAMPLE_SEMA = 2,
 	I915_SAMPLE_QUEUED = 3,
 	I915_SAMPLE_SUBMITTED = 4,
+	I915_SAMPLE_RUNNING = 5,
 };
 
 #define I915_SAMPLE_QUEUED_SCALE 1e-2 /* No braces please. */
 #define I915_SAMPLE_SUBMITTED_SCALE 1e-2 /* No braces please. */
+#define I915_SAMPLE_RUNNING_SCALE 1e-2 /* No braces please. */
 
 #define I915_PMU_SAMPLE_BITS (4)
 #define I915_PMU_SAMPLE_MASK (0xf)
@@ -144,6 +146,9 @@ enum drm_i915_pmu_engine_sample {
 #define I915_PMU_ENGINE_SUBMITTED(class, instance) \
 	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SUBMITTED)
 
+#define I915_PMU_ENGINE_RUNNING(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING)
+
 #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
 
 #define I915_PMU_ACTUAL_FREQUENCY	__I915_PMU_OTHER(0)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC 6/6] drm/i915/pmu: Add running counter
  2018-01-18 10:41 ` [RFC 6/6] drm/i915/pmu: Add running counter Tvrtko Ursulin
@ 2018-01-18 11:57   ` Chris Wilson
  2018-01-19 11:45     ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2018-01-18 11:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-01-18 10:41:36)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We add a PMU counter to expose the number of requests currently executing
> on the GPU.
> 
> This is useful to analyze the overall load of the system.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Ok, the split between queued (unmet dependencies),
submitted (met dependencies, ready for hw) and running (on hw) look good
to me. The usual slight inaccuracies that may arise due to trying to
sample across async hw + engines, but those should be minor. And the
counters seem very useful (at least for the trivial overlay).

The only suggestion I would make is perhaps

	engine->stats.unready_requests / requests_queued;
	engine->stats.requests_ready / requests_submitted;

(doesn't have to be stats, but I think we want a bit more verbosity
here).

Oh, the second suggestion is perhaps not to use 1e-2 :) Talk about
giving me a fright!
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* ✗ Fi.CI.BAT: warning for Submitted queue depth stats
  2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2018-01-18 10:41 ` [RFC 6/6] drm/i915/pmu: Add running counter Tvrtko Ursulin
@ 2018-01-18 12:04 ` Patchwork
  6 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2018-01-18 12:04 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Submitted queue depth stats
URL   : https://patchwork.freedesktop.org/series/36685/
State : warning

== Summary ==

Series 36685v1 Submitted queue depth stats
https://patchwork.freedesktop.org/api/1.0/series/36685/revisions/1/mbox/

Test debugfs_test:
        Subgroup read_all_entries:
                pass       -> DMESG-WARN (fi-elk-e7500) fdo#103989 +1
Test gem_ringfill:
        Subgroup basic-default:
                pass       -> SKIP       (fi-bsw-n3050)
        Subgroup basic-default-hang:
                dmesg-warn -> INCOMPLETE (fi-blb-e6850) fdo#101600 +1
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                fail       -> PASS       (fi-skl-6700k2) fdo#103191
        Subgroup suspend-read-crc-pipe-b:
                pass       -> INCOMPLETE (fi-snb-2520m) fdo#103713

fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
fdo#101600 https://bugs.freedesktop.org/show_bug.cgi?id=101600
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:421s
fi-bdw-gvtdvm    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:424s
fi-blb-e6850     total:146  pass:114  dwarn:0   dfail:0   fail:0   skip:31 
fi-bsw-n3050     total:288  pass:241  dwarn:0   dfail:0   fail:0   skip:47  time:484s
fi-bwr-2160      total:288  pass:183  dwarn:0   dfail:0   fail:0   skip:105 time:280s
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:483s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:483s
fi-byt-j1900     total:288  pass:253  dwarn:0   dfail:0   fail:0   skip:35  time:468s
fi-byt-n2820     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:452s
fi-elk-e7500     total:224  pass:168  dwarn:10  dfail:0   fail:0   skip:45 
fi-gdg-551       total:288  pass:179  dwarn:0   dfail:0   fail:1   skip:108 time:278s
fi-glk-1         total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:510s
fi-hsw-4770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:391s
fi-hsw-4770r     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:401s
fi-ilk-650       total:288  pass:228  dwarn:0   dfail:0   fail:0   skip:60  time:410s
fi-ivb-3520m     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:449s
fi-ivb-3770      total:288  pass:255  dwarn:0   dfail:0   fail:0   skip:33  time:415s
fi-kbl-7500u     total:288  pass:263  dwarn:1   dfail:0   fail:0   skip:24  time:457s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:497s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:452s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:505s
fi-pnv-d510      total:146  pass:113  dwarn:0   dfail:0   fail:0   skip:32 
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:430s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:508s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:530s
fi-skl-6700k2    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:490s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:480s
fi-skl-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:433s
fi-snb-2520m     total:245  pass:211  dwarn:0   dfail:0   fail:0   skip:33 
fi-snb-2600      total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:394s
Blacklisted hosts:
fi-cfl-s2        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:570s
fi-glk-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:477s

cb7287f74687f033ae8e50937b8294c16317170a drm-tip: 2018y-01m-18d-10h-31m-34s UTC integration manifest
4cd187b4f529 drm/i915/pmu: Add running counter
1dd7214fae43 drm/i915/pmu: Add submitted counter
d4574b651a10 drm/i915/pmu: Add queued counter
f572acba07ba drm/i915: Keep a count of requests submitted from userspace
1d56575bd8f9 drm/i915: Keep a count of requests waiting for a slot on GPU
18c6f5364f87 drm/i915/pmu: Fix enable count array size and bounds checking

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7707/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 6/6] drm/i915/pmu: Add running counter
  2018-01-18 11:57   ` Chris Wilson
@ 2018-01-19 11:45     ` Tvrtko Ursulin
  2018-01-19 13:40       ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-19 11:45 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 18/01/2018 11:57, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-18 10:41:36)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> We add a PMU counter to expose the number of requests currently executing
>> on the GPU.
>>
>> This is useful to analyze the overall load of the system.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Ok, the split between queued (unmet dependencies),
> submitted (met dependencies, ready for hw) and running (on hw) look good
> to me. The usual slight inaccuracies that may arise due to trying to
> sample across async hw + engines, but those should be minor. And the
> counters seem very useful (at least for the trivial overlay).

Glad to hear positive feedback!

> The only suggestion I would make is perhaps
> 
> 	engine->stats.unready_requests / requests_queued;
> 	engine->stats.requests_ready / requests_submitted;
> 
> (doesn't have to be stats, but I think we want a bit more verbosity
> here).

Hm, what is that? You are suggesting some relative stats? Exposed as 
another counter? It can be calculated in userspace easily.

> Oh, the second suggestion is perhaps not to use 1e-2 :) Talk about
> giving me a fright!

Yes it's a bit scary, just that I wanted to avoid having to have two 
defines which need to be kept in sync. "1e-2" is what needs to go as 
string to sysfs for perf to work correctly. I'll add the second define.

So I'll wait to hear some positive feedback from the feature requestors 
and then respin. IGTs will be a pain...

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 6/6] drm/i915/pmu: Add running counter
  2018-01-19 11:45     ` Tvrtko Ursulin
@ 2018-01-19 13:40       ` Chris Wilson
  2018-01-19 13:48         ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2018-01-19 13:40 UTC (permalink / raw)
  To: Tvrtko Ursulin, Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-01-19 11:45:24)
> 
> On 18/01/2018 11:57, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-01-18 10:41:36)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> We add a PMU counter to expose the number of requests currently executing
> >> on the GPU.
> >>
> >> This is useful to analyze the overall load of the system.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > Ok, the split between queued (unmet dependencies),
> > submitted (met dependencies, ready for hw) and running (on hw) look good
> > to me. The usual slight inaccuracies that may arise due to trying to
> > sample across async hw + engines, but those should be minor. And the
> > counters seem very useful (at least for the trivial overlay).
> 
> Glad to hear positive feedback!
> 
> > The only suggestion I would make is perhaps
> > 
> >       engine->stats.unready_requests / requests_queued;
> >       engine->stats.requests_ready / requests_submitted;
> > 
> > (doesn't have to be stats, but I think we want a bit more verbosity
> > here).
> 
> Hm, what is that? You are suggesting some relative stats? Exposed as 
> another counter? It can be calculated in userspace easily.

Just trying to think of better names than engine->queued.
I like 'stats' as it says "this is not part of the normal engine
management, but some auxiliary information we tracked for user
convenience"; then I was just trying to think of a more apropos name
than 'queued'. I definitely think we want to let the reader know what's
queued :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 6/6] drm/i915/pmu: Add running counter
  2018-01-19 13:40       ` Chris Wilson
@ 2018-01-19 13:48         ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-19 13:48 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 19/01/2018 13:40, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-19 11:45:24)
>>
>> On 18/01/2018 11:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-01-18 10:41:36)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> We add a PMU counter to expose the number of requests currently executing
>>>> on the GPU.
>>>>
>>>> This is useful to analyze the overall load of the system.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Ok, the split between queued (unmet dependencies),
>>> submitted (met dependencies, ready for hw) and running (on hw) look good
>>> to me. The usual slight inaccuracies that may arise due to trying to
>>> sample across async hw + engines, but those should be minor. And the
>>> counters seem very useful (at least for the trivial overlay).
>>
>> Glad to hear positive feedback!
>>
>>> The only suggestion I would make is perhaps
>>>
>>>        engine->stats.unready_requests / requests_queued;
>>>        engine->stats.requests_ready / requests_submitted;
>>>
>>> (doesn't have to be stats, but I think we want a bit more verbosity
>>> here).
>>
>> Hm, what is that? You are suggesting some relative stats? Exposed as
>> another counter? It can be calculated in userspace easily.
> 
> Just trying to think of better names than engine->queued.
> I like 'stats' as it says "this is not part of the normal engine
> management, but some auxiliary information we tracked for user
> convenience"; then I was just trying to think of a more apropos name
> than 'queued'. I definitely think we want to let the reader know what's
> queued :)

Oh right, a container definitely makes sense. More verbose naming can do 
as well.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 4/6] drm/i915/pmu: Add queued counter
  2018-01-22 18:56   ` Chris Wilson
@ 2018-01-24 18:01     ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-24 18:01 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 22/01/2018 18:56, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-22 18:43:56)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> We add a PMU counter to expose the number of requests which have been
>> submitted from userspace but are not yet runnable due dependencies and
>> unsignaled fences.
>>
>> This is useful to analyze the overall load of the system.
>>
>> v2:
>>   * Rebase for name change and re-order.
>>   * Drop floating point constant. (Chris Wilson)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_pmu.c         | 40 +++++++++++++++++++++++++++++----
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
>>   include/uapi/drm/i915_drm.h             |  9 +++++++-
>>   3 files changed, 45 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>> index cbfca4a255ab..8eefdf09a30a 100644
>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -36,7 +36,8 @@
>>   #define ENGINE_SAMPLE_MASK \
>>          (BIT(I915_SAMPLE_BUSY) | \
>>           BIT(I915_SAMPLE_WAIT) | \
>> -        BIT(I915_SAMPLE_SEMA))
>> +        BIT(I915_SAMPLE_SEMA) | \
>> +        BIT(I915_SAMPLE_QUEUED))
>>   
>>   #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
>>   
>> @@ -220,6 +221,11 @@ static void engines_sample(struct drm_i915_private *dev_priv)
>>   
>>                  update_sample(&engine->pmu.sample[I915_SAMPLE_SEMA],
>>                                PERIOD, !!(val & RING_WAIT_SEMAPHORE));
>> +
>> +               if (engine->pmu.enable & BIT(I915_SAMPLE_QUEUED))
>> +                       update_sample(&engine->pmu.sample[I915_SAMPLE_QUEUED],
>> +                                     I915_SAMPLE_QUEUED_DIVISOR,
>> +                                     atomic_read(&engine->request_stats.queued));
> 
> engine->request_stats.foo works for me, and reads quite nicely.
> 
>> +/* No brackets or quotes below please. */
>> +#define I915_SAMPLE_QUEUED_SCALE 0.01
> 
>> + /* Divide counter value by divisor to get the real value. */
>> +#define I915_SAMPLE_QUEUED_DIVISOR (100)
> 
> I'm just thinking of favouring the sampler arithmetic by using 128. As
> far as userspace the difference is not going to that noticeable, less if
> you chose 256.

I'll do 1024 then, but the CPU usage in the sampling thread is so low 
anyway.

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 4/6] drm/i915/pmu: Add queued counter
  2018-01-22 18:43 ` [RFC 4/6] drm/i915/pmu: Add queued counter Tvrtko Ursulin
@ 2018-01-22 18:56   ` Chris Wilson
  2018-01-24 18:01     ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2018-01-22 18:56 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-01-22 18:43:56)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We add a PMU counter to expose the number of requests which have been
> submitted from userspace but are not yet runnable due dependencies and
> unsignaled fences.
> 
> This is useful to analyze the overall load of the system.
> 
> v2:
>  * Rebase for name change and re-order.
>  * Drop floating point constant. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_pmu.c         | 40 +++++++++++++++++++++++++++++----
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
>  include/uapi/drm/i915_drm.h             |  9 +++++++-
>  3 files changed, 45 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index cbfca4a255ab..8eefdf09a30a 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -36,7 +36,8 @@
>  #define ENGINE_SAMPLE_MASK \
>         (BIT(I915_SAMPLE_BUSY) | \
>          BIT(I915_SAMPLE_WAIT) | \
> -        BIT(I915_SAMPLE_SEMA))
> +        BIT(I915_SAMPLE_SEMA) | \
> +        BIT(I915_SAMPLE_QUEUED))
>  
>  #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
>  
> @@ -220,6 +221,11 @@ static void engines_sample(struct drm_i915_private *dev_priv)
>  
>                 update_sample(&engine->pmu.sample[I915_SAMPLE_SEMA],
>                               PERIOD, !!(val & RING_WAIT_SEMAPHORE));
> +
> +               if (engine->pmu.enable & BIT(I915_SAMPLE_QUEUED))
> +                       update_sample(&engine->pmu.sample[I915_SAMPLE_QUEUED],
> +                                     I915_SAMPLE_QUEUED_DIVISOR,
> +                                     atomic_read(&engine->request_stats.queued));

engine->request_stats.foo works for me, and reads quite nicely.

> +/* No brackets or quotes below please. */
> +#define I915_SAMPLE_QUEUED_SCALE 0.01

> + /* Divide counter value by divisor to get the real value. */
> +#define I915_SAMPLE_QUEUED_DIVISOR (100)

I'm just thinking of favouring the sampler arithmetic by using 128. As
far as userspace the difference is not going to that noticeable, less if
you chose 256.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC 4/6] drm/i915/pmu: Add queued counter
  2018-01-22 18:43 [RFC v2 0/6] Queued/runnable/running engine stats Tvrtko Ursulin
@ 2018-01-22 18:43 ` Tvrtko Ursulin
  2018-01-22 18:56   ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-01-22 18:43 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We add a PMU counter to expose the number of requests which have been
submitted from userspace but are not yet runnable due dependencies and
unsignaled fences.

This is useful to analyze the overall load of the system.

v2:
 * Rebase for name change and re-order.
 * Drop floating point constant. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 40 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
 include/uapi/drm/i915_drm.h             |  9 +++++++-
 3 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index cbfca4a255ab..8eefdf09a30a 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -36,7 +36,8 @@
 #define ENGINE_SAMPLE_MASK \
 	(BIT(I915_SAMPLE_BUSY) | \
 	 BIT(I915_SAMPLE_WAIT) | \
-	 BIT(I915_SAMPLE_SEMA))
+	 BIT(I915_SAMPLE_SEMA) | \
+	 BIT(I915_SAMPLE_QUEUED))
 
 #define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
 
@@ -220,6 +221,11 @@ static void engines_sample(struct drm_i915_private *dev_priv)
 
 		update_sample(&engine->pmu.sample[I915_SAMPLE_SEMA],
 			      PERIOD, !!(val & RING_WAIT_SEMAPHORE));
+
+		if (engine->pmu.enable & BIT(I915_SAMPLE_QUEUED))
+			update_sample(&engine->pmu.sample[I915_SAMPLE_QUEUED],
+				      I915_SAMPLE_QUEUED_DIVISOR,
+				      atomic_read(&engine->request_stats.queued));
 	}
 
 	if (fw)
@@ -297,6 +303,7 @@ engine_event_status(struct intel_engine_cs *engine,
 	switch (sample) {
 	case I915_SAMPLE_BUSY:
 	case I915_SAMPLE_WAIT:
+	case I915_SAMPLE_QUEUED:
 		break;
 	case I915_SAMPLE_SEMA:
 		if (INTEL_GEN(engine->i915) < 6)
@@ -407,6 +414,9 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 		} else {
 			val = engine->pmu.sample[sample].cur;
 		}
+
+		if (sample == I915_SAMPLE_QUEUED)
+			val = div_u64(val, FREQUENCY);
 	} else {
 		switch (event->attr.config) {
 		case I915_PMU_ACTUAL_FREQUENCY:
@@ -719,6 +729,16 @@ static const struct attribute_group *i915_pmu_attr_groups[] = {
 { \
 	.sample = (__sample), \
 	.name = (__name), \
+	.suffix = "unit", \
+	.value = "ns", \
+}
+
+#define __engine_event_scale(__sample, __name, __scale) \
+{ \
+	.sample = (__sample), \
+	.name = (__name), \
+	.suffix = "scale", \
+	.value = (__scale), \
 }
 
 static struct i915_ext_attribute *
@@ -746,6 +766,9 @@ add_pmu_attr(struct perf_pmu_events_attr *attr, const char *name,
 	return ++attr;
 }
 
+/* No brackets or quotes below please. */
+#define I915_SAMPLE_QUEUED_SCALE 0.01
+
 static struct attribute **
 create_event_attributes(struct drm_i915_private *i915)
 {
@@ -762,10 +785,14 @@ create_event_attributes(struct drm_i915_private *i915)
 	static const struct {
 		enum drm_i915_pmu_engine_sample sample;
 		char *name;
+		char *suffix;
+		char *value;
 	} engine_events[] = {
 		__engine_event(I915_SAMPLE_BUSY, "busy"),
 		__engine_event(I915_SAMPLE_SEMA, "sema"),
 		__engine_event(I915_SAMPLE_WAIT, "wait"),
+		__engine_event_scale(I915_SAMPLE_QUEUED, "queued",
+				     __stringify(I915_SAMPLE_QUEUED_SCALE)),
 	};
 	unsigned int count = 0;
 	struct perf_pmu_events_attr *pmu_attr = NULL, *pmu_iter;
@@ -775,6 +802,9 @@ create_event_attributes(struct drm_i915_private *i915)
 	enum intel_engine_id id;
 	unsigned int i;
 
+	BUILD_BUG_ON(I915_SAMPLE_QUEUED_DIVISOR !=
+		     (1 / I915_SAMPLE_QUEUED_SCALE));
+
 	/* Count how many counters we will be exposing. */
 	for (i = 0; i < ARRAY_SIZE(events); i++) {
 		if (!config_status(i915, events[i].config))
@@ -852,13 +882,15 @@ create_event_attributes(struct drm_i915_private *i915)
 								engine->instance,
 								engine_events[i].sample));
 
-			str = kasprintf(GFP_KERNEL, "%s-%s.unit",
-					engine->name, engine_events[i].name);
+			str = kasprintf(GFP_KERNEL, "%s-%s.%s",
+					engine->name, engine_events[i].name,
+					engine_events[i].suffix);
 			if (!str)
 				goto err;
 
 			*attr_iter++ = &pmu_iter->attr.attr;
-			pmu_iter = add_pmu_attr(pmu_iter, str, "ns");
+			pmu_iter = add_pmu_attr(pmu_iter, str,
+						engine_events[i].value);
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4519788cc5a1..580f07b2a5dd 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -381,7 +381,7 @@ struct intel_engine_cs {
 		 *
 		 * Our internal timer stores the current counters in this field.
 		 */
-#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_SEMA + 1)
+#define I915_ENGINE_SAMPLE_MAX (I915_SAMPLE_QUEUED + 1)
 		struct i915_pmu_sample sample[I915_ENGINE_SAMPLE_MAX];
 		/**
 		 * @busy_stats: Has enablement of engine stats tracking been
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 536ee4febd74..968bdc3269cb 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -110,9 +110,13 @@ enum drm_i915_gem_engine_class {
 enum drm_i915_pmu_engine_sample {
 	I915_SAMPLE_BUSY = 0,
 	I915_SAMPLE_WAIT = 1,
-	I915_SAMPLE_SEMA = 2
+	I915_SAMPLE_SEMA = 2,
+	I915_SAMPLE_QUEUED = 3
 };
 
+ /* Divide counter value by divisor to get the real value. */
+#define I915_SAMPLE_QUEUED_DIVISOR (100)
+
 #define I915_PMU_SAMPLE_BITS (4)
 #define I915_PMU_SAMPLE_MASK (0xf)
 #define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +137,9 @@ enum drm_i915_pmu_engine_sample {
 #define I915_PMU_ENGINE_SEMA(class, instance) \
 	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
 
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
 #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
 
 #define I915_PMU_ACTUAL_FREQUENCY	__I915_PMU_OTHER(0)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-01-24 18:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-18 10:41 [RFC 0/6] Submitted queue depth stats Tvrtko Ursulin
2018-01-18 10:41 ` [RFC 1/6] drm/i915/pmu: Fix enable count array size and bounds checking Tvrtko Ursulin
2018-01-18 10:41 ` [RFC 2/6] drm/i915: Keep a count of requests waiting for a slot on GPU Tvrtko Ursulin
2018-01-18 10:41 ` [RFC 3/6] drm/i915: Keep a count of requests submitted from userspace Tvrtko Ursulin
2018-01-18 10:41 ` [RFC 4/6] drm/i915/pmu: Add queued counter Tvrtko Ursulin
2018-01-18 10:41 ` [RFC 5/6] drm/i915/pmu: Add submitted counter Tvrtko Ursulin
2018-01-18 10:41 ` [RFC 6/6] drm/i915/pmu: Add running counter Tvrtko Ursulin
2018-01-18 11:57   ` Chris Wilson
2018-01-19 11:45     ` Tvrtko Ursulin
2018-01-19 13:40       ` Chris Wilson
2018-01-19 13:48         ` Tvrtko Ursulin
2018-01-18 12:04 ` ✗ Fi.CI.BAT: warning for Submitted queue depth stats Patchwork
2018-01-22 18:43 [RFC v2 0/6] Queued/runnable/running engine stats Tvrtko Ursulin
2018-01-22 18:43 ` [RFC 4/6] drm/i915/pmu: Add queued counter Tvrtko Ursulin
2018-01-22 18:56   ` Chris Wilson
2018-01-24 18:01     ` Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.