All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v4 0/7] Per-context and per-client engine busyness
@ 2018-04-17 12:27 ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-post of my earlier, now slightly updated work, to expose a DRM client
hierarchy in sysfs in order to enable a top like tool:

intel-gpu-top - load avg 40.80, 27.11,  1.50;  882/ 950 MHz;    0% RC6;  13.26 Watts;   261903 irqs/s

      IMC reads:     5543 MiB/s
     IMC writes:      236 MiB/s

          ENGINE      BUSY                            QD     MI_SEMA MI_WAIT
     Render/3D/0    60.47% |███████████▍       |  28   0   1      0%      0%
       Blitter/0    92.70% |█████████████████▌ |   0   0   1      0%      0%
         Video/0   100.00% |███████████████████|  15  37   2      0%      0%
         Video/1    51.68% |█████████▊         |   0   0   1      0%      0%
  VideoEnhance/0     0.00% |                   |   0   0   0      0%      0%

  PID            NAME   rcs0       bcs0       vcs0       vcs1       vecs0
21664        gem_wsim |█████▍   ||         ||█████████||████▋    ||         |
21662     gem_latency |         ||████████▎||         ||         ||         |
21662     gem_latency |         ||         ||         ||         ||         |

Hopefully the screen shot is self-explanatory. It shows overall GPU per-engine
stats, plus per-client and per-engine busyness.

In this version all review feedback has been incorporated and the context param
query for DCG is back as the last patch. (DCG should only need patches 1-2 & 7).

For the rest I kept the interface as sysfs in hope it would have the advantage
of solving the access control one day. (Setting ownership on sysfs nodes to
follow the DRM client? Still have to investigate this angle.)

This time around I won't be sending the IGT patches since I am building it all
on top of engine queue depths, and on top of the intel-gpu-top rewrite. Both are
unmerged so sending it all just to show this would be a bit much.

Tvrtko Ursulin (7):
  drm/i915: Use seqlock in engine stats
  drm/i915: Track per-context engine busyness
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Expose per-engine client busyness
  drm/i915: Add sysfs toggle to enable per-client engine stats
  drm/i915: Allow clients to query own per-engine busyness

 drivers/gpu/drm/i915/i915_drv.h         |  39 +++++++
 drivers/gpu/drm/i915/i915_gem.c         | 192 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_context.c | 121 +++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h |   8 ++
 drivers/gpu/drm/i915/i915_sysfs.c       |  80 +++++++++++++
 drivers/gpu/drm/i915/intel_engine_cs.c  |  50 +++++++--
 drivers/gpu/drm/i915/intel_lrc.c        |  14 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  66 +++++++++--
 include/uapi/drm/i915_drm.h             |  21 ++++
 9 files changed, 553 insertions(+), 38 deletions(-)

-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [igt-dev] [RFC v4 0/7] Per-context and per-client engine busyness
@ 2018-04-17 12:27 ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-post of my earlier, now slightly updated work, to expose a DRM client
hierarchy in sysfs in order to enable a top like tool:

intel-gpu-top - load avg 40.80, 27.11,  1.50;  882/ 950 MHz;    0% RC6;  13.26 Watts;   261903 irqs/s

      IMC reads:     5543 MiB/s
     IMC writes:      236 MiB/s

          ENGINE      BUSY                            QD     MI_SEMA MI_WAIT
     Render/3D/0    60.47% |███████████▍       |  28   0   1      0%      0%
       Blitter/0    92.70% |█████████████████▌ |   0   0   1      0%      0%
         Video/0   100.00% |███████████████████|  15  37   2      0%      0%
         Video/1    51.68% |█████████▊         |   0   0   1      0%      0%
  VideoEnhance/0     0.00% |                   |   0   0   0      0%      0%

  PID            NAME   rcs0       bcs0       vcs0       vcs1       vecs0
21664        gem_wsim |█████▍   ||         ||█████████||████▋    ||         |
21662     gem_latency |         ||████████▎||         ||         ||         |
21662     gem_latency |         ||         ||         ||         ||         |

Hopefully the screen shot is self-explanatory. It shows overall GPU per-engine
stats, plus per-client and per-engine busyness.

In this version all review feedback has been incorporated and the context param
query for DCG is back as the last patch. (DCG should only need patches 1-2 & 7).

For the rest I kept the interface as sysfs in hope it would have the advantage
of solving the access control one day. (Setting ownership on sysfs nodes to
follow the DRM client? Still have to investigate this angle.)

This time around I won't be sending the IGT patches since I am building it all
on top of engine queue depths, and on top of the intel-gpu-top rewrite. Both are
unmerged so sending it all just to show this would be a bit much.

Tvrtko Ursulin (7):
  drm/i915: Use seqlock in engine stats
  drm/i915: Track per-context engine busyness
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Expose per-engine client busyness
  drm/i915: Add sysfs toggle to enable per-client engine stats
  drm/i915: Allow clients to query own per-engine busyness

 drivers/gpu/drm/i915/i915_drv.h         |  39 +++++++
 drivers/gpu/drm/i915/i915_gem.c         | 192 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_context.c | 121 +++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h |   8 ++
 drivers/gpu/drm/i915/i915_sysfs.c       |  80 +++++++++++++
 drivers/gpu/drm/i915/intel_engine_cs.c  |  50 +++++++--
 drivers/gpu/drm/i915/intel_lrc.c        |  14 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  66 +++++++++--
 include/uapi/drm/i915_drm.h             |  21 ++++
 9 files changed, 553 insertions(+), 38 deletions(-)

-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/7] drm/i915: Use seqlock in engine stats
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can convert engine stats from a spinlock to seqlock to ensure interrupt
processing is never even a tiny bit delayed by parallel readers.

There is a smidgen bit more cost on the write lock side, and an extremely
unlikely chance that readers will have to retry a few times in face of
heavy interrupt load.Bbut it should be extremely unlikely given how
lightweight read side section is compared to the interrupt processing side,
and also compared to the rest of the code paths which can lead into it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 19 ++++++++++---------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 11 ++++++-----
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e4992c2e23a4..de61d3d1653d 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -306,7 +306,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	/* Nothing to do here, execute in order of dependencies */
 	engine->schedule = NULL;
 
-	spin_lock_init(&engine->stats.lock);
+	seqlock_init(&engine->stats.lock);
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
@@ -2065,7 +2065,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 		return -ENODEV;
 
 	tasklet_disable(&execlists->tasklet);
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (unlikely(engine->stats.enabled == ~0)) {
 		err = -EBUSY;
@@ -2089,7 +2089,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 	}
 
 unlock:
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 	tasklet_enable(&execlists->tasklet);
 
 	return err;
@@ -2118,12 +2118,13 @@ static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
  */
 ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
 {
+	unsigned int seq;
 	ktime_t total;
-	unsigned long flags;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
-	total = __intel_engine_get_busy_time(engine);
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	do {
+		seq = read_seqbegin(&engine->stats.lock);
+		total = __intel_engine_get_busy_time(engine);
+	} while (read_seqretry(&engine->stats.lock, seq));
 
 	return total;
 }
@@ -2141,13 +2142,13 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 	if (!intel_engine_supports_stats(engine))
 		return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 	WARN_ON_ONCE(engine->stats.enabled == 0);
 	if (--engine->stats.enabled == 0) {
 		engine->stats.total = __intel_engine_get_busy_time(engine);
 		engine->stats.active = 0;
 	}
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index d50b31eb43a5..f24ea9826037 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -3,6 +3,7 @@
 #define _INTEL_RINGBUFFER_H_
 
 #include <linux/hashtable.h>
+#include <linux/seqlock.h>
 
 #include "i915_gem_batch_pool.h"
 #include "i915_gem_timeline.h"
@@ -610,7 +611,7 @@ struct intel_engine_cs {
 		/**
 		 * @lock: Lock protecting the below fields.
 		 */
-		spinlock_t lock;
+		seqlock_t lock;
 		/**
 		 * @enabled: Reference count indicating number of listeners.
 		 */
@@ -1082,7 +1083,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
 	if (READ_ONCE(engine->stats.enabled) == 0)
 		return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (engine->stats.enabled > 0) {
 		if (engine->stats.active++ == 0)
@@ -1090,7 +1091,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
 		GEM_BUG_ON(engine->stats.active == 0);
 	}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
 static inline void intel_engine_context_out(struct intel_engine_cs *engine)
@@ -1100,7 +1101,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	if (READ_ONCE(engine->stats.enabled) == 0)
 		return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (engine->stats.enabled > 0) {
 		ktime_t last;
@@ -1127,7 +1128,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 		}
 	}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
 int intel_enable_engine_stats(struct intel_engine_cs *engine);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [igt-dev] [PATCH 1/7] drm/i915: Use seqlock in engine stats
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can convert engine stats from a spinlock to seqlock to ensure interrupt
processing is never even a tiny bit delayed by parallel readers.

There is a smidgen bit more cost on the write lock side, and an extremely
unlikely chance that readers will have to retry a few times in face of
heavy interrupt load.Bbut it should be extremely unlikely given how
lightweight read side section is compared to the interrupt processing side,
and also compared to the rest of the code paths which can lead into it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 19 ++++++++++---------
 drivers/gpu/drm/i915/intel_ringbuffer.h | 11 ++++++-----
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e4992c2e23a4..de61d3d1653d 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -306,7 +306,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	/* Nothing to do here, execute in order of dependencies */
 	engine->schedule = NULL;
 
-	spin_lock_init(&engine->stats.lock);
+	seqlock_init(&engine->stats.lock);
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
@@ -2065,7 +2065,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 		return -ENODEV;
 
 	tasklet_disable(&execlists->tasklet);
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (unlikely(engine->stats.enabled == ~0)) {
 		err = -EBUSY;
@@ -2089,7 +2089,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 	}
 
 unlock:
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 	tasklet_enable(&execlists->tasklet);
 
 	return err;
@@ -2118,12 +2118,13 @@ static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
  */
 ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
 {
+	unsigned int seq;
 	ktime_t total;
-	unsigned long flags;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
-	total = __intel_engine_get_busy_time(engine);
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	do {
+		seq = read_seqbegin(&engine->stats.lock);
+		total = __intel_engine_get_busy_time(engine);
+	} while (read_seqretry(&engine->stats.lock, seq));
 
 	return total;
 }
@@ -2141,13 +2142,13 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 	if (!intel_engine_supports_stats(engine))
 		return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 	WARN_ON_ONCE(engine->stats.enabled == 0);
 	if (--engine->stats.enabled == 0) {
 		engine->stats.total = __intel_engine_get_busy_time(engine);
 		engine->stats.active = 0;
 	}
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index d50b31eb43a5..f24ea9826037 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -3,6 +3,7 @@
 #define _INTEL_RINGBUFFER_H_
 
 #include <linux/hashtable.h>
+#include <linux/seqlock.h>
 
 #include "i915_gem_batch_pool.h"
 #include "i915_gem_timeline.h"
@@ -610,7 +611,7 @@ struct intel_engine_cs {
 		/**
 		 * @lock: Lock protecting the below fields.
 		 */
-		spinlock_t lock;
+		seqlock_t lock;
 		/**
 		 * @enabled: Reference count indicating number of listeners.
 		 */
@@ -1082,7 +1083,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
 	if (READ_ONCE(engine->stats.enabled) == 0)
 		return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (engine->stats.enabled > 0) {
 		if (engine->stats.active++ == 0)
@@ -1090,7 +1091,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
 		GEM_BUG_ON(engine->stats.active == 0);
 	}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
 static inline void intel_engine_context_out(struct intel_engine_cs *engine)
@@ -1100,7 +1101,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	if (READ_ONCE(engine->stats.enabled) == 0)
 		return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (engine->stats.enabled > 0) {
 		ktime_t last;
@@ -1127,7 +1128,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 		}
 	}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
 int intel_enable_engine_stats(struct intel_engine_cs *engine);
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 2/7] drm/i915: Track per-context engine busyness
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, gordon.kelly

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some customers want to know how much of the GPU time are their clients
using in order to make dynamic load balancing decisions.

With the hooks already in place which track the overall engine busyness,
we can extend that slightly to split that time between contexts.

v2: Fix accounting for tail updates.
v3: Rebase.
v4: Mark currently running contexts as active on stats enable.
v5: Include some headers to fix the build.
v6: Added fine grained lock.
v7: Convert to seqlock. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: gordon.kelly@intel.com
---
 drivers/gpu/drm/i915/i915_gem_context.c |  8 +++++
 drivers/gpu/drm/i915/i915_gem_context.h |  7 +++++
 drivers/gpu/drm/i915/intel_engine_cs.c  | 31 +++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 14 ++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 55 +++++++++++++++++++++++++++++----
 5 files changed, 104 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5cfac0255758..85627fa4565b 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -349,7 +349,9 @@ static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *dev_priv,
 			struct drm_i915_file_private *file_priv)
 {
+	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
+	enum intel_engine_id id;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
@@ -375,6 +377,12 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 		ctx->desc_template = default_desc_template(dev_priv, ppgtt);
 	}
 
+	/* Initialize the context stats lock. */
+	for_each_engine(engine, dev_priv, id) {
+		if (intel_engine_supports_stats(engine))
+			seqlock_init(&ctx->engine[id].stats.lock);
+	}
+
 	trace_i915_context_create(ctx);
 
 	return ctx;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 7854262ddfd9..159223c5fc5f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -28,6 +28,7 @@
 #include <linux/bitops.h>
 #include <linux/list.h>
 #include <linux/radix-tree.h>
+#include <linux/seqlock.h>
 
 #include "i915_gem.h"
 
@@ -160,6 +161,12 @@ struct i915_gem_context {
 		u32 *lrc_reg_state;
 		u64 lrc_desc;
 		int pin_count;
+		struct {
+			seqlock_t lock;
+			bool active;
+			ktime_t start;
+			ktime_t total;
+		} stats;
 	} engine[I915_NUM_ENGINES];
 
 	/** ring_size: size for allocating the per-engine ring buffer */
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index de61d3d1653d..e66166466d6d 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -2078,6 +2078,16 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 
 		engine->stats.enabled_at = ktime_get();
 
+		/* Mark currently running context as active. */
+		if (port_isset(port)) {
+			struct i915_request *req = port_request(port);
+			struct intel_context *ce =
+					&req->ctx->engine[engine->id];
+
+			ce->stats.start = engine->stats.enabled_at;
+			ce->stats.active = true;
+		}
+
 		/* XXX submission method oblivious? */
 		while (num_ports-- && port_isset(port)) {
 			engine->stats.active++;
@@ -2151,6 +2161,27 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
+ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
+					   struct intel_engine_cs *engine)
+{
+	struct intel_context *ce = &ctx->engine[engine->id];
+	unsigned int seq;
+	ktime_t total;
+
+	do {
+		seq = read_seqbegin(&ce->stats.lock);
+
+		total = ce->stats.total;
+
+		if (ce->stats.active)
+			total = ktime_add(total,
+					  ktime_sub(ktime_get(),
+						    ce->stats.start));
+	} while (read_seqretry(&ce->stats.lock, seq));
+
+	return total;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_engine.c"
 #include "selftests/intel_engine_cs.c"
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ddd14e30be6c..930fcf0b1e86 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -389,16 +389,18 @@ execlists_user_end(struct intel_engine_execlists *execlists)
 }
 
 static inline void
-execlists_context_schedule_in(struct i915_request *rq)
+execlists_context_schedule_in(struct i915_request *rq, unsigned int port)
 {
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-	intel_engine_context_in(rq->engine);
+	intel_engine_context_in(rq->engine,
+				&rq->ctx->engine[rq->engine->id],
+				port == 0);
 }
 
 static inline void
 execlists_context_schedule_out(struct i915_request *rq)
 {
-	intel_engine_context_out(rq->engine);
+	intel_engine_context_out(rq->engine, &rq->ctx->engine[rq->engine->id]);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 }
 
@@ -463,7 +465,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		if (rq) {
 			GEM_BUG_ON(count > !n);
 			if (!count++)
-				execlists_context_schedule_in(rq);
+				execlists_context_schedule_in(rq, n);
 			port_set(&port[n], port_pack(rq, count));
 			desc = execlists_update_context(rq);
 			GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
@@ -751,7 +753,9 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 			  intel_engine_get_seqno(rq->engine));
 
 		GEM_BUG_ON(!execlists->active);
-		intel_engine_context_out(rq->engine);
+
+		intel_engine_context_out(rq->engine,
+					 &rq->ctx->engine[rq->engine->id]);
 
 		execlists_context_status_change(rq,
 						i915_request_completed(rq) ?
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f24ea9826037..4d122ba1cd30 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -6,6 +6,7 @@
 #include <linux/seqlock.h>
 
 #include "i915_gem_batch_pool.h"
+#include "i915_gem_context.h"
 #include "i915_gem_timeline.h"
 
 #include "i915_reg.h"
@@ -1076,25 +1077,44 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 struct intel_engine_cs *
 intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
 
-static inline void intel_engine_context_in(struct intel_engine_cs *engine)
+static inline void
+intel_engine_context_in(struct intel_engine_cs *engine,
+			struct intel_context *ce,
+			bool submit)
 {
 	unsigned long flags;
+	ktime_t now;
 
 	if (READ_ONCE(engine->stats.enabled) == 0)
 		return;
 
 	write_seqlock_irqsave(&engine->stats.lock, flags);
 
+	if (submit) {
+		now = ktime_get();
+		write_seqlock(&ce->stats.lock);
+		ce->stats.start = now;
+		ce->stats.active = true;
+		write_sequnlock(&ce->stats.lock);
+	} else {
+		now = 0;
+	}
+
 	if (engine->stats.enabled > 0) {
-		if (engine->stats.active++ == 0)
-			engine->stats.start = ktime_get();
+		if (engine->stats.active++ == 0) {
+			if (!now)
+				now = ktime_get();
+			engine->stats.start = now;
+		}
 		GEM_BUG_ON(engine->stats.active == 0);
 	}
 
 	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
-static inline void intel_engine_context_out(struct intel_engine_cs *engine)
+static inline void
+intel_engine_context_out(struct intel_engine_cs *engine,
+			 struct intel_context *ce)
 {
 	unsigned long flags;
 
@@ -1104,14 +1124,34 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (engine->stats.enabled > 0) {
+		struct execlist_port *next_port = &engine->execlists.port[1];
+		ktime_t now = ktime_get();
 		ktime_t last;
 
+		write_seqlock(&ce->stats.lock);
+		GEM_BUG_ON(!ce->stats.start);
+		ce->stats.total = ktime_add(ce->stats.total,
+					    ktime_sub(now, ce->stats.start));
+		ce->stats.active = false;
+		write_sequnlock(&ce->stats.lock);
+
+		if (port_isset(next_port)) {
+			struct i915_request *next_req = port_request(next_port);
+			struct intel_context *next_ce =
+					&next_req->ctx->engine[engine->id];
+
+			write_seqlock(&next_ce->stats.lock);
+			next_ce->stats.start = now;
+			next_ce->stats.active = true;
+			write_sequnlock(&next_ce->stats.lock);
+		}
+
 		if (engine->stats.active && --engine->stats.active == 0) {
 			/*
 			 * Decrement the active context count and in case GPU
 			 * is now idle add up to the running total.
 			 */
-			last = ktime_sub(ktime_get(), engine->stats.start);
+			last = ktime_sub(now, engine->stats.start);
 
 			engine->stats.total = ktime_add(engine->stats.total,
 							last);
@@ -1121,7 +1161,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 			 * the first event in which case we account from the
 			 * time stats gathering was turned on.
 			 */
-			last = ktime_sub(ktime_get(), engine->stats.enabled_at);
+			last = ktime_sub(now, engine->stats.enabled_at);
 
 			engine->stats.total = ktime_add(engine->stats.total,
 							last);
@@ -1131,6 +1171,9 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
+ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
+					   struct intel_engine_cs *engine);
+
 int intel_enable_engine_stats(struct intel_engine_cs *engine);
 void intel_disable_engine_stats(struct intel_engine_cs *engine);
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] [RFC 2/7] drm/i915: Track per-context engine busyness
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, gordon.kelly

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some customers want to know how much of the GPU time are their clients
using in order to make dynamic load balancing decisions.

With the hooks already in place which track the overall engine busyness,
we can extend that slightly to split that time between contexts.

v2: Fix accounting for tail updates.
v3: Rebase.
v4: Mark currently running contexts as active on stats enable.
v5: Include some headers to fix the build.
v6: Added fine grained lock.
v7: Convert to seqlock. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: gordon.kelly@intel.com
---
 drivers/gpu/drm/i915/i915_gem_context.c |  8 +++++
 drivers/gpu/drm/i915/i915_gem_context.h |  7 +++++
 drivers/gpu/drm/i915/intel_engine_cs.c  | 31 +++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 14 ++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 55 +++++++++++++++++++++++++++++----
 5 files changed, 104 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5cfac0255758..85627fa4565b 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -349,7 +349,9 @@ static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *dev_priv,
 			struct drm_i915_file_private *file_priv)
 {
+	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
+	enum intel_engine_id id;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
@@ -375,6 +377,12 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 		ctx->desc_template = default_desc_template(dev_priv, ppgtt);
 	}
 
+	/* Initialize the context stats lock. */
+	for_each_engine(engine, dev_priv, id) {
+		if (intel_engine_supports_stats(engine))
+			seqlock_init(&ctx->engine[id].stats.lock);
+	}
+
 	trace_i915_context_create(ctx);
 
 	return ctx;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 7854262ddfd9..159223c5fc5f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -28,6 +28,7 @@
 #include <linux/bitops.h>
 #include <linux/list.h>
 #include <linux/radix-tree.h>
+#include <linux/seqlock.h>
 
 #include "i915_gem.h"
 
@@ -160,6 +161,12 @@ struct i915_gem_context {
 		u32 *lrc_reg_state;
 		u64 lrc_desc;
 		int pin_count;
+		struct {
+			seqlock_t lock;
+			bool active;
+			ktime_t start;
+			ktime_t total;
+		} stats;
 	} engine[I915_NUM_ENGINES];
 
 	/** ring_size: size for allocating the per-engine ring buffer */
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index de61d3d1653d..e66166466d6d 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -2078,6 +2078,16 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 
 		engine->stats.enabled_at = ktime_get();
 
+		/* Mark currently running context as active. */
+		if (port_isset(port)) {
+			struct i915_request *req = port_request(port);
+			struct intel_context *ce =
+					&req->ctx->engine[engine->id];
+
+			ce->stats.start = engine->stats.enabled_at;
+			ce->stats.active = true;
+		}
+
 		/* XXX submission method oblivious? */
 		while (num_ports-- && port_isset(port)) {
 			engine->stats.active++;
@@ -2151,6 +2161,27 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
+ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
+					   struct intel_engine_cs *engine)
+{
+	struct intel_context *ce = &ctx->engine[engine->id];
+	unsigned int seq;
+	ktime_t total;
+
+	do {
+		seq = read_seqbegin(&ce->stats.lock);
+
+		total = ce->stats.total;
+
+		if (ce->stats.active)
+			total = ktime_add(total,
+					  ktime_sub(ktime_get(),
+						    ce->stats.start));
+	} while (read_seqretry(&ce->stats.lock, seq));
+
+	return total;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_engine.c"
 #include "selftests/intel_engine_cs.c"
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ddd14e30be6c..930fcf0b1e86 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -389,16 +389,18 @@ execlists_user_end(struct intel_engine_execlists *execlists)
 }
 
 static inline void
-execlists_context_schedule_in(struct i915_request *rq)
+execlists_context_schedule_in(struct i915_request *rq, unsigned int port)
 {
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-	intel_engine_context_in(rq->engine);
+	intel_engine_context_in(rq->engine,
+				&rq->ctx->engine[rq->engine->id],
+				port == 0);
 }
 
 static inline void
 execlists_context_schedule_out(struct i915_request *rq)
 {
-	intel_engine_context_out(rq->engine);
+	intel_engine_context_out(rq->engine, &rq->ctx->engine[rq->engine->id]);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 }
 
@@ -463,7 +465,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		if (rq) {
 			GEM_BUG_ON(count > !n);
 			if (!count++)
-				execlists_context_schedule_in(rq);
+				execlists_context_schedule_in(rq, n);
 			port_set(&port[n], port_pack(rq, count));
 			desc = execlists_update_context(rq);
 			GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
@@ -751,7 +753,9 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 			  intel_engine_get_seqno(rq->engine));
 
 		GEM_BUG_ON(!execlists->active);
-		intel_engine_context_out(rq->engine);
+
+		intel_engine_context_out(rq->engine,
+					 &rq->ctx->engine[rq->engine->id]);
 
 		execlists_context_status_change(rq,
 						i915_request_completed(rq) ?
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f24ea9826037..4d122ba1cd30 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -6,6 +6,7 @@
 #include <linux/seqlock.h>
 
 #include "i915_gem_batch_pool.h"
+#include "i915_gem_context.h"
 #include "i915_gem_timeline.h"
 
 #include "i915_reg.h"
@@ -1076,25 +1077,44 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 struct intel_engine_cs *
 intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
 
-static inline void intel_engine_context_in(struct intel_engine_cs *engine)
+static inline void
+intel_engine_context_in(struct intel_engine_cs *engine,
+			struct intel_context *ce,
+			bool submit)
 {
 	unsigned long flags;
+	ktime_t now;
 
 	if (READ_ONCE(engine->stats.enabled) == 0)
 		return;
 
 	write_seqlock_irqsave(&engine->stats.lock, flags);
 
+	if (submit) {
+		now = ktime_get();
+		write_seqlock(&ce->stats.lock);
+		ce->stats.start = now;
+		ce->stats.active = true;
+		write_sequnlock(&ce->stats.lock);
+	} else {
+		now = 0;
+	}
+
 	if (engine->stats.enabled > 0) {
-		if (engine->stats.active++ == 0)
-			engine->stats.start = ktime_get();
+		if (engine->stats.active++ == 0) {
+			if (!now)
+				now = ktime_get();
+			engine->stats.start = now;
+		}
 		GEM_BUG_ON(engine->stats.active == 0);
 	}
 
 	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
-static inline void intel_engine_context_out(struct intel_engine_cs *engine)
+static inline void
+intel_engine_context_out(struct intel_engine_cs *engine,
+			 struct intel_context *ce)
 {
 	unsigned long flags;
 
@@ -1104,14 +1124,34 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	write_seqlock_irqsave(&engine->stats.lock, flags);
 
 	if (engine->stats.enabled > 0) {
+		struct execlist_port *next_port = &engine->execlists.port[1];
+		ktime_t now = ktime_get();
 		ktime_t last;
 
+		write_seqlock(&ce->stats.lock);
+		GEM_BUG_ON(!ce->stats.start);
+		ce->stats.total = ktime_add(ce->stats.total,
+					    ktime_sub(now, ce->stats.start));
+		ce->stats.active = false;
+		write_sequnlock(&ce->stats.lock);
+
+		if (port_isset(next_port)) {
+			struct i915_request *next_req = port_request(next_port);
+			struct intel_context *next_ce =
+					&next_req->ctx->engine[engine->id];
+
+			write_seqlock(&next_ce->stats.lock);
+			next_ce->stats.start = now;
+			next_ce->stats.active = true;
+			write_sequnlock(&next_ce->stats.lock);
+		}
+
 		if (engine->stats.active && --engine->stats.active == 0) {
 			/*
 			 * Decrement the active context count and in case GPU
 			 * is now idle add up to the running total.
 			 */
-			last = ktime_sub(ktime_get(), engine->stats.start);
+			last = ktime_sub(now, engine->stats.start);
 
 			engine->stats.total = ktime_add(engine->stats.total,
 							last);
@@ -1121,7 +1161,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 			 * the first event in which case we account from the
 			 * time stats gathering was turned on.
 			 */
-			last = ktime_sub(ktime_get(), engine->stats.enabled_at);
+			last = ktime_sub(now, engine->stats.enabled_at);
 
 			engine->stats.total = ktime_add(engine->stats.total,
 							last);
@@ -1131,6 +1171,9 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	write_sequnlock_irqrestore(&engine->stats.lock, flags);
 }
 
+ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
+					   struct intel_engine_cs *engine);
+
 int intel_enable_engine_stats(struct intel_engine_cs *engine);
 void intel_disable_engine_stats(struct intel_engine_cs *engine);
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 3/7] drm/i915: Expose list of clients in sysfs
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose a list of clients with open file handles in sysfs.

This will be a basis for a top-like utility showing per-client and per-
engine GPU load.

Currently we only expose each client's pid and name under opaque numbered
directories in /sys/class/drm/card0/clients/.

For instance:

/sys/class/drm/card0/clients/3/name: Xorg
/sys/class/drm/card0/clients/3/pid: 5664

v2:
 Chris Wilson:
 * Enclose new members into dedicated structs.
 * Protect against failed sysfs registration.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  19 +++++++
 drivers/gpu/drm/i915/i915_gem.c   | 117 +++++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_sysfs.c |   8 +++
 3 files changed, 137 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 649c0f2f3bae..ff2ad9719b1e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -347,6 +347,20 @@ struct drm_i915_file_private {
  */
 #define I915_MAX_CLIENT_CONTEXT_BANS 3
 	atomic_t context_bans;
+
+	struct i915_drm_client {
+		unsigned int id;
+
+		pid_t pid;
+		char *name;
+
+		struct kobject *root;
+
+		struct {
+			struct device_attribute pid;
+			struct device_attribute name;
+		} attr;
+	} client;
 };
 
 /* Interface history:
@@ -2111,6 +2125,11 @@ struct drm_i915_private {
 
 	struct i915_pmu pmu;
 
+	struct i915_drm_clients {
+		struct kobject *root;
+		atomic_t serial;
+	} clients;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 062a9c743607..421e6c9e95f4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5708,6 +5708,94 @@ int i915_gem_freeze_late(struct drm_i915_private *dev_priv)
 	return 0;
 }
 
+static ssize_t
+show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_file_private *file_priv =
+		container_of(attr, struct drm_i915_file_private,
+			     client.attr.name);
+
+	return snprintf(buf, PAGE_SIZE, "%s", file_priv->client.name);
+}
+
+static ssize_t
+show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_file_private *file_priv =
+		container_of(attr, struct drm_i915_file_private,
+			     client.attr.pid);
+
+	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client.pid);
+}
+
+static int
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,
+		struct task_struct *task,
+		unsigned int serial)
+{
+	int ret = -ENOMEM;
+	struct device_attribute *attr;
+	char id[32];
+
+	if (!i915->clients.root)
+		goto err_name;
+
+	file_priv->client.name = kstrdup(task->comm, GFP_KERNEL);
+	if (!file_priv->client.name)
+		goto err_name;
+
+	snprintf(id, sizeof(id), "%u", serial);
+	file_priv->client.root = kobject_create_and_add(id,
+							i915->clients.root);
+	if (!file_priv->client.root)
+		goto err_client;
+
+	attr = &file_priv->client.attr.name;
+	attr->attr.name = "name";
+	attr->attr.mode = 0444;
+	attr->show = show_client_name;
+
+	ret = sysfs_create_file(file_priv->client.root,
+				(struct attribute *)attr);
+	if (ret)
+		goto err_attr_name;
+
+	attr = &file_priv->client.attr.pid;
+	attr->attr.name = "pid";
+	attr->attr.mode = 0444;
+	attr->show = show_client_pid;
+
+	ret = sysfs_create_file(file_priv->client.root,
+				(struct attribute *)attr);
+	if (ret)
+		goto err_attr_pid;
+
+	file_priv->client.pid = pid_nr(get_task_pid(task, PIDTYPE_PID));
+
+	return 0;
+
+err_attr_pid:
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.name);
+err_attr_name:
+	kobject_put(file_priv->client.root);
+err_client:
+	kfree(file_priv->client.name);
+err_name:
+	return ret;
+}
+
+static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
+{
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.pid);
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.name);
+	kobject_put(file_priv->client.root);
+	kfree(file_priv->client.name);
+}
+
 void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
@@ -5721,32 +5809,47 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 	list_for_each_entry(request, &file_priv->mm.request_list, client_link)
 		request->file_priv = NULL;
 	spin_unlock(&file_priv->mm.lock);
+
+	i915_gem_remove_client(file_priv);
 }
 
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 {
+	int ret = -ENOMEM;
 	struct drm_i915_file_private *file_priv;
-	int ret;
 
 	DRM_DEBUG("\n");
 
 	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
 	if (!file_priv)
-		return -ENOMEM;
+		goto err_alloc;
+
+	file_priv->client.id = atomic_inc_return(&i915->clients.serial);
+	ret = i915_gem_add_client(i915, file_priv, current,
+				  file_priv->client.id);
+	if (ret)
+		goto err_client;
 
 	file->driver_priv = file_priv;
+	ret = i915_gem_context_open(i915, file);
+	if (ret)
+		goto err_context;
+
 	file_priv->dev_priv = i915;
 	file_priv->file = file;
+	file_priv->bsd_engine = -1;
 
 	spin_lock_init(&file_priv->mm.lock);
 	INIT_LIST_HEAD(&file_priv->mm.request_list);
 
-	file_priv->bsd_engine = -1;
-
-	ret = i915_gem_context_open(i915, file);
-	if (ret)
-		kfree(file_priv);
+	return 0;
 
+err_context:
+	i915_gem_remove_client(file_priv);
+err_client:
+	atomic_dec(&i915->clients.serial);
+	kfree(file_priv);
+err_alloc:
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index e5e6f6bb2b05..d809259456ef 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -581,6 +581,11 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	struct device *kdev = dev_priv->drm.primary->kdev;
 	int ret;
 
+	dev_priv->clients.root =
+		kobject_create_and_add("clients", &kdev->kobj);
+	if (!dev_priv->clients.root)
+		DRM_ERROR("Per-client sysfs setup failed\n");
+
 #ifdef CONFIG_PM
 	if (HAS_RC6(dev_priv)) {
 		ret = sysfs_merge_group(&kdev->kobj,
@@ -641,4 +646,7 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 	sysfs_unmerge_group(&kdev->kobj, &rc6_attr_group);
 	sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
 #endif
+
+	if (dev_priv->clients.root)
+		kobject_put(dev_priv->clients.root);
 }
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [igt-dev] [RFC 3/7] drm/i915: Expose list of clients in sysfs
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose a list of clients with open file handles in sysfs.

This will be a basis for a top-like utility showing per-client and per-
engine GPU load.

Currently we only expose each client's pid and name under opaque numbered
directories in /sys/class/drm/card0/clients/.

For instance:

/sys/class/drm/card0/clients/3/name: Xorg
/sys/class/drm/card0/clients/3/pid: 5664

v2:
 Chris Wilson:
 * Enclose new members into dedicated structs.
 * Protect against failed sysfs registration.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  19 +++++++
 drivers/gpu/drm/i915/i915_gem.c   | 117 +++++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_sysfs.c |   8 +++
 3 files changed, 137 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 649c0f2f3bae..ff2ad9719b1e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -347,6 +347,20 @@ struct drm_i915_file_private {
  */
 #define I915_MAX_CLIENT_CONTEXT_BANS 3
 	atomic_t context_bans;
+
+	struct i915_drm_client {
+		unsigned int id;
+
+		pid_t pid;
+		char *name;
+
+		struct kobject *root;
+
+		struct {
+			struct device_attribute pid;
+			struct device_attribute name;
+		} attr;
+	} client;
 };
 
 /* Interface history:
@@ -2111,6 +2125,11 @@ struct drm_i915_private {
 
 	struct i915_pmu pmu;
 
+	struct i915_drm_clients {
+		struct kobject *root;
+		atomic_t serial;
+	} clients;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 062a9c743607..421e6c9e95f4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5708,6 +5708,94 @@ int i915_gem_freeze_late(struct drm_i915_private *dev_priv)
 	return 0;
 }
 
+static ssize_t
+show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_file_private *file_priv =
+		container_of(attr, struct drm_i915_file_private,
+			     client.attr.name);
+
+	return snprintf(buf, PAGE_SIZE, "%s", file_priv->client.name);
+}
+
+static ssize_t
+show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_file_private *file_priv =
+		container_of(attr, struct drm_i915_file_private,
+			     client.attr.pid);
+
+	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client.pid);
+}
+
+static int
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,
+		struct task_struct *task,
+		unsigned int serial)
+{
+	int ret = -ENOMEM;
+	struct device_attribute *attr;
+	char id[32];
+
+	if (!i915->clients.root)
+		goto err_name;
+
+	file_priv->client.name = kstrdup(task->comm, GFP_KERNEL);
+	if (!file_priv->client.name)
+		goto err_name;
+
+	snprintf(id, sizeof(id), "%u", serial);
+	file_priv->client.root = kobject_create_and_add(id,
+							i915->clients.root);
+	if (!file_priv->client.root)
+		goto err_client;
+
+	attr = &file_priv->client.attr.name;
+	attr->attr.name = "name";
+	attr->attr.mode = 0444;
+	attr->show = show_client_name;
+
+	ret = sysfs_create_file(file_priv->client.root,
+				(struct attribute *)attr);
+	if (ret)
+		goto err_attr_name;
+
+	attr = &file_priv->client.attr.pid;
+	attr->attr.name = "pid";
+	attr->attr.mode = 0444;
+	attr->show = show_client_pid;
+
+	ret = sysfs_create_file(file_priv->client.root,
+				(struct attribute *)attr);
+	if (ret)
+		goto err_attr_pid;
+
+	file_priv->client.pid = pid_nr(get_task_pid(task, PIDTYPE_PID));
+
+	return 0;
+
+err_attr_pid:
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.name);
+err_attr_name:
+	kobject_put(file_priv->client.root);
+err_client:
+	kfree(file_priv->client.name);
+err_name:
+	return ret;
+}
+
+static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
+{
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.pid);
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.name);
+	kobject_put(file_priv->client.root);
+	kfree(file_priv->client.name);
+}
+
 void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
@@ -5721,32 +5809,47 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 	list_for_each_entry(request, &file_priv->mm.request_list, client_link)
 		request->file_priv = NULL;
 	spin_unlock(&file_priv->mm.lock);
+
+	i915_gem_remove_client(file_priv);
 }
 
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 {
+	int ret = -ENOMEM;
 	struct drm_i915_file_private *file_priv;
-	int ret;
 
 	DRM_DEBUG("\n");
 
 	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
 	if (!file_priv)
-		return -ENOMEM;
+		goto err_alloc;
+
+	file_priv->client.id = atomic_inc_return(&i915->clients.serial);
+	ret = i915_gem_add_client(i915, file_priv, current,
+				  file_priv->client.id);
+	if (ret)
+		goto err_client;
 
 	file->driver_priv = file_priv;
+	ret = i915_gem_context_open(i915, file);
+	if (ret)
+		goto err_context;
+
 	file_priv->dev_priv = i915;
 	file_priv->file = file;
+	file_priv->bsd_engine = -1;
 
 	spin_lock_init(&file_priv->mm.lock);
 	INIT_LIST_HEAD(&file_priv->mm.request_list);
 
-	file_priv->bsd_engine = -1;
-
-	ret = i915_gem_context_open(i915, file);
-	if (ret)
-		kfree(file_priv);
+	return 0;
 
+err_context:
+	i915_gem_remove_client(file_priv);
+err_client:
+	atomic_dec(&i915->clients.serial);
+	kfree(file_priv);
+err_alloc:
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index e5e6f6bb2b05..d809259456ef 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -581,6 +581,11 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	struct device *kdev = dev_priv->drm.primary->kdev;
 	int ret;
 
+	dev_priv->clients.root =
+		kobject_create_and_add("clients", &kdev->kobj);
+	if (!dev_priv->clients.root)
+		DRM_ERROR("Per-client sysfs setup failed\n");
+
 #ifdef CONFIG_PM
 	if (HAS_RC6(dev_priv)) {
 		ret = sysfs_merge_group(&kdev->kobj,
@@ -641,4 +646,7 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 	sysfs_unmerge_group(&kdev->kobj, &rc6_attr_group);
 	sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
 #endif
+
+	if (dev_priv->clients.root)
+		kobject_put(dev_priv->clients.root);
 }
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 4/7] drm/i915: Update client name on context create
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some clients have the DRM fd passed to them over a socket by the X server.

Grab the real client and pid when they create their first context and
update the exposed data for more useful enumeration.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  8 ++++++++
 drivers/gpu/drm/i915/i915_gem.c         |  4 ++--
 drivers/gpu/drm/i915/i915_gem_context.c | 16 +++++++++++++---
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ff2ad9719b1e..456dc495e017 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3195,6 +3195,14 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
 				int align);
+
+int
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,
+		struct task_struct *task,
+		unsigned int serial);
+void i915_gem_remove_client(struct drm_i915_file_private *file_priv);
+
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 421e6c9e95f4..4c20545e3539 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5728,7 +5728,7 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client.pid);
 }
 
-static int
+int
 i915_gem_add_client(struct drm_i915_private *i915,
 		struct drm_i915_file_private *file_priv,
 		struct task_struct *task,
@@ -5786,7 +5786,7 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	return ret;
 }
 
-static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
+void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
 {
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.pid);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 85627fa4565b..e9b0940ce7f9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -653,9 +653,10 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file)
 {
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	pid_t pid = pid_nr(get_task_pid(current, PIDTYPE_PID));
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_context_create *args = data;
-	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct i915_gem_context *ctx;
 	int ret;
 
@@ -667,8 +668,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 
 	if (client_is_banned(file_priv)) {
 		DRM_DEBUG("client %s[%d] banned from creating ctx\n",
-			  current->comm,
-			  pid_nr(get_task_pid(current, PIDTYPE_PID)));
+			  current->comm, pid);
 
 		return -EIO;
 	}
@@ -677,6 +677,16 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
+	if (file_priv->client.pid != pid) {
+		i915_gem_remove_client(file_priv);
+		ret = i915_gem_add_client(dev_priv, file_priv, current,
+					  file_priv->client.id);
+		if (ret) {
+			mutex_unlock(&dev->struct_mutex);
+			return ret;
+		}
+	}
+
 	ctx = i915_gem_create_context(dev_priv, file_priv);
 	mutex_unlock(&dev->struct_mutex);
 	if (IS_ERR(ctx))
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [igt-dev] [RFC 4/7] drm/i915: Update client name on context create
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some clients have the DRM fd passed to them over a socket by the X server.

Grab the real client and pid when they create their first context and
update the exposed data for more useful enumeration.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  8 ++++++++
 drivers/gpu/drm/i915/i915_gem.c         |  4 ++--
 drivers/gpu/drm/i915/i915_gem_context.c | 16 +++++++++++++---
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ff2ad9719b1e..456dc495e017 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3195,6 +3195,14 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
 				int align);
+
+int
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,
+		struct task_struct *task,
+		unsigned int serial);
+void i915_gem_remove_client(struct drm_i915_file_private *file_priv);
+
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 421e6c9e95f4..4c20545e3539 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5728,7 +5728,7 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client.pid);
 }
 
-static int
+int
 i915_gem_add_client(struct drm_i915_private *i915,
 		struct drm_i915_file_private *file_priv,
 		struct task_struct *task,
@@ -5786,7 +5786,7 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	return ret;
 }
 
-static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
+void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
 {
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.pid);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 85627fa4565b..e9b0940ce7f9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -653,9 +653,10 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file)
 {
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	pid_t pid = pid_nr(get_task_pid(current, PIDTYPE_PID));
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_context_create *args = data;
-	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct i915_gem_context *ctx;
 	int ret;
 
@@ -667,8 +668,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 
 	if (client_is_banned(file_priv)) {
 		DRM_DEBUG("client %s[%d] banned from creating ctx\n",
-			  current->comm,
-			  pid_nr(get_task_pid(current, PIDTYPE_PID)));
+			  current->comm, pid);
 
 		return -EIO;
 	}
@@ -677,6 +677,16 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
+	if (file_priv->client.pid != pid) {
+		i915_gem_remove_client(file_priv);
+		ret = i915_gem_add_client(dev_priv, file_priv, current,
+					  file_priv->client.id);
+		if (ret) {
+			mutex_unlock(&dev->struct_mutex);
+			return ret;
+		}
+	}
+
 	ctx = i915_gem_create_context(dev_priv, file_priv);
 	mutex_unlock(&dev->struct_mutex);
 	if (IS_ERR(ctx))
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 5/7] drm/i915: Expose per-engine client busyness
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory.

Each contains a monotonically increasing nano-second resolution times each
client's jobs were executing on the GPU.

$ cat /sys/class/drm/card0/clients/5/busy/rcs0
32516602

This data can serve as an interface to implement a top like utility for
GPU jobs. For instance I have prototyped a tool in IGT which produces
periodic output like:

neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
     Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
    xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%

This tools can also be extended to use the i915 PMU and show overall engine
busyness, and engine loads using the queue depth metric.

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  8 ++++
 drivers/gpu/drm/i915/i915_gem.c | 81 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 456dc495e017..f83b8dcac16b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -317,6 +317,12 @@ struct drm_i915_private;
 struct i915_mm_struct;
 struct i915_mmu_object;
 
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct drm_i915_file_private *file_priv;
+	struct intel_engine_cs *engine;
+};
+
 struct drm_i915_file_private {
 	struct drm_i915_private *dev_priv;
 	struct drm_file *file;
@@ -355,10 +361,12 @@ struct drm_i915_file_private {
 		char *name;
 
 		struct kobject *root;
+ 		struct kobject *busy_root;
 
 		struct {
 			struct device_attribute pid;
 			struct device_attribute name;
+			struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
 		} attr;
 	} client;
 };
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4c20545e3539..6a550fb110fb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5728,6 +5728,38 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client.pid);
 }
 
+struct busy_ctx {
+	struct intel_engine_cs *engine;
+	u64 total;
+};
+
+static int busy_add(int _id, void *p, void *data)
+{
+	struct i915_gem_context *ctx = p;
+	struct busy_ctx *bc = data;
+
+	bc->total +=
+		ktime_to_ns(intel_context_engine_get_busy_time(ctx,
+							       bc->engine));
+
+	return 0;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	struct drm_i915_file_private *file_priv = i915_attr->file_priv;
+	struct busy_ctx bc = { .engine = i915_attr->engine };
+
+	rcu_read_lock();
+	idr_for_each(&file_priv->context_idr, busy_add, &bc);
+	rcu_read_unlock();
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", bc.total);
+}
+
 int
 i915_gem_add_client(struct drm_i915_private *i915,
 		struct drm_i915_file_private *file_priv,
@@ -5735,8 +5767,10 @@ i915_gem_add_client(struct drm_i915_private *i915,
 		unsigned int serial)
 {
 	int ret = -ENOMEM;
+	struct intel_engine_cs *engine;
 	struct device_attribute *attr;
-	char id[32];
+	enum intel_engine_id id, id2;
+	char idstr[32];
 
 	if (!i915->clients.root)
 		goto err_name;
@@ -5745,8 +5779,8 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	if (!file_priv->client.name)
 		goto err_name;
 
-	snprintf(id, sizeof(id), "%u", serial);
-	file_priv->client.root = kobject_create_and_add(id,
+	snprintf(idstr, sizeof(idstr), "%u", serial);
+	file_priv->client.root = kobject_create_and_add(idstr,
 							i915->clients.root);
 	if (!file_priv->client.root)
 		goto err_client;
@@ -5771,10 +5805,41 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	if (ret)
 		goto err_attr_pid;
 
+	file_priv->client.busy_root =
+			kobject_create_and_add("busy", file_priv->client.root);
+	if (!file_priv->client.busy_root)
+		goto err_busy_root;
+
+	for_each_engine(engine, i915, id) {
+		file_priv->client.attr.busy[id].file_priv = file_priv;
+		file_priv->client.attr.busy[id].engine = engine;
+		attr = &file_priv->client.attr.busy[id].attr;
+		attr->attr.name = engine->name;
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(file_priv->client.busy_root,
+				        (struct attribute *)attr);
+		if (ret)
+			goto err_attr_busy;
+	}
+
 	file_priv->client.pid = pid_nr(get_task_pid(task, PIDTYPE_PID));
 
 	return 0;
 
+err_attr_busy:
+	for_each_engine(engine, i915, id2) {
+		if (id2 == id)
+			break;
+
+		sysfs_remove_file(file_priv->client.busy_root,
+				  (struct attribute *)&file_priv->client.attr.busy[id2]);
+	}
+	kobject_put(file_priv->client.busy_root);
+err_busy_root:
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.pid);
 err_attr_pid:
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.name);
@@ -5788,10 +5853,20 @@ i915_gem_add_client(struct drm_i915_private *i915,
 
 void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
 {
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, file_priv->dev_priv, id)
+		sysfs_remove_file(file_priv->client.busy_root,
+				  (struct attribute *)&file_priv->client.attr.busy[id]);
+
+	kobject_put(file_priv->client.busy_root);
+
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.pid);
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.name);
+
 	kobject_put(file_priv->client.root);
 	kfree(file_priv->client.name);
 }
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [igt-dev] [RFC 5/7] drm/i915: Expose per-engine client busyness
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory.

Each contains a monotonically increasing nano-second resolution times each
client's jobs were executing on the GPU.

$ cat /sys/class/drm/card0/clients/5/busy/rcs0
32516602

This data can serve as an interface to implement a top like utility for
GPU jobs. For instance I have prototyped a tool in IGT which produces
periodic output like:

neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
     Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
    xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%

This tools can also be extended to use the i915 PMU and show overall engine
busyness, and engine loads using the queue depth metric.

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  8 ++++
 drivers/gpu/drm/i915/i915_gem.c | 81 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 456dc495e017..f83b8dcac16b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -317,6 +317,12 @@ struct drm_i915_private;
 struct i915_mm_struct;
 struct i915_mmu_object;
 
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct drm_i915_file_private *file_priv;
+	struct intel_engine_cs *engine;
+};
+
 struct drm_i915_file_private {
 	struct drm_i915_private *dev_priv;
 	struct drm_file *file;
@@ -355,10 +361,12 @@ struct drm_i915_file_private {
 		char *name;
 
 		struct kobject *root;
+ 		struct kobject *busy_root;
 
 		struct {
 			struct device_attribute pid;
 			struct device_attribute name;
+			struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
 		} attr;
 	} client;
 };
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4c20545e3539..6a550fb110fb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5728,6 +5728,38 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client.pid);
 }
 
+struct busy_ctx {
+	struct intel_engine_cs *engine;
+	u64 total;
+};
+
+static int busy_add(int _id, void *p, void *data)
+{
+	struct i915_gem_context *ctx = p;
+	struct busy_ctx *bc = data;
+
+	bc->total +=
+		ktime_to_ns(intel_context_engine_get_busy_time(ctx,
+							       bc->engine));
+
+	return 0;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	struct drm_i915_file_private *file_priv = i915_attr->file_priv;
+	struct busy_ctx bc = { .engine = i915_attr->engine };
+
+	rcu_read_lock();
+	idr_for_each(&file_priv->context_idr, busy_add, &bc);
+	rcu_read_unlock();
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", bc.total);
+}
+
 int
 i915_gem_add_client(struct drm_i915_private *i915,
 		struct drm_i915_file_private *file_priv,
@@ -5735,8 +5767,10 @@ i915_gem_add_client(struct drm_i915_private *i915,
 		unsigned int serial)
 {
 	int ret = -ENOMEM;
+	struct intel_engine_cs *engine;
 	struct device_attribute *attr;
-	char id[32];
+	enum intel_engine_id id, id2;
+	char idstr[32];
 
 	if (!i915->clients.root)
 		goto err_name;
@@ -5745,8 +5779,8 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	if (!file_priv->client.name)
 		goto err_name;
 
-	snprintf(id, sizeof(id), "%u", serial);
-	file_priv->client.root = kobject_create_and_add(id,
+	snprintf(idstr, sizeof(idstr), "%u", serial);
+	file_priv->client.root = kobject_create_and_add(idstr,
 							i915->clients.root);
 	if (!file_priv->client.root)
 		goto err_client;
@@ -5771,10 +5805,41 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	if (ret)
 		goto err_attr_pid;
 
+	file_priv->client.busy_root =
+			kobject_create_and_add("busy", file_priv->client.root);
+	if (!file_priv->client.busy_root)
+		goto err_busy_root;
+
+	for_each_engine(engine, i915, id) {
+		file_priv->client.attr.busy[id].file_priv = file_priv;
+		file_priv->client.attr.busy[id].engine = engine;
+		attr = &file_priv->client.attr.busy[id].attr;
+		attr->attr.name = engine->name;
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(file_priv->client.busy_root,
+				        (struct attribute *)attr);
+		if (ret)
+			goto err_attr_busy;
+	}
+
 	file_priv->client.pid = pid_nr(get_task_pid(task, PIDTYPE_PID));
 
 	return 0;
 
+err_attr_busy:
+	for_each_engine(engine, i915, id2) {
+		if (id2 == id)
+			break;
+
+		sysfs_remove_file(file_priv->client.busy_root,
+				  (struct attribute *)&file_priv->client.attr.busy[id2]);
+	}
+	kobject_put(file_priv->client.busy_root);
+err_busy_root:
+	sysfs_remove_file(file_priv->client.root,
+			  (struct attribute *)&file_priv->client.attr.pid);
 err_attr_pid:
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.name);
@@ -5788,10 +5853,20 @@ i915_gem_add_client(struct drm_i915_private *i915,
 
 void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
 {
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, file_priv->dev_priv, id)
+		sysfs_remove_file(file_priv->client.busy_root,
+				  (struct attribute *)&file_priv->client.attr.busy[id]);
+
+	kobject_put(file_priv->client.busy_root);
+
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.pid);
 	sysfs_remove_file(file_priv->client.root,
 			  (struct attribute *)&file_priv->client.attr.name);
+
 	kobject_put(file_priv->client.root);
 	kfree(file_priv->client.name);
 }
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 6/7] drm/i915: Add sysfs toggle to enable per-client engine stats
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

By default we are not collecting any per-engine and per-context
statistcs.

Add a new sysfs toggle to enable this facility:

$ echo 1 >/sys/class/drm/card0/clients/enable_stats

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  4 +++
 drivers/gpu/drm/i915/i915_sysfs.c | 72 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f83b8dcac16b..aa82df864dd4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2136,6 +2136,10 @@ struct drm_i915_private {
 	struct i915_drm_clients {
 		struct kobject *root;
 		atomic_t serial;
+		struct {
+			bool enabled;
+			struct device_attribute attr;
+		} stats;
 	} clients;
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index d809259456ef..70115072d56f 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -576,9 +576,67 @@ static void i915_setup_error_capture(struct device *kdev) {}
 static void i915_teardown_error_capture(struct device *kdev) {}
 #endif
 
+static ssize_t
+show_client_stats(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_private *i915 =
+		container_of(attr, struct drm_i915_private, clients.stats.attr);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", i915->clients.stats.enabled);
+}
+
+static ssize_t
+store_client_stats(struct device *kdev, struct device_attribute *attr,
+		   const char *buf, size_t count)
+{
+	struct drm_i915_private *i915 =
+		container_of(attr, struct drm_i915_private, clients.stats.attr);
+	bool disable = false;
+	bool enable = false;
+	bool val = false;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int ret;
+
+	 /* Use RCS as proxy for all engines. */
+	if (!intel_engine_supports_stats(i915->engine[RCS]))
+		return -EINVAL;
+
+	ret = kstrtobool(buf, &val);
+	if (ret)
+		return ret;
+
+	ret = i915_mutex_lock_interruptible(&i915->drm);
+	if (ret)
+		return ret;
+
+	if (val && !i915->clients.stats.enabled)
+		enable = true;
+	else if (!val && i915->clients.stats.enabled)
+		disable = true;
+
+	if (!enable && !disable)
+		goto out;
+
+	for_each_engine(engine, i915, id) {
+		if (enable)
+			WARN_ON_ONCE(intel_enable_engine_stats(engine));
+		else if (disable)
+			intel_disable_engine_stats(engine);
+	}
+
+	i915->clients.stats.enabled = val;
+
+out:
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return count;
+}
+
 void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 {
 	struct device *kdev = dev_priv->drm.primary->kdev;
+	struct device_attribute *attr;
 	int ret;
 
 	dev_priv->clients.root =
@@ -586,6 +644,17 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	if (!dev_priv->clients.root)
 		DRM_ERROR("Per-client sysfs setup failed\n");
 
+	attr = &dev_priv->clients.stats.attr;
+	attr->attr.name = "enable_stats";
+	attr->attr.mode = 0664;
+	attr->show = show_client_stats;
+	attr->store = store_client_stats;
+
+	ret = sysfs_create_file(dev_priv->clients.root,
+				(struct attribute *)attr);
+	if (ret)
+		DRM_ERROR("Per-client sysfs setup failed! (%d)\n", ret);
+
 #ifdef CONFIG_PM
 	if (HAS_RC6(dev_priv)) {
 		ret = sysfs_merge_group(&kdev->kobj,
@@ -647,6 +716,9 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 	sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
 #endif
 
+	sysfs_remove_file(dev_priv->clients.root,
+			  (struct attribute *)&dev_priv->clients.stats.attr);
+
 	if (dev_priv->clients.root)
 		kobject_put(dev_priv->clients.root);
 }
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] [RFC 6/7] drm/i915: Add sysfs toggle to enable per-client engine stats
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

By default we are not collecting any per-engine and per-context
statistcs.

Add a new sysfs toggle to enable this facility:

$ echo 1 >/sys/class/drm/card0/clients/enable_stats

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  4 +++
 drivers/gpu/drm/i915/i915_sysfs.c | 72 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f83b8dcac16b..aa82df864dd4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2136,6 +2136,10 @@ struct drm_i915_private {
 	struct i915_drm_clients {
 		struct kobject *root;
 		atomic_t serial;
+		struct {
+			bool enabled;
+			struct device_attribute attr;
+		} stats;
 	} clients;
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index d809259456ef..70115072d56f 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -576,9 +576,67 @@ static void i915_setup_error_capture(struct device *kdev) {}
 static void i915_teardown_error_capture(struct device *kdev) {}
 #endif
 
+static ssize_t
+show_client_stats(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_private *i915 =
+		container_of(attr, struct drm_i915_private, clients.stats.attr);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", i915->clients.stats.enabled);
+}
+
+static ssize_t
+store_client_stats(struct device *kdev, struct device_attribute *attr,
+		   const char *buf, size_t count)
+{
+	struct drm_i915_private *i915 =
+		container_of(attr, struct drm_i915_private, clients.stats.attr);
+	bool disable = false;
+	bool enable = false;
+	bool val = false;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int ret;
+
+	 /* Use RCS as proxy for all engines. */
+	if (!intel_engine_supports_stats(i915->engine[RCS]))
+		return -EINVAL;
+
+	ret = kstrtobool(buf, &val);
+	if (ret)
+		return ret;
+
+	ret = i915_mutex_lock_interruptible(&i915->drm);
+	if (ret)
+		return ret;
+
+	if (val && !i915->clients.stats.enabled)
+		enable = true;
+	else if (!val && i915->clients.stats.enabled)
+		disable = true;
+
+	if (!enable && !disable)
+		goto out;
+
+	for_each_engine(engine, i915, id) {
+		if (enable)
+			WARN_ON_ONCE(intel_enable_engine_stats(engine));
+		else if (disable)
+			intel_disable_engine_stats(engine);
+	}
+
+	i915->clients.stats.enabled = val;
+
+out:
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return count;
+}
+
 void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 {
 	struct device *kdev = dev_priv->drm.primary->kdev;
+	struct device_attribute *attr;
 	int ret;
 
 	dev_priv->clients.root =
@@ -586,6 +644,17 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	if (!dev_priv->clients.root)
 		DRM_ERROR("Per-client sysfs setup failed\n");
 
+	attr = &dev_priv->clients.stats.attr;
+	attr->attr.name = "enable_stats";
+	attr->attr.mode = 0664;
+	attr->show = show_client_stats;
+	attr->store = store_client_stats;
+
+	ret = sysfs_create_file(dev_priv->clients.root,
+				(struct attribute *)attr);
+	if (ret)
+		DRM_ERROR("Per-client sysfs setup failed! (%d)\n", ret);
+
 #ifdef CONFIG_PM
 	if (HAS_RC6(dev_priv)) {
 		ret = sysfs_merge_group(&kdev->kobj,
@@ -647,6 +716,9 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 	sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
 #endif
 
+	sysfs_remove_file(dev_priv->clients.root,
+			  (struct attribute *)&dev_priv->clients.stats.attr);
+
 	if (dev_priv->clients.root)
 		kobject_put(dev_priv->clients.root);
 }
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC 7/7] drm/i915: Allow clients to query own per-engine busyness
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, gordon.kelly

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some customers want to know how much of the GPU time are their clients
using in order to make dynamic load balancing decisions.

With the accounting infrastructure in place in the previous patch, we add
a new context param (I915_CONTEXT_GET_ENGINES_BUSY) which points to struct
drm_i915_context_engines_busy, followed by a variable number of structs
drm_i915_context_engine_busy.

Userspace needs to provide the number of attached structures in the
num_engines fields, as well as set args->size to byte size of the provided
buffer.

Attached drm_i915_context_engine_busy objects need to have the class and
instance of the engine which userspace wants to query busyness of
initialized.

Kernel will then report accumulated engine busyness as monotonically
increasing number of nano-seconds the engine spent executing jobs
belonging to this context.

v2:
 * Use intel_context_engine_get_busy_time.
 * Refactor to only use struct_mutex while initially enabling engine
   stats.

v3:
 * Fix stats enabling.

v4:
 * Change uAPI to enable querying multiple engines at a time.
   (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: gordon.kelly@intel.com
---
This version has been compile tested only until acceptable to
everyone.
---
 drivers/gpu/drm/i915/i915_gem_context.c | 97 ++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h |  1 +
 include/uapi/drm/i915_drm.h             | 21 +++++++
 3 files changed, 116 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e9b0940ce7f9..7cfec17b51cf 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -126,6 +126,9 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	for (i = 0; i < I915_NUM_ENGINES; i++) {
 		struct intel_context *ce = &ctx->engine[i];
 
+		if (ctx->i915->engine[i] && ce->stats.enabled)
+			intel_disable_engine_stats(ctx->i915->engine[i]);
+
 		if (!ce->state)
 			continue;
 
@@ -730,11 +733,95 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int
+get_engines_busy(struct drm_i915_private *i915,
+		 struct i915_gem_context *ctx,
+		 struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_context_engine_busy __user *busy_user;
+	struct drm_i915_context_engines_busy engines;
+	struct drm_i915_context_engine_busy busy;
+	bool mutex = false;
+	unsigned int i;
+	int ret = 0;
+
+	if (args->size < sizeof(engines))
+		return -EINVAL;
+
+	if (copy_from_user(&engines, u64_to_user_ptr(args->value),
+			   sizeof(engines)))
+		return -EFAULT;
+
+	if (engines.pad || engines.mbz)
+		return -EINVAL;
+
+	if (engines.num_engines == 0 || engines.num_engines > I915_NUM_ENGINES)
+		return -EINVAL;
+
+	if (!access_ok(VERIFY_WRITE, args->value,
+		       sizeof(engines) + engines.num_engines * sizeof(busy)))
+		return -EFAULT;
+
+	busy_user = (struct drm_i915_context_engine_busy __user *)
+		    ((char __user *)args->value + sizeof(engines));
+
+	for (i = 0; i < engines.num_engines; i++, busy_user++) {
+		struct intel_engine_cs *engine;
+		struct intel_context *ce;
+
+		__copy_from_user(busy_user, &busy, sizeof(busy));
+
+		if (busy.mbz || busy.flags || busy.busy) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		engine = intel_engine_lookup_user(i915,
+						  busy.class, busy.instance);
+		if (!engine) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		/* Enable stats on first query. */
+		ce = &ctx->engine[engine->id];
+		if (!READ_ONCE(ce->stats.enabled)) {
+			/* Grab mutex if need to enable engine stats. */
+			if (!mutex) {
+				ret = i915_mutex_lock_interruptible(&i915->drm);
+				if (!ret)
+					break;
+				mutex = true;
+			}
+
+			if (!ce->stats.enabled) {
+				ret = intel_enable_engine_stats(engine);
+				if (!ret)
+					goto out;
+				ce->stats.enabled = true;
+			}
+		}
+
+		busy.busy =
+			ktime_to_ns(intel_context_engine_get_busy_time(ctx,
+								       engine));
+
+		__copy_to_user(busy_user, &busy, sizeof(busy));
+	}
+
+out:
+	if (mutex)
+		mutex_unlock(&i915->drm.struct_mutex);
+
+	return ret;
+}
+
 int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct drm_i915_gem_context_param *args = data;
+	struct drm_i915_private *i915 = to_i915(dev);
 	struct i915_gem_context *ctx;
 	int ret = 0;
 
@@ -753,10 +840,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_GTT_SIZE:
 		if (ctx->ppgtt)
 			args->value = ctx->ppgtt->base.total;
-		else if (to_i915(dev)->mm.aliasing_ppgtt)
-			args->value = to_i915(dev)->mm.aliasing_ppgtt->base.total;
+		else if (i915->mm.aliasing_ppgtt)
+			args->value = i915->mm.aliasing_ppgtt->base.total;
 		else
-			args->value = to_i915(dev)->ggtt.base.total;
+			args->value = i915->ggtt.base.total;
 		break;
 	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
 		args->value = i915_gem_context_no_error_capture(ctx);
@@ -767,6 +854,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_PRIORITY:
 		args->value = ctx->priority;
 		break;
+	case I915_CONTEXT_GET_ENGINES_BUSY:
+		ret = get_engines_busy(i915, ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -842,6 +932,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		}
 		break;
 
+	case I915_CONTEXT_GET_ENGINES_BUSY:
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 159223c5fc5f..e468c971a7f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -163,6 +163,7 @@ struct i915_gem_context {
 		int pin_count;
 		struct {
 			seqlock_t lock;
+			bool enabled;
 			bool active;
 			ktime_t start;
 			ktime_t total;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index c82035b71824..c1bdae484594 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1460,6 +1460,26 @@ struct drm_i915_gem_userptr {
 	__u32 handle;
 };
 
+struct drm_i915_context_engine_busy {
+	__u8 class; /* in/out */
+	__u8 instance; /* in/out */
+
+	__u16 mbz;
+
+	__u32 flags; /* in/out/mbz */
+
+	__u64 busy; /* out/mbz */
+};
+
+struct drm_i915_context_engines_busy {
+	__u32 num_engines; /* in */
+	__u32 pad; /* mbz */
+
+	__u64 mbz;
+
+	struct drm_i915_context_engine_busy engines[0];
+};
+
 struct drm_i915_gem_context_param {
 	__u32 ctx_id;
 	__u32 size;
@@ -1473,6 +1493,7 @@ struct drm_i915_gem_context_param {
 #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
 #define   I915_CONTEXT_DEFAULT_PRIORITY		0
 #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
+#define I915_CONTEXT_GET_ENGINES_BUSY	0x7
 	__u64 value;
 };
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] [RFC 7/7] drm/i915: Allow clients to query own per-engine busyness
@ 2018-04-17 12:27   ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2018-04-17 12:27 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, gordon.kelly

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some customers want to know how much of the GPU time are their clients
using in order to make dynamic load balancing decisions.

With the accounting infrastructure in place in the previous patch, we add
a new context param (I915_CONTEXT_GET_ENGINES_BUSY) which points to struct
drm_i915_context_engines_busy, followed by a variable number of structs
drm_i915_context_engine_busy.

Userspace needs to provide the number of attached structures in the
num_engines fields, as well as set args->size to byte size of the provided
buffer.

Attached drm_i915_context_engine_busy objects need to have the class and
instance of the engine which userspace wants to query busyness of
initialized.

Kernel will then report accumulated engine busyness as monotonically
increasing number of nano-seconds the engine spent executing jobs
belonging to this context.

v2:
 * Use intel_context_engine_get_busy_time.
 * Refactor to only use struct_mutex while initially enabling engine
   stats.

v3:
 * Fix stats enabling.

v4:
 * Change uAPI to enable querying multiple engines at a time.
   (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: gordon.kelly@intel.com
---
This version has been compile tested only until acceptable to
everyone.
---
 drivers/gpu/drm/i915/i915_gem_context.c | 97 ++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h |  1 +
 include/uapi/drm/i915_drm.h             | 21 +++++++
 3 files changed, 116 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e9b0940ce7f9..7cfec17b51cf 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -126,6 +126,9 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	for (i = 0; i < I915_NUM_ENGINES; i++) {
 		struct intel_context *ce = &ctx->engine[i];
 
+		if (ctx->i915->engine[i] && ce->stats.enabled)
+			intel_disable_engine_stats(ctx->i915->engine[i]);
+
 		if (!ce->state)
 			continue;
 
@@ -730,11 +733,95 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int
+get_engines_busy(struct drm_i915_private *i915,
+		 struct i915_gem_context *ctx,
+		 struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_context_engine_busy __user *busy_user;
+	struct drm_i915_context_engines_busy engines;
+	struct drm_i915_context_engine_busy busy;
+	bool mutex = false;
+	unsigned int i;
+	int ret = 0;
+
+	if (args->size < sizeof(engines))
+		return -EINVAL;
+
+	if (copy_from_user(&engines, u64_to_user_ptr(args->value),
+			   sizeof(engines)))
+		return -EFAULT;
+
+	if (engines.pad || engines.mbz)
+		return -EINVAL;
+
+	if (engines.num_engines == 0 || engines.num_engines > I915_NUM_ENGINES)
+		return -EINVAL;
+
+	if (!access_ok(VERIFY_WRITE, args->value,
+		       sizeof(engines) + engines.num_engines * sizeof(busy)))
+		return -EFAULT;
+
+	busy_user = (struct drm_i915_context_engine_busy __user *)
+		    ((char __user *)args->value + sizeof(engines));
+
+	for (i = 0; i < engines.num_engines; i++, busy_user++) {
+		struct intel_engine_cs *engine;
+		struct intel_context *ce;
+
+		__copy_from_user(busy_user, &busy, sizeof(busy));
+
+		if (busy.mbz || busy.flags || busy.busy) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		engine = intel_engine_lookup_user(i915,
+						  busy.class, busy.instance);
+		if (!engine) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		/* Enable stats on first query. */
+		ce = &ctx->engine[engine->id];
+		if (!READ_ONCE(ce->stats.enabled)) {
+			/* Grab mutex if need to enable engine stats. */
+			if (!mutex) {
+				ret = i915_mutex_lock_interruptible(&i915->drm);
+				if (!ret)
+					break;
+				mutex = true;
+			}
+
+			if (!ce->stats.enabled) {
+				ret = intel_enable_engine_stats(engine);
+				if (!ret)
+					goto out;
+				ce->stats.enabled = true;
+			}
+		}
+
+		busy.busy =
+			ktime_to_ns(intel_context_engine_get_busy_time(ctx,
+								       engine));
+
+		__copy_to_user(busy_user, &busy, sizeof(busy));
+	}
+
+out:
+	if (mutex)
+		mutex_unlock(&i915->drm.struct_mutex);
+
+	return ret;
+}
+
 int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct drm_i915_gem_context_param *args = data;
+	struct drm_i915_private *i915 = to_i915(dev);
 	struct i915_gem_context *ctx;
 	int ret = 0;
 
@@ -753,10 +840,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_GTT_SIZE:
 		if (ctx->ppgtt)
 			args->value = ctx->ppgtt->base.total;
-		else if (to_i915(dev)->mm.aliasing_ppgtt)
-			args->value = to_i915(dev)->mm.aliasing_ppgtt->base.total;
+		else if (i915->mm.aliasing_ppgtt)
+			args->value = i915->mm.aliasing_ppgtt->base.total;
 		else
-			args->value = to_i915(dev)->ggtt.base.total;
+			args->value = i915->ggtt.base.total;
 		break;
 	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
 		args->value = i915_gem_context_no_error_capture(ctx);
@@ -767,6 +854,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_PRIORITY:
 		args->value = ctx->priority;
 		break;
+	case I915_CONTEXT_GET_ENGINES_BUSY:
+		ret = get_engines_busy(i915, ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -842,6 +932,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		}
 		break;
 
+	case I915_CONTEXT_GET_ENGINES_BUSY:
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 159223c5fc5f..e468c971a7f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -163,6 +163,7 @@ struct i915_gem_context {
 		int pin_count;
 		struct {
 			seqlock_t lock;
+			bool enabled;
 			bool active;
 			ktime_t start;
 			ktime_t total;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index c82035b71824..c1bdae484594 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1460,6 +1460,26 @@ struct drm_i915_gem_userptr {
 	__u32 handle;
 };
 
+struct drm_i915_context_engine_busy {
+	__u8 class; /* in/out */
+	__u8 instance; /* in/out */
+
+	__u16 mbz;
+
+	__u32 flags; /* in/out/mbz */
+
+	__u64 busy; /* out/mbz */
+};
+
+struct drm_i915_context_engines_busy {
+	__u32 num_engines; /* in */
+	__u32 pad; /* mbz */
+
+	__u64 mbz;
+
+	struct drm_i915_context_engine_busy engines[0];
+};
+
 struct drm_i915_gem_context_param {
 	__u32 ctx_id;
 	__u32 size;
@@ -1473,6 +1493,7 @@ struct drm_i915_gem_context_param {
 #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
 #define   I915_CONTEXT_DEFAULT_PRIORITY		0
 #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
+#define I915_CONTEXT_GET_ENGINES_BUSY	0x7
 	__u64 value;
 };
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC v4 0/7] Per-context and per-client engine busyness
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:39   ` Chris Wilson
  -1 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 12:39 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2018-04-17 13:27:29)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Another re-post of my earlier, now slightly updated work, to expose a DRM client
> hierarchy in sysfs in order to enable a top like tool:
> 
> intel-gpu-top - load avg 40.80, 27.11,  1.50;  882/ 950 MHz;    0% RC6;  13.26 Watts;   261903 irqs/s
> 
>       IMC reads:     5543 MiB/s
>      IMC writes:      236 MiB/s
> 
>           ENGINE      BUSY                            QD     MI_SEMA MI_WAIT

QD could do with more explanation, maybe like so?

            ENGINE      BUSY                         E   Q   W MI_SEMA MI_WAIT
>      Render/3D/0    60.47% |███████████▍       |  28   0   1      0%      0%

Render/Compute?

Another visual break after QD?

  Render/Compute/0    60.47% |███████████▍       |  28   0   1 |    0%      0%

>        Blitter/0    92.70% |█████████████████▌ |   0   0   1      0%      0%
>          Video/0   100.00% |███████████████████|  15  37   2      0%      0%
>          Video/1    51.68% |█████████▊         |   0   0   1      0%      0%
>   VideoEnhance/0     0.00% |                   |   0   0   0      0%      0%
> 
>   PID            NAME   rcs0       bcs0       vcs0       vcs1       vecs0
> 21664        gem_wsim |█████▍   ||         ||█████████||████▋    ||         |
> 21662     gem_latency |         ||████████▎||         ||         ||         |
> 21662     gem_latency |         ||         ||         ||         ||         |
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [RFC v4 0/7] Per-context and per-client engine busyness
@ 2018-04-17 12:39   ` Chris Wilson
  0 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 12:39 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2018-04-17 13:27:29)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Another re-post of my earlier, now slightly updated work, to expose a DRM client
> hierarchy in sysfs in order to enable a top like tool:
> 
> intel-gpu-top - load avg 40.80, 27.11,  1.50;  882/ 950 MHz;    0% RC6;  13.26 Watts;   261903 irqs/s
> 
>       IMC reads:     5543 MiB/s
>      IMC writes:      236 MiB/s
> 
>           ENGINE      BUSY                            QD     MI_SEMA MI_WAIT

QD could do with more explanation, maybe like so?

            ENGINE      BUSY                         E   Q   W MI_SEMA MI_WAIT
>      Render/3D/0    60.47% |███████████▍       |  28   0   1      0%      0%

Render/Compute?

Another visual break after QD?

  Render/Compute/0    60.47% |███████████▍       |  28   0   1 |    0%      0%

>        Blitter/0    92.70% |█████████████████▌ |   0   0   1      0%      0%
>          Video/0   100.00% |███████████████████|  15  37   2      0%      0%
>          Video/1    51.68% |█████████▊         |   0   0   1      0%      0%
>   VideoEnhance/0     0.00% |                   |   0   0   0      0%      0%
> 
>   PID            NAME   rcs0       bcs0       vcs0       vcs1       vecs0
> 21664        gem_wsim |█████▍   ||         ||█████████||████▋    ||         |
> 21662     gem_latency |         ||████████▎||         ||         ||         |
> 21662     gem_latency |         ||         ||         ||         ||         |
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/7] drm/i915: Use seqlock in engine stats
  2018-04-17 12:27   ` [igt-dev] " Tvrtko Ursulin
@ 2018-04-17 12:47     ` Chris Wilson
  -1 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 12:47 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2018-04-17 13:27:30)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We can convert engine stats from a spinlock to seqlock to ensure interrupt
> processing is never even a tiny bit delayed by parallel readers.
> 
> There is a smidgen bit more cost on the write lock side, and an extremely
> unlikely chance that readers will have to retry a few times in face of
> heavy interrupt load.Bbut it should be extremely unlikely given how
> lightweight read side section is compared to the interrupt processing side,
> and also compared to the rest of the code paths which can lead into it.

Also mention that readers are informative only, its the writers that are
doing the work/submission.

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c  | 19 ++++++++++---------
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 11 ++++++-----
>  2 files changed, 16 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index e4992c2e23a4..de61d3d1653d 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -306,7 +306,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>         /* Nothing to do here, execute in order of dependencies */
>         engine->schedule = NULL;
>  
> -       spin_lock_init(&engine->stats.lock);
> +       seqlock_init(&engine->stats.lock);
>  
>         ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
>  
> @@ -2065,7 +2065,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
>                 return -ENODEV;
>  
>         tasklet_disable(&execlists->tasklet);
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (unlikely(engine->stats.enabled == ~0)) {
>                 err = -EBUSY;
> @@ -2089,7 +2089,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
>         }
>  
>  unlock:
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>         tasklet_enable(&execlists->tasklet);
>  
>         return err;
> @@ -2118,12 +2118,13 @@ static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
>   */
>  ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
>  {
> +       unsigned int seq;
>         ktime_t total;
> -       unsigned long flags;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> -       total = __intel_engine_get_busy_time(engine);
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       do {
> +               seq = read_seqbegin(&engine->stats.lock);
> +               total = __intel_engine_get_busy_time(engine);
> +       } while (read_seqretry(&engine->stats.lock, seq));
>  
>         return total;
>  }
> @@ -2141,13 +2142,13 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
>         if (!intel_engine_supports_stats(engine))
>                 return;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>         WARN_ON_ONCE(engine->stats.enabled == 0);
>         if (--engine->stats.enabled == 0) {
>                 engine->stats.total = __intel_engine_get_busy_time(engine);
>                 engine->stats.active = 0;
>         }
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index d50b31eb43a5..f24ea9826037 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -3,6 +3,7 @@
>  #define _INTEL_RINGBUFFER_H_
>  
>  #include <linux/hashtable.h>
> +#include <linux/seqlock.h>
>  
>  #include "i915_gem_batch_pool.h"
>  #include "i915_gem_timeline.h"
> @@ -610,7 +611,7 @@ struct intel_engine_cs {
>                 /**
>                  * @lock: Lock protecting the below fields.
>                  */
> -               spinlock_t lock;
> +               seqlock_t lock;
>                 /**
>                  * @enabled: Reference count indicating number of listeners.
>                  */
> @@ -1082,7 +1083,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
>         if (READ_ONCE(engine->stats.enabled) == 0)
>                 return;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (engine->stats.enabled > 0) {
>                 if (engine->stats.active++ == 0)
> @@ -1090,7 +1091,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
>                 GEM_BUG_ON(engine->stats.active == 0);
>         }
>  
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
>  static inline void intel_engine_context_out(struct intel_engine_cs *engine)
> @@ -1100,7 +1101,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
>         if (READ_ONCE(engine->stats.enabled) == 0)
>                 return;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (engine->stats.enabled > 0) {
>                 ktime_t last;
> @@ -1127,7 +1128,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
>                 }
>         }
>  
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I think that's good enough as a stand-alone patch, even though we were
unable to realise the no irq_disable gains.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [igt-dev] [PATCH 1/7] drm/i915: Use seqlock in engine stats
@ 2018-04-17 12:47     ` Chris Wilson
  0 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 12:47 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2018-04-17 13:27:30)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We can convert engine stats from a spinlock to seqlock to ensure interrupt
> processing is never even a tiny bit delayed by parallel readers.
> 
> There is a smidgen bit more cost on the write lock side, and an extremely
> unlikely chance that readers will have to retry a few times in face of
> heavy interrupt load.Bbut it should be extremely unlikely given how
> lightweight read side section is compared to the interrupt processing side,
> and also compared to the rest of the code paths which can lead into it.

Also mention that readers are informative only, its the writers that are
doing the work/submission.

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c  | 19 ++++++++++---------
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 11 ++++++-----
>  2 files changed, 16 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index e4992c2e23a4..de61d3d1653d 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -306,7 +306,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>         /* Nothing to do here, execute in order of dependencies */
>         engine->schedule = NULL;
>  
> -       spin_lock_init(&engine->stats.lock);
> +       seqlock_init(&engine->stats.lock);
>  
>         ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
>  
> @@ -2065,7 +2065,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
>                 return -ENODEV;
>  
>         tasklet_disable(&execlists->tasklet);
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (unlikely(engine->stats.enabled == ~0)) {
>                 err = -EBUSY;
> @@ -2089,7 +2089,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
>         }
>  
>  unlock:
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>         tasklet_enable(&execlists->tasklet);
>  
>         return err;
> @@ -2118,12 +2118,13 @@ static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
>   */
>  ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
>  {
> +       unsigned int seq;
>         ktime_t total;
> -       unsigned long flags;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> -       total = __intel_engine_get_busy_time(engine);
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       do {
> +               seq = read_seqbegin(&engine->stats.lock);
> +               total = __intel_engine_get_busy_time(engine);
> +       } while (read_seqretry(&engine->stats.lock, seq));
>  
>         return total;
>  }
> @@ -2141,13 +2142,13 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
>         if (!intel_engine_supports_stats(engine))
>                 return;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>         WARN_ON_ONCE(engine->stats.enabled == 0);
>         if (--engine->stats.enabled == 0) {
>                 engine->stats.total = __intel_engine_get_busy_time(engine);
>                 engine->stats.active = 0;
>         }
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index d50b31eb43a5..f24ea9826037 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -3,6 +3,7 @@
>  #define _INTEL_RINGBUFFER_H_
>  
>  #include <linux/hashtable.h>
> +#include <linux/seqlock.h>
>  
>  #include "i915_gem_batch_pool.h"
>  #include "i915_gem_timeline.h"
> @@ -610,7 +611,7 @@ struct intel_engine_cs {
>                 /**
>                  * @lock: Lock protecting the below fields.
>                  */
> -               spinlock_t lock;
> +               seqlock_t lock;
>                 /**
>                  * @enabled: Reference count indicating number of listeners.
>                  */
> @@ -1082,7 +1083,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
>         if (READ_ONCE(engine->stats.enabled) == 0)
>                 return;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (engine->stats.enabled > 0) {
>                 if (engine->stats.active++ == 0)
> @@ -1090,7 +1091,7 @@ static inline void intel_engine_context_in(struct intel_engine_cs *engine)
>                 GEM_BUG_ON(engine->stats.active == 0);
>         }
>  
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
>  static inline void intel_engine_context_out(struct intel_engine_cs *engine)
> @@ -1100,7 +1101,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
>         if (READ_ONCE(engine->stats.enabled) == 0)
>                 return;
>  
> -       spin_lock_irqsave(&engine->stats.lock, flags);
> +       write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (engine->stats.enabled > 0) {
>                 ktime_t last;
> @@ -1127,7 +1128,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
>                 }
>         }
>  
> -       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +       write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I think that's good enough as a stand-alone patch, even though we were
unable to realise the no irq_disable gains.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [igt-dev] [RFC 2/7] drm/i915: Track per-context engine busyness
  2018-04-17 12:27   ` [Intel-gfx] " Tvrtko Ursulin
@ 2018-04-17 12:58     ` Chris Wilson
  -1 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 12:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, gordon.kelly

Quoting Tvrtko Ursulin (2018-04-17 13:27:31)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Some customers want to know how much of the GPU time are their clients
> using in order to make dynamic load balancing decisions.
> 
> With the hooks already in place which track the overall engine busyness,
> we can extend that slightly to split that time between contexts.
> 
> v2: Fix accounting for tail updates.
> v3: Rebase.
> v4: Mark currently running contexts as active on stats enable.
> v5: Include some headers to fix the build.
> v6: Added fine grained lock.
> v7: Convert to seqlock. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: gordon.kelly@intel.com
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c |  8 +++++
>  drivers/gpu/drm/i915/i915_gem_context.h |  7 +++++
>  drivers/gpu/drm/i915/intel_engine_cs.c  | 31 +++++++++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        | 14 ++++++---
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 55 +++++++++++++++++++++++++++++----
>  5 files changed, 104 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 5cfac0255758..85627fa4565b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -349,7 +349,9 @@ static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *dev_priv,
>                         struct drm_i915_file_private *file_priv)
>  {
> +       struct intel_engine_cs *engine;
>         struct i915_gem_context *ctx;
> +       enum intel_engine_id id;
>  
>         lockdep_assert_held(&dev_priv->drm.struct_mutex);
>  
> @@ -375,6 +377,12 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>                 ctx->desc_template = default_desc_template(dev_priv, ppgtt);
>         }
>  
> +       /* Initialize the context stats lock. */
> +       for_each_engine(engine, dev_priv, id) {
> +               if (intel_engine_supports_stats(engine))
> +                       seqlock_init(&ctx->engine[id].stats.lock);

Keep it simple, always init the substruct.

> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index de61d3d1653d..e66166466d6d 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -2078,6 +2078,16 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
>  
>                 engine->stats.enabled_at = ktime_get();
>  
> +               /* Mark currently running context as active. */
> +               if (port_isset(port)) {
> +                       struct i915_request *req = port_request(port);
> +                       struct intel_context *ce =
> +                                       &req->ctx->engine[engine->id];

Bleh, better break out the
	struct i915_request {
		struct intel_context *ce;
	}
patch, as this is getting repetitive ;)
> +
> +                       ce->stats.start = engine->stats.enabled_at;
> +                       ce->stats.active = true;
> +               }
> +
>                 /* XXX submission method oblivious? */
>                 while (num_ports-- && port_isset(port)) {
>                         engine->stats.active++;
> @@ -2151,6 +2161,27 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
>         write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
> +ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
> +                                          struct intel_engine_cs *engine)
> +{
> +       struct intel_context *ce = &ctx->engine[engine->id];
> +       unsigned int seq;
> +       ktime_t total;
> +
> +       do {
> +               seq = read_seqbegin(&ce->stats.lock);
> +
> +               total = ce->stats.total;
> +
> +               if (ce->stats.active)
> +                       total = ktime_add(total,
> +                                         ktime_sub(ktime_get(),
> +                                                   ce->stats.start));
> +       } while (read_seqretry(&ce->stats.lock, seq));
> +
> +       return total;
> +}
> +
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>  #include "selftests/mock_engine.c"
>  #include "selftests/intel_engine_cs.c"
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ddd14e30be6c..930fcf0b1e86 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -389,16 +389,18 @@ execlists_user_end(struct intel_engine_execlists *execlists)
>  }
>  
>  static inline void
> -execlists_context_schedule_in(struct i915_request *rq)
> +execlists_context_schedule_in(struct i915_request *rq, unsigned int port)
>  {
>         execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> -       intel_engine_context_in(rq->engine);
> +       intel_engine_context_in(rq->engine,
> +                               &rq->ctx->engine[rq->engine->id],
> +                               port == 0);
>  }
>  
>  static inline void
>  execlists_context_schedule_out(struct i915_request *rq)
>  {
> -       intel_engine_context_out(rq->engine);
> +       intel_engine_context_out(rq->engine, &rq->ctx->engine[rq->engine->id]);
>         execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
>  }
>  
> @@ -463,7 +465,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>                 if (rq) {
>                         GEM_BUG_ON(count > !n);
>                         if (!count++)
> -                               execlists_context_schedule_in(rq);
> +                               execlists_context_schedule_in(rq, n);
>                         port_set(&port[n], port_pack(rq, count));
>                         desc = execlists_update_context(rq);
>                         GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
> @@ -751,7 +753,9 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
>                           intel_engine_get_seqno(rq->engine));
>  
>                 GEM_BUG_ON(!execlists->active);
> -               intel_engine_context_out(rq->engine);
> +
> +               intel_engine_context_out(rq->engine,
> +                                        &rq->ctx->engine[rq->engine->id]);
>  
>                 execlists_context_status_change(rq,
>                                                 i915_request_completed(rq) ?
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index f24ea9826037..4d122ba1cd30 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -6,6 +6,7 @@
>  #include <linux/seqlock.h>
>  
>  #include "i915_gem_batch_pool.h"
> +#include "i915_gem_context.h"
>  #include "i915_gem_timeline.h"
>  
>  #include "i915_reg.h"
> @@ -1076,25 +1077,44 @@ void intel_engine_dump(struct intel_engine_cs *engine,
>  struct intel_engine_cs *
>  intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
>  
> -static inline void intel_engine_context_in(struct intel_engine_cs *engine)
> +static inline void
> +intel_engine_context_in(struct intel_engine_cs *engine,
> +                       struct intel_context *ce,
> +                       bool submit)
>  {
>         unsigned long flags;
> +       ktime_t now;
>  
>         if (READ_ONCE(engine->stats.enabled) == 0)
>                 return;
>  
>         write_seqlock_irqsave(&engine->stats.lock, flags);
>  
> +       if (submit) {
> +               now = ktime_get();
> +               write_seqlock(&ce->stats.lock);
> +               ce->stats.start = now;
> +               ce->stats.active = true;
> +               write_sequnlock(&ce->stats.lock);
> +       } else {
> +               now = 0;
> +       }

So "submit" means that this is the first request in a sequence. But you
only then update ce->stats.start/active when the context is submitted
the first time in port 0. If it is pulled in via port N, when it is
promoted to port 0, we don't start the client timer.

I must have misunderstood.

> +
>         if (engine->stats.enabled > 0) {
> -               if (engine->stats.active++ == 0)
> -                       engine->stats.start = ktime_get();
> +               if (engine->stats.active++ == 0) {
> +                       if (!now)
> +                               now = ktime_get();
> +                       engine->stats.start = now;
> +               }
>                 GEM_BUG_ON(engine->stats.active == 0);
>         }
>  
>         write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
> -static inline void intel_engine_context_out(struct intel_engine_cs *engine)
> +static inline void
> +intel_engine_context_out(struct intel_engine_cs *engine,
> +                        struct intel_context *ce)
>  {
>         unsigned long flags;
>  
> @@ -1104,14 +1124,34 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
>         write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (engine->stats.enabled > 0) {
> +               struct execlist_port *next_port = &engine->execlists.port[1];

Yikes, we had better pass in the next port. There's a similarity here
with execlists_begin_user, maybe we should refactor that to help in this
case (single hook?).

Yeah, I'm thinking that we now have 3/4 different cases that try to
update bookkeeping between context switches, that all want the same
hooks.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [igt-dev] [RFC 2/7] drm/i915: Track per-context engine busyness
@ 2018-04-17 12:58     ` Chris Wilson
  0 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 12:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, gordon.kelly, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2018-04-17 13:27:31)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Some customers want to know how much of the GPU time are their clients
> using in order to make dynamic load balancing decisions.
> 
> With the hooks already in place which track the overall engine busyness,
> we can extend that slightly to split that time between contexts.
> 
> v2: Fix accounting for tail updates.
> v3: Rebase.
> v4: Mark currently running contexts as active on stats enable.
> v5: Include some headers to fix the build.
> v6: Added fine grained lock.
> v7: Convert to seqlock. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: gordon.kelly@intel.com
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c |  8 +++++
>  drivers/gpu/drm/i915/i915_gem_context.h |  7 +++++
>  drivers/gpu/drm/i915/intel_engine_cs.c  | 31 +++++++++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        | 14 ++++++---
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 55 +++++++++++++++++++++++++++++----
>  5 files changed, 104 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 5cfac0255758..85627fa4565b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -349,7 +349,9 @@ static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *dev_priv,
>                         struct drm_i915_file_private *file_priv)
>  {
> +       struct intel_engine_cs *engine;
>         struct i915_gem_context *ctx;
> +       enum intel_engine_id id;
>  
>         lockdep_assert_held(&dev_priv->drm.struct_mutex);
>  
> @@ -375,6 +377,12 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>                 ctx->desc_template = default_desc_template(dev_priv, ppgtt);
>         }
>  
> +       /* Initialize the context stats lock. */
> +       for_each_engine(engine, dev_priv, id) {
> +               if (intel_engine_supports_stats(engine))
> +                       seqlock_init(&ctx->engine[id].stats.lock);

Keep it simple, always init the substruct.

> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index de61d3d1653d..e66166466d6d 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -2078,6 +2078,16 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
>  
>                 engine->stats.enabled_at = ktime_get();
>  
> +               /* Mark currently running context as active. */
> +               if (port_isset(port)) {
> +                       struct i915_request *req = port_request(port);
> +                       struct intel_context *ce =
> +                                       &req->ctx->engine[engine->id];

Bleh, better break out the
	struct i915_request {
		struct intel_context *ce;
	}
patch, as this is getting repetitive ;)
> +
> +                       ce->stats.start = engine->stats.enabled_at;
> +                       ce->stats.active = true;
> +               }
> +
>                 /* XXX submission method oblivious? */
>                 while (num_ports-- && port_isset(port)) {
>                         engine->stats.active++;
> @@ -2151,6 +2161,27 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
>         write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
> +ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
> +                                          struct intel_engine_cs *engine)
> +{
> +       struct intel_context *ce = &ctx->engine[engine->id];
> +       unsigned int seq;
> +       ktime_t total;
> +
> +       do {
> +               seq = read_seqbegin(&ce->stats.lock);
> +
> +               total = ce->stats.total;
> +
> +               if (ce->stats.active)
> +                       total = ktime_add(total,
> +                                         ktime_sub(ktime_get(),
> +                                                   ce->stats.start));
> +       } while (read_seqretry(&ce->stats.lock, seq));
> +
> +       return total;
> +}
> +
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>  #include "selftests/mock_engine.c"
>  #include "selftests/intel_engine_cs.c"
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ddd14e30be6c..930fcf0b1e86 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -389,16 +389,18 @@ execlists_user_end(struct intel_engine_execlists *execlists)
>  }
>  
>  static inline void
> -execlists_context_schedule_in(struct i915_request *rq)
> +execlists_context_schedule_in(struct i915_request *rq, unsigned int port)
>  {
>         execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> -       intel_engine_context_in(rq->engine);
> +       intel_engine_context_in(rq->engine,
> +                               &rq->ctx->engine[rq->engine->id],
> +                               port == 0);
>  }
>  
>  static inline void
>  execlists_context_schedule_out(struct i915_request *rq)
>  {
> -       intel_engine_context_out(rq->engine);
> +       intel_engine_context_out(rq->engine, &rq->ctx->engine[rq->engine->id]);
>         execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
>  }
>  
> @@ -463,7 +465,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>                 if (rq) {
>                         GEM_BUG_ON(count > !n);
>                         if (!count++)
> -                               execlists_context_schedule_in(rq);
> +                               execlists_context_schedule_in(rq, n);
>                         port_set(&port[n], port_pack(rq, count));
>                         desc = execlists_update_context(rq);
>                         GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
> @@ -751,7 +753,9 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
>                           intel_engine_get_seqno(rq->engine));
>  
>                 GEM_BUG_ON(!execlists->active);
> -               intel_engine_context_out(rq->engine);
> +
> +               intel_engine_context_out(rq->engine,
> +                                        &rq->ctx->engine[rq->engine->id]);
>  
>                 execlists_context_status_change(rq,
>                                                 i915_request_completed(rq) ?
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index f24ea9826037..4d122ba1cd30 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -6,6 +6,7 @@
>  #include <linux/seqlock.h>
>  
>  #include "i915_gem_batch_pool.h"
> +#include "i915_gem_context.h"
>  #include "i915_gem_timeline.h"
>  
>  #include "i915_reg.h"
> @@ -1076,25 +1077,44 @@ void intel_engine_dump(struct intel_engine_cs *engine,
>  struct intel_engine_cs *
>  intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
>  
> -static inline void intel_engine_context_in(struct intel_engine_cs *engine)
> +static inline void
> +intel_engine_context_in(struct intel_engine_cs *engine,
> +                       struct intel_context *ce,
> +                       bool submit)
>  {
>         unsigned long flags;
> +       ktime_t now;
>  
>         if (READ_ONCE(engine->stats.enabled) == 0)
>                 return;
>  
>         write_seqlock_irqsave(&engine->stats.lock, flags);
>  
> +       if (submit) {
> +               now = ktime_get();
> +               write_seqlock(&ce->stats.lock);
> +               ce->stats.start = now;
> +               ce->stats.active = true;
> +               write_sequnlock(&ce->stats.lock);
> +       } else {
> +               now = 0;
> +       }

So "submit" means that this is the first request in a sequence. But you
only then update ce->stats.start/active when the context is submitted
the first time in port 0. If it is pulled in via port N, when it is
promoted to port 0, we don't start the client timer.

I must have misunderstood.

> +
>         if (engine->stats.enabled > 0) {
> -               if (engine->stats.active++ == 0)
> -                       engine->stats.start = ktime_get();
> +               if (engine->stats.active++ == 0) {
> +                       if (!now)
> +                               now = ktime_get();
> +                       engine->stats.start = now;
> +               }
>                 GEM_BUG_ON(engine->stats.active == 0);
>         }
>  
>         write_sequnlock_irqrestore(&engine->stats.lock, flags);
>  }
>  
> -static inline void intel_engine_context_out(struct intel_engine_cs *engine)
> +static inline void
> +intel_engine_context_out(struct intel_engine_cs *engine,
> +                        struct intel_context *ce)
>  {
>         unsigned long flags;
>  
> @@ -1104,14 +1124,34 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
>         write_seqlock_irqsave(&engine->stats.lock, flags);
>  
>         if (engine->stats.enabled > 0) {
> +               struct execlist_port *next_port = &engine->execlists.port[1];

Yikes, we had better pass in the next port. There's a similarity here
with execlists_begin_user, maybe we should refactor that to help in this
case (single hook?).

Yeah, I'm thinking that we now have 3/4 different cases that try to
update bookkeeping between context switches, that all want the same
hooks.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for Per-context and per-client engine busyness (rev5)
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
                   ` (8 preceding siblings ...)
  (?)
@ 2018-04-17 12:59 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2018-04-17 12:59 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per-context and per-client engine busyness (rev5)
URL   : https://patchwork.freedesktop.org/series/32645/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
a367a6c9fd96 drm/i915: Use seqlock in engine stats
557e1f39ef83 drm/i915: Track per-context engine busyness
2faa04d0e6db drm/i915: Expose list of clients in sysfs
-:93: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#93: FILE: drivers/gpu/drm/i915/i915_gem.c:5735:
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,

total: 0 errors, 0 warnings, 1 checks, 197 lines checked
0aa560d499c9 drm/i915: Update client name on context create
-:24: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#24: FILE: drivers/gpu/drm/i915/i915_drv.h:3200:
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,

total: 0 errors, 0 warnings, 1 checks, 66 lines checked
83bedddd78f1 drm/i915: Expose per-engine client busyness
-:22: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#22: 
neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%

-:55: ERROR:CODE_INDENT: code indent should use tabs where possible
#55: FILE: drivers/gpu/drm/i915/i915_drv.h:364:
+ ^I^Istruct kobject *busy_root;$

-:55: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#55: FILE: drivers/gpu/drm/i915/i915_drv.h:364:
+ ^I^Istruct kobject *busy_root;$

-:55: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#55: FILE: drivers/gpu/drm/i915/i915_drv.h:364:
+ ^I^Istruct kobject *busy_root;$

-:148: ERROR:CODE_INDENT: code indent should use tabs where possible
#148: FILE: drivers/gpu/drm/i915/i915_gem.c:5824:
+^I^I^I^I        (struct attribute *)attr);$

-:148: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#148: FILE: drivers/gpu/drm/i915/i915_gem.c:5824:
+		ret = sysfs_create_file(file_priv->client.busy_root,
+				        (struct attribute *)attr);

total: 2 errors, 3 warnings, 1 checks, 144 lines checked
163d3d9b5151 drm/i915: Add sysfs toggle to enable per-client engine stats
3f6953ea6afe drm/i915: Allow clients to query own per-engine busyness

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* ✗ Fi.CI.SPARSE: warning for Per-context and per-client engine busyness (rev5)
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
                   ` (9 preceding siblings ...)
  (?)
@ 2018-04-17 13:02 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2018-04-17 13:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per-context and per-client engine busyness (rev5)
URL   : https://patchwork.freedesktop.org/series/32645/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Commit: drm/i915: Use seqlock in engine stats
Okay!

Commit: drm/i915: Track per-context engine busyness
Okay!

Commit: drm/i915: Expose list of clients in sysfs
-drivers/gpu/drm/i915/selftests/../i915_drv.h:2207:33: warning: constant 0xffffea0000000000 is so big it is unsigned long
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3655:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:2226:33: warning: constant 0xffffea0000000000 is so big it is unsigned long
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3674:16: warning: expression using sizeof(void)

Commit: drm/i915: Update client name on context create
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3674:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3682:16: warning: expression using sizeof(void)

Commit: drm/i915: Expose per-engine client busyness
-drivers/gpu/drm/i915/selftests/../i915_drv.h:2226:33: warning: constant 0xffffea0000000000 is so big it is unsigned long
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3682:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:2234:33: warning: constant 0xffffea0000000000 is so big it is unsigned long
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3690:16: warning: expression using sizeof(void)

Commit: drm/i915: Add sysfs toggle to enable per-client engine stats
-drivers/gpu/drm/i915/selftests/../i915_drv.h:2234:33: warning: constant 0xffffea0000000000 is so big it is unsigned long
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3690:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:2238:33: warning: constant 0xffffea0000000000 is so big it is unsigned long
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3694:16: warning: expression using sizeof(void)

Commit: drm/i915: Allow clients to query own per-engine busyness
+drivers/gpu/drm/i915/i915_gem_context.c:767:14:    expected void const volatile [noderef] <asn:1>*<noident>
+drivers/gpu/drm/i915/i915_gem_context.c:767:14:    got unsigned long long [unsigned] [usertype] value
+drivers/gpu/drm/i915/i915_gem_context.c:767:14: warning: incorrect type in argument 1 (different base types)
+drivers/gpu/drm/i915/i915_gem_context.c:778:34:    expected void *to
+drivers/gpu/drm/i915/i915_gem_context.c:778:34:    got struct drm_i915_context_engine_busy [noderef] <asn:1>*[assigned] busy_user
+drivers/gpu/drm/i915/i915_gem_context.c:778:34: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.c:778:46:    expected void const [noderef] <asn:1>*from
+drivers/gpu/drm/i915/i915_gem_context.c:778:46:    got struct drm_i915_context_engine_busy *<noident>
+drivers/gpu/drm/i915/i915_gem_context.c:778:46: warning: incorrect type in argument 2 (different address spaces)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* ✗ Fi.CI.BAT: failure for Per-context and per-client engine busyness (rev5)
  2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
                   ` (10 preceding siblings ...)
  (?)
@ 2018-04-17 13:16 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2018-04-17 13:16 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per-context and per-client engine busyness (rev5)
URL   : https://patchwork.freedesktop.org/series/32645/
State : failure

== Summary ==

= CI Bug Log - changes from CI_DRM_4059 -> Patchwork_8706 =

== Summary - FAILURE ==

  Serious unknown changes coming with Patchwork_8706 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_8706, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/32645/revisions/5/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_8706:

  === IGT changes ===

    ==== Possible regressions ====

    igt@gem_exec_suspend@basic-s4-devices:
      fi-skl-guc:         PASS -> FAIL

    
== Known issues ==

  Here are the changes found in Patchwork_8706 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
      fi-bxt-dsi:         PASS -> INCOMPLETE (fdo#103927)

    igt@pm_rpm@basic-rte:
      fi-glk-1:           PASS -> INCOMPLETE (fdo#103359, k.org#198133)

    
    ==== Possible fixes ====

    igt@gem_exec_suspend@basic-s3:
      fi-ivb-3520m:       DMESG-WARN (fdo#106084) -> PASS +1

    igt@gem_mmap_gtt@basic-small-bo-tiledx:
      fi-gdg-551:         FAIL (fdo#102575) -> PASS

    igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a-frame-sequence:
      fi-elk-e7500:       INCOMPLETE (fdo#103989) -> PASS

    
  fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
  fdo#103359 https://bugs.freedesktop.org/show_bug.cgi?id=103359
  fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
  fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
  fdo#106084 https://bugs.freedesktop.org/show_bug.cgi?id=106084
  k.org#198133 https://bugzilla.kernel.org/show_bug.cgi?id=198133


== Participating hosts (36 -> 33) ==

  Missing    (3): fi-ctg-p8600 fi-ilk-m540 fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4059 -> Patchwork_8706

  CI_DRM_4059: c1645edc253f2b52a8c94565a75b479a6782e75f @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4435: ddbe5a4d8bb1780ecf07f72e815062d3bce8ff71 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_8706: 3f6953ea6afeb5d0a630313a1284dceb6bdd0676 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4435: e60d247eb359f044caf0c09904da14e39d7adca1 @ git://anongit.freedesktop.org/piglit


== Linux commits ==

3f6953ea6afe drm/i915: Allow clients to query own per-engine busyness
163d3d9b5151 drm/i915: Add sysfs toggle to enable per-client engine stats
83bedddd78f1 drm/i915: Expose per-engine client busyness
0aa560d499c9 drm/i915: Update client name on context create
2faa04d0e6db drm/i915: Expose list of clients in sysfs
557e1f39ef83 drm/i915: Track per-context engine busyness
a367a6c9fd96 drm/i915: Use seqlock in engine stats


== Kernel 32bit build ==

Warning: Kernel 32bit buildtest failed:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8706/build_32bit_failure.log

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8706/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC 6/7] drm/i915: Add sysfs toggle to enable per-client engine stats
  2018-04-17 12:27   ` [Intel-gfx] " Tvrtko Ursulin
@ 2018-04-17 13:51     ` Chris Wilson
  -1 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 13:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2018-04-17 13:27:35)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> By default we are not collecting any per-engine and per-context
> statistcs.
> 
> Add a new sysfs toggle to enable this facility:
> 
> $ echo 1 >/sys/class/drm/card0/clients/enable_stats

I have to ask, what would be the intersection with cgroup here?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [RFC 6/7] drm/i915: Add sysfs toggle to enable per-client engine stats
@ 2018-04-17 13:51     ` Chris Wilson
  0 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2018-04-17 13:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2018-04-17 13:27:35)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> By default we are not collecting any per-engine and per-context
> statistcs.
> 
> Add a new sysfs toggle to enable this facility:
> 
> $ echo 1 >/sys/class/drm/card0/clients/enable_stats

I have to ask, what would be the intersection with cgroup here?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2018-04-17 13:51 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17 12:27 [RFC v4 0/7] Per-context and per-client engine busyness Tvrtko Ursulin
2018-04-17 12:27 ` [igt-dev] " Tvrtko Ursulin
2018-04-17 12:27 ` [PATCH 1/7] drm/i915: Use seqlock in engine stats Tvrtko Ursulin
2018-04-17 12:27   ` [igt-dev] " Tvrtko Ursulin
2018-04-17 12:47   ` Chris Wilson
2018-04-17 12:47     ` [igt-dev] " Chris Wilson
2018-04-17 12:27 ` [RFC 2/7] drm/i915: Track per-context engine busyness Tvrtko Ursulin
2018-04-17 12:27   ` [Intel-gfx] " Tvrtko Ursulin
2018-04-17 12:58   ` [igt-dev] " Chris Wilson
2018-04-17 12:58     ` Chris Wilson
2018-04-17 12:27 ` [RFC 3/7] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
2018-04-17 12:27   ` [igt-dev] " Tvrtko Ursulin
2018-04-17 12:27 ` [RFC 4/7] drm/i915: Update client name on context create Tvrtko Ursulin
2018-04-17 12:27   ` [igt-dev] " Tvrtko Ursulin
2018-04-17 12:27 ` [RFC 5/7] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2018-04-17 12:27   ` [igt-dev] " Tvrtko Ursulin
2018-04-17 12:27 ` [RFC 6/7] drm/i915: Add sysfs toggle to enable per-client engine stats Tvrtko Ursulin
2018-04-17 12:27   ` [Intel-gfx] " Tvrtko Ursulin
2018-04-17 13:51   ` Chris Wilson
2018-04-17 13:51     ` [Intel-gfx] " Chris Wilson
2018-04-17 12:27 ` [RFC 7/7] drm/i915: Allow clients to query own per-engine busyness Tvrtko Ursulin
2018-04-17 12:27   ` [Intel-gfx] " Tvrtko Ursulin
2018-04-17 12:39 ` [RFC v4 0/7] Per-context and per-client engine busyness Chris Wilson
2018-04-17 12:39   ` [Intel-gfx] " Chris Wilson
2018-04-17 12:59 ` ✗ Fi.CI.CHECKPATCH: warning for Per-context and per-client engine busyness (rev5) Patchwork
2018-04-17 13:02 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-04-17 13:16 ` ✗ Fi.CI.BAT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.