All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/5] Per-client engine stats
@ 2018-02-14 18:50 Tvrtko Ursulin
  2018-02-14 18:50 ` [RFC 1/5] drm/i915: Track per-context engine busyness Tvrtko Ursulin
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-14 18:50 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-post of my earlier, now slightly updated work, to expose a DRM client
hierarchy in sysfs in order to enable a top like tool:

intel-gpu-top - load avg  9.47,  3.34,  0.13; 1000/ 950 MHz;    0% RC6;   5202mW;     1206 irqs/s

    rcs0 100.00% ( 0.56/ 4.78/ 3.86) |██████████████████████████████████████████████████████████████████|  0.00% wait,   0.00% sema
    bcs0  17.72% ( 0.81/ 0.00/ 0.17) |███████████▋                                                      |  0.00% wait,   0.00% sema
    vcs0  61.08% ( 0.81/ 0.00/ 0.63) |████████████████████████████████████████▎                         |  0.00% wait,   0.00% sema
   vecs0   0.00% ( 0.00/ 0.00/ 0.00) |                                                                  |  0.00% wait,   0.00% sema

  PID            NAME           rcs0                       bcs0                       vcs0                       vecs0
 9654       neverball |████████████             ||                         ||                         ||                         |
 7792            Xorg |████████▊                ||                         ||                         ||                         |
 9659        gem_wsim |████                     ||████▍                    ||███████████████▎         ||                         |
 7846           xfwm4 |                         ||                         ||                         ||                         |

I'll post the IGT support for this as well. It requires both this series and
engine queue depth PMU RFC-ed recently.

Tvrtko Ursulin (5):
  drm/i915: Track per-context engine busyness
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Expose per-engine client busyness
  drm/i915: Add sysfs toggle to enable per-client engine stats

 drivers/gpu/drm/i915/i915_drv.h         |  31 ++++++
 drivers/gpu/drm/i915/i915_gem.c         | 192 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_context.c |  23 +++-
 drivers/gpu/drm/i915/i915_gem_context.h |   7 ++
 drivers/gpu/drm/i915/i915_sysfs.c       |  80 +++++++++++++
 drivers/gpu/drm/i915/intel_engine_cs.c  |  29 +++++
 drivers/gpu/drm/i915/intel_lrc.c        |  14 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  56 +++++++++-
 8 files changed, 412 insertions(+), 20 deletions(-)

-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC 1/5] drm/i915: Track per-context engine busyness
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
@ 2018-02-14 18:50 ` Tvrtko Ursulin
  2018-02-14 19:07   ` Chris Wilson
  2018-02-14 18:50 ` [RFC 2/5] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-14 18:50 UTC (permalink / raw)
  To: Intel-gfx; +Cc: gordon.kelly

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some customers want to know how much of the GPU time are their clients
using in order to make dynamic load balancing decisions.

With the hooks already in place which track the overall engine busyness,
we can extend that slightly to split that time between contexts.

v2: Fix accounting for tail updates.
v3: Rebase.
v4: Mark currently running contexts as active on stats enable.
v5: Include some headers to fix the build.
v6: Added fine grained lock.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: gordon.kelly@intel.com
---
 drivers/gpu/drm/i915/i915_gem_context.c |  8 +++++
 drivers/gpu/drm/i915/i915_gem_context.h |  7 +++++
 drivers/gpu/drm/i915/intel_engine_cs.c  | 29 +++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 14 ++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 56 +++++++++++++++++++++++++++++----
 5 files changed, 103 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 3d75f484f6e5..7c6a406fd309 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -342,7 +342,9 @@ static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *dev_priv,
 			struct drm_i915_file_private *file_priv)
 {
+	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
+	enum intel_engine_id id;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
@@ -368,6 +370,12 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 		ctx->desc_template = default_desc_template(dev_priv, ppgtt);
 	}
 
+	/* Initialize the context stats lock. */
+	for_each_engine(engine, dev_priv, id) {
+		if (intel_engine_supports_stats(engine))
+			spin_lock_init(&ctx->engine[id].stats.lock);
+	}
+
 	trace_i915_context_create(ctx);
 
 	return ctx;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index a681c5b891ff..4de078408dc1 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -28,6 +28,7 @@
 #include <linux/bitops.h>
 #include <linux/list.h>
 #include <linux/radix-tree.h>
+#include <linux/spinlock.h>
 
 #include "i915_gem.h"
 
@@ -160,6 +161,12 @@ struct i915_gem_context {
 		u32 *lrc_reg_state;
 		u64 lrc_desc;
 		int pin_count;
+		struct {
+			spinlock_t lock;
+			bool active;
+			ktime_t start;
+			ktime_t total;
+		} stats;
 	} engine[I915_NUM_ENGINES];
 
 	/** ring_size: size for allocating the per-engine ring buffer */
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index c90e649b7458..152b09514d27 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1990,6 +1990,16 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 
 		engine->stats.enabled_at = ktime_get();
 
+		/* Mark currently running context as active. */
+		if (port_isset(port)) {
+			struct drm_i915_gem_request *req = port_request(port);
+			struct intel_context *ce =
+					&req->ctx->engine[engine->id];
+
+			ce->stats.start = engine->stats.enabled_at;
+			ce->stats.active = true;
+		}
+
 		/* XXX submission method oblivious? */
 		while (num_ports-- && port_isset(port)) {
 			engine->stats.active++;
@@ -2062,6 +2072,25 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
 }
 
+ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
+					   struct intel_engine_cs *engine)
+{
+	struct intel_context *ce = &ctx->engine[engine->id];
+	ktime_t total;
+
+	spin_lock_irq(&ce->stats.lock);
+
+	total = ce->stats.total;
+
+	if (ce->stats.active)
+		total = ktime_add(total,
+				  ktime_sub(ktime_get(), ce->stats.start));
+
+	spin_unlock_irq(&ce->stats.lock);
+
+	return total;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_engine.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 827c6a3d30b8..fba2089e16c9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -331,16 +331,19 @@ execlists_context_status_change(struct drm_i915_gem_request *rq,
 }
 
 static inline void
-execlists_context_schedule_in(struct drm_i915_gem_request *rq)
+execlists_context_schedule_in(struct drm_i915_gem_request *rq,
+			      unsigned int port)
 {
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
-	intel_engine_context_in(rq->engine);
+	intel_engine_context_in(rq->engine,
+				&rq->ctx->engine[rq->engine->id],
+				port == 0);
 }
 
 static inline void
 execlists_context_schedule_out(struct drm_i915_gem_request *rq)
 {
-	intel_engine_context_out(rq->engine);
+	intel_engine_context_out(rq->engine, &rq->ctx->engine[rq->engine->id]);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 }
 
@@ -393,7 +396,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		if (rq) {
 			GEM_BUG_ON(count > !n);
 			if (!count++)
-				execlists_context_schedule_in(rq);
+				execlists_context_schedule_in(rq, n);
 			port_set(&port[n], port_pack(rq, count));
 			desc = execlists_update_context(rq);
 			GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
@@ -661,7 +664,8 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 		struct drm_i915_gem_request *rq = port_request(port);
 
 		GEM_BUG_ON(!execlists->active);
-		intel_engine_context_out(rq->engine);
+		intel_engine_context_out(rq->engine,
+					 &rq->ctx->engine[rq->engine->id]);
 		execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_PREEMPTED);
 		i915_gem_request_put(rq);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f542969ead83..967275e29f1f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -4,6 +4,7 @@
 
 #include <linux/hashtable.h>
 #include "i915_gem_batch_pool.h"
+#include "i915_gem_context.h"
 #include "i915_gem_request.h"
 #include "i915_gem_timeline.h"
 #include "i915_pmu.h"
@@ -1029,25 +1030,44 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 struct intel_engine_cs *
 intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
 
-static inline void intel_engine_context_in(struct intel_engine_cs *engine)
+static inline void
+intel_engine_context_in(struct intel_engine_cs *engine,
+			struct intel_context *ce,
+			bool submit)
 {
 	unsigned long flags;
+	ktime_t now;
 
 	if (READ_ONCE(engine->stats.enabled) == 0)
 		return;
 
 	spin_lock_irqsave(&engine->stats.lock, flags);
 
+	if (submit) {
+		now = ktime_get();
+		spin_lock(&ce->stats.lock);
+		ce->stats.start = now;
+		ce->stats.active = true;
+		spin_unlock(&ce->stats.lock);
+	} else {
+		now = 0;
+	}
+
 	if (engine->stats.enabled > 0) {
-		if (engine->stats.active++ == 0)
-			engine->stats.start = ktime_get();
+		if (engine->stats.active++ == 0) {
+			if (!now)
+				now = ktime_get();
+			engine->stats.start = now;
+		}
 		GEM_BUG_ON(engine->stats.active == 0);
 	}
 
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
 }
 
-static inline void intel_engine_context_out(struct intel_engine_cs *engine)
+static inline void
+intel_engine_context_out(struct intel_engine_cs *engine,
+			 struct intel_context *ce)
 {
 	unsigned long flags;
 
@@ -1057,14 +1077,35 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	spin_lock_irqsave(&engine->stats.lock, flags);
 
 	if (engine->stats.enabled > 0) {
+		struct execlist_port *next_port = &engine->execlists.port[1];
+		ktime_t now = ktime_get();
 		ktime_t last;
 
+		spin_lock(&ce->stats.lock);
+		GEM_BUG_ON(!ce->stats.start);
+		ce->stats.total = ktime_add(ce->stats.total,
+					    ktime_sub(now, ce->stats.start));
+		ce->stats.active = false;
+		spin_unlock(&ce->stats.lock);
+
+		if (port_isset(next_port)) {
+			struct drm_i915_gem_request *next_req =
+						port_request(next_port);
+			struct intel_context *next_ce =
+					&next_req->ctx->engine[engine->id];
+
+			spin_lock(&next_ce->stats.lock);
+			next_ce->stats.start = now;
+			next_ce->stats.active = true;
+			spin_unlock(&next_ce->stats.lock);
+		}
+
 		if (engine->stats.active && --engine->stats.active == 0) {
 			/*
 			 * Decrement the active context count and in case GPU
 			 * is now idle add up to the running total.
 			 */
-			last = ktime_sub(ktime_get(), engine->stats.start);
+			last = ktime_sub(now, engine->stats.start);
 
 			engine->stats.total = ktime_add(engine->stats.total,
 							last);
@@ -1074,7 +1115,7 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 			 * the first event in which case we account from the
 			 * time stats gathering was turned on.
 			 */
-			last = ktime_sub(ktime_get(), engine->stats.enabled_at);
+			last = ktime_sub(now, engine->stats.enabled_at);
 
 			engine->stats.total = ktime_add(engine->stats.total,
 							last);
@@ -1084,6 +1125,9 @@ static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
 }
 
+ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
+					   struct intel_engine_cs *engine);
+
 int intel_enable_engine_stats(struct intel_engine_cs *engine);
 void intel_disable_engine_stats(struct intel_engine_cs *engine);
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 2/5] drm/i915: Expose list of clients in sysfs
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
  2018-02-14 18:50 ` [RFC 1/5] drm/i915: Track per-context engine busyness Tvrtko Ursulin
@ 2018-02-14 18:50 ` Tvrtko Ursulin
  2018-02-14 19:13   ` Chris Wilson
  2018-02-14 18:50 ` [RFC 3/5] drm/i915: Update client name on context create Tvrtko Ursulin
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-14 18:50 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose a list of clients with open file handles in sysfs.

This will be a basis for a top-like utility showing per-client and per-
engine GPU load.

Currently we only expose each client's pid and name under opaque numbered
directories in /sys/class/drm/card0/clients/.

For instance:

/sys/class/drm/card0/clients/3/name: Xorg
/sys/class/drm/card0/clients/3/pid: 5664

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  13 +++++
 drivers/gpu/drm/i915/i915_gem.c   | 112 +++++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_sysfs.c |   8 +++
 3 files changed, 126 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 70cf289855af..2133e67f5fbc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -345,6 +345,16 @@ struct drm_i915_file_private {
  */
 #define I915_MAX_CLIENT_CONTEXT_BANS 3
 	atomic_t context_bans;
+
+	unsigned int client_sysfs_id;
+	unsigned int client_pid;
+	char *client_name;
+	struct kobject *client_root;
+
+	struct {
+		struct device_attribute pid;
+		struct device_attribute name;
+	} attr;
 };
 
 /* Interface history:
@@ -2365,6 +2375,9 @@ struct drm_i915_private {
 
 	struct i915_pmu pmu;
 
+	struct kobject *clients_root;
+	atomic_t client_serial;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fc68b35854df..24607bce2efe 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5613,6 +5613,89 @@ int i915_gem_freeze_late(struct drm_i915_private *dev_priv)
 	return 0;
 }
 
+static ssize_t
+show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_file_private *file_priv =
+		container_of(attr, struct drm_i915_file_private, attr.name);
+
+	return snprintf(buf, PAGE_SIZE, "%s", file_priv->client_name);
+}
+
+static ssize_t
+show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_file_private *file_priv =
+		container_of(attr, struct drm_i915_file_private, attr.pid);
+
+	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
+}
+
+static int
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,
+		struct task_struct *task,
+		unsigned int serial)
+{
+	int ret = -ENOMEM;
+	struct device_attribute *attr;
+	char id[32];
+
+	file_priv->client_name = kstrdup(task->comm, GFP_KERNEL);
+	if (!file_priv->client_name)
+		goto err_name;
+
+	snprintf(id, sizeof(id), "%u", serial);
+	file_priv->client_root = kobject_create_and_add(id,
+							i915->clients_root);
+	if (!file_priv->client_root)
+		goto err_client;
+
+	attr = &file_priv->attr.name;
+	attr->attr.name = "name";
+	attr->attr.mode = 0444;
+	attr->show = show_client_name;
+
+	ret = sysfs_create_file(file_priv->client_root,
+				(struct attribute *)attr);
+	if (ret)
+		goto err_attr_name;
+
+	attr = &file_priv->attr.pid;
+	attr->attr.name = "pid";
+	attr->attr.mode = 0444;
+	attr->show = show_client_pid;
+
+	ret = sysfs_create_file(file_priv->client_root,
+				(struct attribute *)attr);
+	if (ret)
+		goto err_attr_pid;
+
+	file_priv->client_pid = pid_nr(get_task_pid(task, PIDTYPE_PID));
+
+	return 0;
+
+err_attr_pid:
+	sysfs_remove_file(file_priv->client_root,
+			  (struct attribute *)&file_priv->attr.name);
+err_attr_name:
+	kobject_put(file_priv->client_root);
+err_client:
+	kfree(file_priv->client_name);
+err_name:
+	return ret;
+}
+
+static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
+{
+	sysfs_remove_file(file_priv->client_root,
+			  (struct attribute *)&file_priv->attr.pid);
+	sysfs_remove_file(file_priv->client_root,
+			  (struct attribute *)&file_priv->attr.name);
+	kobject_put(file_priv->client_root);
+	kfree(file_priv->client_name);
+}
+
 void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
@@ -5626,32 +5709,47 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 	list_for_each_entry(request, &file_priv->mm.request_list, client_link)
 		request->file_priv = NULL;
 	spin_unlock(&file_priv->mm.lock);
+
+	i915_gem_remove_client(file_priv);
 }
 
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 {
+	int ret = -ENOMEM;
 	struct drm_i915_file_private *file_priv;
-	int ret;
 
 	DRM_DEBUG("\n");
 
 	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
 	if (!file_priv)
-		return -ENOMEM;
+		goto err_alloc;
+
+	file_priv->client_sysfs_id = atomic_inc_return(&i915->client_serial);
+	ret = i915_gem_add_client(i915, file_priv, current,
+				  file_priv->client_sysfs_id);
+	if (ret)
+		goto err_client;
 
 	file->driver_priv = file_priv;
+	ret = i915_gem_context_open(i915, file);
+	if (ret)
+		goto err_context;
+
 	file_priv->dev_priv = i915;
 	file_priv->file = file;
+	file_priv->bsd_engine = -1;
 
 	spin_lock_init(&file_priv->mm.lock);
 	INIT_LIST_HEAD(&file_priv->mm.request_list);
 
-	file_priv->bsd_engine = -1;
-
-	ret = i915_gem_context_open(i915, file);
-	if (ret)
-		kfree(file_priv);
+	return 0;
 
+err_context:
+	i915_gem_remove_client(file_priv);
+err_client:
+	atomic_dec(&i915->client_serial);
+	kfree(file_priv);
+err_alloc:
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index b33d2158c234..6cf032bc1573 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -575,6 +575,11 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	struct device *kdev = dev_priv->drm.primary->kdev;
 	int ret;
 
+	dev_priv->clients_root =
+		kobject_create_and_add("clients", &kdev->kobj);
+	if (!dev_priv->clients_root)
+		DRM_ERROR("Per-client sysfs setup failed\n");
+
 #ifdef CONFIG_PM
 	if (HAS_RC6(dev_priv)) {
 		ret = sysfs_merge_group(&kdev->kobj,
@@ -635,4 +640,7 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 	sysfs_unmerge_group(&kdev->kobj, &rc6_attr_group);
 	sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
 #endif
+
+	if (dev_priv->clients_root)
+		kobject_put(dev_priv->clients_root);
 }
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 3/5] drm/i915: Update client name on context create
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
  2018-02-14 18:50 ` [RFC 1/5] drm/i915: Track per-context engine busyness Tvrtko Ursulin
  2018-02-14 18:50 ` [RFC 2/5] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
@ 2018-02-14 18:50 ` Tvrtko Ursulin
  2018-02-14 18:50 ` [RFC 4/5] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-14 18:50 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some clients have the DRM fd passed to them over a socket by the X server.

Grab the real client and pid when they create their first context and
update the exposed data for more useful enumeration.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  8 ++++++++
 drivers/gpu/drm/i915/i915_gem.c         |  4 ++--
 drivers/gpu/drm/i915/i915_gem_context.c | 15 +++++++++++++--
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2133e67f5fbc..372d13cb2472 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3418,6 +3418,14 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
 				int align);
+
+int
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,
+		struct task_struct *task,
+		unsigned int serial);
+void i915_gem_remove_client(struct drm_i915_file_private *file_priv);
+
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 24607bce2efe..46ac7b3ca348 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5631,7 +5631,7 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
 }
 
-static int
+int
 i915_gem_add_client(struct drm_i915_private *i915,
 		struct drm_i915_file_private *file_priv,
 		struct task_struct *task,
@@ -5686,7 +5686,7 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	return ret;
 }
 
-static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
+void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
 {
 	sysfs_remove_file(file_priv->client_root,
 			  (struct attribute *)&file_priv->attr.pid);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 7c6a406fd309..710ccfa18714 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -645,6 +645,7 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file)
 {
+	unsigned int pid = pid_nr(get_task_pid(current, PIDTYPE_PID));
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_context_create *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
@@ -659,8 +660,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 
 	if (client_is_banned(file_priv)) {
 		DRM_DEBUG("client %s[%d] banned from creating ctx\n",
-			  current->comm,
-			  pid_nr(get_task_pid(current, PIDTYPE_PID)));
+			  current->comm, pid);
 
 		return -EIO;
 	}
@@ -669,6 +669,17 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
+	if (file_priv->client_pid != pid) {
+		unsigned int id = file_priv->client_sysfs_id;
+
+		i915_gem_remove_client(file_priv);
+		ret = i915_gem_add_client(dev_priv, file_priv, current, id);
+		if (ret) {
+			mutex_unlock(&dev->struct_mutex);
+			return ret;
+		}
+	}
+
 	ctx = i915_gem_create_context(dev_priv, file_priv);
 	mutex_unlock(&dev->struct_mutex);
 	if (IS_ERR(ctx))
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 4/5] drm/i915: Expose per-engine client busyness
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2018-02-14 18:50 ` [RFC 3/5] drm/i915: Update client name on context create Tvrtko Ursulin
@ 2018-02-14 18:50 ` Tvrtko Ursulin
  2018-02-14 19:17   ` Chris Wilson
  2018-02-14 18:50 ` [RFC 5/5] drm/i915: Add sysfs toggle to enable per-client engine stats Tvrtko Ursulin
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-14 18:50 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory.

Each contains a monotonically increasing nano-second resolution times each
client's jobs were executing on the GPU.

$ cat /sys/class/drm/card0/clients/5/busy/rcs0
32516602

This data can serve as an interface to implement a top like utility for
GPU jobs. For instance I have prototyped a tool in IGT which produces
periodic output like:

neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
     Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
    xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%

This tools can also be extended to use the i915 PMU and show overall engine
busyness, and engine loads using the queue depth metric.

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  8 ++++
 drivers/gpu/drm/i915/i915_gem.c | 86 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 372d13cb2472..d6b2883b42fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -315,6 +315,12 @@ struct drm_i915_private;
 struct i915_mm_struct;
 struct i915_mmu_object;
 
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct drm_i915_file_private *file_priv;
+	struct intel_engine_cs *engine;
+};
+
 struct drm_i915_file_private {
 	struct drm_i915_private *dev_priv;
 	struct drm_file *file;
@@ -350,10 +356,12 @@ struct drm_i915_file_private {
 	unsigned int client_pid;
 	char *client_name;
 	struct kobject *client_root;
+	struct kobject *busy_root;
 
 	struct {
 		struct device_attribute pid;
 		struct device_attribute name;
+		struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
 	} attr;
 };
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 46ac7b3ca348..01298d924524 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5631,6 +5631,45 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
 }
 
+struct busy_ctx {
+	struct intel_engine_cs *engine;
+	u64 total;
+};
+
+static int busy_add(int _id, void *p, void *data)
+{
+	struct i915_gem_context *ctx = p;
+	struct busy_ctx *bc = data;
+
+	bc->total +=
+		ktime_to_ns(intel_context_engine_get_busy_time(ctx,
+							       bc->engine));
+
+	return 0;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	struct drm_i915_file_private *file_priv = i915_attr->file_priv;
+	struct intel_engine_cs *engine = i915_attr->engine;
+	struct drm_i915_private *i915 = engine->i915;
+	struct busy_ctx bc = { .engine = engine };
+	int ret;
+
+	ret = i915_mutex_lock_interruptible(&i915->drm);
+	if (ret)
+		return ret;
+
+	idr_for_each(&file_priv->context_idr, busy_add, &bc);
+
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", bc.total);
+}
+
 int
 i915_gem_add_client(struct drm_i915_private *i915,
 		struct drm_i915_file_private *file_priv,
@@ -5638,15 +5677,17 @@ i915_gem_add_client(struct drm_i915_private *i915,
 		unsigned int serial)
 {
 	int ret = -ENOMEM;
+	struct intel_engine_cs *engine;
 	struct device_attribute *attr;
-	char id[32];
+	enum intel_engine_id id, id2;
+	char idstr[32];
 
 	file_priv->client_name = kstrdup(task->comm, GFP_KERNEL);
 	if (!file_priv->client_name)
 		goto err_name;
 
-	snprintf(id, sizeof(id), "%u", serial);
-	file_priv->client_root = kobject_create_and_add(id,
+	snprintf(idstr, sizeof(idstr), "%u", serial);
+	file_priv->client_root = kobject_create_and_add(idstr,
 							i915->clients_root);
 	if (!file_priv->client_root)
 		goto err_client;
@@ -5671,10 +5712,41 @@ i915_gem_add_client(struct drm_i915_private *i915,
 	if (ret)
 		goto err_attr_pid;
 
+	file_priv->busy_root = kobject_create_and_add("busy",
+						      file_priv->client_root);
+	if (!file_priv->busy_root)
+		goto err_busy_root;
+
+	for_each_engine(engine, i915, id) {
+		file_priv->attr.busy[id].file_priv = file_priv;
+		file_priv->attr.busy[id].engine = engine;
+		attr = &file_priv->attr.busy[id].attr;
+		attr->attr.name = engine->name;
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(file_priv->busy_root,
+				        (struct attribute *)attr);
+		if (ret)
+			goto err_attr_busy;
+	}
+
 	file_priv->client_pid = pid_nr(get_task_pid(task, PIDTYPE_PID));
 
 	return 0;
 
+err_attr_busy:
+	for_each_engine(engine, i915, id2) {
+		if (id2 == id)
+			break;
+
+		sysfs_remove_file(file_priv->busy_root,
+				  (struct attribute *)&file_priv->attr.busy[id2]);
+	}
+	kobject_put(file_priv->busy_root);
+err_busy_root:
+	sysfs_remove_file(file_priv->client_root,
+			  (struct attribute *)&file_priv->attr.pid);
 err_attr_pid:
 	sysfs_remove_file(file_priv->client_root,
 			  (struct attribute *)&file_priv->attr.name);
@@ -5688,6 +5760,14 @@ i915_gem_add_client(struct drm_i915_private *i915,
 
 void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
 {
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, file_priv->dev_priv, id) {
+		sysfs_remove_file(file_priv->busy_root,
+				  (struct attribute *)&file_priv->attr.busy[id]);
+	}
+	kobject_put(file_priv->busy_root);
 	sysfs_remove_file(file_priv->client_root,
 			  (struct attribute *)&file_priv->attr.pid);
 	sysfs_remove_file(file_priv->client_root,
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 5/5] drm/i915: Add sysfs toggle to enable per-client engine stats
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2018-02-14 18:50 ` [RFC 4/5] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
@ 2018-02-14 18:50 ` Tvrtko Ursulin
  2018-02-14 18:55 ` ✗ Fi.CI.CHECKPATCH: warning for Per-client " Patchwork
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-14 18:50 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

By default we are not collecting any per-engine and per-context
statistcs.

Add a new sysfs toggle to enable this facility:

$ echo 1 >/sys/class/drm/card0/clients/enable_stats

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |  2 ++
 drivers/gpu/drm/i915/i915_sysfs.c | 72 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d6b2883b42fe..fe8d91ce2af5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2385,6 +2385,8 @@ struct drm_i915_private {
 
 	struct kobject *clients_root;
 	atomic_t client_serial;
+	struct device_attribute client_stats_attr;
+	bool client_stats_enabled;
 
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 6cf032bc1573..2eb720042007 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -570,9 +570,67 @@ static void i915_setup_error_capture(struct device *kdev) {}
 static void i915_teardown_error_capture(struct device *kdev) {}
 #endif
 
+static ssize_t
+show_client_stats(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct drm_i915_private *i915 =
+		container_of(attr, struct drm_i915_private, client_stats_attr);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", i915->client_stats_enabled);
+}
+
+static ssize_t
+store_client_stats(struct device *kdev, struct device_attribute *attr,
+		   const char *buf, size_t count)
+{
+	struct drm_i915_private *i915 =
+		container_of(attr, struct drm_i915_private, client_stats_attr);
+	bool disable = false;
+	bool enable = false;
+	bool val = false;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int ret;
+
+	 /* Use RCS as proxy for all engines. */
+	if (!intel_engine_supports_stats(i915->engine[RCS]))
+		return -EINVAL;
+
+	ret = kstrtobool(buf, &val);
+	if (ret)
+		return ret;
+
+	ret = i915_mutex_lock_interruptible(&i915->drm);
+	if (ret)
+		return ret;
+
+	if (val && !i915->client_stats_enabled)
+		enable = true;
+	else if (!val && i915->client_stats_enabled)
+		disable = true;
+
+	if (!enable && !disable)
+		goto out;
+
+	for_each_engine(engine, i915, id) {
+		if (enable)
+			WARN_ON_ONCE(intel_enable_engine_stats(engine));
+		else if (disable)
+			intel_disable_engine_stats(engine);
+	}
+
+	i915->client_stats_enabled = val;
+
+out:
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return count;
+}
+
 void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 {
 	struct device *kdev = dev_priv->drm.primary->kdev;
+	struct device_attribute *attr;
 	int ret;
 
 	dev_priv->clients_root =
@@ -580,6 +638,17 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	if (!dev_priv->clients_root)
 		DRM_ERROR("Per-client sysfs setup failed\n");
 
+	attr = &dev_priv->client_stats_attr;
+	attr->attr.name = "enable_stats";
+	attr->attr.mode = 0664;
+	attr->show = show_client_stats;
+	attr->store = store_client_stats;
+
+	ret = sysfs_create_file(dev_priv->clients_root,
+				(struct attribute *)attr);
+	if (ret)
+		DRM_ERROR("Per-client sysfs setup failed! (%d)\n", ret);
+
 #ifdef CONFIG_PM
 	if (HAS_RC6(dev_priv)) {
 		ret = sysfs_merge_group(&kdev->kobj,
@@ -641,6 +710,9 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 	sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
 #endif
 
+	sysfs_remove_file(dev_priv->clients_root,
+			  (struct attribute *)&dev_priv->client_stats_attr);
+
 	if (dev_priv->clients_root)
 		kobject_put(dev_priv->clients_root);
 }
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for Per-client engine stats
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2018-02-14 18:50 ` [RFC 5/5] drm/i915: Add sysfs toggle to enable per-client engine stats Tvrtko Ursulin
@ 2018-02-14 18:55 ` Patchwork
  2018-02-14 19:11 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2018-02-14 18:55 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per-client engine stats
URL   : https://patchwork.freedesktop.org/series/38281/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
f4e5fd1ffe1f drm/i915: Track per-context engine busyness
-:65: CHECK: spinlock_t definition without comment
#65: FILE: drivers/gpu/drm/i915/i915_gem_context.h:165:
+			spinlock_t lock;

total: 0 errors, 0 warnings, 1 checks, 228 lines checked
eb22b6934bda drm/i915: Expose list of clients in sysfs
-:80: CHECK: Alignment should match open parenthesis
#80: FILE: drivers/gpu/drm/i915/i915_gem.c:5636:
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,

total: 0 errors, 0 warnings, 1 checks, 186 lines checked
431e3341d2bc drm/i915: Update client name on context create
-:24: CHECK: Alignment should match open parenthesis
#24: FILE: drivers/gpu/drm/i915/i915_drv.h:3428:
+i915_gem_add_client(struct drm_i915_private *i915,
+		struct drm_i915_file_private *file_priv,

total: 0 errors, 0 warnings, 1 checks, 63 lines checked
5f54047b452c drm/i915: Expose per-engine client busyness
-:22: WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#22: 
neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%

-:153: ERROR: code indent should use tabs where possible
#153: FILE: drivers/gpu/drm/i915/i915_gem.c:5729:
+^I^I^I^I        (struct attribute *)attr);$

-:153: CHECK: Alignment should match open parenthesis
#153: FILE: drivers/gpu/drm/i915/i915_gem.c:5729:
+		ret = sysfs_create_file(file_priv->busy_root,
+				        (struct attribute *)attr);

-:168: WARNING: line over 80 characters
#168: FILE: drivers/gpu/drm/i915/i915_gem.c:5744:
+				  (struct attribute *)&file_priv->attr.busy[id2]);

-:186: WARNING: line over 80 characters
#186: FILE: drivers/gpu/drm/i915/i915_gem.c:5768:
+				  (struct attribute *)&file_priv->attr.busy[id]);

total: 1 errors, 3 warnings, 1 checks, 144 lines checked
deb4b442c974 drm/i915: Add sysfs toggle to enable per-client engine stats

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 1/5] drm/i915: Track per-context engine busyness
  2018-02-14 18:50 ` [RFC 1/5] drm/i915: Track per-context engine busyness Tvrtko Ursulin
@ 2018-02-14 19:07   ` Chris Wilson
  2018-02-15  9:29     ` Tvrtko Ursulin
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2018-02-14 19:07 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: gordon.kelly

Quoting Tvrtko Ursulin (2018-02-14 18:50:31)
> +ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
> +                                          struct intel_engine_cs *engine)
> +{
> +       struct intel_context *ce = &ctx->engine[engine->id];
> +       ktime_t total;
> +
> +       spin_lock_irq(&ce->stats.lock);
> +
> +       total = ce->stats.total;
> +
> +       if (ce->stats.active)
> +               total = ktime_add(total,
> +                                 ktime_sub(ktime_get(), ce->stats.start));
> +
> +       spin_unlock_irq(&ce->stats.lock);

Looks like we can just use a seqlock here.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* ✓ Fi.CI.BAT: success for Per-client engine stats
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2018-02-14 18:55 ` ✗ Fi.CI.CHECKPATCH: warning for Per-client " Patchwork
@ 2018-02-14 19:11 ` Patchwork
  2018-02-14 19:20 ` [RFC 0/5] " Chris Wilson
  2018-02-15  2:19 ` ✓ Fi.CI.IGT: success for " Patchwork
  8 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2018-02-14 19:11 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per-client engine stats
URL   : https://patchwork.freedesktop.org/series/38281/
State : success

== Summary ==

Series 38281v1 Per-client engine stats
https://patchwork.freedesktop.org/api/1.0/series/38281/revisions/1/mbox/

Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                fail       -> PASS       (fi-gdg-551) fdo#102575
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                pass       -> DMESG-WARN (fi-cnl-y3) fdo#105058
        Subgroup suspend-read-crc-pipe-b:
                pass       -> INCOMPLETE (fi-snb-2520m) fdo#103713

fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#105058 https://bugs.freedesktop.org/show_bug.cgi?id=105058
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:431s
fi-bdw-gvtdvm    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:420s
fi-blb-e6850     total:288  pass:223  dwarn:1   dfail:0   fail:0   skip:64  time:365s
fi-bsw-n3050     total:288  pass:242  dwarn:0   dfail:0   fail:0   skip:46  time:491s
fi-bwr-2160      total:288  pass:183  dwarn:0   dfail:0   fail:0   skip:105 time:274s
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:503s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:506s
fi-byt-j1900     total:288  pass:253  dwarn:0   dfail:0   fail:0   skip:35  time:514s
fi-byt-n2820     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:463s
fi-cfl-s2        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:623s
fi-cnl-y3        total:288  pass:261  dwarn:1   dfail:0   fail:0   skip:26  time:599s
fi-elk-e7500     total:288  pass:229  dwarn:0   dfail:0   fail:0   skip:59  time:409s
fi-gdg-551       total:288  pass:180  dwarn:0   dfail:0   fail:0   skip:108 time:263s
fi-glk-1         total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:533s
fi-hsw-4770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:414s
fi-ilk-650       total:288  pass:228  dwarn:0   dfail:0   fail:0   skip:60  time:402s
fi-ivb-3520m     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:490s
fi-ivb-3770      total:288  pass:255  dwarn:0   dfail:0   fail:0   skip:33  time:441s
fi-kbl-7500u     total:288  pass:263  dwarn:1   dfail:0   fail:0   skip:24  time:464s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:498s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:537s
fi-pnv-d510      total:288  pass:222  dwarn:1   dfail:0   fail:0   skip:65  time:496s
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:438s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:517s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:566s
fi-skl-6700k2    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:503s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:496s
fi-skl-guc       total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:437s
fi-skl-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:427s
fi-snb-2520m     total:245  pass:211  dwarn:0   dfail:0   fail:0   skip:33 
fi-snb-2600      total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:438s
Blacklisted hosts:
fi-glk-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:490s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:456s

5ad7866768dcc5d993f1bba687a80fced1070635 drm-tip: 2018y-02m-14d-15h-14m-50s UTC integration manifest
deb4b442c974 drm/i915: Add sysfs toggle to enable per-client engine stats
5f54047b452c drm/i915: Expose per-engine client busyness
431e3341d2bc drm/i915: Update client name on context create
eb22b6934bda drm/i915: Expose list of clients in sysfs
f4e5fd1ffe1f drm/i915: Track per-context engine busyness

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8028/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 2/5] drm/i915: Expose list of clients in sysfs
  2018-02-14 18:50 ` [RFC 2/5] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
@ 2018-02-14 19:13   ` Chris Wilson
  2018-02-15  9:35     ` Tvrtko Ursulin
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2018-02-14 19:13 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-02-14 18:50:32)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Expose a list of clients with open file handles in sysfs.
> 
> This will be a basis for a top-like utility showing per-client and per-
> engine GPU load.
> 
> Currently we only expose each client's pid and name under opaque numbered
> directories in /sys/class/drm/card0/clients/.
> 
> For instance:
> 
> /sys/class/drm/card0/clients/3/name: Xorg
> /sys/class/drm/card0/clients/3/pid: 5664
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h   |  13 +++++
>  drivers/gpu/drm/i915/i915_gem.c   | 112 +++++++++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_sysfs.c |   8 +++
>  3 files changed, 126 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 70cf289855af..2133e67f5fbc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -345,6 +345,16 @@ struct drm_i915_file_private {
>   */
>  #define I915_MAX_CLIENT_CONTEXT_BANS 3
>         atomic_t context_bans;
> +

struct {

> +       unsigned int client_sysfs_id;
> +       unsigned int client_pid;
> +       char *client_name;
> +       struct kobject *client_root;

} client;

> +
> +       struct {
> +               struct device_attribute pid;
> +               struct device_attribute name;
> +       } attr;

sysfs_attr?

>  };
>  
>  /* Interface history:
> @@ -2365,6 +2375,9 @@ struct drm_i915_private {
>  
>         struct i915_pmu pmu;
>  
> +       struct kobject *clients_root;
> +       atomic_t client_serial;

I'd start a new substruct { } clients;

>         /*
>          * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
>          * will be rejected. Instead look for a better place.
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fc68b35854df..24607bce2efe 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5613,6 +5613,89 @@ int i915_gem_freeze_late(struct drm_i915_private *dev_priv)
>         return 0;
>  }
>  
> +static ssize_t
> +show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
> +{
> +       struct drm_i915_file_private *file_priv =
> +               container_of(attr, struct drm_i915_file_private, attr.name);
> +
> +       return snprintf(buf, PAGE_SIZE, "%s", file_priv->client_name);
> +}
> +
> +static ssize_t
> +show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
> +{
> +       struct drm_i915_file_private *file_priv =
> +               container_of(attr, struct drm_i915_file_private, attr.pid);
> +
> +       return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
> +}
> +
> +static int
> +i915_gem_add_client(struct drm_i915_private *i915,
> +               struct drm_i915_file_private *file_priv,
> +               struct task_struct *task,
> +               unsigned int serial)
> +{
> +       int ret = -ENOMEM;
> +       struct device_attribute *attr;
> +       char id[32];
> +
> +       file_priv->client_name = kstrdup(task->comm, GFP_KERNEL);
> +       if (!file_priv->client_name)
> +               goto err_name;
> +
> +       snprintf(id, sizeof(id), "%u", serial);
> +       file_priv->client_root = kobject_create_and_add(id,
> +                                                       i915->clients_root);

Do we have to care about i915->clients_root being NULL here?

> +       if (!file_priv->client_root)
> +               goto err_client;
> +
> +       attr = &file_priv->attr.name;
> +       attr->attr.name = "name";
> +       attr->attr.mode = 0444;
> +       attr->show = show_client_name;
> +
> +       ret = sysfs_create_file(file_priv->client_root,
> +                               (struct attribute *)attr);
> +       if (ret)
> +               goto err_attr_name;
> +
> +       attr = &file_priv->attr.pid;
> +       attr->attr.name = "pid";
> +       attr->attr.mode = 0444;
> +       attr->show = show_client_pid;
> +
> +       ret = sysfs_create_file(file_priv->client_root,
> +                               (struct attribute *)attr);
> +       if (ret)
> +               goto err_attr_pid;
> +
> +       file_priv->client_pid = pid_nr(get_task_pid(task, PIDTYPE_PID));

Are we before or after the "create_context get new pid"? Just wondering
aloud how that's going to tie in.

> +
> +       return 0;
> +
> +err_attr_pid:
> +       sysfs_remove_file(file_priv->client_root,
> +                         (struct attribute *)&file_priv->attr.name);
> +err_attr_name:
> +       kobject_put(file_priv->client_root);
> +err_client:
> +       kfree(file_priv->client_name);
> +err_name:
> +       return ret;
> +}
> +
> +static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
> +{
> +       sysfs_remove_file(file_priv->client_root,
> +                         (struct attribute *)&file_priv->attr.pid);
> +       sysfs_remove_file(file_priv->client_root,
> +                         (struct attribute *)&file_priv->attr.name);
> +       kobject_put(file_priv->client_root);
> +       kfree(file_priv->client_name);
> +}
> +
>  void i915_gem_release(struct drm_device *dev, struct drm_file *file)
>  {
>         struct drm_i915_file_private *file_priv = file->driver_priv;
> @@ -5626,32 +5709,47 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
>         list_for_each_entry(request, &file_priv->mm.request_list, client_link)
>                 request->file_priv = NULL;
>         spin_unlock(&file_priv->mm.lock);
> +
> +       i915_gem_remove_client(file_priv);
>  }
>  
>  int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
>  {
> +       int ret = -ENOMEM;
>         struct drm_i915_file_private *file_priv;
> -       int ret;
>  
>         DRM_DEBUG("\n");
>  
>         file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
>         if (!file_priv)
> -               return -ENOMEM;
> +               goto err_alloc;
> +
> +       file_priv->client_sysfs_id = atomic_inc_return(&i915->client_serial);

What's the use for client_sysfd_id? I don't see it used again.

> +       ret = i915_gem_add_client(i915, file_priv, current,
> +                                 file_priv->client_sysfs_id);
> +       if (ret)
> +               goto err_client;
>  
>         file->driver_priv = file_priv;
> +       ret = i915_gem_context_open(i915, file);
> +       if (ret)
> +               goto err_context;
> +
>         file_priv->dev_priv = i915;
>         file_priv->file = file;
> +       file_priv->bsd_engine = -1;
>  
>         spin_lock_init(&file_priv->mm.lock);
>         INIT_LIST_HEAD(&file_priv->mm.request_list);
>  
> -       file_priv->bsd_engine = -1;
> -
> -       ret = i915_gem_context_open(i915, file);
> -       if (ret)
> -               kfree(file_priv);
> +       return 0;
>  
> +err_context:
> +       i915_gem_remove_client(file_priv);
> +err_client:
> +       atomic_dec(&i915->client_serial);
> +       kfree(file_priv);
> +err_alloc:
>         return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index b33d2158c234..6cf032bc1573 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -575,6 +575,11 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
>         struct device *kdev = dev_priv->drm.primary->kdev;
>         int ret;
>  
> +       dev_priv->clients_root =
> +               kobject_create_and_add("clients", &kdev->kobj);
> +       if (!dev_priv->clients_root)
> +               DRM_ERROR("Per-client sysfs setup failed\n");
> +
>  #ifdef CONFIG_PM
>         if (HAS_RC6(dev_priv)) {
>                 ret = sysfs_merge_group(&kdev->kobj,
> @@ -635,4 +640,7 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
>         sysfs_unmerge_group(&kdev->kobj, &rc6_attr_group);
>         sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
>  #endif
> +
> +       if (dev_priv->clients_root)
> +               kobject_put(dev_priv->clients_root);
>  }
> -- 
> 2.14.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 4/5] drm/i915: Expose per-engine client busyness
  2018-02-14 18:50 ` [RFC 4/5] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
@ 2018-02-14 19:17   ` Chris Wilson
  2018-02-15  9:41     ` Tvrtko Ursulin
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2018-02-14 19:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-02-14 18:50:34)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Expose per-client and per-engine busyness under the previously added sysfs
> client root.
> 
> The new files are one per-engine instance and located under the 'busy'
> directory.
> 
> Each contains a monotonically increasing nano-second resolution times each
> client's jobs were executing on the GPU.
> 
> $ cat /sys/class/drm/card0/clients/5/busy/rcs0
> 32516602
> 
> This data can serve as an interface to implement a top like utility for
> GPU jobs. For instance I have prototyped a tool in IGT which produces
> periodic output like:
> 
> neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>      Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>     xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
> 
> This tools can also be extended to use the i915 PMU and show overall engine
> busyness, and engine loads using the queue depth metric.
> 
> v2: Use intel_context_engine_get_busy_time.
> v3: New directory structure.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |  8 ++++
>  drivers/gpu/drm/i915/i915_gem.c | 86 +++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 91 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 372d13cb2472..d6b2883b42fe 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -315,6 +315,12 @@ struct drm_i915_private;
>  struct i915_mm_struct;
>  struct i915_mmu_object;
>  
> +struct i915_engine_busy_attribute {
> +       struct device_attribute attr;
> +       struct drm_i915_file_private *file_priv;
> +       struct intel_engine_cs *engine;
> +};
> +
>  struct drm_i915_file_private {
>         struct drm_i915_private *dev_priv;
>         struct drm_file *file;
> @@ -350,10 +356,12 @@ struct drm_i915_file_private {
>         unsigned int client_pid;
>         char *client_name;
>         struct kobject *client_root;
> +       struct kobject *busy_root;
>  
>         struct {
>                 struct device_attribute pid;
>                 struct device_attribute name;
> +               struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
>         } attr;
>  };
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 46ac7b3ca348..01298d924524 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5631,6 +5631,45 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
>         return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
>  }
>  
> +struct busy_ctx {
> +       struct intel_engine_cs *engine;
> +       u64 total;
> +};
> +
> +static int busy_add(int _id, void *p, void *data)
> +{
> +       struct i915_gem_context *ctx = p;
> +       struct busy_ctx *bc = data;
> +
> +       bc->total +=
> +               ktime_to_ns(intel_context_engine_get_busy_time(ctx,
> +                                                              bc->engine));
> +
> +       return 0;
> +}
> +
> +static ssize_t
> +show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
> +{
> +       struct i915_engine_busy_attribute *i915_attr =
> +               container_of(attr, typeof(*i915_attr), attr);
> +       struct drm_i915_file_private *file_priv = i915_attr->file_priv;
> +       struct intel_engine_cs *engine = i915_attr->engine;
> +       struct drm_i915_private *i915 = engine->i915;
> +       struct busy_ctx bc = { .engine = engine };
> +       int ret;
> +
> +       ret = i915_mutex_lock_interruptible(&i915->drm);
> +       if (ret)
> +               return ret;
> +

Doesn't need struct_mutex, just rcu_read_lock() will suffice.

Neither the context nor idr will be freed too soon, and the data is
involatile when the context is unreffed (and contexts don't have the
nasty zombie/undead status of requests). So the busy-time will be
stable.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 0/5] Per-client engine stats
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2018-02-14 19:11 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-02-14 19:20 ` Chris Wilson
  2018-02-15  9:44   ` Tvrtko Ursulin
  2018-02-15  2:19 ` ✓ Fi.CI.IGT: success for " Patchwork
  8 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2018-02-14 19:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-02-14 18:50:30)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Another re-post of my earlier, now slightly updated work, to expose a DRM client
> hierarchy in sysfs in order to enable a top like tool:

So what I don't like about it is that it is a new interface in sysfs. We
already have a PMU interface for statistics and would rather see that
extended than abandoned. If perf can handle new processes coming and
going, surely we can handle new clients? :|
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* ✓ Fi.CI.IGT: success for Per-client engine stats
  2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
                   ` (7 preceding siblings ...)
  2018-02-14 19:20 ` [RFC 0/5] " Chris Wilson
@ 2018-02-15  2:19 ` Patchwork
  8 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2018-02-15  2:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per-client engine stats
URL   : https://patchwork.freedesktop.org/series/38281/
State : success

== Summary ==

Test kms_flip:
        Subgroup plain-flip-fb-recreate:
                pass       -> FAIL       (shard-hsw) fdo#100368
        Subgroup dpms-vs-vblank-race-interruptible:
                pass       -> FAIL       (shard-hsw) fdo#103060
Test kms_vblank:
        Subgroup crtc-id:
                skip       -> PASS       (shard-snb)
        Subgroup pipe-a-ts-continuation-modeset:
                skip       -> PASS       (shard-snb)
Test gem_softpin:
        Subgroup noreloc-s3:
                incomplete -> PASS       (shard-hsw) fdo#103540
Test kms_cursor_legacy:
        Subgroup flip-vs-cursor-crc-atomic:
                pass       -> FAIL       (shard-apl) fdo#102670
Test kms_frontbuffer_tracking:
        Subgroup fbc-1p-offscren-pri-shrfb-draw-pwrite:
                pass       -> FAIL       (shard-apl) fdo#101623

fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#103540 https://bugs.freedesktop.org/show_bug.cgi?id=103540
fdo#102670 https://bugs.freedesktop.org/show_bug.cgi?id=102670
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623

shard-apl        total:3346 pass:1735 dwarn:1   dfail:0   fail:23  skip:1586 time:12696s
shard-hsw        total:3427 pass:1756 dwarn:1   dfail:0   fail:13  skip:1656 time:14314s
shard-snb        total:3427 pass:1349 dwarn:1   dfail:0   fail:10  skip:2067 time:7284s
Blacklisted hosts:
shard-kbl        total:3409 pass:1891 dwarn:1   dfail:0   fail:23  skip:1493 time:10485s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8028/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 1/5] drm/i915: Track per-context engine busyness
  2018-02-14 19:07   ` Chris Wilson
@ 2018-02-15  9:29     ` Tvrtko Ursulin
  2018-02-15  9:35       ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-15  9:29 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx; +Cc: gordon.kelly


On 14/02/2018 19:07, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-14 18:50:31)
>> +ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
>> +                                          struct intel_engine_cs *engine)
>> +{
>> +       struct intel_context *ce = &ctx->engine[engine->id];
>> +       ktime_t total;
>> +
>> +       spin_lock_irq(&ce->stats.lock);
>> +
>> +       total = ce->stats.total;
>> +
>> +       if (ce->stats.active)
>> +               total = ktime_add(total,
>> +                                 ktime_sub(ktime_get(), ce->stats.start));
>> +
>> +       spin_unlock_irq(&ce->stats.lock);
> 
> Looks like we can just use a seqlock here.

Hm, you may have suggested this before? Even for whole engine stats.

I think it could yes, with the benefit of not delaying writers (execlist 
processing) in presence of readers. But since the code is so writer 
heavy, and readers so infrequent and light weight, I wouldn't think we 
are in any danger of writer starvation, or even injecting any relevant 
latencies into command submission.

Also, could we get into reader live-lock situations under heavy 
interrupts? Probably not, since the reader section is again so light 
weight compared to the rest code would have to do to trigger it.

I can try it and see what happens.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 1/5] drm/i915: Track per-context engine busyness
  2018-02-15  9:29     ` Tvrtko Ursulin
@ 2018-02-15  9:35       ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2018-02-15  9:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, Tvrtko Ursulin, Intel-gfx; +Cc: gordon.kelly

Quoting Tvrtko Ursulin (2018-02-15 09:29:57)
> 
> On 14/02/2018 19:07, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-14 18:50:31)
> >> +ktime_t intel_context_engine_get_busy_time(struct i915_gem_context *ctx,
> >> +                                          struct intel_engine_cs *engine)
> >> +{
> >> +       struct intel_context *ce = &ctx->engine[engine->id];
> >> +       ktime_t total;
> >> +
> >> +       spin_lock_irq(&ce->stats.lock);
> >> +
> >> +       total = ce->stats.total;
> >> +
> >> +       if (ce->stats.active)
> >> +               total = ktime_add(total,
> >> +                                 ktime_sub(ktime_get(), ce->stats.start));
> >> +
> >> +       spin_unlock_irq(&ce->stats.lock);
> > 
> > Looks like we can just use a seqlock here.
> 
> Hm, you may have suggested this before? Even for whole engine stats.
> 
> I think it could yes, with the benefit of not delaying writers (execlist 
> processing) in presence of readers. But since the code is so writer 
> heavy, and readers so infrequent and light weight, I wouldn't think we 
> are in any danger of writer starvation, or even injecting any relevant 
> latencies into command submission.
> 
> Also, could we get into reader live-lock situations under heavy 
> interrupts? Probably not, since the reader section is again so light 
> weight compared to the rest code would have to do to trigger it.

In that scenario, I heavily favour the writer. They are responding to
the interrupt and latency in the writer means a potential GPU stall.

irq -> submission latency being the bane of my existence atm.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 2/5] drm/i915: Expose list of clients in sysfs
  2018-02-14 19:13   ` Chris Wilson
@ 2018-02-15  9:35     ` Tvrtko Ursulin
  0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-15  9:35 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 14/02/2018 19:13, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-14 18:50:32)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Expose a list of clients with open file handles in sysfs.
>>
>> This will be a basis for a top-like utility showing per-client and per-
>> engine GPU load.
>>
>> Currently we only expose each client's pid and name under opaque numbered
>> directories in /sys/class/drm/card0/clients/.
>>
>> For instance:
>>
>> /sys/class/drm/card0/clients/3/name: Xorg
>> /sys/class/drm/card0/clients/3/pid: 5664
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h   |  13 +++++
>>   drivers/gpu/drm/i915/i915_gem.c   | 112 +++++++++++++++++++++++++++++++++++---
>>   drivers/gpu/drm/i915/i915_sysfs.c |   8 +++
>>   3 files changed, 126 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 70cf289855af..2133e67f5fbc 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -345,6 +345,16 @@ struct drm_i915_file_private {
>>    */
>>   #define I915_MAX_CLIENT_CONTEXT_BANS 3
>>          atomic_t context_bans;
>> +
> 
> struct {
> 
>> +       unsigned int client_sysfs_id;
>> +       unsigned int client_pid;
>> +       char *client_name;
>> +       struct kobject *client_root;
> 
> } client;

client_sysfs?

>> +
>> +       struct {
>> +               struct device_attribute pid;
>> +               struct device_attribute name;
>> +       } attr;
> 
> sysfs_attr?

Ok.

>>   };
>>   
>>   /* Interface history:
>> @@ -2365,6 +2375,9 @@ struct drm_i915_private {
>>   
>>          struct i915_pmu pmu;
>>   
>> +       struct kobject *clients_root;
>> +       atomic_t client_serial;
> 
> I'd start a new substruct { } clients;

Ok.

>>          /*
>>           * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
>>           * will be rejected. Instead look for a better place.
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index fc68b35854df..24607bce2efe 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -5613,6 +5613,89 @@ int i915_gem_freeze_late(struct drm_i915_private *dev_priv)
>>          return 0;
>>   }
>>   
>> +static ssize_t
>> +show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
>> +{
>> +       struct drm_i915_file_private *file_priv =
>> +               container_of(attr, struct drm_i915_file_private, attr.name);
>> +
>> +       return snprintf(buf, PAGE_SIZE, "%s", file_priv->client_name);
>> +}
>> +
>> +static ssize_t
>> +show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
>> +{
>> +       struct drm_i915_file_private *file_priv =
>> +               container_of(attr, struct drm_i915_file_private, attr.pid);
>> +
>> +       return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
>> +}
>> +
>> +static int
>> +i915_gem_add_client(struct drm_i915_private *i915,
>> +               struct drm_i915_file_private *file_priv,
>> +               struct task_struct *task,
>> +               unsigned int serial)
>> +{
>> +       int ret = -ENOMEM;
>> +       struct device_attribute *attr;
>> +       char id[32];
>> +
>> +       file_priv->client_name = kstrdup(task->comm, GFP_KERNEL);
>> +       if (!file_priv->client_name)
>> +               goto err_name;
>> +
>> +       snprintf(id, sizeof(id), "%u", serial);
>> +       file_priv->client_root = kobject_create_and_add(id,
>> +                                                       i915->clients_root);
> 
> Do we have to care about i915->clients_root being NULL here?

Yep, due our lax handling of sysfs registration failures. :/

>> +       if (!file_priv->client_root)
>> +               goto err_client;
>> +
>> +       attr = &file_priv->attr.name;
>> +       attr->attr.name = "name";
>> +       attr->attr.mode = 0444;
>> +       attr->show = show_client_name;
>> +
>> +       ret = sysfs_create_file(file_priv->client_root,
>> +                               (struct attribute *)attr);
>> +       if (ret)
>> +               goto err_attr_name;
>> +
>> +       attr = &file_priv->attr.pid;
>> +       attr->attr.name = "pid";
>> +       attr->attr.mode = 0444;
>> +       attr->show = show_client_pid;
>> +
>> +       ret = sysfs_create_file(file_priv->client_root,
>> +                               (struct attribute *)attr);
>> +       if (ret)
>> +               goto err_attr_pid;
>> +
>> +       file_priv->client_pid = pid_nr(get_task_pid(task, PIDTYPE_PID));
> 
> Are we before or after the "create_context get new pid"? Just wondering
> aloud how that's going to tie in.

Before, create_context then updates it in a following patch. Which 
probably needs improving to avoid remove and re-add, and add a proper 
update helper.

> 
>> +
>> +       return 0;
>> +
>> +err_attr_pid:
>> +       sysfs_remove_file(file_priv->client_root,
>> +                         (struct attribute *)&file_priv->attr.name);
>> +err_attr_name:
>> +       kobject_put(file_priv->client_root);
>> +err_client:
>> +       kfree(file_priv->client_name);
>> +err_name:
>> +       return ret;
>> +}
>> +
>> +static void i915_gem_remove_client(struct drm_i915_file_private *file_priv)
>> +{
>> +       sysfs_remove_file(file_priv->client_root,
>> +                         (struct attribute *)&file_priv->attr.pid);
>> +       sysfs_remove_file(file_priv->client_root,
>> +                         (struct attribute *)&file_priv->attr.name);
>> +       kobject_put(file_priv->client_root);
>> +       kfree(file_priv->client_name);
>> +}
>> +
>>   void i915_gem_release(struct drm_device *dev, struct drm_file *file)
>>   {
>>          struct drm_i915_file_private *file_priv = file->driver_priv;
>> @@ -5626,32 +5709,47 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
>>          list_for_each_entry(request, &file_priv->mm.request_list, client_link)
>>                  request->file_priv = NULL;
>>          spin_unlock(&file_priv->mm.lock);
>> +
>> +       i915_gem_remove_client(file_priv);
>>   }
>>   
>>   int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
>>   {
>> +       int ret = -ENOMEM;
>>          struct drm_i915_file_private *file_priv;
>> -       int ret;
>>   
>>          DRM_DEBUG("\n");
>>   
>>          file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
>>          if (!file_priv)
>> -               return -ENOMEM;
>> +               goto err_alloc;
>> +
>> +       file_priv->client_sysfs_id = atomic_inc_return(&i915->client_serial);
> 
> What's the use for client_sysfd_id? I don't see it used again.

Used from context create pid update, but its weak.. I'll eliminate it 
after above mentioned improvements.

Regards,

Tvrtko

> 
>> +       ret = i915_gem_add_client(i915, file_priv, current,
>> +                                 file_priv->client_sysfs_id);
>> +       if (ret)
>> +               goto err_client;
>>   
>>          file->driver_priv = file_priv;
>> +       ret = i915_gem_context_open(i915, file);
>> +       if (ret)
>> +               goto err_context;
>> +
>>          file_priv->dev_priv = i915;
>>          file_priv->file = file;
>> +       file_priv->bsd_engine = -1;
>>   
>>          spin_lock_init(&file_priv->mm.lock);
>>          INIT_LIST_HEAD(&file_priv->mm.request_list);
>>   
>> -       file_priv->bsd_engine = -1;
>> -
>> -       ret = i915_gem_context_open(i915, file);
>> -       if (ret)
>> -               kfree(file_priv);
>> +       return 0;
>>   
>> +err_context:
>> +       i915_gem_remove_client(file_priv);
>> +err_client:
>> +       atomic_dec(&i915->client_serial);
>> +       kfree(file_priv);
>> +err_alloc:
>>          return ret;
>>   }
>>   
>> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
>> index b33d2158c234..6cf032bc1573 100644
>> --- a/drivers/gpu/drm/i915/i915_sysfs.c
>> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
>> @@ -575,6 +575,11 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
>>          struct device *kdev = dev_priv->drm.primary->kdev;
>>          int ret;
>>   
>> +       dev_priv->clients_root =
>> +               kobject_create_and_add("clients", &kdev->kobj);
>> +       if (!dev_priv->clients_root)
>> +               DRM_ERROR("Per-client sysfs setup failed\n");
>> +
>>   #ifdef CONFIG_PM
>>          if (HAS_RC6(dev_priv)) {
>>                  ret = sysfs_merge_group(&kdev->kobj,
>> @@ -635,4 +640,7 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
>>          sysfs_unmerge_group(&kdev->kobj, &rc6_attr_group);
>>          sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
>>   #endif
>> +
>> +       if (dev_priv->clients_root)
>> +               kobject_put(dev_priv->clients_root);
>>   }
>> -- 
>> 2.14.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 4/5] drm/i915: Expose per-engine client busyness
  2018-02-14 19:17   ` Chris Wilson
@ 2018-02-15  9:41     ` Tvrtko Ursulin
  2018-02-15  9:44       ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-15  9:41 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 14/02/2018 19:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-14 18:50:34)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Expose per-client and per-engine busyness under the previously added sysfs
>> client root.
>>
>> The new files are one per-engine instance and located under the 'busy'
>> directory.
>>
>> Each contains a monotonically increasing nano-second resolution times each
>> client's jobs were executing on the GPU.
>>
>> $ cat /sys/class/drm/card0/clients/5/busy/rcs0
>> 32516602
>>
>> This data can serve as an interface to implement a top like utility for
>> GPU jobs. For instance I have prototyped a tool in IGT which produces
>> periodic output like:
>>
>> neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>>       Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>>      xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>>
>> This tools can also be extended to use the i915 PMU and show overall engine
>> busyness, and engine loads using the queue depth metric.
>>
>> v2: Use intel_context_engine_get_busy_time.
>> v3: New directory structure.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h |  8 ++++
>>   drivers/gpu/drm/i915/i915_gem.c | 86 +++++++++++++++++++++++++++++++++++++++--
>>   2 files changed, 91 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 372d13cb2472..d6b2883b42fe 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -315,6 +315,12 @@ struct drm_i915_private;
>>   struct i915_mm_struct;
>>   struct i915_mmu_object;
>>   
>> +struct i915_engine_busy_attribute {
>> +       struct device_attribute attr;
>> +       struct drm_i915_file_private *file_priv;
>> +       struct intel_engine_cs *engine;
>> +};
>> +
>>   struct drm_i915_file_private {
>>          struct drm_i915_private *dev_priv;
>>          struct drm_file *file;
>> @@ -350,10 +356,12 @@ struct drm_i915_file_private {
>>          unsigned int client_pid;
>>          char *client_name;
>>          struct kobject *client_root;
>> +       struct kobject *busy_root;
>>   
>>          struct {
>>                  struct device_attribute pid;
>>                  struct device_attribute name;
>> +               struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
>>          } attr;
>>   };
>>   
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 46ac7b3ca348..01298d924524 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -5631,6 +5631,45 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
>>          return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
>>   }
>>   
>> +struct busy_ctx {
>> +       struct intel_engine_cs *engine;
>> +       u64 total;
>> +};
>> +
>> +static int busy_add(int _id, void *p, void *data)
>> +{
>> +       struct i915_gem_context *ctx = p;
>> +       struct busy_ctx *bc = data;
>> +
>> +       bc->total +=
>> +               ktime_to_ns(intel_context_engine_get_busy_time(ctx,
>> +                                                              bc->engine));
>> +
>> +       return 0;
>> +}
>> +
>> +static ssize_t
>> +show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
>> +{
>> +       struct i915_engine_busy_attribute *i915_attr =
>> +               container_of(attr, typeof(*i915_attr), attr);
>> +       struct drm_i915_file_private *file_priv = i915_attr->file_priv;
>> +       struct intel_engine_cs *engine = i915_attr->engine;
>> +       struct drm_i915_private *i915 = engine->i915;
>> +       struct busy_ctx bc = { .engine = engine };
>> +       int ret;
>> +
>> +       ret = i915_mutex_lock_interruptible(&i915->drm);
>> +       if (ret)
>> +               return ret;
>> +
> 
> Doesn't need struct_mutex, just rcu_read_lock() will suffice.
> 
> Neither the context nor idr will be freed too soon, and the data is
> involatile when the context is unreffed (and contexts don't have the
> nasty zombie/undead status of requests). So the busy-time will be
> stable.

Are you sure? What holds a reference to contexts while userspace might 
by in sysfs reading the stat? It would be super nice if we could avoid 
struct mutex here.. I just don't understand at the moment why it would 
be safe.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 4/5] drm/i915: Expose per-engine client busyness
  2018-02-15  9:41     ` Tvrtko Ursulin
@ 2018-02-15  9:44       ` Chris Wilson
  2018-02-15 15:13         ` Tvrtko Ursulin
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2018-02-15  9:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-02-15 09:41:53)
> 
> On 14/02/2018 19:17, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-14 18:50:34)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Expose per-client and per-engine busyness under the previously added sysfs
> >> client root.
> >>
> >> The new files are one per-engine instance and located under the 'busy'
> >> directory.
> >>
> >> Each contains a monotonically increasing nano-second resolution times each
> >> client's jobs were executing on the GPU.
> >>
> >> $ cat /sys/class/drm/card0/clients/5/busy/rcs0
> >> 32516602
> >>
> >> This data can serve as an interface to implement a top like utility for
> >> GPU jobs. For instance I have prototyped a tool in IGT which produces
> >> periodic output like:
> >>
> >> neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
> >>       Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
> >>      xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
> >>
> >> This tools can also be extended to use the i915 PMU and show overall engine
> >> busyness, and engine loads using the queue depth metric.
> >>
> >> v2: Use intel_context_engine_get_busy_time.
> >> v3: New directory structure.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_drv.h |  8 ++++
> >>   drivers/gpu/drm/i915/i915_gem.c | 86 +++++++++++++++++++++++++++++++++++++++--
> >>   2 files changed, 91 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index 372d13cb2472..d6b2883b42fe 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -315,6 +315,12 @@ struct drm_i915_private;
> >>   struct i915_mm_struct;
> >>   struct i915_mmu_object;
> >>   
> >> +struct i915_engine_busy_attribute {
> >> +       struct device_attribute attr;
> >> +       struct drm_i915_file_private *file_priv;
> >> +       struct intel_engine_cs *engine;
> >> +};
> >> +
> >>   struct drm_i915_file_private {
> >>          struct drm_i915_private *dev_priv;
> >>          struct drm_file *file;
> >> @@ -350,10 +356,12 @@ struct drm_i915_file_private {
> >>          unsigned int client_pid;
> >>          char *client_name;
> >>          struct kobject *client_root;
> >> +       struct kobject *busy_root;
> >>   
> >>          struct {
> >>                  struct device_attribute pid;
> >>                  struct device_attribute name;
> >> +               struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
> >>          } attr;
> >>   };
> >>   
> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >> index 46ac7b3ca348..01298d924524 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem.c
> >> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> @@ -5631,6 +5631,45 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
> >>          return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
> >>   }
> >>   
> >> +struct busy_ctx {
> >> +       struct intel_engine_cs *engine;
> >> +       u64 total;
> >> +};
> >> +
> >> +static int busy_add(int _id, void *p, void *data)
> >> +{
> >> +       struct i915_gem_context *ctx = p;
> >> +       struct busy_ctx *bc = data;
> >> +
> >> +       bc->total +=
> >> +               ktime_to_ns(intel_context_engine_get_busy_time(ctx,
> >> +                                                              bc->engine));
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static ssize_t
> >> +show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
> >> +{
> >> +       struct i915_engine_busy_attribute *i915_attr =
> >> +               container_of(attr, typeof(*i915_attr), attr);
> >> +       struct drm_i915_file_private *file_priv = i915_attr->file_priv;
> >> +       struct intel_engine_cs *engine = i915_attr->engine;
> >> +       struct drm_i915_private *i915 = engine->i915;
> >> +       struct busy_ctx bc = { .engine = engine };
> >> +       int ret;
> >> +
> >> +       ret = i915_mutex_lock_interruptible(&i915->drm);
> >> +       if (ret)
> >> +               return ret;
> >> +
> > 
> > Doesn't need struct_mutex, just rcu_read_lock() will suffice.
> > 
> > Neither the context nor idr will be freed too soon, and the data is
> > involatile when the context is unreffed (and contexts don't have the
> > nasty zombie/undead status of requests). So the busy-time will be
> > stable.
> 
> Are you sure? What holds a reference to contexts while userspace might 
> by in sysfs reading the stat? It would be super nice if we could avoid 
> struct mutex here.. I just don't understand at the moment why it would 
> be safe.

RCU keeps the pointer alive, and in this case the context is not reused
before being freed (unlike requests). So given that it's unref, it is
dead and the stats will be stable, just not yet freed.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 0/5] Per-client engine stats
  2018-02-14 19:20 ` [RFC 0/5] " Chris Wilson
@ 2018-02-15  9:44   ` Tvrtko Ursulin
  2018-02-15  9:47     ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-15  9:44 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 14/02/2018 19:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-14 18:50:30)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Another re-post of my earlier, now slightly updated work, to expose a DRM client
>> hierarchy in sysfs in order to enable a top like tool:
> 
> So what I don't like about it is that it is a new interface in sysfs. We
> already have a PMU interface for statistics and would rather see that
> extended than abandoned. If perf can handle new processes coming and
> going, surely we can handle new clients? :|

I don't think it means abandoning the PMU, just that I don't see it 
suitable for this use case.

Even if we go with adding a PMU task mode, that is a separate thing from 
this. It would allow profiling of a single task, but not enumerating and 
profiling all clients/tasks from perf/PMU.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 0/5] Per-client engine stats
  2018-02-15  9:44   ` Tvrtko Ursulin
@ 2018-02-15  9:47     ` Chris Wilson
  2018-02-15 10:50       ` Tvrtko Ursulin
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2018-02-15  9:47 UTC (permalink / raw)
  To: Tvrtko Ursulin, Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-02-15 09:44:58)
> 
> On 14/02/2018 19:20, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-14 18:50:30)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Another re-post of my earlier, now slightly updated work, to expose a DRM client
> >> hierarchy in sysfs in order to enable a top like tool:
> > 
> > So what I don't like about it is that it is a new interface in sysfs. We
> > already have a PMU interface for statistics and would rather see that
> > extended than abandoned. If perf can handle new processes coming and
> > going, surely we can handle new clients? :|
> 
> I don't think it means abandoning the PMU, just that I don't see it 
> suitable for this use case.
> 
> Even if we go with adding a PMU task mode, that is a separate thing from 
> this. It would allow profiling of a single task, but not enumerating and 
> profiling all clients/tasks from perf/PMU.

I think perf top seems to handle processes coming and going, so I don't
think it's a fundamental limitation of perf, just our understanding :)

I'd rather have one interface to maintain :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 0/5] Per-client engine stats
  2018-02-15  9:47     ` Chris Wilson
@ 2018-02-15 10:50       ` Tvrtko Ursulin
  0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-15 10:50 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 15/02/2018 09:47, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-15 09:44:58)
>>
>> On 14/02/2018 19:20, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-02-14 18:50:30)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Another re-post of my earlier, now slightly updated work, to expose a DRM client
>>>> hierarchy in sysfs in order to enable a top like tool:
>>>
>>> So what I don't like about it is that it is a new interface in sysfs. We
>>> already have a PMU interface for statistics and would rather see that
>>> extended than abandoned. If perf can handle new processes coming and
>>> going, surely we can handle new clients? :|
>>
>> I don't think it means abandoning the PMU, just that I don't see it
>> suitable for this use case.
>>
>> Even if we go with adding a PMU task mode, that is a separate thing from
>> this. It would allow profiling of a single task, but not enumerating and
>> profiling all clients/tasks from perf/PMU.
> 
> I think perf top seems to handle processes coming and going, so I don't
> think it's a fundamental limitation of perf, just our understanding :)
> 
> I'd rather have one interface to maintain :)

Referencing my old branch when I barely started on per-task PMU, I think 
that the idea was to add another i915 PMU instance which allows events 
with tasks contexts.

Then in the implementation we would something like 
i915_get_engine_busy_for_task(event->ctx->task).

This would work for "perf stat -e i915/rcs0-busy some-program".

But not for perf top - that one actually creates sampling counters which 
need to provide things like PIDs and call-chains on each sample and I 
don't see that we can ever do this.

Ignoring "perf top", we could implement a top like tool using the above 
described new per-task PMU, but with two limitations:

1. No per-client support - only per-task.
2. More overhead - need a data structure, plus it's management, to map 
from tasks to lists of drm clients etc.

My point is that I did not see the sysfs interface as a substantial 
additional burden. Apart from the sysfs management bits, the rest is 
actually building blocks for per-task PMU. Because the solution from 
point 2 above would still need to aggregate the per-client stats, after 
it is able to walk per-task clients.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 4/5] drm/i915: Expose per-engine client busyness
  2018-02-15  9:44       ` Chris Wilson
@ 2018-02-15 15:13         ` Tvrtko Ursulin
  0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2018-02-15 15:13 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 15/02/2018 09:44, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-15 09:41:53)
>>
>> On 14/02/2018 19:17, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-02-14 18:50:34)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Expose per-client and per-engine busyness under the previously added sysfs
>>>> client root.
>>>>
>>>> The new files are one per-engine instance and located under the 'busy'
>>>> directory.
>>>>
>>>> Each contains a monotonically increasing nano-second resolution times each
>>>> client's jobs were executing on the GPU.
>>>>
>>>> $ cat /sys/class/drm/card0/clients/5/busy/rcs0
>>>> 32516602
>>>>
>>>> This data can serve as an interface to implement a top like utility for
>>>> GPU jobs. For instance I have prototyped a tool in IGT which produces
>>>> periodic output like:
>>>>
>>>> neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>>>>        Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>>>>       xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
>>>>
>>>> This tools can also be extended to use the i915 PMU and show overall engine
>>>> busyness, and engine loads using the queue depth metric.
>>>>
>>>> v2: Use intel_context_engine_get_busy_time.
>>>> v3: New directory structure.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_drv.h |  8 ++++
>>>>    drivers/gpu/drm/i915/i915_gem.c | 86 +++++++++++++++++++++++++++++++++++++++--
>>>>    2 files changed, 91 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>> index 372d13cb2472..d6b2883b42fe 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -315,6 +315,12 @@ struct drm_i915_private;
>>>>    struct i915_mm_struct;
>>>>    struct i915_mmu_object;
>>>>    
>>>> +struct i915_engine_busy_attribute {
>>>> +       struct device_attribute attr;
>>>> +       struct drm_i915_file_private *file_priv;
>>>> +       struct intel_engine_cs *engine;
>>>> +};
>>>> +
>>>>    struct drm_i915_file_private {
>>>>           struct drm_i915_private *dev_priv;
>>>>           struct drm_file *file;
>>>> @@ -350,10 +356,12 @@ struct drm_i915_file_private {
>>>>           unsigned int client_pid;
>>>>           char *client_name;
>>>>           struct kobject *client_root;
>>>> +       struct kobject *busy_root;
>>>>    
>>>>           struct {
>>>>                   struct device_attribute pid;
>>>>                   struct device_attribute name;
>>>> +               struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
>>>>           } attr;
>>>>    };
>>>>    
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>> index 46ac7b3ca348..01298d924524 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>> @@ -5631,6 +5631,45 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
>>>>           return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
>>>>    }
>>>>    
>>>> +struct busy_ctx {
>>>> +       struct intel_engine_cs *engine;
>>>> +       u64 total;
>>>> +};
>>>> +
>>>> +static int busy_add(int _id, void *p, void *data)
>>>> +{
>>>> +       struct i915_gem_context *ctx = p;
>>>> +       struct busy_ctx *bc = data;
>>>> +
>>>> +       bc->total +=
>>>> +               ktime_to_ns(intel_context_engine_get_busy_time(ctx,
>>>> +                                                              bc->engine));
>>>> +
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static ssize_t
>>>> +show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
>>>> +{
>>>> +       struct i915_engine_busy_attribute *i915_attr =
>>>> +               container_of(attr, typeof(*i915_attr), attr);
>>>> +       struct drm_i915_file_private *file_priv = i915_attr->file_priv;
>>>> +       struct intel_engine_cs *engine = i915_attr->engine;
>>>> +       struct drm_i915_private *i915 = engine->i915;
>>>> +       struct busy_ctx bc = { .engine = engine };
>>>> +       int ret;
>>>> +
>>>> +       ret = i915_mutex_lock_interruptible(&i915->drm);
>>>> +       if (ret)
>>>> +               return ret;
>>>> +
>>>
>>> Doesn't need struct_mutex, just rcu_read_lock() will suffice.
>>>
>>> Neither the context nor idr will be freed too soon, and the data is
>>> involatile when the context is unreffed (and contexts don't have the
>>> nasty zombie/undead status of requests). So the busy-time will be
>>> stable.
>>
>> Are you sure? What holds a reference to contexts while userspace might
>> by in sysfs reading the stat? It would be super nice if we could avoid
>> struct mutex here.. I just don't understand at the moment why it would
>> be safe.
> 
> RCU keeps the pointer alive, and in this case the context is not reused
> before being freed (unlike requests). So given that it's unref, it is
> dead and the stats will be stable, just not yet freed.

Somehow I missed the suggestion to replace with rcu_read_lock. Cool, 
this is then even more light-weight than I imagined.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-02-15 15:13 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-14 18:50 [RFC 0/5] Per-client engine stats Tvrtko Ursulin
2018-02-14 18:50 ` [RFC 1/5] drm/i915: Track per-context engine busyness Tvrtko Ursulin
2018-02-14 19:07   ` Chris Wilson
2018-02-15  9:29     ` Tvrtko Ursulin
2018-02-15  9:35       ` Chris Wilson
2018-02-14 18:50 ` [RFC 2/5] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
2018-02-14 19:13   ` Chris Wilson
2018-02-15  9:35     ` Tvrtko Ursulin
2018-02-14 18:50 ` [RFC 3/5] drm/i915: Update client name on context create Tvrtko Ursulin
2018-02-14 18:50 ` [RFC 4/5] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2018-02-14 19:17   ` Chris Wilson
2018-02-15  9:41     ` Tvrtko Ursulin
2018-02-15  9:44       ` Chris Wilson
2018-02-15 15:13         ` Tvrtko Ursulin
2018-02-14 18:50 ` [RFC 5/5] drm/i915: Add sysfs toggle to enable per-client engine stats Tvrtko Ursulin
2018-02-14 18:55 ` ✗ Fi.CI.CHECKPATCH: warning for Per-client " Patchwork
2018-02-14 19:11 ` ✓ Fi.CI.BAT: success " Patchwork
2018-02-14 19:20 ` [RFC 0/5] " Chris Wilson
2018-02-15  9:44   ` Tvrtko Ursulin
2018-02-15  9:47     ` Chris Wilson
2018-02-15 10:50       ` Tvrtko Ursulin
2018-02-15  2:19 ` ✓ Fi.CI.IGT: success for " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.