intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 0/9] Per client engine busyness
@ 2020-03-18 11:01 Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 1/9] drm/i915: Update client name on context create Tvrtko Ursulin
                   ` (10 more replies)
  0 siblings, 11 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Checkpatch cleanup and bits of review feedback only.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── 9
	    ├── busy
	    │   ├── 0
	    │   ├── 1
	    │   ├── 2
	    │   └── 3
	    ├── name
	    └── pid

Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

It is stil a RFC since it misses dedicated test cases to ensure things really
work as advertised.

Tvrtko Ursulin (9):
  drm/i915: Update client name on context create
  drm/i915: Make GEM contexts track DRM clients
  drm/i915: Use explicit flag to mark unreachable intel_context
  drm/i915: Track runtime spent in unreachable intel_contexts
  drm/i915: Track runtime spent in closed GEM contexts
  drm/i915: Track all user contexts per client
  drm/i915: Expose per-engine client busyness
  drm/i915: Track context current active time
  drm/i915: Prefer software tracked context busyness

 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  63 +++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  18 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   6 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  25 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  55 +++-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |  10 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  29 +-
 drivers/gpu/drm/i915/i915_drm_client.c        | 274 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h        |  33 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  25 +-
 12 files changed, 473 insertions(+), 88 deletions(-)

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 1/9] drm/i915: Update client name on context create
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 2/9] drm/i915: Make GEM contexts track DRM clients Tvrtko Ursulin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some clients have the DRM fd passed to them over a socket by the X server.

Grab the real client and pid when they create their first context and
update the exposed data for more useful enumeration.

To enable lockless access to client name and pid data from the following
patches, we also make these fields rcu protected. In this way asynchronous
code paths where both contexts which remain after the client exit, and
access to client name and pid as they are getting updated due context
creation running in parallel with name/pid queries.

v2:
 * Do not leak the pid reference and borrow context idr_lock. (Chris)

v3:
 * More avoiding leaks. (Chris)

v4:
 * Move update completely to drm client. (Chris)
 * Do not lose previous client data on failure to re-register and simplify
   update to only touch what it needs.

v5:
 * Reuse ext_data local. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |   5 +
 drivers/gpu/drm/i915/i915_drm_client.c      | 103 ++++++++++++++++++--
 drivers/gpu/drm/i915/i915_drm_client.h      |  10 +-
 3 files changed, 106 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c0e476fcd1fa..841838d6c1dd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -74,6 +74,7 @@
 #include "gt/intel_engine_user.h"
 #include "gt/intel_ring.h"
 
+#include "i915_drm_client.h"
 #include "i915_gem_context.h"
 #include "i915_globals.h"
 #include "i915_trace.h"
@@ -2357,6 +2358,10 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
+	ret = i915_drm_client_update(ext_data.fpriv->client, current);
+	if (ret)
+		return ret;
+
 	ext_data.ctx = i915_gem_create_context(i915, args->flags);
 	if (IS_ERR(ext_data.ctx))
 		return PTR_ERR(ext_data.ctx);
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 2067fbcdb795..342a11554573 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -7,6 +7,9 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 
+#include <drm/drm_print.h>
+
+#include "i915_drv.h"
 #include "i915_drm_client.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
@@ -22,10 +25,15 @@ show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
 {
 	struct i915_drm_client *client =
 		container_of(attr, typeof(*client), attr.name);
+	int ret;
+
+	rcu_read_lock();
+	ret = snprintf(buf, PAGE_SIZE,
+		       READ_ONCE(client->closed) ? "<%s>" : "%s",
+		       rcu_dereference(client->name));
+	rcu_read_unlock();
 
-	return snprintf(buf, PAGE_SIZE,
-			READ_ONCE(client->closed) ? "<%s>" : "%s",
-			client->name);
+	return ret;
 }
 
 static ssize_t
@@ -33,10 +41,15 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 {
 	struct i915_drm_client *client =
 		container_of(attr, typeof(*client), attr.pid);
+	int ret;
 
-	return snprintf(buf, PAGE_SIZE,
-			READ_ONCE(client->closed) ? "<%u>" : "%u",
-			pid_nr(client->pid));
+	rcu_read_lock();
+	ret = snprintf(buf, PAGE_SIZE,
+		       READ_ONCE(client->closed) ? "<%u>" : "%u",
+		       pid_nr(rcu_dereference(client->pid)));
+	rcu_read_unlock();
+
+	return ret;
 }
 
 static int
@@ -101,8 +114,8 @@ __i915_drm_client_register(struct i915_drm_client *client,
 	if (!name)
 		return -ENOMEM;
 
-	client->pid = get_task_pid(task, PIDTYPE_PID);
-	client->name = name;
+	rcu_assign_pointer(client->pid, get_task_pid(task, PIDTYPE_PID));
+	rcu_assign_pointer(client->name, name);
 
 	if (!clients->root)
 		return 0; /* intel_fbdev_init registers a client before sysfs */
@@ -125,8 +138,8 @@ __i915_drm_client_unregister(struct i915_drm_client *client)
 {
 	__client_unregister_sysfs(client);
 
-	put_pid(fetch_and_zero(&client->pid));
-	kfree(fetch_and_zero(&client->name));
+	put_pid(rcu_replace_pointer(client->pid, NULL, true));
+	kfree(rcu_replace_pointer(client->name, NULL, true));
 }
 
 struct i915_drm_client *
@@ -140,6 +153,7 @@ i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task)
 		return ERR_PTR(-ENOMEM);
 
 	kref_init(&client->kref);
+	mutex_init(&client->update_lock);
 	client->clients = clients;
 
 	ret = xa_alloc_cyclic(&clients->xarray, &client->id, client,
@@ -177,3 +191,72 @@ void i915_drm_client_close(struct i915_drm_client *client)
 	WRITE_ONCE(client->closed, true);
 	i915_drm_client_put(client);
 }
+
+struct client_update_free {
+	struct rcu_head rcu;
+	struct pid *pid;
+	char *name;
+};
+
+static void __client_update_free(struct rcu_head *rcu)
+{
+	struct client_update_free *old = container_of(rcu, typeof(*old), rcu);
+
+	put_pid(old->pid);
+	kfree(old->name);
+	kfree(old);
+}
+
+int
+i915_drm_client_update(struct i915_drm_client *client,
+		       struct task_struct *task)
+{
+	struct drm_i915_private *i915 =
+		container_of(client->clients, typeof(*i915), clients);
+	struct client_update_free *old;
+	struct pid *pid;
+	char *name;
+	int ret;
+
+	old = kmalloc(sizeof(*old), GFP_KERNEL);
+	if (!old)
+		return -ENOMEM;
+
+	ret = mutex_lock_interruptible(&client->update_lock);
+	if (ret)
+		goto out_free;
+
+	pid = get_task_pid(task, PIDTYPE_PID);
+	if (!pid)
+		goto out_pid;
+	if (pid == client->pid)
+		goto out_name;
+
+	name = kstrdup(task->comm, GFP_KERNEL);
+	if (!name) {
+		drm_notice(&i915->drm,
+			   "Failed to update client id=%u,name=%s,pid=%u! (%d)\n",
+			   client->id, client->name, pid_nr(client->pid), ret);
+		goto out_name;
+	}
+
+	init_rcu_head(&old->rcu);
+
+	old->pid = rcu_replace_pointer(client->pid, pid, true);
+	old->name = rcu_replace_pointer(client->name, name, true);
+
+	mutex_unlock(&client->update_lock);
+
+	call_rcu(&old->rcu, __client_update_free);
+
+	return 0;
+
+out_name:
+	put_pid(pid);
+out_pid:
+	mutex_unlock(&client->update_lock);
+out_free:
+	kfree(old);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index af6998c74d4c..11b48383881d 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -9,6 +9,7 @@
 #include <linux/device.h>
 #include <linux/kobject.h>
 #include <linux/kref.h>
+#include <linux/mutex.h>
 #include <linux/pid.h>
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
@@ -26,9 +27,11 @@ struct i915_drm_client {
 
 	struct rcu_head rcu;
 
+	struct mutex update_lock; /* Serializes name and pid updates. */
+
 	unsigned int id;
-	struct pid *pid;
-	char *name;
+	struct pid __rcu *pid;
+	char __rcu *name;
 	bool closed;
 
 	struct i915_drm_clients *clients;
@@ -61,4 +64,7 @@ void i915_drm_client_close(struct i915_drm_client *client);
 struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients,
 					    struct task_struct *task);
 
+int i915_drm_client_update(struct i915_drm_client *client,
+			   struct task_struct *task);
+
 #endif /* !__I915_DRM_CLIENT_H__ */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 2/9] drm/i915: Make GEM contexts track DRM clients
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 1/9] drm/i915: Update client name on context create Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 3/9] drm/i915: Use explicit flag to mark unreachable intel_context Tvrtko Ursulin
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If we make GEM contexts keep a reference to i915_drm_client for the whole
of their lifetime, we can consolidate the current task pid and name usage
by getting it from the client.

v2:
 * Don't bother supporting selftests contexts from debugfs. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 23 ++++++++++++---
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 13 ++-------
 drivers/gpu/drm/i915/i915_debugfs.c           | 29 +++++++------------
 drivers/gpu/drm/i915/i915_gpu_error.c         | 21 ++++++++------
 4 files changed, 44 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 841838d6c1dd..9afc60ab95e0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -340,8 +340,13 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 
 static void i915_gem_context_free(struct i915_gem_context *ctx)
 {
+	struct i915_drm_client *client = ctx->client;
+
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 
+	if (client)
+		i915_drm_client_put(client);
+
 	spin_lock(&ctx->i915->gem.contexts.lock);
 	list_del(&ctx->link);
 	spin_unlock(&ctx->i915->gem.contexts.lock);
@@ -351,7 +356,6 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	if (ctx->timeline)
 		intel_timeline_put(ctx->timeline);
 
-	put_pid(ctx->pid);
 	mutex_destroy(&ctx->mutex);
 
 	kfree_rcu(ctx, rcu);
@@ -911,6 +915,7 @@ static int gem_context_register(struct i915_gem_context *ctx,
 				struct drm_i915_file_private *fpriv,
 				u32 *id)
 {
+	struct i915_drm_client *client;
 	struct i915_address_space *vm;
 	int ret;
 
@@ -922,15 +927,25 @@ static int gem_context_register(struct i915_gem_context *ctx,
 		WRITE_ONCE(vm->file, fpriv); /* XXX */
 	mutex_unlock(&ctx->mutex);
 
-	ctx->pid = get_task_pid(current, PIDTYPE_PID);
+	client = i915_drm_client_get(fpriv->client);
+
+	rcu_read_lock();
 	snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
-		 current->comm, pid_nr(ctx->pid));
+		 rcu_dereference(client->name),
+		 pid_nr(rcu_dereference(client->pid)));
+	rcu_read_unlock();
 
 	/* And finally expose ourselves to userspace via the idr */
 	ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
 	if (ret)
-		put_pid(fetch_and_zero(&ctx->pid));
+		goto err;
+
+	ctx->client = client;
 
+	return 0;
+
+err:
+	i915_drm_client_put(client);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 28760bd03265..b0e03380c690 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -96,20 +96,13 @@ struct i915_gem_context {
 	 */
 	struct i915_address_space __rcu *vm;
 
-	/**
-	 * @pid: process id of creator
-	 *
-	 * Note that who created the context may not be the principle user,
-	 * as the context may be shared across a local socket. However,
-	 * that should only affect the default context, all contexts created
-	 * explicitly by the client are expected to be isolated.
-	 */
-	struct pid *pid;
-
 	/** link: place with &drm_i915_private.context_list */
 	struct list_head link;
 	struct llist_node free_link;
 
+	/** client: struct i915_drm_client */
+	struct i915_drm_client *client;
+
 	/**
 	 * @ref: reference count
 	 *
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6ca797128aa1..03018a108916 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -330,17 +330,15 @@ static void print_context_stats(struct seq_file *m,
 				.vm = rcu_access_pointer(ctx->vm),
 			};
 			struct drm_file *file = ctx->file_priv->file;
-			struct task_struct *task;
 			char name[80];
 
 			rcu_read_lock();
+
 			idr_for_each(&file->object_idr, per_file_stats, &stats);
-			rcu_read_unlock();
 
-			rcu_read_lock();
-			task = pid_task(ctx->pid ?: file->pid, PIDTYPE_PID);
 			snprintf(name, sizeof(name), "%s",
-				 task ? task->comm : "<unknown>");
+				 rcu_dereference(ctx->client->name));
+
 			rcu_read_unlock();
 
 			print_file_stats(m, name, stats);
@@ -1059,20 +1057,13 @@ static int i915_context_status(struct seq_file *m, void *unused)
 		spin_unlock(&i915->gem.contexts.lock);
 
 		seq_puts(m, "HW context ");
-		if (ctx->pid) {
-			struct task_struct *task;
-
-			task = get_pid_task(ctx->pid, PIDTYPE_PID);
-			if (task) {
-				seq_printf(m, "(%s [%d]) ",
-					   task->comm, task->pid);
-				put_task_struct(task);
-			}
-		} else if (IS_ERR(ctx->file_priv)) {
-			seq_puts(m, "(deleted) ");
-		} else {
-			seq_puts(m, "(kernel) ");
-		}
+
+		rcu_read_lock();
+		seq_printf(m, "(%s [%d]) %s",
+			   rcu_dereference(ctx->client->name),
+			   pid_nr(rcu_dereference(ctx->client->pid)),
+			   ctx->client->closed ? "(closed) " : "");
+		rcu_read_unlock();
 
 		seq_putc(m, ctx->remap_slice ? 'R' : 'r');
 		seq_putc(m, '\n');
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2a4cd0ba5464..653e1bc5050e 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1221,7 +1221,8 @@ static void record_request(const struct i915_request *request,
 	rcu_read_lock();
 	ctx = rcu_dereference(request->context->gem_context);
 	if (ctx)
-		erq->pid = pid_nr(ctx->pid);
+		erq->pid = I915_SELFTEST_ONLY(!ctx->client) ?
+			   0 : pid_nr(rcu_dereference(ctx->client->pid));
 	rcu_read_unlock();
 }
 
@@ -1241,23 +1242,25 @@ static bool record_context(struct i915_gem_context_coredump *e,
 			   const struct i915_request *rq)
 {
 	struct i915_gem_context *ctx;
-	struct task_struct *task;
 	bool simulated;
 
 	rcu_read_lock();
+
 	ctx = rcu_dereference(rq->context->gem_context);
 	if (ctx && !kref_get_unless_zero(&ctx->ref))
 		ctx = NULL;
-	rcu_read_unlock();
-	if (!ctx)
+	if (!ctx) {
+		rcu_read_unlock();
 		return true;
+	}
 
-	rcu_read_lock();
-	task = pid_task(ctx->pid, PIDTYPE_PID);
-	if (task) {
-		strcpy(e->comm, task->comm);
-		e->pid = task->pid;
+	if (I915_SELFTEST_ONLY(!ctx->client)) {
+		strcpy(e->comm, "[kernel]");
+	} else {
+		strcpy(e->comm, rcu_dereference(ctx->client->name));
+		e->pid = pid_nr(rcu_dereference(ctx->client->pid));
 	}
+
 	rcu_read_unlock();
 
 	e->sched_attr = ctx->sched;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 3/9] drm/i915: Use explicit flag to mark unreachable intel_context
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 1/9] drm/i915: Update client name on context create Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 2/9] drm/i915: Make GEM contexts track DRM clients Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 4/9] drm/i915: Track runtime spent in unreachable intel_contexts Tvrtko Ursulin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

I need to keep the GEM context around a bit longer so adding an explicit
flag for syncing execbuf with closed/abandonded contexts.

v2:
 * Use already available context flags. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c    | 3 ++-
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h  | 1 +
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 9afc60ab95e0..7c119a3a2cbd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -579,7 +579,8 @@ static void engines_idle_release(struct i915_gem_context *ctx,
 		int err = 0;
 
 		/* serialises with execbuf */
-		RCU_INIT_POINTER(ce->gem_context, NULL);
+		set_bit(INTEL_CONTEXT_CLOSED, &ce->flags);
+
 		if (!intel_context_pin_if_active(ce))
 			continue;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d3f4f28e9468..875da020d6c8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2316,7 +2316,7 @@ static void eb_request_add(struct i915_execbuffer *eb)
 	prev = __i915_request_commit(rq);
 
 	/* Check that the context wasn't destroyed before submission */
-	if (likely(rcu_access_pointer(eb->context->gem_context))) {
+	if (likely(!test_bit(INTEL_CONTEXT_CLOSED, &eb->context->flags))) {
 		attr = eb->gem_context->sched;
 
 		/*
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 0f3b68b95c56..d5925c25f109 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -66,6 +66,7 @@ struct intel_context {
 #define CONTEXT_BANNED			4
 #define CONTEXT_FORCE_SINGLE_SUBMISSION	5
 #define CONTEXT_NOPREEMPT		6
+#define INTEL_CONTEXT_CLOSED		7
 
 	u32 *lrc_reg_state;
 	u64 lrc_desc;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 4/9] drm/i915: Track runtime spent in unreachable intel_contexts
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 3/9] drm/i915: Use explicit flag to mark unreachable intel_context Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 5/9] drm/i915: Track runtime spent in closed GEM contexts Tvrtko Ursulin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

As contexts are abandoned we want to remember how much GPU time they used
(per class) so later we can used it for smarter purposes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c       | 13 ++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h |  5 +++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7c119a3a2cbd..5edf79ed6247 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -257,7 +257,19 @@ static void free_engines_rcu(struct rcu_head *rcu)
 {
 	struct i915_gem_engines *engines =
 		container_of(rcu, struct i915_gem_engines, rcu);
+	struct i915_gem_context *ctx = engines->ctx;
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+
+	/* Transfer accumulated runtime to the parent GEM context. */
+	for_each_gem_engine(ce, engines, it) {
+		unsigned int class = ce->engine->uabi_class;
 
+		GEM_BUG_ON(class >= ARRAY_SIZE(ctx->past_runtime));
+		atomic64_add(ce->runtime.total, &ctx->past_runtime[class]);
+	}
+
+	i915_gem_context_put(ctx);
 	i915_sw_fence_fini(&engines->fence);
 	free_engines(engines);
 }
@@ -278,7 +290,6 @@ engines_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 			list_del(&engines->link);
 			spin_unlock_irqrestore(&ctx->stale.lock, flags);
 		}
-		i915_gem_context_put(engines->ctx);
 		break;
 
 	case FENCE_FREE:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index b0e03380c690..f0d7441aafc8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -177,6 +177,11 @@ struct i915_gem_context {
 		spinlock_t lock;
 		struct list_head engines;
 	} stale;
+
+	/**
+	 * @past_runtime: Accumulation of freed intel_context pphwsp runtimes.
+	 */
+	atomic64_t past_runtime[MAX_ENGINE_CLASS + 1];
 };
 
 #endif /* __I915_GEM_CONTEXT_TYPES_H__ */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 5/9] drm/i915: Track runtime spent in closed GEM contexts
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 4/9] drm/i915: Track runtime spent in unreachable intel_contexts Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 6/9] drm/i915: Track all user contexts per client Tvrtko Ursulin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

As GEM contexts are closed we want to have the DRM client remember how
much GPU time they used (per class) so later we can used it for smarter
purposes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 12 +++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h      |  7 +++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5edf79ed6247..912375fb8a3b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -355,8 +355,18 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 
-	if (client)
+	if (client) {
+		unsigned int i;
+
+		/* Transfer accumulated runtime to the parent drm client. */
+		BUILD_BUG_ON(ARRAY_SIZE(client->past_runtime) !=
+			     ARRAY_SIZE(ctx->past_runtime));
+		for (i = 0; i < ARRAY_SIZE(client->past_runtime); i++)
+			atomic64_add(atomic64_read(&ctx->past_runtime[i]),
+				     &client->past_runtime[i]);
+
 		i915_drm_client_put(client);
+	}
 
 	spin_lock(&ctx->i915->gem.contexts.lock);
 	list_del(&ctx->link);
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 11b48383881d..29b116606596 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -15,6 +15,8 @@
 #include <linux/sched.h>
 #include <linux/xarray.h>
 
+#include "gt/intel_engine_types.h"
+
 struct i915_drm_clients {
 	struct xarray xarray;
 	u32 next_id;
@@ -41,6 +43,11 @@ struct i915_drm_client {
 		struct device_attribute pid;
 		struct device_attribute name;
 	} attr;
+
+	/**
+	 * @past_runtime: Accumulation of pphwsp runtimes from closed contexts.
+	 */
+	atomic64_t past_runtime[MAX_ENGINE_CLASS + 1];
 };
 
 void i915_drm_clients_init(struct i915_drm_clients *clients);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 6/9] drm/i915: Track all user contexts per client
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 5/9] drm/i915: Track runtime spent in closed GEM contexts Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We soon want to start answering questions like how much GPU time is the
context belonging to a client which exited still using.

To enable this we start tracking all context belonging to a client on a
separate list.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c       | 8 ++++++++
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 3 +++
 drivers/gpu/drm/i915/i915_drm_client.c            | 3 +++
 drivers/gpu/drm/i915/i915_drm_client.h            | 5 +++++
 4 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 912375fb8a3b..9654f0ab3f45 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -358,6 +358,10 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	if (client) {
 		unsigned int i;
 
+		spin_lock(&client->ctx_lock);
+		list_del_rcu(&ctx->client_link);
+		spin_unlock(&client->ctx_lock);
+
 		/* Transfer accumulated runtime to the parent drm client. */
 		BUILD_BUG_ON(ARRAY_SIZE(client->past_runtime) !=
 			     ARRAY_SIZE(ctx->past_runtime));
@@ -964,6 +968,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
 
 	ctx->client = client;
 
+	spin_lock(&client->ctx_lock);
+	list_add_tail_rcu(&ctx->client_link, &client->ctx_list);
+	spin_unlock(&client->ctx_lock);
+
 	return 0;
 
 err:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index f0d7441aafc8..255fcc469d9b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -103,6 +103,9 @@ struct i915_gem_context {
 	/** client: struct i915_drm_client */
 	struct i915_drm_client *client;
 
+	/** link: &drm_client.context_list */
+	struct list_head client_link;
+
 	/**
 	 * @ref: reference count
 	 *
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 342a11554573..c88d9ff448e0 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -154,6 +154,9 @@ i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task)
 
 	kref_init(&client->kref);
 	mutex_init(&client->update_lock);
+	spin_lock_init(&client->ctx_lock);
+	INIT_LIST_HEAD(&client->ctx_list);
+
 	client->clients = clients;
 
 	ret = xa_alloc_cyclic(&clients->xarray, &client->id, client,
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 29b116606596..0be27aa9bbda 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -9,10 +9,12 @@
 #include <linux/device.h>
 #include <linux/kobject.h>
 #include <linux/kref.h>
+#include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/pid.h>
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
+#include <linux/spinlock.h>
 #include <linux/xarray.h>
 
 #include "gt/intel_engine_types.h"
@@ -36,6 +38,9 @@ struct i915_drm_client {
 	char __rcu *name;
 	bool closed;
 
+	spinlock_t ctx_lock; /* For add/remove from ctx_list. */
+	struct list_head ctx_list; /* List of contexts belonging to client. */
+
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 6/9] drm/i915: Track all user contexts per client Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 8/9] drm/i915: Track context current active time Tvrtko Ursulin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.

This enables userspace to create a top-like tool for GPU utilization:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

      IMC reads:     1401 MiB/s
     IMC writes:        4 MiB/s

          ENGINE      BUSY                                 MI_SEMA MI_WAIT
     Render/3D/0   63.73% |███████████████████           |      3%      0%
       Blitter/0    9.53% |██▊                           |      6%      0%
         Video/0   39.32% |███████████▊                  |     16%      0%
         Video/1   15.62% |████▋                         |      0%      0%
  VideoEnhance/0    0.00% |                              |      0%      0%

  PID            NAME     RCS          BCS          VCS         VECS
 4084        gem_wsim |█████▌     ||█          ||           ||           |
 4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
 * Always enable stats.
 * Walk all client contexts.
v9:
 * Skip unsupported engine classes. (Chris)
 * Use scheduler caps. (Chris)
v10:
 * Use pphwsp runtime only.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 110 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h |  11 +++
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index c88d9ff448e0..0a2d933fe83c 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -9,8 +9,13 @@
 
 #include <drm/drm_print.h>
 
+#include <uapi/drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_drm_client.h"
+#include "gem/i915_gem_context.h"
+#include "gt/intel_engine_user.h"
+#include "i915_drv.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
 
@@ -52,6 +57,104 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return ret;
 }
 
+static u64
+pphwsp_busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+	struct i915_gem_engines *engines = rcu_dereference(ctx->engines);
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+	u64 total = 0;
+
+	for_each_gem_engine(ce, engines, it) {
+		if (ce->engine->uabi_class == class)
+			total += ce->runtime.total;
+	}
+
+	return total;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	unsigned int class = i915_attr->engine_class;
+	struct i915_drm_client *client = i915_attr->client;
+	u64 total = atomic64_read(&client->past_runtime[class]);
+	struct list_head *list = &client->ctx_list;
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ctx, list, client_link) {
+		total += atomic64_read(&ctx->past_runtime[class]);
+		total += pphwsp_busy_add(ctx, class);
+	}
+	rcu_read_unlock();
+
+	total *= RUNTIME_INFO(i915_attr->i915)->cs_timestamp_period_ns;
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
+}
+
+static const char * const uabi_class_names[] = {
+	[I915_ENGINE_CLASS_RENDER] = "0",
+	[I915_ENGINE_CLASS_COPY] = "1",
+	[I915_ENGINE_CLASS_VIDEO] = "2",
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = "3",
+};
+
+static int
+__client_register_sysfs_busy(struct i915_drm_client *client)
+{
+	struct i915_drm_clients *clients = client->clients;
+	struct drm_i915_private *i915 =
+		container_of(clients, typeof(*i915), clients);
+	unsigned int i;
+	int ret = 0;
+
+	if (!HAS_LOGICAL_RING_CONTEXTS(i915))
+		return 0;
+
+	client->busy_root = kobject_create_and_add("busy", client->root);
+	if (!client->busy_root)
+		return -ENOMEM;
+
+	for (i = 0; i < ARRAY_SIZE(uabi_class_names); i++) {
+		struct i915_engine_busy_attribute *i915_attr =
+			&client->attr.busy[i];
+		struct device_attribute *attr = &i915_attr->attr;
+
+		if (!intel_engine_lookup_user(i915, i, 0))
+			continue;
+
+		i915_attr->client = client;
+		i915_attr->i915 = i915;
+		i915_attr->engine_class = i;
+
+		sysfs_attr_init(&attr->attr);
+
+		attr->attr.name = uabi_class_names[i];
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(client->busy_root,
+					(struct attribute *)attr);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	kobject_put(client->busy_root);
+	return ret;
+}
+
+static void __client_unregister_sysfs_busy(struct i915_drm_client *client)
+{
+	kobject_put(fetch_and_zero(&client->busy_root));
+}
+
 static int
 __client_register_sysfs(struct i915_drm_client *client)
 {
@@ -88,9 +191,12 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 		ret = sysfs_create_file(client->root, (struct attribute *)attr);
 		if (ret)
-			break;
+			goto out;
 	}
 
+	ret = __client_register_sysfs_busy(client);
+
+out:
 	if (ret)
 		kobject_put(client->root);
 
@@ -99,6 +205,8 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 static void __client_unregister_sysfs(struct i915_drm_client *client)
 {
+	__client_unregister_sysfs_busy(client);
+
 	kobject_put(fetch_and_zero(&client->root));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 0be27aa9bbda..98c5fc24fcb4 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -26,6 +26,15 @@ struct i915_drm_clients {
 	struct kobject *root;
 };
 
+struct i915_drm_client;
+
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct drm_i915_private *i915;
+	struct i915_drm_client *client;
+	unsigned int engine_class;
+};
+
 struct i915_drm_client {
 	struct kref kref;
 
@@ -44,9 +53,11 @@ struct i915_drm_client {
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
+	struct kobject *busy_root;
 	struct {
 		struct device_attribute pid;
 		struct device_attribute name;
+		struct i915_engine_busy_attribute busy[MAX_ENGINE_CLASS + 1];
 	} attr;
 
 	/**
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 8/9] drm/i915: Track context current active time
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 9/9] drm/i915: Prefer software tracked context busyness Tvrtko Ursulin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Track context active (on hardware) status together with the start
timestamp.

This will be used to provide better granularity of context
runtime reporting in conjunction with already tracked pphwsp accumulated
runtime.

The latter is only updated on context save so does not give us visibility
to any currently executing work.

As part of the patch the existing runtime tracking data is moved under the
new ce->stats member and updated under the seqlock. This provides the
ability to atomically read out accumulated plus active runtime.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  3 +-
 drivers/gpu/drm/i915/gt/intel_context.c       | 18 +++++-
 drivers/gpu/drm/i915/gt/intel_context.h       |  6 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h | 24 +++++---
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 55 +++++++++++++++----
 drivers/gpu/drm/i915/gt/selftest_lrc.c        | 10 ++--
 drivers/gpu/drm/i915/i915_drm_client.c        |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  4 +-
 8 files changed, 90 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 9654f0ab3f45..8c67a1269142 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -266,7 +266,8 @@ static void free_engines_rcu(struct rcu_head *rcu)
 		unsigned int class = ce->engine->uabi_class;
 
 		GEM_BUG_ON(class >= ARRAY_SIZE(ctx->past_runtime));
-		atomic64_add(ce->runtime.total, &ctx->past_runtime[class]);
+		atomic64_add(ce->stats.runtime.total,
+			     &ctx->past_runtime[class]);
 	}
 
 	i915_gem_context_put(ctx);
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 01474d3a558b..1f371f8b202a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -288,7 +288,7 @@ intel_context_init(struct intel_context *ce,
 	ce->sseu = engine->sseu;
 	ce->ring = __intel_context_ring_size(SZ_4K);
 
-	ewma_runtime_init(&ce->runtime.avg);
+	ewma_runtime_init(&ce->stats.runtime.avg);
 
 	ce->vm = i915_vm_get(engine->gt->vm);
 
@@ -296,6 +296,7 @@ intel_context_init(struct intel_context *ce,
 	INIT_LIST_HEAD(&ce->signals);
 
 	mutex_init(&ce->pin_mutex);
+	seqlock_init(&ce->stats.lock);
 
 	i915_active_init(&ce->active,
 			 __intel_context_active, __intel_context_retire);
@@ -390,6 +391,21 @@ struct i915_request *intel_context_create_request(struct intel_context *ce)
 	return rq;
 }
 
+ktime_t intel_context_get_active_time(struct intel_context *ce)
+{
+	struct intel_context_stats *stats = &ce->stats;
+	unsigned int seq;
+	ktime_t total = 0;
+
+	do {
+		seq = read_seqbegin(&stats->lock);
+		if (stats->active)
+			total = ktime_sub(ktime_get(), stats->start);
+	} while (read_seqretry(&stats->lock, seq));
+
+	return total;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_context.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 18efad255124..1f8b6e0bbfaf 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -233,7 +233,7 @@ static inline u64 intel_context_get_total_runtime_ns(struct intel_context *ce)
 	const u32 period =
 		RUNTIME_INFO(ce->engine->i915)->cs_timestamp_period_ns;
 
-	return READ_ONCE(ce->runtime.total) * period;
+	return READ_ONCE(ce->stats.runtime.total) * period;
 }
 
 static inline u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
@@ -241,7 +241,9 @@ static inline u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
 	const u32 period =
 		RUNTIME_INFO(ce->engine->i915)->cs_timestamp_period_ns;
 
-	return mul_u32_u32(ewma_runtime_read(&ce->runtime.avg), period);
+	return mul_u32_u32(ewma_runtime_read(&ce->stats.runtime.avg), period);
 }
 
+ktime_t intel_context_get_active_time(struct intel_context *ce);
+
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index d5925c25f109..fe85a3b0b7f4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -12,6 +12,7 @@
 #include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/types.h>
+#include <linux/seqlock.h>
 
 #include "i915_active_types.h"
 #include "i915_utils.h"
@@ -72,14 +73,21 @@ struct intel_context {
 	u64 lrc_desc;
 	u32 tag; /* cookie passed to HW to track this context on submission */
 
-	/* Time on GPU as tracked by the hw. */
-	struct {
-		struct ewma_runtime avg;
-		u64 total;
-		u32 last;
-		I915_SELFTEST_DECLARE(u32 num_underflow);
-		I915_SELFTEST_DECLARE(u32 max_underflow);
-	} runtime;
+	/** stats: Context GPU engine busyness tracking. */
+	struct intel_context_stats {
+		seqlock_t lock;
+		bool active;
+		ktime_t start;
+
+		/* Time on GPU as tracked by the hw. */
+		struct {
+			struct ewma_runtime avg;
+			u64 total;
+			u32 last;
+			I915_SELFTEST_DECLARE(u32 num_underflow);
+			I915_SELFTEST_DECLARE(u32 max_underflow);
+		} runtime;
+	} stats;
 
 	unsigned int active_count; /* protected by timeline->mutex */
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 112531b29f59..6143c6a9dea4 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1206,35 +1206,61 @@ static u32 intel_context_get_runtime(const struct intel_context *ce)
 	return READ_ONCE(ce->lrc_reg_state[CTX_TIMESTAMP]);
 }
 
-static void st_update_runtime_underflow(struct intel_context *ce, s32 dt)
+static void
+st_update_runtime_underflow(struct intel_context_stats *stats, s32 dt)
 {
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-	ce->runtime.num_underflow += dt < 0;
-	ce->runtime.max_underflow = max_t(u32, ce->runtime.max_underflow, -dt);
+	stats->runtime.num_underflow += dt < 0;
+	stats->runtime.max_underflow =
+		max_t(u32, stats->runtime.max_underflow, -dt);
 #endif
 }
 
 static void intel_context_update_runtime(struct intel_context *ce)
 {
+	struct intel_context_stats *stats = &ce->stats;
 	u32 old;
 	s32 dt;
 
 	if (intel_context_is_barrier(ce))
 		return;
 
-	old = ce->runtime.last;
-	ce->runtime.last = intel_context_get_runtime(ce);
-	dt = ce->runtime.last - old;
+	old = stats->runtime.last;
+	stats->runtime.last = intel_context_get_runtime(ce);
+	dt = stats->runtime.last - old;
 
 	if (unlikely(dt <= 0)) {
 		CE_TRACE(ce, "runtime underflow: last=%u, new=%u, delta=%d\n",
-			 old, ce->runtime.last, dt);
-		st_update_runtime_underflow(ce, dt);
+			 old, stats->runtime.last, dt);
+		st_update_runtime_underflow(stats, dt);
 		return;
 	}
 
-	ewma_runtime_add(&ce->runtime.avg, dt);
-	ce->runtime.total += dt;
+	ewma_runtime_add(&stats->runtime.avg, dt);
+	stats->runtime.total += dt;
+}
+
+static void intel_context_stats_start(struct intel_context *ce)
+{
+	struct intel_context_stats *stats = &ce->stats;
+	unsigned long flags;
+
+	write_seqlock_irqsave(&stats->lock, flags);
+	stats->start = ktime_get();
+	stats->active = true;
+	write_sequnlock_irqrestore(&stats->lock, flags);
+}
+
+static void intel_context_stats_stop(struct intel_context *ce)
+{
+	struct intel_context_stats *stats = &ce->stats;
+	unsigned long flags;
+
+	write_seqlock_irqsave(&stats->lock, flags);
+	stats->active = false;
+	stats->start = 0;
+	intel_context_update_runtime(ce);
+	write_sequnlock_irqrestore(&stats->lock, flags);
 }
 
 static inline struct intel_engine_cs *
@@ -1304,7 +1330,7 @@ static inline void
 __execlists_schedule_out(struct i915_request *rq,
 			 struct intel_engine_cs * const engine)
 {
-	struct intel_context * const ce = rq->context;
+	struct intel_context *ce = rq->context;
 
 	/*
 	 * NB process_csb() is not under the engine->active.lock and hence
@@ -1320,8 +1346,8 @@ __execlists_schedule_out(struct i915_request *rq,
 	    i915_request_completed(rq))
 		intel_engine_add_retire(engine, ce->timeline);
 
-	intel_context_update_runtime(ce);
 	intel_engine_context_out(engine);
+	intel_context_stats_stop(ce);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put_async(engine->gt);
 
@@ -1790,6 +1816,11 @@ active_timeslice(const struct intel_engine_cs *engine)
 
 static void set_timeslice(struct intel_engine_cs *engine)
 {
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+
+	if (*execlists->active)
+		intel_context_stats_start((*execlists->active)->context);
+
 	if (!intel_engine_has_timeslices(engine))
 		return;
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 6f06ba750a0a..6a51e7fa03ca 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -5479,8 +5479,8 @@ static int __live_pphwsp_runtime(struct intel_engine_cs *engine)
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
-	ce->runtime.num_underflow = 0;
-	ce->runtime.max_underflow = 0;
+	ce->stats.runtime.num_underflow = 0;
+	ce->stats.runtime.max_underflow = 0;
 
 	do {
 		unsigned int loop = 1024;
@@ -5518,11 +5518,11 @@ static int __live_pphwsp_runtime(struct intel_engine_cs *engine)
 		intel_context_get_avg_runtime_ns(ce));
 
 	err = 0;
-	if (ce->runtime.num_underflow) {
+	if (ce->stats.runtime.num_underflow) {
 		pr_err("%s: pphwsp underflow %u time(s), max %u cycles!\n",
 		       engine->name,
-		       ce->runtime.num_underflow,
-		       ce->runtime.max_underflow);
+		       ce->stats.runtime.num_underflow,
+		       ce->stats.runtime.max_underflow);
 		GEM_TRACE_DUMP();
 		err = -EOVERFLOW;
 	}
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 0a2d933fe83c..485a2b75d3e1 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -67,7 +67,7 @@ pphwsp_busy_add(struct i915_gem_context *ctx, unsigned int class)
 
 	for_each_gem_engine(ce, engines, it) {
 		if (ce->engine->uabi_class == class)
-			total += ce->runtime.total;
+			total += ce->stats.runtime.total;
 	}
 
 	return total;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 653e1bc5050e..ce752572c38f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1267,8 +1267,8 @@ static bool record_context(struct i915_gem_context_coredump *e,
 	e->guilty = atomic_read(&ctx->guilty_count);
 	e->active = atomic_read(&ctx->active_count);
 
-	e->total_runtime = rq->context->runtime.total;
-	e->avg_runtime = ewma_runtime_read(&rq->context->runtime.avg);
+	e->total_runtime = rq->context->stats.runtime.total;
+	e->avg_runtime = ewma_runtime_read(&rq->context->stats.runtime.avg);
 
 	simulated = i915_gem_context_no_error_capture(ctx);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 9/9] drm/i915: Prefer software tracked context busyness
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (7 preceding siblings ...)
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 8/9] drm/i915: Track context current active time Tvrtko Ursulin
@ 2020-03-18 11:01 ` Tvrtko Ursulin
  2020-03-18 11:49 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Per client engine busyness (rev8) Patchwork
  2020-03-18 12:16 ` [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 11:01 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

When available prefer context tracked context busyness because it provides
visibility into currently executing contexts as well.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 68 ++++++++++++++++++++++++--
 1 file changed, 63 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 485a2b75d3e1..31f0d373caae 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -96,6 +96,61 @@ show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
 }
 
+static u64
+sw_busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+	struct i915_gem_engines *engines = rcu_dereference(ctx->engines);
+	u32 period_ns = RUNTIME_INFO(ctx->i915)->cs_timestamp_period_ns;
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+	u64 total = 0;
+
+	for_each_gem_engine(ce, engines, it) {
+		struct intel_context_stats *stats;
+		unsigned int seq;
+		u64 t;
+
+		if (ce->engine->uabi_class != class)
+			continue;
+
+		stats = &ce->stats;
+
+		do {
+			seq = read_seqbegin(&stats->lock);
+			t = ce->stats.runtime.total * period_ns;
+			t += intel_context_get_active_time(ce);
+		} while (read_seqretry(&stats->lock, seq));
+
+		total += t;
+	}
+
+	return total;
+}
+
+static ssize_t
+show_client_sw_busy(struct device *kdev,
+		    struct device_attribute *attr,
+		    char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	unsigned int class = i915_attr->engine_class;
+	struct i915_drm_client *client = i915_attr->client;
+	u32 period_ns = RUNTIME_INFO(i915_attr->i915)->cs_timestamp_period_ns;
+	u64 total = atomic64_read(&client->past_runtime[class]) * period_ns;
+	struct list_head *list = &client->ctx_list;
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ctx, list, client_link) {
+		total += atomic64_read(&ctx->past_runtime[class]) * period_ns +
+			 sw_busy_add(ctx, class);
+	}
+	rcu_read_unlock();
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
+}
+
 static const char * const uabi_class_names[] = {
 	[I915_ENGINE_CLASS_RENDER] = "0",
 	[I915_ENGINE_CLASS_COPY] = "1",
@@ -109,6 +164,8 @@ __client_register_sysfs_busy(struct i915_drm_client *client)
 	struct i915_drm_clients *clients = client->clients;
 	struct drm_i915_private *i915 =
 		container_of(clients, typeof(*i915), clients);
+	bool sw_stats = i915->caps.scheduler &
+			I915_SCHEDULER_CAP_ENGINE_BUSY_STATS;
 	unsigned int i;
 	int ret = 0;
 
@@ -135,18 +192,19 @@ __client_register_sysfs_busy(struct i915_drm_client *client)
 
 		attr->attr.name = uabi_class_names[i];
 		attr->attr.mode = 0444;
-		attr->show = show_client_busy;
+		attr->show = sw_stats ?
+			     show_client_sw_busy : show_client_busy;
 
 		ret = sysfs_create_file(client->busy_root,
 					(struct attribute *)attr);
 		if (ret)
-			goto err;
+			goto out;
 	}
 
-	return 0;
+out:
+	if (ret)
+		kobject_put(client->busy_root);
 
-err:
-	kobject_put(client->busy_root);
 	return ret;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for Per client engine busyness (rev8)
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (8 preceding siblings ...)
  2020-03-18 11:01 ` [Intel-gfx] [PATCH 9/9] drm/i915: Prefer software tracked context busyness Tvrtko Ursulin
@ 2020-03-18 11:49 ` Patchwork
  2020-03-18 12:16 ` [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
  10 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2020-03-18 11:49 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per client engine busyness (rev8)
URL   : https://patchwork.freedesktop.org/series/70977/
State : failure

== Summary ==

Applying: drm/i915: Update client name on context create
error: sha1 information is lacking or useless (drivers/gpu/drm/i915/i915_drm_client.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 drm/i915: Update client name on context create
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Intel-gfx] [PATCH 0/9] Per client engine busyness
  2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
                   ` (9 preceding siblings ...)
  2020-03-18 11:49 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Per client engine busyness (rev8) Patchwork
@ 2020-03-18 12:16 ` Tvrtko Ursulin
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-03-18 12:16 UTC (permalink / raw)
  To: Intel-gfx


On 18/03/2020 11:01, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Another re-spin of the per-client engine busyness series. Highlights from this
> version:
> 

Broken version with one patch missing, apologies for the spam.

Regards,

Tvrtko

>   * Checkpatch cleanup and bits of review feedback only.
> 
> Internally we track time spent on engines for each struct intel_context. This
> can serve as a building block for several features from the want list:
> smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
> wanted by some customers, cgroups controller, dynamic SSEU tuning,...
> 
> Externally, in sysfs, we expose time spent on GPU per client and per engine
> class.
> 
> Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
> a "screenshot":
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s
> 
>        IMC reads:     4414 MiB/s
>       IMC writes:     3805 MiB/s
> 
>            ENGINE      BUSY                                      MI_SEMA MI_WAIT
>       Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
>         Blitter/0    0.00% |                                   |      0%      0%
>           Video/0    0.00% |                                   |      0%      0%
>    VideoEnhance/0    0.00% |                                   |      0%      0%
> 
>    PID            NAME  Render/3D      Blitter        Video      VideoEnhance
>   2733       neverball |██████▌     ||            ||            ||            |
>   2047            Xorg |███▊        ||            ||            ||            |
>   2737        glxgears |█▍          ||            ||            ||            |
>   2128           xfwm4 |            ||            ||            ||            |
>   2047            Xorg |            ||            ||            ||            |
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Implementation wise we add a a bunch of files in sysfs like:
> 
> 	# cd /sys/class/drm/card0/clients/
> 	# tree
> 	.
> 	├── 7
> 	│   ├── busy
> 	│   │   ├── 0
> 	│   │   ├── 1
> 	│   │   ├── 2
> 	│   │   └── 3
> 	│   ├── name
> 	│   └── pid
> 	├── 8
> 	│   ├── busy
> 	│   │   ├── 0
> 	│   │   ├── 1
> 	│   │   ├── 2
> 	│   │   └── 3
> 	│   ├── name
> 	│   └── pid
> 	└── 9
> 	    ├── busy
> 	    │   ├── 0
> 	    │   ├── 1
> 	    │   ├── 2
> 	    │   └── 3
> 	    ├── name
> 	    └── pid
> 
> Files in 'busy' directories are numbered using the engine class ABI values and
> they contain accumulated nanoseconds each client spent on engines of a
> respective class.
> 
> It is stil a RFC since it misses dedicated test cases to ensure things really
> work as advertised.
> 
> Tvrtko Ursulin (9):
>    drm/i915: Update client name on context create
>    drm/i915: Make GEM contexts track DRM clients
>    drm/i915: Use explicit flag to mark unreachable intel_context
>    drm/i915: Track runtime spent in unreachable intel_contexts
>    drm/i915: Track runtime spent in closed GEM contexts
>    drm/i915: Track all user contexts per client
>    drm/i915: Expose per-engine client busyness
>    drm/i915: Track context current active time
>    drm/i915: Prefer software tracked context busyness
> 
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  63 +++-
>   .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +-
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>   drivers/gpu/drm/i915/gt/intel_context.c       |  18 +-
>   drivers/gpu/drm/i915/gt/intel_context.h       |   6 +-
>   drivers/gpu/drm/i915/gt/intel_context_types.h |  25 +-
>   drivers/gpu/drm/i915/gt/intel_lrc.c           |  55 +++-
>   drivers/gpu/drm/i915/gt/selftest_lrc.c        |  10 +-
>   drivers/gpu/drm/i915/i915_debugfs.c           |  29 +-
>   drivers/gpu/drm/i915/i915_drm_client.c        | 274 +++++++++++++++++-
>   drivers/gpu/drm/i915/i915_drm_client.h        |  33 ++-
>   drivers/gpu/drm/i915/i915_gpu_error.c         |  25 +-
>   12 files changed, 473 insertions(+), 88 deletions(-)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness
  2021-01-19 12:10 [Intel-gfx] [PATCH 1/9] drm/i915/gt: Show the per-engine runtime in sysfs Chris Wilson
@ 2021-01-19 12:10 ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2021-01-19 12:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.

This enables userspace to create a top-like tool for GPU utilization:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

      IMC reads:     1401 MiB/s
     IMC writes:        4 MiB/s

          ENGINE      BUSY                                 MI_SEMA MI_WAIT
     Render/3D/0   63.73% |███████████████████           |      3%      0%
       Blitter/0    9.53% |██▊                           |      6%      0%
         Video/0   39.32% |███████████▊                  |     16%      0%
         Video/1   15.62% |████▋                         |      0%      0%
  VideoEnhance/0    0.00% |                              |      0%      0%

  PID            NAME     RCS          BCS          VCS         VECS
 4084        gem_wsim |█████▌     ||█          ||           ||           |
 4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
 * Always enable stats.
 * Walk all client contexts.
v9:
 * Skip unsupported engine classes. (Chris)
 * Use scheduler caps. (Chris)
v10:
 * Use pphwsp runtime only.

Link: https://patchwork.freedesktop.org/series/71182/
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 103 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h |  10 +++
 2 files changed, 112 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index b08c8d7def02..ce51a02bf33b 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -9,6 +9,11 @@
 
 #include <drm/drm_print.h>
 
+#include <uapi/drm/i915_drm.h>
+
+#include "gem/i915_gem_context.h"
+#include "gt/intel_engine_user.h"
+
 #include "i915_drm_client.h"
 #include "i915_drv.h"
 #include "i915_gem.h"
@@ -55,6 +60,97 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return ret;
 }
 
+static u64 pphwsp_busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+	struct i915_gem_engines *engines = rcu_dereference(ctx->engines);
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+	u64 total = 0;
+
+	for_each_gem_engine(ce, engines, it) {
+		if (ce->engine->uabi_class == class)
+			total += ce->runtime.total;
+	}
+
+	return total;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	unsigned int class = i915_attr->engine_class;
+	struct i915_drm_client *client = i915_attr->client;
+	u64 total = atomic64_read(&client->past_runtime[class]);
+	struct list_head *list = &client->ctx_list;
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ctx, list, client_link)
+		total += pphwsp_busy_add(ctx, class);
+	rcu_read_unlock();
+
+	total *= client->clients->i915->gt.clock_period_ns;
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
+}
+
+static const char * const uabi_class_names[] = {
+	[I915_ENGINE_CLASS_RENDER] = "0",
+	[I915_ENGINE_CLASS_COPY] = "1",
+	[I915_ENGINE_CLASS_VIDEO] = "2",
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = "3",
+};
+
+static int __client_register_sysfs_busy(struct i915_drm_client *client)
+{
+	struct i915_drm_clients *clients = client->clients;
+	unsigned int i;
+	int ret = 0;
+
+	if (!HAS_LOGICAL_RING_CONTEXTS(clients->i915))
+		return 0;
+
+	client->busy_root = kobject_create_and_add("busy", client->root);
+	if (!client->busy_root)
+		return -ENOMEM;
+
+	for (i = 0; i < ARRAY_SIZE(uabi_class_names); i++) {
+		struct i915_engine_busy_attribute *i915_attr =
+			&client->attr.busy[i];
+		struct device_attribute *attr = &i915_attr->attr;
+
+		if (!intel_engine_lookup_user(clients->i915, i, 0))
+			continue;
+
+		i915_attr->client = client;
+		i915_attr->engine_class = i;
+
+		sysfs_attr_init(&attr->attr);
+
+		attr->attr.name = uabi_class_names[i];
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(client->busy_root,
+					(struct attribute *)attr);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	kobject_put(client->busy_root);
+	return ret;
+}
+
+static void __client_unregister_sysfs_busy(struct i915_drm_client *client)
+{
+	kobject_put(fetch_and_zero(&client->busy_root));
+}
+
 static int __client_register_sysfs(struct i915_drm_client *client)
 {
 	const struct {
@@ -90,9 +186,12 @@ static int __client_register_sysfs(struct i915_drm_client *client)
 
 		ret = sysfs_create_file(client->root, (struct attribute *)attr);
 		if (ret)
-			break;
+			goto out;
 	}
 
+	ret = __client_register_sysfs_busy(client);
+
+out:
 	if (ret)
 		kobject_put(client->root);
 
@@ -101,6 +200,8 @@ static int __client_register_sysfs(struct i915_drm_client *client)
 
 static void __client_unregister_sysfs(struct i915_drm_client *client)
 {
+	__client_unregister_sysfs_busy(client);
+
 	kobject_put(fetch_and_zero(&client->root));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index b74565e0dee5..00e1b83b8ce4 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -30,6 +30,14 @@ struct i915_drm_clients {
 	struct kobject *root;
 };
 
+struct i915_drm_client;
+
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct i915_drm_client *client;
+	unsigned int engine_class;
+};
+
 struct i915_drm_client_name {
 	struct rcu_head rcu;
 	struct i915_drm_client *client;
@@ -54,9 +62,11 @@ struct i915_drm_client {
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
+	struct kobject *busy_root;
 	struct {
 		struct device_attribute pid;
 		struct device_attribute name;
+		struct i915_engine_busy_attribute busy[MAX_ENGINE_CLASS + 1];
 	} attr;
 
 	/**
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness
  2020-09-14 13:12 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
@ 2020-09-14 13:12 ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-09-14 13:12 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.

This enables userspace to create a top-like tool for GPU utilization:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

      IMC reads:     1401 MiB/s
     IMC writes:        4 MiB/s

          ENGINE      BUSY                                 MI_SEMA MI_WAIT
     Render/3D/0   63.73% |███████████████████           |      3%      0%
       Blitter/0    9.53% |██▊                           |      6%      0%
         Video/0   39.32% |███████████▊                  |     16%      0%
         Video/1   15.62% |████▋                         |      0%      0%
  VideoEnhance/0    0.00% |                              |      0%      0%

  PID            NAME     RCS          BCS          VCS         VECS
 4084        gem_wsim |█████▌     ||█          ||           ||           |
 4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
 * Always enable stats.
 * Walk all client contexts.
v9:
 * Skip unsupported engine classes. (Chris)
 * Use scheduler caps. (Chris)
v10:
 * Use pphwsp runtime only.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 110 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h |  11 +++
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index d99673508f8c..e21d97d810a5 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -9,8 +9,13 @@
 
 #include <drm/drm_print.h>
 
+#include <uapi/drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_drm_client.h"
+#include "gem/i915_gem_context.h"
+#include "gt/intel_engine_user.h"
+#include "i915_drv.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
 
@@ -52,6 +57,104 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return ret;
 }
 
+static u64
+pphwsp_busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+	struct i915_gem_engines *engines = rcu_dereference(ctx->engines);
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+	u64 total = 0;
+
+	for_each_gem_engine(ce, engines, it) {
+		if (ce->engine->uabi_class == class)
+			total += ce->runtime.total;
+	}
+
+	return total;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	unsigned int class = i915_attr->engine_class;
+	struct i915_drm_client *client = i915_attr->client;
+	u64 total = atomic64_read(&client->past_runtime[class]);
+	struct list_head *list = &client->ctx_list;
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ctx, list, client_link) {
+		total += atomic64_read(&ctx->past_runtime[class]);
+		total += pphwsp_busy_add(ctx, class);
+	}
+	rcu_read_unlock();
+
+	total *= RUNTIME_INFO(i915_attr->i915)->cs_timestamp_period_ns;
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
+}
+
+static const char * const uabi_class_names[] = {
+	[I915_ENGINE_CLASS_RENDER] = "0",
+	[I915_ENGINE_CLASS_COPY] = "1",
+	[I915_ENGINE_CLASS_VIDEO] = "2",
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = "3",
+};
+
+static int
+__client_register_sysfs_busy(struct i915_drm_client *client)
+{
+	struct i915_drm_clients *clients = client->clients;
+	struct drm_i915_private *i915 =
+		container_of(clients, typeof(*i915), clients);
+	unsigned int i;
+	int ret = 0;
+
+	if (!HAS_LOGICAL_RING_CONTEXTS(i915))
+		return 0;
+
+	client->busy_root = kobject_create_and_add("busy", client->root);
+	if (!client->busy_root)
+		return -ENOMEM;
+
+	for (i = 0; i < ARRAY_SIZE(uabi_class_names); i++) {
+		struct i915_engine_busy_attribute *i915_attr =
+			&client->attr.busy[i];
+		struct device_attribute *attr = &i915_attr->attr;
+
+		if (!intel_engine_lookup_user(i915, i, 0))
+			continue;
+
+		i915_attr->client = client;
+		i915_attr->i915 = i915;
+		i915_attr->engine_class = i;
+
+		sysfs_attr_init(&attr->attr);
+
+		attr->attr.name = uabi_class_names[i];
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(client->busy_root,
+					(struct attribute *)attr);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	kobject_put(client->busy_root);
+	return ret;
+}
+
+static void __client_unregister_sysfs_busy(struct i915_drm_client *client)
+{
+	kobject_put(fetch_and_zero(&client->busy_root));
+}
+
 static int
 __client_register_sysfs(struct i915_drm_client *client)
 {
@@ -88,9 +191,12 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 		ret = sysfs_create_file(client->root, (struct attribute *)attr);
 		if (ret)
-			break;
+			goto out;
 	}
 
+	ret = __client_register_sysfs_busy(client);
+
+out:
 	if (ret)
 		kobject_put(client->root);
 
@@ -99,6 +205,8 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 static void __client_unregister_sysfs(struct i915_drm_client *client)
 {
+	__client_unregister_sysfs_busy(client);
+
 	kobject_put(fetch_and_zero(&client->root));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 0be27aa9bbda..98c5fc24fcb4 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -26,6 +26,15 @@ struct i915_drm_clients {
 	struct kobject *root;
 };
 
+struct i915_drm_client;
+
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct drm_i915_private *i915;
+	struct i915_drm_client *client;
+	unsigned int engine_class;
+};
+
 struct i915_drm_client {
 	struct kref kref;
 
@@ -44,9 +53,11 @@ struct i915_drm_client {
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
+	struct kobject *busy_root;
 	struct {
 		struct device_attribute pid;
 		struct device_attribute name;
+		struct i915_engine_busy_attribute busy[MAX_ENGINE_CLASS + 1];
 	} attr;
 
 	/**
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness
  2020-09-04 12:59 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
@ 2020-09-04 12:59 ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-09-04 12:59 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.

This enables userspace to create a top-like tool for GPU utilization:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

      IMC reads:     1401 MiB/s
     IMC writes:        4 MiB/s

          ENGINE      BUSY                                 MI_SEMA MI_WAIT
     Render/3D/0   63.73% |███████████████████           |      3%      0%
       Blitter/0    9.53% |██▊                           |      6%      0%
         Video/0   39.32% |███████████▊                  |     16%      0%
         Video/1   15.62% |████▋                         |      0%      0%
  VideoEnhance/0    0.00% |                              |      0%      0%

  PID            NAME     RCS          BCS          VCS         VECS
 4084        gem_wsim |█████▌     ||█          ||           ||           |
 4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
 * Always enable stats.
 * Walk all client contexts.
v9:
 * Skip unsupported engine classes. (Chris)
 * Use scheduler caps. (Chris)
v10:
 * Use pphwsp runtime only.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 110 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h |  11 +++
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 064358b14079..55241702bea8 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -10,8 +10,13 @@
 
 #include <drm/drm_print.h>
 
+#include <uapi/drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_drm_client.h"
+#include "gem/i915_gem_context.h"
+#include "gt/intel_engine_user.h"
+#include "i915_drv.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
 
@@ -53,6 +58,104 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return ret;
 }
 
+static u64
+pphwsp_busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+	struct i915_gem_engines *engines = rcu_dereference(ctx->engines);
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+	u64 total = 0;
+
+	for_each_gem_engine(ce, engines, it) {
+		if (ce->engine->uabi_class == class)
+			total += ce->runtime.total;
+	}
+
+	return total;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	unsigned int class = i915_attr->engine_class;
+	struct i915_drm_client *client = i915_attr->client;
+	u64 total = atomic64_read(&client->past_runtime[class]);
+	struct list_head *list = &client->ctx_list;
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ctx, list, client_link) {
+		total += atomic64_read(&ctx->past_runtime[class]);
+		total += pphwsp_busy_add(ctx, class);
+	}
+	rcu_read_unlock();
+
+	total *= RUNTIME_INFO(i915_attr->i915)->cs_timestamp_period_ns;
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
+}
+
+static const char * const uabi_class_names[] = {
+	[I915_ENGINE_CLASS_RENDER] = "0",
+	[I915_ENGINE_CLASS_COPY] = "1",
+	[I915_ENGINE_CLASS_VIDEO] = "2",
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = "3",
+};
+
+static int
+__client_register_sysfs_busy(struct i915_drm_client *client)
+{
+	struct i915_drm_clients *clients = client->clients;
+	struct drm_i915_private *i915 =
+		container_of(clients, typeof(*i915), clients);
+	unsigned int i;
+	int ret = 0;
+
+	if (!HAS_LOGICAL_RING_CONTEXTS(i915))
+		return 0;
+
+	client->busy_root = kobject_create_and_add("busy", client->root);
+	if (!client->busy_root)
+		return -ENOMEM;
+
+	for (i = 0; i < ARRAY_SIZE(uabi_class_names); i++) {
+		struct i915_engine_busy_attribute *i915_attr =
+			&client->attr.busy[i];
+		struct device_attribute *attr = &i915_attr->attr;
+
+		if (!intel_engine_lookup_user(i915, i, 0))
+			continue;
+
+		i915_attr->client = client;
+		i915_attr->i915 = i915;
+		i915_attr->engine_class = i;
+
+		sysfs_attr_init(&attr->attr);
+
+		attr->attr.name = uabi_class_names[i];
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(client->busy_root,
+					(struct attribute *)attr);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	kobject_put(client->busy_root);
+	return ret;
+}
+
+static void __client_unregister_sysfs_busy(struct i915_drm_client *client)
+{
+	kobject_put(fetch_and_zero(&client->busy_root));
+}
+
 static int
 __client_register_sysfs(struct i915_drm_client *client)
 {
@@ -89,9 +192,12 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 		ret = sysfs_create_file(client->root, (struct attribute *)attr);
 		if (ret)
-			break;
+			goto out;
 	}
 
+	ret = __client_register_sysfs_busy(client);
+
+out:
 	if (ret)
 		kobject_put(client->root);
 
@@ -100,6 +206,8 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 static void __client_unregister_sysfs(struct i915_drm_client *client)
 {
+	__client_unregister_sysfs_busy(client);
+
 	kobject_put(fetch_and_zero(&client->root));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 8f528166e07b..ef589569df3c 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -27,6 +27,15 @@ struct i915_drm_clients {
 	struct kobject *root;
 };
 
+struct i915_drm_client;
+
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct drm_i915_private *i915;
+	struct i915_drm_client *client;
+	unsigned int engine_class;
+};
+
 struct i915_drm_client {
 	struct kref kref;
 
@@ -45,9 +54,11 @@ struct i915_drm_client {
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
+	struct kobject *busy_root;
 	struct {
 		struct device_attribute pid;
 		struct device_attribute name;
+		struct i915_engine_busy_attribute busy[MAX_ENGINE_CLASS + 1];
 	} attr;
 
 	/**
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness
  2020-04-15 10:11 Tvrtko Ursulin
@ 2020-04-15 10:11 ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2020-04-15 10:11 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.

This enables userspace to create a top-like tool for GPU utilization:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

      IMC reads:     1401 MiB/s
     IMC writes:        4 MiB/s

          ENGINE      BUSY                                 MI_SEMA MI_WAIT
     Render/3D/0   63.73% |███████████████████           |      3%      0%
       Blitter/0    9.53% |██▊                           |      6%      0%
         Video/0   39.32% |███████████▊                  |     16%      0%
         Video/1   15.62% |████▋                         |      0%      0%
  VideoEnhance/0    0.00% |                              |      0%      0%

  PID            NAME     RCS          BCS          VCS         VECS
 4084        gem_wsim |█████▌     ||█          ||           ||           |
 4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
 * Always enable stats.
 * Walk all client contexts.
v9:
 * Skip unsupported engine classes. (Chris)
 * Use scheduler caps. (Chris)
v10:
 * Use pphwsp runtime only.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 110 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h |  11 +++
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index c88d9ff448e0..0a2d933fe83c 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -9,8 +9,13 @@
 
 #include <drm/drm_print.h>
 
+#include <uapi/drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_drm_client.h"
+#include "gem/i915_gem_context.h"
+#include "gt/intel_engine_user.h"
+#include "i915_drv.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
 
@@ -52,6 +57,104 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return ret;
 }
 
+static u64
+pphwsp_busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+	struct i915_gem_engines *engines = rcu_dereference(ctx->engines);
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+	u64 total = 0;
+
+	for_each_gem_engine(ce, engines, it) {
+		if (ce->engine->uabi_class == class)
+			total += ce->runtime.total;
+	}
+
+	return total;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	unsigned int class = i915_attr->engine_class;
+	struct i915_drm_client *client = i915_attr->client;
+	u64 total = atomic64_read(&client->past_runtime[class]);
+	struct list_head *list = &client->ctx_list;
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ctx, list, client_link) {
+		total += atomic64_read(&ctx->past_runtime[class]);
+		total += pphwsp_busy_add(ctx, class);
+	}
+	rcu_read_unlock();
+
+	total *= RUNTIME_INFO(i915_attr->i915)->cs_timestamp_period_ns;
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
+}
+
+static const char * const uabi_class_names[] = {
+	[I915_ENGINE_CLASS_RENDER] = "0",
+	[I915_ENGINE_CLASS_COPY] = "1",
+	[I915_ENGINE_CLASS_VIDEO] = "2",
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = "3",
+};
+
+static int
+__client_register_sysfs_busy(struct i915_drm_client *client)
+{
+	struct i915_drm_clients *clients = client->clients;
+	struct drm_i915_private *i915 =
+		container_of(clients, typeof(*i915), clients);
+	unsigned int i;
+	int ret = 0;
+
+	if (!HAS_LOGICAL_RING_CONTEXTS(i915))
+		return 0;
+
+	client->busy_root = kobject_create_and_add("busy", client->root);
+	if (!client->busy_root)
+		return -ENOMEM;
+
+	for (i = 0; i < ARRAY_SIZE(uabi_class_names); i++) {
+		struct i915_engine_busy_attribute *i915_attr =
+			&client->attr.busy[i];
+		struct device_attribute *attr = &i915_attr->attr;
+
+		if (!intel_engine_lookup_user(i915, i, 0))
+			continue;
+
+		i915_attr->client = client;
+		i915_attr->i915 = i915;
+		i915_attr->engine_class = i;
+
+		sysfs_attr_init(&attr->attr);
+
+		attr->attr.name = uabi_class_names[i];
+		attr->attr.mode = 0444;
+		attr->show = show_client_busy;
+
+		ret = sysfs_create_file(client->busy_root,
+					(struct attribute *)attr);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	kobject_put(client->busy_root);
+	return ret;
+}
+
+static void __client_unregister_sysfs_busy(struct i915_drm_client *client)
+{
+	kobject_put(fetch_and_zero(&client->busy_root));
+}
+
 static int
 __client_register_sysfs(struct i915_drm_client *client)
 {
@@ -88,9 +191,12 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 		ret = sysfs_create_file(client->root, (struct attribute *)attr);
 		if (ret)
-			break;
+			goto out;
 	}
 
+	ret = __client_register_sysfs_busy(client);
+
+out:
 	if (ret)
 		kobject_put(client->root);
 
@@ -99,6 +205,8 @@ __client_register_sysfs(struct i915_drm_client *client)
 
 static void __client_unregister_sysfs(struct i915_drm_client *client)
 {
+	__client_unregister_sysfs_busy(client);
+
 	kobject_put(fetch_and_zero(&client->root));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 0be27aa9bbda..98c5fc24fcb4 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -26,6 +26,15 @@ struct i915_drm_clients {
 	struct kobject *root;
 };
 
+struct i915_drm_client;
+
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct drm_i915_private *i915;
+	struct i915_drm_client *client;
+	unsigned int engine_class;
+};
+
 struct i915_drm_client {
 	struct kref kref;
 
@@ -44,9 +53,11 @@ struct i915_drm_client {
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
+	struct kobject *busy_root;
 	struct {
 		struct device_attribute pid;
 		struct device_attribute name;
+		struct i915_engine_busy_attribute busy[MAX_ENGINE_CLASS + 1];
 	} attr;
 
 	/**
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-01-19 12:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-18 11:01 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 1/9] drm/i915: Update client name on context create Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 2/9] drm/i915: Make GEM contexts track DRM clients Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 3/9] drm/i915: Use explicit flag to mark unreachable intel_context Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 4/9] drm/i915: Track runtime spent in unreachable intel_contexts Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 5/9] drm/i915: Track runtime spent in closed GEM contexts Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 6/9] drm/i915: Track all user contexts per client Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 8/9] drm/i915: Track context current active time Tvrtko Ursulin
2020-03-18 11:01 ` [Intel-gfx] [PATCH 9/9] drm/i915: Prefer software tracked context busyness Tvrtko Ursulin
2020-03-18 11:49 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Per client engine busyness (rev8) Patchwork
2020-03-18 12:16 ` [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
2020-04-15 10:11 Tvrtko Ursulin
2020-04-15 10:11 ` [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2020-09-04 12:59 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
2020-09-04 12:59 ` [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2020-09-14 13:12 [Intel-gfx] [PATCH 0/9] Per client engine busyness Tvrtko Ursulin
2020-09-14 13:12 ` [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2021-01-19 12:10 [Intel-gfx] [PATCH 1/9] drm/i915/gt: Show the per-engine runtime in sysfs Chris Wilson
2021-01-19 12:10 ` [Intel-gfx] [PATCH 7/9] drm/i915: Expose per-engine client busyness Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).