All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [RFC 0/8] Per client engine busyness
@ 2020-02-07 16:13 Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 1/6] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Refactor of the i915_drm_client concept based on feedback from Chris.
 * Tracking contexts belonging to a client had a flaw where reaped contexts
   would stop contributing to the busyness making the counters go backwards.
   I was simply broken. In this version I had to track per-client stats at the
   same call-sites where per-context is tracked. It's a little bit more runtime
   cost but not too bad.
 * GuC support is out - needs more time than I have at the moment to properly
   wire it up with the latest changes. Plus there is a monotonicity issue with
   either the value stored in pphwsp by the GPU or a bug in my patch which also
   needs to be debugged.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── 9
	    ├── busy
	    │   ├── 0
	    │   ├── 1
	    │   ├── 2
	    │   └── 3
	    ├── name
	    └── pid


Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

It is stil a RFC since it misses dedicated test cases to ensure things really
work as advertised.

Tvrtko Ursulin (6):
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Make GEM contexts track DRM clients
  drm/i915: Track per-context engine busyness
  drm/i915: Track per drm client engine class busyness
  drm/i915: Expose per-engine client busyness

 drivers/gpu/drm/i915/Makefile                 |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  31 +-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  13 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  20 +
 drivers/gpu/drm/i915/gt/intel_context.h       |  35 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  65 +++-
 drivers/gpu/drm/i915/i915_debugfs.c           |  31 +-
 drivers/gpu/drm/i915/i915_drm_client.c        | 350 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h        |  88 +++++
 drivers/gpu/drm/i915/i915_drv.h               |   5 +
 drivers/gpu/drm/i915/i915_gem.c               |  35 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  21 +-
 drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
 15 files changed, 673 insertions(+), 57 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 1/6] drm/i915: Expose list of clients in sysfs
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
@ 2020-02-07 16:13 ` Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 2/6] drm/i915: Update client name on context create Tvrtko Ursulin
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose a list of clients with open file handles in sysfs.

This will be a basis for a top-like utility showing per-client and per-
engine GPU load.

Currently we only expose each client's pid and name under opaque numbered
directories in /sys/class/drm/card0/clients/.

For instance:

/sys/class/drm/card0/clients/3/name: Xorg
/sys/class/drm/card0/clients/3/pid: 5664

v2:
 Chris Wilson:
 * Enclose new members into dedicated structs.
 * Protect against failed sysfs registration.

v3:
 * sysfs_attr_init.

v4:
 * Fix for internal clients.

v5:
 * Use cyclic ida for client id. (Chris)
 * Do not leak pid reference. (Chris)
 * Tidy code with some locals.

v6:
 * Use xa_alloc_cyclic to simplify locking. (Chris)
 * No need to unregister individial sysfs files. (Chris)
 * Rebase on top of fpriv kref.
 * Track client closed status and reflect in sysfs.

v7:
 * Make drm_client more standalone concept.

v8:
 * Simplify sysfs show. (Chris)
 * Always track name and pid.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/Makefile          |   3 +-
 drivers/gpu/drm/i915/i915_drm_client.c | 152 +++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h |  61 ++++++++++
 drivers/gpu/drm/i915/i915_drv.h        |   5 +
 drivers/gpu/drm/i915/i915_gem.c        |  35 ++++--
 drivers/gpu/drm/i915/i915_sysfs.c      |   8 ++
 6 files changed, 256 insertions(+), 8 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 49eed50ef0a4..eca62c272ca2 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -36,7 +36,8 @@ subdir-ccflags-y += -I$(srctree)/$(src)
 # Please keep these build lists sorted!
 
 # core driver code
-i915-y += i915_drv.o \
+i915-y += i915_drm_client.o \
+	  i915_drv.o \
 	  i915_irq.o \
 	  i915_getparam.o \
 	  i915_params.o \
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
new file mode 100644
index 000000000000..f97989329df5
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -0,0 +1,152 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include "i915_drm_client.h"
+#include "i915_gem.h"
+#include "i915_utils.h"
+
+static ssize_t
+show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_drm_client *client =
+		container_of(attr, typeof(*client), attr.name);
+
+	return snprintf(buf, PAGE_SIZE,
+			client->closed ? "<%s>" : "%s",
+			client->name);
+}
+
+static ssize_t
+show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_drm_client *client =
+		container_of(attr, typeof(*client), attr.pid);
+
+	return snprintf(buf, PAGE_SIZE,
+			client->closed ? "<%u>" : "%u",
+			pid_nr(client->pid));
+}
+
+static int
+__i915_drm_client_register(struct i915_drm_client *client,
+			   struct task_struct *task)
+{
+	struct i915_drm_clients *clients = client->clients;
+	struct device_attribute *attr;
+	int ret = -ENOMEM;
+	char idstr[32];
+
+	client->pid = get_task_pid(task, PIDTYPE_PID);
+
+	client->name = kstrdup(task->comm, GFP_KERNEL);
+	if (!client->name)
+		goto err_name;
+
+	if (!clients->root)
+		return 0; /* intel_fbdev_init registers a client before sysfs */
+
+	snprintf(idstr, sizeof(idstr), "%u", client->id);
+	client->root = kobject_create_and_add(idstr, clients->root);
+	if (!client->root)
+		goto err_client;
+
+	attr = &client->attr.name;
+	sysfs_attr_init(&attr->attr);
+	attr->attr.name = "name";
+	attr->attr.mode = 0444;
+	attr->show = show_client_name;
+
+	ret = sysfs_create_file(client->root, (struct attribute *)attr);
+	if (ret)
+		goto err_attr;
+
+	attr = &client->attr.pid;
+	sysfs_attr_init(&attr->attr);
+	attr->attr.name = "pid";
+	attr->attr.mode = 0444;
+	attr->show = show_client_pid;
+
+	ret = sysfs_create_file(client->root, (struct attribute *)attr);
+	if (ret)
+		goto err_attr;
+
+	return 0;
+
+err_attr:
+	kobject_put(client->root);
+err_client:
+	kfree(client->name);
+err_name:
+	put_pid(client->pid);
+
+	return ret;
+}
+
+static void
+__i915_drm_client_unregister(struct i915_drm_client *client)
+{
+	put_pid(fetch_and_zero(&client->pid));
+	kfree(fetch_and_zero(&client->name));
+
+	if (!client->root)
+		return; /* fbdev client or error during drm open */
+
+	kobject_put(fetch_and_zero(&client->root));
+}
+
+struct i915_drm_client *
+i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task)
+{
+	struct i915_drm_client *client;
+	u32 next = 0;
+	int ret;
+
+	client = kzalloc(sizeof(*client), GFP_KERNEL);
+	if (!client)
+		return ERR_PTR(-ENOMEM);
+
+	kref_init(&client->kref);
+	client->clients = clients;
+
+	ret = xa_alloc_cyclic(&clients->xarray, &client->id, client,
+			      xa_limit_32b, &next, GFP_KERNEL);
+	if (ret)
+		goto err_id;
+
+	ret = __i915_drm_client_register(client, task);
+	if (ret)
+		goto err_register;
+
+	return client;
+
+err_register:
+	xa_erase(&clients->xarray, client->id);
+err_id:
+	kfree(client);
+
+	return ERR_PTR(ret);
+}
+
+void __i915_drm_client_free(struct kref *kref)
+{
+	struct i915_drm_client *client =
+		container_of(kref, typeof(*client), kref);
+
+	__i915_drm_client_unregister(client);
+	xa_erase(&client->clients->xarray, client->id);
+	kfree_rcu(client, rcu);
+}
+
+void i915_drm_client_close(struct i915_drm_client *client)
+{
+	GEM_BUG_ON(client->closed);
+	client->closed = true;
+	i915_drm_client_put(client);
+}
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
new file mode 100644
index 000000000000..8b261dc4a1f3
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -0,0 +1,61 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __I915_DRM_CLIENT_H__
+#define __I915_DRM_CLIENT_H__
+
+#include <linux/device.h>
+#include <linux/kobject.h>
+#include <linux/kref.h>
+#include <linux/pid.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+#include <linux/xarray.h>
+
+struct i915_drm_clients {
+	struct xarray xarray;
+	struct kobject *root;
+};
+
+struct i915_drm_client {
+	struct kref kref;
+
+	struct rcu_head rcu;
+
+	unsigned int id;
+	struct pid *pid;
+	char *name;
+	bool closed;
+
+	struct i915_drm_clients *clients;
+
+	struct kobject *root;
+	struct {
+		struct device_attribute pid;
+		struct device_attribute name;
+	} attr;
+};
+
+static inline struct i915_drm_client *
+i915_drm_client_get(struct i915_drm_client *client)
+{
+	kref_get(&client->kref);
+	return client;
+}
+
+void __i915_drm_client_free(struct kref *kref);
+
+static inline void i915_drm_client_put(struct i915_drm_client *client)
+{
+	kref_put(&client->kref, __i915_drm_client_free);
+}
+
+void i915_drm_client_close(struct i915_drm_client *client);
+
+struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients,
+					    struct task_struct *task);
+
+#endif /* !__I915_DRM_CLIENT_H__ */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3452926d7b77..6e7685323ccd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -92,6 +92,7 @@
 #include "intel_wakeref.h"
 #include "intel_wopcm.h"
 
+#include "i915_drm_client.h"
 #include "i915_gem.h"
 #include "i915_gem_fence_reg.h"
 #include "i915_gem_gtt.h"
@@ -223,6 +224,8 @@ struct drm_i915_file_private {
 	/** ban_score: Accumulated score of all ctx bans and fast hangs. */
 	atomic_t ban_score;
 	unsigned long hang_timestamp;
+
+	struct i915_drm_client *client;
 };
 
 /* Interface history:
@@ -1261,6 +1264,8 @@ struct drm_i915_private {
 
 	struct i915_pmu pmu;
 
+	struct i915_drm_clients clients;
+
 	struct i915_hdcp_comp_master *hdcp_master;
 	bool hdcp_comp_added;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a712e60b016a..2571ffb9e567 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1222,12 +1222,14 @@ static void i915_gem_init__mm(struct drm_i915_private *i915)
 	i915_gem_init__objects(i915);
 }
 
-void i915_gem_init_early(struct drm_i915_private *dev_priv)
+void i915_gem_init_early(struct drm_i915_private *i915)
 {
-	i915_gem_init__mm(dev_priv);
-	i915_gem_init__contexts(dev_priv);
+	i915_gem_init__mm(i915);
+	i915_gem_init__contexts(i915);
 
-	spin_lock_init(&dev_priv->fb_tracking.lock);
+	spin_lock_init(&i915->fb_tracking.lock);
+
+	xa_init_flags(&i915->clients.xarray, XA_FLAGS_ALLOC);
 }
 
 void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
@@ -1236,6 +1238,8 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
 	GEM_BUG_ON(!llist_empty(&dev_priv->mm.free_list));
 	GEM_BUG_ON(atomic_read(&dev_priv->mm.free_count));
 	drm_WARN_ON(&dev_priv->drm, dev_priv->mm.shrink_count);
+	drm_WARN_ON(&dev_priv->drm, !xa_empty(&dev_priv->clients.xarray));
+	xa_destroy(&dev_priv->clients.xarray);
 }
 
 int i915_gem_freeze(struct drm_i915_private *dev_priv)
@@ -1290,6 +1294,8 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct i915_request *request;
 
+	i915_drm_client_close(file_priv->client);
+
 	/* Clean up our request list when the client is going away, so that
 	 * later retire_requests won't dereference our soon-to-be-gone
 	 * file_priv.
@@ -1303,17 +1309,25 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv;
-	int ret;
+	struct i915_drm_client *client;
+	int ret = -ENOMEM;
 
 	DRM_DEBUG("\n");
 
 	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
 	if (!file_priv)
-		return -ENOMEM;
+		goto err_alloc;
+
+	client = i915_drm_client_add(&i915->clients, current);
+	if (IS_ERR(client)) {
+		ret = PTR_ERR(client);
+		goto err_client;
+	}
 
 	file->driver_priv = file_priv;
 	file_priv->dev_priv = i915;
 	file_priv->file = file;
+	file_priv->client = client;
 
 	spin_lock_init(&file_priv->mm.lock);
 	INIT_LIST_HEAD(&file_priv->mm.request_list);
@@ -1323,8 +1337,15 @@ int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 
 	ret = i915_gem_context_open(i915, file);
 	if (ret)
-		kfree(file_priv);
+		goto err_context;
+
+	return 0;
 
+err_context:
+	i915_drm_client_close(client);
+err_client:
+	kfree(file_priv);
+err_alloc:
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index c14d762bd652..476ef6561769 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -559,6 +559,11 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv)
 	struct device *kdev = dev_priv->drm.primary->kdev;
 	int ret;
 
+	dev_priv->clients.root =
+		kobject_create_and_add("clients", &kdev->kobj);
+	if (!dev_priv->clients.root)
+		DRM_ERROR("Per-client sysfs setup failed\n");
+
 #ifdef CONFIG_PM
 	if (HAS_RC6(dev_priv)) {
 		ret = sysfs_merge_group(&kdev->kobj,
@@ -624,4 +629,7 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 	sysfs_unmerge_group(&kdev->kobj, &rc6_attr_group);
 	sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group);
 #endif
+
+	if (dev_priv->clients.root)
+		kobject_put(dev_priv->clients.root);
 }
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 2/6] drm/i915: Update client name on context create
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 1/6] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
@ 2020-02-07 16:13 ` Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 3/6] drm/i915: Make GEM contexts track DRM clients Tvrtko Ursulin
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some clients have the DRM fd passed to them over a socket by the X server.

Grab the real client and pid when they create their first context and
update the exposed data for more useful enumeration.

To enable lockless access to client name and pid data from the following
patches, we also make these fields rcu protected. In this way asynchronous
code paths where both contexts which remain after the client exit, and
access to client name and pid as they are getting updated due context
creation running in parallel with name/pid queries.

v2:
 * Do not leak the pid reference and borrow context idr_lock. (Chris)

v3:
 * More avoiding leaks. (Chris)

v4:
 * Move update completely to drm client. (Chris)
 * Do not lose previous client data on failure to re-register and simplify
   update to only touch what it needs.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |   8 +-
 drivers/gpu/drm/i915/i915_drm_client.c      | 110 +++++++++++++++++---
 drivers/gpu/drm/i915/i915_drm_client.h      |  10 +-
 3 files changed, 113 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 52a749691a8d..7dfabd8a04a1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -75,6 +75,7 @@
 #include "gt/intel_engine_user.h"
 #include "gt/intel_ring.h"
 
+#include "i915_drm_client.h"
 #include "i915_gem_context.h"
 #include "i915_globals.h"
 #include "i915_trace.h"
@@ -2092,6 +2093,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_context_create_ext *args = data;
+	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct create_ext ext_data;
 	int ret;
 	u32 id;
@@ -2106,7 +2108,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
-	ext_data.fpriv = file->driver_priv;
+	ext_data.fpriv = file_priv;
 	if (client_is_banned(ext_data.fpriv)) {
 		drm_dbg(&i915->drm,
 			"client %s[%d] banned from creating ctx\n",
@@ -2114,6 +2116,10 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
+	ret = i915_drm_client_update(file_priv->client, current);
+	if (ret)
+		return ret;
+
 	ext_data.ctx = i915_gem_create_context(i915, args->flags);
 	if (IS_ERR(ext_data.ctx))
 		return PTR_ERR(ext_data.ctx);
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index f97989329df5..770094ef2e67 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -8,6 +8,9 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 
+#include <drm/drm_print.h>
+
+#include "i915_drv.h"
 #include "i915_drm_client.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
@@ -17,10 +20,15 @@ show_client_name(struct device *kdev, struct device_attribute *attr, char *buf)
 {
 	struct i915_drm_client *client =
 		container_of(attr, typeof(*client), attr.name);
+	int ret;
 
-	return snprintf(buf, PAGE_SIZE,
-			client->closed ? "<%s>" : "%s",
-			client->name);
+	rcu_read_lock();
+	ret = snprintf(buf, PAGE_SIZE,
+		       client->closed ? "<%s>" : "%s",
+		       rcu_dereference(client->name));
+	rcu_read_unlock();
+
+	return ret;
 }
 
 static ssize_t
@@ -28,10 +36,15 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 {
 	struct i915_drm_client *client =
 		container_of(attr, typeof(*client), attr.pid);
+	int ret;
+
+	rcu_read_lock();
+	ret = snprintf(buf, PAGE_SIZE,
+		       client->closed ? "<%u>" : "%u",
+		       pid_nr(rcu_dereference(client->pid)));
+	rcu_read_unlock();
 
-	return snprintf(buf, PAGE_SIZE,
-			client->closed ? "<%u>" : "%u",
-			pid_nr(client->pid));
+	return ret;
 }
 
 static int
@@ -42,12 +55,14 @@ __i915_drm_client_register(struct i915_drm_client *client,
 	struct device_attribute *attr;
 	int ret = -ENOMEM;
 	char idstr[32];
+	char *name;
 
-	client->pid = get_task_pid(task, PIDTYPE_PID);
+	rcu_assign_pointer(client->pid, get_task_pid(task, PIDTYPE_PID));
 
-	client->name = kstrdup(task->comm, GFP_KERNEL);
-	if (!client->name)
+	name = kstrdup(task->comm, GFP_KERNEL);
+	if (!name)
 		goto err_name;
+	rcu_assign_pointer(client->name, name);
 
 	if (!clients->root)
 		return 0; /* intel_fbdev_init registers a client before sysfs */
@@ -82,7 +97,7 @@ __i915_drm_client_register(struct i915_drm_client *client,
 err_attr:
 	kobject_put(client->root);
 err_client:
-	kfree(client->name);
+	kfree(name);
 err_name:
 	put_pid(client->pid);
 
@@ -92,8 +107,8 @@ __i915_drm_client_register(struct i915_drm_client *client,
 static void
 __i915_drm_client_unregister(struct i915_drm_client *client)
 {
-	put_pid(fetch_and_zero(&client->pid));
-	kfree(fetch_and_zero(&client->name));
+	put_pid(rcu_replace_pointer(client->pid, NULL, true));
+	kfree(rcu_replace_pointer(client->name, NULL, true));
 
 	if (!client->root)
 		return; /* fbdev client or error during drm open */
@@ -113,6 +128,7 @@ i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task)
 		return ERR_PTR(-ENOMEM);
 
 	kref_init(&client->kref);
+	mutex_init(&client->update_lock);
 	client->clients = clients;
 
 	ret = xa_alloc_cyclic(&clients->xarray, &client->id, client,
@@ -150,3 +166,73 @@ void i915_drm_client_close(struct i915_drm_client *client)
 	client->closed = true;
 	i915_drm_client_put(client);
 }
+
+struct client_update_free
+{
+	struct rcu_head rcu;
+	struct pid *pid;
+	char *name;
+};
+
+static void __client_update_free(struct rcu_head *rcu)
+{
+	struct client_update_free *old = container_of(rcu, typeof(*old), rcu);
+
+	put_pid(old->pid);
+	kfree(old->name);
+	kfree(old);
+}
+
+int
+i915_drm_client_update(struct i915_drm_client *client,
+		       struct task_struct *task)
+{
+	struct drm_i915_private *i915 =
+		container_of(client->clients, typeof(*i915), clients);
+	struct client_update_free *old;
+	struct pid *pid;
+	char *name;
+	int ret;
+
+	old = kmalloc(sizeof(*old), GFP_KERNEL);
+	if (!old)
+		return -ENOMEM;
+
+	ret = mutex_lock_interruptible(&client->update_lock);
+	if (ret)
+		goto out_free;
+
+	pid = get_task_pid(task, PIDTYPE_PID);
+	if (!pid)
+		goto out_pid;
+	if (pid == client->pid)
+		goto out_name;
+
+	name = kstrdup(task->comm, GFP_KERNEL);
+	if (!name) {
+		drm_notice(&i915->drm,
+			   "Failed to update client id=%u,name=%s,pid=%u! (%d)\n",
+			   client->id, client->name, pid_nr(client->pid), ret);
+		goto out_name;
+	}
+
+	init_rcu_head(&old->rcu);
+
+	old->pid = rcu_replace_pointer(client->pid, pid, true);
+	old->name = rcu_replace_pointer(client->name, name, true);
+
+	mutex_unlock(&client->update_lock);
+
+	call_rcu(&old->rcu, __client_update_free);
+
+	return 0;
+
+out_name:
+	put_pid(pid);
+out_pid:
+	mutex_unlock(&client->update_lock);
+out_free:
+	kfree(old);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 8b261dc4a1f3..a9d4d7396e43 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -10,6 +10,7 @@
 #include <linux/device.h>
 #include <linux/kobject.h>
 #include <linux/kref.h>
+#include <linux/mutex.h>
 #include <linux/pid.h>
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
@@ -25,9 +26,11 @@ struct i915_drm_client {
 
 	struct rcu_head rcu;
 
+	struct mutex update_lock;
+
 	unsigned int id;
-	struct pid *pid;
-	char *name;
+	struct pid __rcu *pid;
+	char __rcu *name;
 	bool closed;
 
 	struct i915_drm_clients *clients;
@@ -58,4 +61,7 @@ void i915_drm_client_close(struct i915_drm_client *client);
 struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients,
 					    struct task_struct *task);
 
+int i915_drm_client_update(struct i915_drm_client *client,
+			   struct task_struct *task);
+
 #endif /* !__I915_DRM_CLIENT_H__ */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 3/6] drm/i915: Make GEM contexts track DRM clients
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 1/6] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 2/6] drm/i915: Update client name on context create Tvrtko Ursulin
@ 2020-02-07 16:13 ` Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 4/6] drm/i915: Track per-context engine busyness Tvrtko Ursulin
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If we make GEM contexts keep a reference to i915_drm_client for the whole
of their lifetime, we can consolidate the current task pid and name usage
by getting it from the client.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 23 +++++++++++---
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 13 ++------
 drivers/gpu/drm/i915/i915_debugfs.c           | 31 +++++++++----------
 drivers/gpu/drm/i915/i915_gpu_error.c         | 21 +++++++------
 4 files changed, 48 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7dfabd8a04a1..828b56113067 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -299,8 +299,13 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 
 static void i915_gem_context_free(struct i915_gem_context *ctx)
 {
+	struct i915_drm_client *client = ctx->client;
+
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 
+	if (client)
+		i915_drm_client_put(client);
+
 	spin_lock(&ctx->i915->gem.contexts.lock);
 	list_del(&ctx->link);
 	spin_unlock(&ctx->i915->gem.contexts.lock);
@@ -311,7 +316,6 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	if (ctx->timeline)
 		intel_timeline_put(ctx->timeline);
 
-	put_pid(ctx->pid);
 	mutex_destroy(&ctx->mutex);
 
 	kfree_rcu(ctx, rcu);
@@ -780,6 +784,7 @@ static int gem_context_register(struct i915_gem_context *ctx,
 				struct drm_i915_file_private *fpriv,
 				u32 *id)
 {
+	struct i915_drm_client *client;
 	struct i915_address_space *vm;
 	int ret;
 
@@ -791,15 +796,25 @@ static int gem_context_register(struct i915_gem_context *ctx,
 		WRITE_ONCE(vm->file, fpriv); /* XXX */
 	mutex_unlock(&ctx->mutex);
 
-	ctx->pid = get_task_pid(current, PIDTYPE_PID);
+	client = i915_drm_client_get(fpriv->client);
+
+	rcu_read_lock();
 	snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
-		 current->comm, pid_nr(ctx->pid));
+		 rcu_dereference(client->name),
+		 pid_nr(rcu_dereference(client->pid)));
+	rcu_read_unlock();
 
 	/* And finally expose ourselves to userspace via the idr */
 	ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
 	if (ret)
-		put_pid(fetch_and_zero(&ctx->pid));
+		goto err;
+
+	ctx->client = client;
 
+	return 0;
+
+err:
+	i915_drm_client_put(client);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 017ca803ab47..0c5fb203b754 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -90,20 +90,13 @@ struct i915_gem_context {
 	 */
 	struct i915_address_space __rcu *vm;
 
-	/**
-	 * @pid: process id of creator
-	 *
-	 * Note that who created the context may not be the principle user,
-	 * as the context may be shared across a local socket. However,
-	 * that should only affect the default context, all contexts created
-	 * explicitly by the client are expected to be isolated.
-	 */
-	struct pid *pid;
-
 	/** link: place with &drm_i915_private.context_list */
 	struct list_head link;
 	struct llist_node free_link;
 
+	/** client: struct i915_drm_client */
+	struct i915_drm_client *client;
+
 	/**
 	 * @ref: reference count
 	 *
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c313c90405cb..6b7e1ff0355e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -339,17 +339,17 @@ static void print_context_stats(struct seq_file *m,
 				.vm = rcu_access_pointer(ctx->vm),
 			};
 			struct drm_file *file = ctx->file_priv->file;
-			struct task_struct *task;
 			char name[80];
 
 			rcu_read_lock();
+
 			idr_for_each(&file->object_idr, per_file_stats, &stats);
-			rcu_read_unlock();
 
-			rcu_read_lock();
-			task = pid_task(ctx->pid ?: file->pid, PIDTYPE_PID);
 			snprintf(name, sizeof(name), "%s",
-				 task ? task->comm : "<unknown>");
+				 I915_SELFTEST_ONLY(!ctx->client) ?
+				 "[kernel]" :
+				 rcu_dereference(ctx->client->name));
+
 			rcu_read_unlock();
 
 			print_file_stats(m, name, stats);
@@ -1493,19 +1493,16 @@ static int i915_context_status(struct seq_file *m, void *unused)
 		spin_unlock(&i915->gem.contexts.lock);
 
 		seq_puts(m, "HW context ");
-		if (ctx->pid) {
-			struct task_struct *task;
-
-			task = get_pid_task(ctx->pid, PIDTYPE_PID);
-			if (task) {
-				seq_printf(m, "(%s [%d]) ",
-					   task->comm, task->pid);
-				put_task_struct(task);
-			}
-		} else if (IS_ERR(ctx->file_priv)) {
-			seq_puts(m, "(deleted) ");
+
+		if (I915_SELFTEST_ONLY(!ctx->client)) {
+			seq_puts(m, "([kernel]) ");
 		} else {
-			seq_puts(m, "(kernel) ");
+			rcu_read_lock();
+			seq_printf(m, "(%s [%d]) %s",
+				   rcu_dereference(ctx->client->name),
+				   pid_nr(rcu_dereference(ctx->client->pid)),
+				   ctx->client->closed ? "(closed) " : "");
+			rcu_read_unlock();
 		}
 
 		seq_putc(m, ctx->remap_slice ? 'R' : 'r');
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 5a1517d0bf3b..86939de90bff 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1217,7 +1217,8 @@ static void record_request(const struct i915_request *request,
 	rcu_read_lock();
 	ctx = rcu_dereference(request->context->gem_context);
 	if (ctx)
-		erq->pid = pid_nr(ctx->pid);
+		erq->pid = I915_SELFTEST_ONLY(!ctx->client) ?
+			   0 : pid_nr(rcu_dereference(ctx->client->pid));
 	rcu_read_unlock();
 }
 
@@ -1237,23 +1238,25 @@ static bool record_context(struct i915_gem_context_coredump *e,
 			   const struct i915_request *rq)
 {
 	struct i915_gem_context *ctx;
-	struct task_struct *task;
 	bool simulated;
 
 	rcu_read_lock();
+
 	ctx = rcu_dereference(rq->context->gem_context);
 	if (ctx && !kref_get_unless_zero(&ctx->ref))
 		ctx = NULL;
-	rcu_read_unlock();
-	if (!ctx)
+	if (!ctx) {
+		rcu_read_unlock();
 		return true;
+	}
 
-	rcu_read_lock();
-	task = pid_task(ctx->pid, PIDTYPE_PID);
-	if (task) {
-		strcpy(e->comm, task->comm);
-		e->pid = task->pid;
+	if (I915_SELFTEST_ONLY(!ctx->client)) {
+		strcpy(e->comm, "[kernel]");
+	} else {
+		strcpy(e->comm, rcu_dereference(ctx->client->name));
+		e->pid = pid_nr(rcu_dereference(ctx->client->pid));
 	}
+
 	rcu_read_unlock();
 
 	e->sched_attr = ctx->sched;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 4/6] drm/i915: Track per-context engine busyness
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 3/6] drm/i915: Make GEM contexts track DRM clients Tvrtko Ursulin
@ 2020-02-07 16:13 ` Tvrtko Ursulin
  2020-02-09 11:02   ` Chris Wilson
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness Tvrtko Ursulin
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Some customers want to know how much of the GPU time are their clients
using in order to make dynamic load balancing decisions.

With the hooks already in place which track the overall engine busyness,
we can extend that slightly to split that time between contexts.

v2: Fix accounting for tail updates.
v3: Rebase.
v4: Mark currently running contexts as active on stats enable.
v5: Include some headers to fix the build.
v6: Added fine grained lock.
v7: Convert to seqlock. (Chris Wilson)
v8: Rebase and tidy with helpers.
v9: Refactor.
v10: Move recording start to promotion. (Chris)
v11: Consolidate duplicated code. (Chris)
v12: execlists->active cannot be NULL. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c       | 20 ++++++++
 drivers/gpu/drm/i915/gt/intel_context.h       | 13 ++++++
 drivers/gpu/drm/i915/gt/intel_context_types.h |  9 ++++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 15 +++++-
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 46 ++++++++++++++++---
 5 files changed, 95 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 57e8a051ddc2..c8669036c303 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -286,6 +286,7 @@ intel_context_init(struct intel_context *ce,
 	INIT_LIST_HEAD(&ce->signals);
 
 	mutex_init(&ce->pin_mutex);
+	seqlock_init(&ce->stats.lock);
 
 	i915_active_init(&ce->active,
 			 __intel_context_active, __intel_context_retire);
@@ -380,6 +381,25 @@ struct i915_request *intel_context_create_request(struct intel_context *ce)
 	return rq;
 }
 
+ktime_t intel_context_get_busy_time(struct intel_context *ce)
+{
+	unsigned int seq;
+	ktime_t total;
+
+	do {
+		seq = read_seqbegin(&ce->stats.lock);
+
+		total = ce->stats.total;
+
+		if (ce->stats.active)
+			total = ktime_add(total,
+					  ktime_sub(ktime_get(),
+						    ce->stats.start));
+	} while (read_seqretry(&ce->stats.lock, seq));
+
+	return total;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_context.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 604d5cfc46ba..871196aa7d5a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -227,4 +227,17 @@ intel_context_clear_nopreempt(struct intel_context *ce)
 	clear_bit(CONTEXT_NOPREEMPT, &ce->flags);
 }
 
+static inline void
+__intel_context_stats_start(struct intel_context *ce, ktime_t now)
+{
+	struct intel_context_stats *stats = &ce->stats;
+
+	if (!stats->active) {
+		stats->start = now;
+		stats->active = true;
+	}
+}
+
+ktime_t intel_context_get_busy_time(struct intel_context *ce);
+
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index ca1420fb8b53..963d33dc5289 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -11,6 +11,7 @@
 #include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/types.h>
+#include <linux/seqlock.h>
 
 #include "i915_active_types.h"
 #include "i915_utils.h"
@@ -83,6 +84,14 @@ struct intel_context {
 
 	/** sseu: Control eu/slice partitioning */
 	struct intel_sseu sseu;
+
+	/** stats: Context GPU engine busyness tracking. */
+	struct intel_context_stats {
+		seqlock_t lock;
+		bool active;
+		ktime_t start;
+		ktime_t total;
+	} stats;
 };
 
 #endif /* __INTEL_CONTEXT_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index b1c7b1ed6149..a5344d2c7c10 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1599,8 +1599,19 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 
 		engine->stats.enabled_at = ktime_get();
 
-		/* XXX submission method oblivious? */
-		for (port = execlists->active; (rq = *port); port++)
+		/*
+		 * Mark currently running context as active.
+		 * XXX submission method oblivious?
+		 */
+
+		rq = NULL;
+		port = execlists->active;
+		rq = *port;
+		if (rq)
+			__intel_context_stats_start(rq->context,
+						    engine->stats.enabled_at);
+
+		for (; (rq = *port); port++)
 			engine->stats.active++;
 
 		for (port = execlists->pending; (rq = *port); port++) {
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index b350e01d86d2..1cd38b6eb189 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1197,6 +1197,32 @@ static void reset_active(struct i915_request *rq,
 	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
 }
 
+static void intel_context_stats_start(struct intel_context *ce)
+{
+	struct intel_context_stats *stats = &ce->stats;
+	unsigned long flags;
+
+	write_seqlock_irqsave(&stats->lock, flags);
+	__intel_context_stats_start(ce, ktime_get());
+	write_sequnlock_irqrestore(&stats->lock, flags);
+}
+
+static void intel_context_stats_stop(struct intel_context *ce)
+{
+	struct intel_context_stats *stats = &ce->stats;
+	unsigned long flags;
+
+	if (!READ_ONCE(stats->active))
+		return;
+
+	write_seqlock_irqsave(&stats->lock, flags);
+	GEM_BUG_ON(!READ_ONCE(stats->active));
+	stats->total = ktime_add(stats->total,
+				 ktime_sub(ktime_get(), stats->start));
+	stats->active = false;
+	write_sequnlock_irqrestore(&stats->lock, flags);
+}
+
 static inline struct intel_engine_cs *
 __execlists_schedule_in(struct i915_request *rq)
 {
@@ -1264,7 +1290,7 @@ static inline void
 __execlists_schedule_out(struct i915_request *rq,
 			 struct intel_engine_cs * const engine)
 {
-	struct intel_context * const ce = rq->context;
+	struct intel_context *ce = rq->context;
 
 	/*
 	 * NB process_csb() is not under the engine->active.lock and hence
@@ -1281,6 +1307,7 @@ __execlists_schedule_out(struct i915_request *rq,
 		intel_engine_add_retire(engine, ce->timeline);
 
 	intel_engine_context_out(engine);
+	intel_context_stats_stop(ce);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put_async(engine->gt);
 
@@ -2262,6 +2289,7 @@ static void process_csb(struct intel_engine_cs *engine)
 	rmb();
 
 	do {
+		struct i915_request *rq;
 		bool promote;
 
 		if (++head == num_entries)
@@ -2316,7 +2344,11 @@ static void process_csb(struct intel_engine_cs *engine)
 
 			WRITE_ONCE(execlists->pending[0], NULL);
 		} else {
-			GEM_BUG_ON(!*execlists->active);
+			rq = *execlists->active++;
+			GEM_BUG_ON(!rq);
+
+			GEM_BUG_ON(execlists->active - execlists->inflight >
+				   execlists_num_ports(execlists));
 
 			/* port0 completed, advanced to port1 */
 			trace_ports(execlists, "completed", execlists->active);
@@ -2327,13 +2359,15 @@ static void process_csb(struct intel_engine_cs *engine)
 			 * coherent (visible from the CPU) before the
 			 * user interrupt and CSB is processed.
 			 */
-			GEM_BUG_ON(!i915_request_completed(*execlists->active) &&
+			GEM_BUG_ON(!i915_request_completed(rq) &&
 				   !reset_in_progress(execlists));
-			execlists_schedule_out(*execlists->active++);
 
-			GEM_BUG_ON(execlists->active - execlists->inflight >
-				   execlists_num_ports(execlists));
+			execlists_schedule_out(rq);
 		}
+
+		rq = *execlists->active;
+		if (rq)
+			intel_context_stats_start(rq->context);
 	} while (head != tail);
 
 	execlists->csb_head = head;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 4/6] drm/i915: Track per-context engine busyness Tvrtko Ursulin
@ 2020-02-07 16:13 ` Tvrtko Ursulin
  2020-02-07 16:33   ` Chris Wilson
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 6/6] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add per client, per engine class tracking of on GPU runtime on top of the
previously added per context tracking.

This data will then be exported via sysfs in a following patch in order to
implement per DRM client view of GPU utilization.

To implement this we add a stats structure to each drm client, which is
per engine class, used to track the total time spent on engines of a
certain class before the last activity transition, count of currently
active contexts and a timestamp of the last activity transition (any
increase or decrease). This allow accumulating activity segments and
reporting in-progress activity.

There is a small locked section (seqlock) from which time accounting
updates are made as contexts are entering and exiting from the GPU. Lock
is per class so contention is limited to same client activity on the same
engine class. Other scenarios add no new lock contention.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.h   | 32 ++++++++++++++++++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  1 +
 drivers/gpu/drm/i915/gt/intel_lrc.c       | 33 ++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_drm_client.c    |  4 ++-
 drivers/gpu/drm/i915/i915_drm_client.h    | 10 +++++++
 5 files changed, 67 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 871196aa7d5a..a0a77a590566 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -12,6 +12,7 @@
 #include <linux/types.h>
 
 #include "i915_active.h"
+#include "i915_drm_client.h"
 #include "intel_context_types.h"
 #include "intel_engine_types.h"
 #include "intel_ring_types.h"
@@ -228,14 +229,35 @@ intel_context_clear_nopreempt(struct intel_context *ce)
 }
 
 static inline void
-__intel_context_stats_start(struct intel_context *ce, ktime_t now)
+__intel_context_stats_start(struct intel_context *ce,
+			    struct intel_engine_cs *engine,
+			    ktime_t now)
 {
 	struct intel_context_stats *stats = &ce->stats;
-
-	if (!stats->active) {
-		stats->start = now;
-		stats->active = true;
+	struct i915_gem_context *ctx;
+
+	if (stats->active)
+		return;
+
+	stats->start = now;
+	stats->active = true;
+
+	rcu_read_lock();
+	ctx = rcu_dereference(ce->gem_context);
+	if (ctx && ctx->client) {
+		struct i915_drm_client_stats *cstats =
+			&ctx->client->stats[engine->uabi_class];
+		ktime_t now = ktime_get();
+		unsigned long flags;
+
+		write_seqlock_irqsave(&cstats->lock, flags);
+		cstats->busy += ktime_to_ns(ktime_sub(now, cstats->start)) *
+				cstats->active++;
+		GEM_WARN_ON(cstats->active == 0);
+		cstats->start = now;
+		write_sequnlock_irqrestore(&cstats->lock, flags);
 	}
+	rcu_read_unlock();
 }
 
 ktime_t intel_context_get_busy_time(struct intel_context *ce);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index a5344d2c7c10..a9f8c5af8f81 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1609,6 +1609,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 		rq = *port;
 		if (rq)
 			__intel_context_stats_start(rq->context,
+						    engine,
 						    engine->stats.enabled_at);
 
 		for (; (rq = *port); port++)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1cd38b6eb189..7eefa7fe160b 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1197,19 +1197,23 @@ static void reset_active(struct i915_request *rq,
 	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
 }
 
-static void intel_context_stats_start(struct intel_context *ce)
+static void intel_context_stats_start(struct intel_context *ce,
+				      struct intel_engine_cs *engine)
 {
 	struct intel_context_stats *stats = &ce->stats;
 	unsigned long flags;
 
 	write_seqlock_irqsave(&stats->lock, flags);
-	__intel_context_stats_start(ce, ktime_get());
+	__intel_context_stats_start(ce, engine, ktime_get());
 	write_sequnlock_irqrestore(&stats->lock, flags);
 }
 
-static void intel_context_stats_stop(struct intel_context *ce)
+static void intel_context_stats_stop(struct intel_context *ce,
+				     struct intel_engine_cs *engine)
 {
 	struct intel_context_stats *stats = &ce->stats;
+	struct i915_gem_context *ctx;
+	ktime_t now, runtime;
 	unsigned long flags;
 
 	if (!READ_ONCE(stats->active))
@@ -1217,10 +1221,25 @@ static void intel_context_stats_stop(struct intel_context *ce)
 
 	write_seqlock_irqsave(&stats->lock, flags);
 	GEM_BUG_ON(!READ_ONCE(stats->active));
-	stats->total = ktime_add(stats->total,
-				 ktime_sub(ktime_get(), stats->start));
+	now = ktime_get();
+	runtime = ktime_sub(now, stats->start);
+	stats->total = ktime_add(stats->total, runtime);
 	stats->active = false;
 	write_sequnlock_irqrestore(&stats->lock, flags);
+
+	rcu_read_lock();
+	ctx = rcu_dereference(ce->gem_context);
+	if (ctx && ctx->client) {
+		struct i915_drm_client_stats *cstats =
+			&ctx->client->stats[engine->uabi_class];
+
+		write_seqlock_irqsave(&cstats->lock, flags);
+		GEM_WARN_ON(cstats->active == 0);
+		cstats->busy += ktime_to_ns(runtime) * cstats->active--;
+		cstats->start = now;
+		write_sequnlock_irqrestore(&cstats->lock, flags);
+	}
+	rcu_read_unlock();
 }
 
 static inline struct intel_engine_cs *
@@ -1307,7 +1326,7 @@ __execlists_schedule_out(struct i915_request *rq,
 		intel_engine_add_retire(engine, ce->timeline);
 
 	intel_engine_context_out(engine);
-	intel_context_stats_stop(ce);
+	intel_context_stats_stop(ce, engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 	intel_gt_pm_put_async(engine->gt);
 
@@ -2367,7 +2386,7 @@ static void process_csb(struct intel_engine_cs *engine)
 
 		rq = *execlists->active;
 		if (rq)
-			intel_context_stats_start(rq->context);
+			intel_context_stats_start(rq->context, engine);
 	} while (head != tail);
 
 	execlists->csb_head = head;
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 770094ef2e67..d26583d5825f 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -121,7 +121,7 @@ i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task)
 {
 	struct i915_drm_client *client;
 	u32 next = 0;
-	int ret;
+	int i, ret;
 
 	client = kzalloc(sizeof(*client), GFP_KERNEL);
 	if (!client)
@@ -129,6 +129,8 @@ i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task)
 
 	kref_init(&client->kref);
 	mutex_init(&client->update_lock);
+	for (i = 0; i < ARRAY_SIZE(client->stats); i++)
+		seqlock_init(&client->stats[i].lock);
 	client->clients = clients;
 
 	ret = xa_alloc_cyclic(&clients->xarray, &client->id, client,
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index a9d4d7396e43..6361976a9f05 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -14,8 +14,11 @@
 #include <linux/pid.h>
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
+#include <linux/seqlock.h>
 #include <linux/xarray.h>
 
+#include "gt/intel_engine_types.h"
+
 struct i915_drm_clients {
 	struct xarray xarray;
 	struct kobject *root;
@@ -33,6 +36,13 @@ struct i915_drm_client {
 	char __rcu *name;
 	bool closed;
 
+	struct i915_drm_client_stats {
+		seqlock_t lock;
+		unsigned int active;
+		ktime_t start;
+		u64 busy;
+	} stats[MAX_ENGINE_CLASS + 1];
+
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 6/6] drm/i915: Expose per-engine client busyness
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness Tvrtko Ursulin
@ 2020-02-07 16:13 ` Tvrtko Ursulin
  2020-02-07 20:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Per client engine busyness (rev4) Patchwork
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.

This enables userspace to create a top-like tool for GPU utilization:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

      IMC reads:     1401 MiB/s
     IMC writes:        4 MiB/s

          ENGINE      BUSY                                 MI_SEMA MI_WAIT
     Render/3D/0   63.73% |███████████████████           |      3%      0%
       Blitter/0    9.53% |██▊                           |      6%      0%
         Video/0   39.32% |███████████▊                  |     16%      0%
         Video/1   15.62% |████▋                         |      0%      0%
  VideoEnhance/0    0.00% |                              |      0%      0%

  PID            NAME     RCS          BCS          VCS         VECS
 4084        gem_wsim |█████▌     ||█          ||           ||           |
 4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
 * Always enable stats.
 * Walk all client contexts.
v9:
 * Skip unsupported engine classes. (Chris)
 * Use scheduler caps. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 112 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h |  11 +++
 2 files changed, 122 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index d26583d5825f..e305aba8744f 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -10,8 +10,13 @@
 
 #include <drm/drm_print.h>
 
+#include <uapi/drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_drm_client.h"
+#include "gem/i915_gem_context.h"
+#include "gt/intel_engine_user.h"
+#include "i915_drv.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
 
@@ -47,13 +52,46 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
 	return ret;
 }
 
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+	struct i915_engine_busy_attribute *i915_attr =
+		container_of(attr, typeof(*i915_attr), attr);
+	struct i915_drm_client_stats *cstats =
+		&i915_attr->client->stats[i915_attr->engine_class];
+	unsigned int seq;
+	u64 total;
+
+	if (i915_attr->no_busy_stats)
+		return -ENODEV;
+
+	do {
+		seq = read_seqbegin(&cstats->lock);
+		total = cstats->busy +
+			ktime_to_ns(ktime_sub(ktime_get(), cstats->start)) *
+			cstats->active;
+	} while (read_seqretry(&cstats->lock, seq));
+
+	return snprintf(buf, PAGE_SIZE, "%llu\n", total);
+}
+
+static const char *uabi_class_names[] = {
+	[I915_ENGINE_CLASS_RENDER] = "0",
+	[I915_ENGINE_CLASS_COPY] = "1",
+	[I915_ENGINE_CLASS_VIDEO] = "2",
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = "3",
+};
+
 static int
 __i915_drm_client_register(struct i915_drm_client *client,
 			   struct task_struct *task)
 {
 	struct i915_drm_clients *clients = client->clients;
+	struct drm_i915_private *i915 =
+		container_of(clients, typeof(*i915), clients);
+	struct intel_engine_cs *engine;
 	struct device_attribute *attr;
-	int ret = -ENOMEM;
+	int i, ret = -ENOMEM;
 	char idstr[32];
 	char *name;
 
@@ -92,8 +130,70 @@ __i915_drm_client_register(struct i915_drm_client *client,
 	if (ret)
 		goto err_attr;
 
+	if (i915->caps.scheduler & I915_SCHEDULER_CAP_ENGINE_BUSY_STATS) {
+		client->busy_root =
+			kobject_create_and_add("busy", client->root);
+		if (!client->busy_root)
+			goto err_attr;
+
+		for (i = 0; i < ARRAY_SIZE(uabi_class_names); i++) {
+			struct i915_engine_busy_attribute *i915_attr =
+				&client->attr.busy[i];
+
+			if (!intel_engine_lookup_user(i915, i, 0))
+				continue;
+
+			i915_attr->client = client;
+			i915_attr->engine_class = i;
+
+			attr = &i915_attr->attr;
+
+			sysfs_attr_init(&attr->attr);
+
+			attr->attr.name = uabi_class_names[i];
+			attr->attr.mode = 0444;
+			attr->show = show_client_busy;
+
+			ret = sysfs_create_file(client->busy_root,
+						(struct attribute *)attr);
+			if (ret)
+				goto err_busy;
+		}
+
+		/* Enable busy stats on all engines. */
+		i = 0;
+		for_each_uabi_engine(engine, i915) {
+			ret = intel_enable_engine_stats(engine);
+			if (ret) {
+				int j, k;
+
+				/* Unwind if not available. */
+				j = 0;
+				for_each_uabi_engine(engine, i915) {
+					if (j++ == i)
+						break;
+
+					intel_disable_engine_stats(engine);
+				}
+
+				for (k = 0;
+				     k < ARRAY_SIZE(uabi_class_names);
+				     k++)
+					client->attr.busy[k].no_busy_stats = true;
+
+				dev_notice_once(i915->drm.dev,
+						"Engine busy stats not available! (%d)",
+						ret);
+				break;
+			}
+			i++;
+		}
+	}
+
 	return 0;
 
+err_busy:
+	kobject_put(client->busy_root);
 err_attr:
 	kobject_put(client->root);
 err_client:
@@ -113,6 +213,16 @@ __i915_drm_client_unregister(struct i915_drm_client *client)
 	if (!client->root)
 		return; /* fbdev client or error during drm open */
 
+	if (client->busy_root && !client->attr.busy[0].no_busy_stats) {
+		struct drm_i915_private *i915 =
+			container_of(client->clients, typeof(*i915), clients);
+		struct intel_engine_cs *engine;
+
+		for_each_uabi_engine(engine, i915)
+			intel_disable_engine_stats(engine);
+	}
+
+	kobject_put(fetch_and_zero(&client->busy_root));
 	kobject_put(fetch_and_zero(&client->root));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 6361976a9f05..d1875e7c46c3 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -24,6 +24,15 @@ struct i915_drm_clients {
 	struct kobject *root;
 };
 
+struct i915_drm_client;
+
+struct i915_engine_busy_attribute {
+	struct device_attribute attr;
+	struct i915_drm_client *client;
+	unsigned int engine_class;
+	bool no_busy_stats;
+};
+
 struct i915_drm_client {
 	struct kref kref;
 
@@ -46,9 +55,11 @@ struct i915_drm_client {
 	struct i915_drm_clients *clients;
 
 	struct kobject *root;
+	struct kobject *busy_root;
 	struct {
 		struct device_attribute pid;
 		struct device_attribute name;
+		struct i915_engine_busy_attribute busy[MAX_ENGINE_CLASS];
 	} attr;
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness Tvrtko Ursulin
@ 2020-02-07 16:33   ` Chris Wilson
  2020-02-07 16:49     ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2020-02-07 16:33 UTC (permalink / raw)
  To: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2020-02-07 16:13:30)
>  static inline void
> -__intel_context_stats_start(struct intel_context *ce, ktime_t now)
> +__intel_context_stats_start(struct intel_context *ce,
> +                           struct intel_engine_cs *engine,
> +                           ktime_t now)
>  {
>         struct intel_context_stats *stats = &ce->stats;
> -
> -       if (!stats->active) {
> -               stats->start = now;
> -               stats->active = true;
> +       struct i915_gem_context *ctx;
> +
> +       if (stats->active)
> +               return;
> +
> +       stats->start = now;
> +       stats->active = true;
> +
> +       rcu_read_lock();
> +       ctx = rcu_dereference(ce->gem_context);
> +       if (ctx && ctx->client) {

I'd rather avoid having to dig into the GEM context down here next to
the HW.

First thought would be to keep the stats local on the intel_context and
for the client to chase collate them when the user reads the fd.

Hmm, didn't you structure it like so earlier? What made you change your
mind?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness
  2020-02-07 16:33   ` Chris Wilson
@ 2020-02-07 16:49     ` Tvrtko Ursulin
  2020-02-07 17:02       ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:49 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx


On 07/02/2020 16:33, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-02-07 16:13:30)
>>   static inline void
>> -__intel_context_stats_start(struct intel_context *ce, ktime_t now)
>> +__intel_context_stats_start(struct intel_context *ce,
>> +                           struct intel_engine_cs *engine,
>> +                           ktime_t now)
>>   {
>>          struct intel_context_stats *stats = &ce->stats;
>> -
>> -       if (!stats->active) {
>> -               stats->start = now;
>> -               stats->active = true;
>> +       struct i915_gem_context *ctx;
>> +
>> +       if (stats->active)
>> +               return;
>> +
>> +       stats->start = now;
>> +       stats->active = true;
>> +
>> +       rcu_read_lock();
>> +       ctx = rcu_dereference(ce->gem_context);
>> +       if (ctx && ctx->client) {
> 
> I'd rather avoid having to dig into the GEM context down here next to
> the HW.
> 
> First thought would be to keep the stats local on the intel_context and
> for the client to chase collate them when the user reads the fd.
> 
> Hmm, didn't you structure it like so earlier? What made you change your
> mind?

Yes, it's in the cover letter - we must not have disappearing 
contributions - client can submit from one context for a bit, close it, 
and oops usage history lost.

ce->drm_client? :)


Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness
  2020-02-07 16:49     ` Tvrtko Ursulin
@ 2020-02-07 17:02       ` Chris Wilson
  2020-02-07 17:24         ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2020-02-07 17:02 UTC (permalink / raw)
  To: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2020-02-07 16:49:17)
> 
> On 07/02/2020 16:33, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2020-02-07 16:13:30)
> >>   static inline void
> >> -__intel_context_stats_start(struct intel_context *ce, ktime_t now)
> >> +__intel_context_stats_start(struct intel_context *ce,
> >> +                           struct intel_engine_cs *engine,
> >> +                           ktime_t now)
> >>   {
> >>          struct intel_context_stats *stats = &ce->stats;
> >> -
> >> -       if (!stats->active) {
> >> -               stats->start = now;
> >> -               stats->active = true;
> >> +       struct i915_gem_context *ctx;
> >> +
> >> +       if (stats->active)
> >> +               return;
> >> +
> >> +       stats->start = now;
> >> +       stats->active = true;
> >> +
> >> +       rcu_read_lock();
> >> +       ctx = rcu_dereference(ce->gem_context);
> >> +       if (ctx && ctx->client) {
> > 
> > I'd rather avoid having to dig into the GEM context down here next to
> > the HW.
> > 
> > First thought would be to keep the stats local on the intel_context and
> > for the client to chase collate them when the user reads the fd.
> > 
> > Hmm, didn't you structure it like so earlier? What made you change your
> > mind?
> 
> Yes, it's in the cover letter - we must not have disappearing 
> contributions - client can submit from one context for a bit, close it, 
> and oops usage history lost.

If only we had a mechanism for accumulating contributions from stale
engines and then from stale contests?
 
> ce->drm_client? :)

Yeah that is equally icky for swimming upstream.

Ok, I have an idea for capturing the contributions at close rather than
having to do global accumulations at runtime.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness
  2020-02-07 17:02       ` Chris Wilson
@ 2020-02-07 17:24         ` Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 17:24 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx


On 07/02/2020 17:02, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-02-07 16:49:17)
>>
>> On 07/02/2020 16:33, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-02-07 16:13:30)
>>>>    static inline void
>>>> -__intel_context_stats_start(struct intel_context *ce, ktime_t now)
>>>> +__intel_context_stats_start(struct intel_context *ce,
>>>> +                           struct intel_engine_cs *engine,
>>>> +                           ktime_t now)
>>>>    {
>>>>           struct intel_context_stats *stats = &ce->stats;
>>>> -
>>>> -       if (!stats->active) {
>>>> -               stats->start = now;
>>>> -               stats->active = true;
>>>> +       struct i915_gem_context *ctx;
>>>> +
>>>> +       if (stats->active)
>>>> +               return;
>>>> +
>>>> +       stats->start = now;
>>>> +       stats->active = true;
>>>> +
>>>> +       rcu_read_lock();
>>>> +       ctx = rcu_dereference(ce->gem_context);
>>>> +       if (ctx && ctx->client) {
>>>
>>> I'd rather avoid having to dig into the GEM context down here next to
>>> the HW.
>>>
>>> First thought would be to keep the stats local on the intel_context and
>>> for the client to chase collate them when the user reads the fd.
>>>
>>> Hmm, didn't you structure it like so earlier? What made you change your
>>> mind?
>>
>> Yes, it's in the cover letter - we must not have disappearing
>> contributions - client can submit from one context for a bit, close it,
>> and oops usage history lost.
> 
> If only we had a mechanism for accumulating contributions from stale
> engines and then from stale contests?
>   
>> ce->drm_client? :)
> 
> Yeah that is equally icky for swimming upstream.
> 
> Ok, I have an idea for capturing the contributions at close rather than
> having to do global accumulations at runtime.

It's certainly possible by accumulating over context runtimes when they 
are closed and adding to active runtime at readout. I contemplated that 
solution as well.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Per client engine busyness (rev4)
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 6/6] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
@ 2020-02-07 20:19 ` Patchwork
  2020-02-07 20:24 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
  2020-02-07 20:44 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2020-02-07 20:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per client engine busyness (rev4)
URL   : https://patchwork.freedesktop.org/series/70977/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
f50257412ea0 drm/i915: Expose list of clients in sysfs
-:65: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#65: 
new file mode 100644

-:70: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#70: FILE: drivers/gpu/drm/i915/i915_drm_client.c:1:
+/*

-:71: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#71: FILE: drivers/gpu/drm/i915/i915_drm_client.c:2:
+ * SPDX-License-Identifier: MIT

-:228: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#228: FILE: drivers/gpu/drm/i915/i915_drm_client.h:1:
+/*

-:229: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#229: FILE: drivers/gpu/drm/i915/i915_drm_client.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 5 warnings, 0 checks, 340 lines checked
45b83964a478 drm/i915: Update client name on context create
-:174: ERROR:OPEN_BRACE: open brace '{' following struct go on the same line
#174: FILE: drivers/gpu/drm/i915/i915_drm_client.c:171:
+struct client_update_free
+{

-:216: WARNING:OOM_MESSAGE: Possible unnecessary 'out of memory' message
#216: FILE: drivers/gpu/drm/i915/i915_drm_client.c:213:
+	if (!name) {
+		drm_notice(&i915->drm,

-:258: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#258: FILE: drivers/gpu/drm/i915/i915_drm_client.h:29:
+	struct mutex update_lock;

total: 1 errors, 1 warnings, 1 checks, 219 lines checked
3b55d5589621 drm/i915: Make GEM contexts track DRM clients
02d2f600b3e6 drm/i915: Track per-context engine busyness
c1b02b71e21c drm/i915: Track per drm client engine class busyness
209d107b652f drm/i915: Expose per-engine client busyness
-:25: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#25: 
     Render/3D/0   63.73% |███████████████████           |      3%      0%

-:96: WARNING:STATIC_CONST_CHAR_ARRAY: static const char * array should probably be static const char * const
#96: FILE: drivers/gpu/drm/i915/i915_drm_client.c:78:
+static const char *uabi_class_names[] = {

total: 0 errors, 2 warnings, 0 checks, 172 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Per client engine busyness (rev4)
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2020-02-07 20:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Per client engine busyness (rev4) Patchwork
@ 2020-02-07 20:24 ` Patchwork
  2020-02-07 20:44 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2020-02-07 20:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per client engine busyness (rev4)
URL   : https://patchwork.freedesktop.org/series/70977/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.6.0
Commit: drm/i915: Expose list of clients in sysfs
Okay!

Commit: drm/i915: Update client name on context create
+drivers/gpu/drm/i915/i915_drm_client.c:102:23:    expected struct pid *pid
+drivers/gpu/drm/i915/i915_drm_client.c:102:23:    got struct pid [noderef] <asn:4> *pid
+drivers/gpu/drm/i915/i915_drm_client.c:102:23: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/i915_drm_client.c:208:17: error: incompatible types in comparison expression (different address spaces)

Commit: drm/i915: Make GEM contexts track DRM clients
Okay!

Commit: drm/i915: Track per-context engine busyness
Okay!

Commit: drm/i915: Track per drm client engine class busyness
Okay!

Commit: drm/i915: Expose per-engine client busyness
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Per client engine busyness (rev4)
  2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
                   ` (7 preceding siblings ...)
  2020-02-07 20:24 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-02-07 20:44 ` Patchwork
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2020-02-07 20:44 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Per client engine busyness (rev4)
URL   : https://patchwork.freedesktop.org/series/70977/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7887 -> Patchwork_16487
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/index.html

Known issues
------------

  Here are the changes found in Patchwork_16487 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live_execlists:
    - fi-icl-dsi:         [PASS][1] -> [DMESG-FAIL][2] ([fdo#108569])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-icl-dsi/igt@i915_selftest@live_execlists.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-icl-dsi/igt@i915_selftest@live_execlists.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-cfl-guc:         [PASS][3] -> [INCOMPLETE][4] ([fdo#106070] / [i915#424])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-cfl-guc/igt@i915_selftest@live_gem_contexts.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-cfl-guc/igt@i915_selftest@live_gem_contexts.html

  * igt@kms_chamelium@dp-crc-fast:
    - fi-kbl-7500u:       [PASS][5] -> [FAIL][6] ([fdo#109635] / [i915#217])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       [PASS][7] -> [DMESG-WARN][8] ([i915#44])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_blt:
    - fi-ivb-3770:        [DMESG-FAIL][9] ([i915#725]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-ivb-3770/igt@i915_selftest@live_blt.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-ivb-3770/igt@i915_selftest@live_blt.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-cfl-8700k:       [DMESG-FAIL][11] ([i915#623]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html
    - fi-byt-n2820:       [DMESG-FAIL][13] ([i915#1052]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-byt-n2820/igt@i915_selftest@live_gem_contexts.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-byt-n2820/igt@i915_selftest@live_gem_contexts.html

  * igt@i915_selftest@live_gtt:
    - fi-icl-guc:         [TIMEOUT][15] ([fdo#112271]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-icl-guc/igt@i915_selftest@live_gtt.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-icl-guc/igt@i915_selftest@live_gtt.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][17] ([fdo#111096] / [i915#323]) -> [PASS][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-skl-6770hq:      [INCOMPLETE][19] ([i915#69]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-skl-6770hq/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-skl-6770hq/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  
#### Warnings ####

  * igt@gem_exec_parallel@contexts:
    - fi-byt-n2820:       [TIMEOUT][21] ([fdo#112271]) -> [TIMEOUT][22] ([fdo#112271] / [i915#1084])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-byt-n2820/igt@gem_exec_parallel@contexts.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-byt-n2820/igt@gem_exec_parallel@contexts.html

  * igt@gem_exec_parallel@fds:
    - fi-byt-n2820:       [FAIL][23] ([i915#694]) -> [TIMEOUT][24] ([fdo#112271] / [i915#1084])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7887/fi-byt-n2820/igt@gem_exec_parallel@fds.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/fi-byt-n2820/igt@gem_exec_parallel@fds.html

  
  [fdo#106070]: https://bugs.freedesktop.org/show_bug.cgi?id=106070
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#109635]: https://bugs.freedesktop.org/show_bug.cgi?id=109635
  [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#1052]: https://gitlab.freedesktop.org/drm/intel/issues/1052
  [i915#1084]: https://gitlab.freedesktop.org/drm/intel/issues/1084
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#323]: https://gitlab.freedesktop.org/drm/intel/issues/323
  [i915#424]: https://gitlab.freedesktop.org/drm/intel/issues/424
  [i915#44]: https://gitlab.freedesktop.org/drm/intel/issues/44
  [i915#623]: https://gitlab.freedesktop.org/drm/intel/issues/623
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#694]: https://gitlab.freedesktop.org/drm/intel/issues/694
  [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725


Participating hosts (48 -> 45)
------------------------------

  Additional (4): fi-skl-lmem fi-blb-e6850 fi-bdw-5557u fi-skl-6600u 
  Missing    (7): fi-hsw-4200u fi-bsw-n3050 fi-bsw-cyan fi-ctg-p8600 fi-kbl-7560u fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7887 -> Patchwork_16487

  CI-20190529: 20190529
  CI_DRM_7887: b147ed9753265260d6380604de01c3d646a323cd @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5425: ad4542ef1adbaa1227bc9ba9e24bb0e0f6dd408d @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16487: 209d107b652fc1c8f58892f9284b5cb572e1d018 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

209d107b652f drm/i915: Expose per-engine client busyness
c1b02b71e21c drm/i915: Track per drm client engine class busyness
02d2f600b3e6 drm/i915: Track per-context engine busyness
3b55d5589621 drm/i915: Make GEM contexts track DRM clients
45b83964a478 drm/i915: Update client name on context create
f50257412ea0 drm/i915: Expose list of clients in sysfs

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16487/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 4/6] drm/i915: Track per-context engine busyness
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 4/6] drm/i915: Track per-context engine busyness Tvrtko Ursulin
@ 2020-02-09 11:02   ` Chris Wilson
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2020-02-09 11:02 UTC (permalink / raw)
  To: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2020-02-07 16:13:29)
>                 } else {
> -                       GEM_BUG_ON(!*execlists->active);
> +                       rq = *execlists->active++;
> +                       GEM_BUG_ON(!rq);
> +
> +                       GEM_BUG_ON(execlists->active - execlists->inflight >
> +                                  execlists_num_ports(execlists));
>  
>                         /* port0 completed, advanced to port1 */
>                         trace_ports(execlists, "completed", execlists->active);
> @@ -2327,13 +2359,15 @@ static void process_csb(struct intel_engine_cs *engine)
>                          * coherent (visible from the CPU) before the
>                          * user interrupt and CSB is processed.
>                          */
> -                       GEM_BUG_ON(!i915_request_completed(*execlists->active) &&
> +                       GEM_BUG_ON(!i915_request_completed(rq) &&
>                                    !reset_in_progress(execlists));
> -                       execlists_schedule_out(*execlists->active++);
>  
> -                       GEM_BUG_ON(execlists->active - execlists->inflight >
> -                                  execlists_num_ports(execlists));
> +                       execlists_schedule_out(rq);

We don't need to touch this branch...

>                 }
> +
> +               rq = *execlists->active;
> +               if (rq)
> +                       intel_context_stats_start(rq->context);

As this can be done after the loop. I think combining it with the
set_timeslice would be nice (both are 'time' related).

>         } while (head != tail);
>  
>         execlists->csb_head = head;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] [RFC 0/8] Per client engine busyness
@ 2020-01-10 13:30 Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-01-10 13:30 UTC (permalink / raw)
  To: Intel-gfx; +Cc: kui.wen

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Two patches added on top of the series to provide the data in GuC mode.
   (For warnings and limitations please read the respective commit messages.)
 * Refactor to introduce a notion of i915_drm_client and move to separate source
   file.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── 9
	    ├── busy
	    │   ├── 0
	    │   ├── 1
	    │   ├── 2
	    │   └── 3
	    ├── name
	    └── pid


Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

Tvrtko Ursulin (8):
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Track per-context engine busyness
  drm/i915: Track all user contexts per client
  drm/i915: Contexts can use struct pid stored in the client
  drm/i915: Expose per-engine client busyness
  drm/i915: Track hw reported context runtime
  drm/i915: Fallback to hw context runtime when sw tracking is not
    available

 drivers/gpu/drm/i915/Makefile                 |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  60 +++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  20 ++
 drivers/gpu/drm/i915/gt/intel_context.h       |  18 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |  14 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  55 +++-
 drivers/gpu/drm/i915/i915_debugfs.c           |   7 +-
 drivers/gpu/drm/i915/i915_drm_client.c        | 298 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h        |  83 +++++
 drivers/gpu/drm/i915/i915_drv.h               |   5 +
 drivers/gpu/drm/i915/i915_gem.c               |  35 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |   8 +-
 drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
 drivers/gpu/drm/i915/intel_device_info.c      |   2 +
 drivers/gpu/drm/i915/intel_device_info.h      |   1 +
 17 files changed, 606 insertions(+), 43 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] [RFC 0/8] Per client engine busyness
@ 2019-12-19 18:00 Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2019-12-19 18:00 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Now tracks GPU time for clients who exit with GPU work left running.
 * No more global toggle - it is now constantly on.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 9
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── enable_stats

Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

I will post the corresponding patch to intel_gpu_top for reference as well.

Tvrtko Ursulin (8):
  drm/i915: Switch context id allocation directoy to xarray
  drm/i915: Reference count struct drm_i915_file_private
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Track per-context engine busyness
  drm/i915: Track all user contexts per client
  drm/i915: Contexts can use struct pid stored in the client
  drm/i915: Expose per-engine client busyness

 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 113 ++++++---
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
 .../gpu/drm/i915/gem/selftests/mock_context.c |   3 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  20 ++
 drivers/gpu/drm/i915/gt/intel_context.h       |  11 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  52 +++-
 drivers/gpu/drm/i915/i915_debugfs.c           |   7 +-
 drivers/gpu/drm/i915/i915_drv.c               |   4 -
 drivers/gpu/drm/i915/i915_drv.h               |  66 ++++-
 drivers/gpu/drm/i915/i915_gem.c               | 236 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |   6 +-
 drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
 14 files changed, 486 insertions(+), 81 deletions(-)

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-02-09 11:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 1/6] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 2/6] drm/i915: Update client name on context create Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 3/6] drm/i915: Make GEM contexts track DRM clients Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 4/6] drm/i915: Track per-context engine busyness Tvrtko Ursulin
2020-02-09 11:02   ` Chris Wilson
2020-02-07 16:13 ` [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness Tvrtko Ursulin
2020-02-07 16:33   ` Chris Wilson
2020-02-07 16:49     ` Tvrtko Ursulin
2020-02-07 17:02       ` Chris Wilson
2020-02-07 17:24         ` Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 6/6] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2020-02-07 20:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Per client engine busyness (rev4) Patchwork
2020-02-07 20:24 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-02-07 20:44 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2020-01-10 13:30 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
2019-12-19 18:00 Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.