linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/8] CPU + GPU synchronised priority scheduling
@ 2021-10-04 14:36 Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 1/8] sched: Add nice value change notifier Tvrtko Ursulin
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx
  Cc: dri-devel, linux-kernel, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This is a somewhat early sketch of one of my ideas intended for early feedback
from the core scheduler experts. First and last two patches in the series are
the most interesting ones for people outside of i915. (Note I did not copy
everyone on all patches but just the cover letter for context and the rest
should be available from the mailing list.)

General idea is that current processing landscape seems to be more and more
composed of pipelines where computations are done on multiple hardware devices.
Furthermore some of the non-CPU devices, like in this case many GPUs supported
by the i915 driver, actually support priority based scheduling which is
currently rather inaccesible to the user (in terms of being able to control it
from the outside).

From these two statements a question arises on how to allow for a simple,
effective and consolidated user experience. In other words why user would not be
able to do something like:

 $ nice ffmmpeg ...transcode my videos...
 $ my-favourite-game

And have the nice hint apply to GPU parts of the transcode pipeline as well?

Another reason why I started thinking about this is that I noticed Chrome
browser for instance uses nice to de-prioritise background tabs. So again,
having that decision propagate to the GPU rendering pipeline sounds like a big
plus to the overall user experience.

This RFC implements this idea with the hairy part being the notifier chain I
added to enable dynamic adjustments. It is a global notifier which raises a few
questions so I am very curious what experts will think here. Please see the
opens in the first patch for more on this.

Last patch ("drm/i915: Connect with the process nice change notifier")
demonstrates how i915 can use the notifier. With a bit of notable tracking being
required and addedd in "drm/i915: Keep track of registered clients indexed by
task struct".

On a more positive note the thing seems to even work as is. For instance I
roughly simulated the above scenario by running a GPU hog at three nice levels
and a GfxBench TRex in parallel (as a game proxy). This is what I got:

   GPU hog nice |   TRex fps
  --------------+---------------
    not running |      48.9
         0      |      42.7
        10      |      47.9
       -10      |      29.0

When re-niced the background GPU hog has a much smaller effect on the
performance of the game user is running in the foreground. So it appears the
feature can indeed improve the user experience. Question is just if people are
happy with this method of implementing it.

v2:
 * Moved notifier outside task_rq_lock.
 * Some improvements and restructuring on the i915 side of the series.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>

Tvrtko Ursulin (8):
  sched: Add nice value change notifier
  drm/i915: Explicitly track DRM clients
  drm/i915: Make GEM contexts track DRM clients
  drm/i915: Track all user contexts per client
  drm/i915: Keep track of registered clients indexed by task struct
  drm/i915: Make some recently added vfuncs use full scheduling
    attribute
  drm/i915: Inherit process nice for context scheduling priority
  drm/i915: Connect with the process nice change notifier

 drivers/gpu/drm/i915/Makefile                 |   5 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  20 +++
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   6 +
 .../drm/i915/gt/intel_execlists_submission.c  |   6 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   3 +-
 drivers/gpu/drm/i915/i915_drm_client.c        | 130 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h        |  71 ++++++++++
 drivers/gpu/drm/i915/i915_drv.c               |   6 +
 drivers/gpu/drm/i915/i915_drv.h               |   5 +
 drivers/gpu/drm/i915/i915_gem.c               |  21 ++-
 drivers/gpu/drm/i915/i915_request.c           |   2 +-
 drivers/gpu/drm/i915/i915_request.h           |   5 +
 drivers/gpu/drm/i915/i915_scheduler.c         |  16 ++-
 drivers/gpu/drm/i915/i915_scheduler.h         |  14 ++
 drivers/gpu/drm/i915/i915_scheduler_types.h   |  12 +-
 include/linux/sched.h                         |   5 +
 kernel/sched/core.c                           |  37 ++++-
 17 files changed, 346 insertions(+), 18 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

-- 
2.30.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC 1/8] sched: Add nice value change notifier
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  2021-10-06  4:10   ` Wanghui (John)
  2021-10-04 14:36 ` [RFC 2/8] drm/i915: Explicitly track DRM clients Tvrtko Ursulin
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx
  Cc: dri-devel, linux-kernel, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Implement a simple notifier chain via which interested parties can track
when process nice value changes. Simple because it is global so each user
would have to track which tasks it is interested in.

First intended use case are GPU drivers using task nice as priority hint
when scheduling GPU contexts belonging to respective clients.

To use register_user_nice_notifier and unregister_user_nice_notifier
functions are provided and new nice value and pointer to task_struct
being modified passed to the callbacks.

v2:
 * Move the notifier chain outside task_rq_lock. (Peter)

Opens:
 * Security. Would some sort of a  per process mechanism be better and
   feasible?
     x Peter Zijlstra thinks it may be passable now that it is outside
       core scheduler locks.
 * Put it all behind kconfig to be selected by interested drivers?

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
---
 include/linux/sched.h |  5 +++++
 kernel/sched/core.c   | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index c1a927ddec64..1fcec88e5dbc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2309,4 +2309,9 @@ static inline void sched_core_free(struct task_struct *tsk) { }
 static inline void sched_core_fork(struct task_struct *p) { }
 #endif
 
+struct notifier_block;
+
+extern int register_user_nice_notifier(struct notifier_block *);
+extern int unregister_user_nice_notifier(struct notifier_block *);
+
 #endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1bba4128a3e6..fc90b603bb6f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6864,10 +6864,42 @@ static inline int rt_effective_prio(struct task_struct *p, int prio)
 }
 #endif
 
+ATOMIC_NOTIFIER_HEAD(user_nice_notifier_list);
+
+/**
+ * register_user_nice_notifier - Register function to be called when task nice changes
+ * @nb: Info about notifier function to be called
+ *
+ * Registers a function with the list of functions to be called when task nice
+ * value changes.
+ *
+ * Currently always returns zero, as atomic_notifier_chain_register()
+ * always returns zero.
+ */
+int register_user_nice_notifier(struct notifier_block *nb)
+{
+	return atomic_notifier_chain_register(&user_nice_notifier_list, nb);
+}
+EXPORT_SYMBOL(register_user_nice_notifier);
+
+/**
+ * unregister_user_nice_notifier - Unregister previously registered user nice notifier
+ * @nb: Hook to be unregistered
+ *
+ * Unregisters a previously registered user nice notifier function.
+ *
+ * Returns zero on success, or %-ENOENT on failure.
+ */
+int unregister_user_nice_notifier(struct notifier_block *nb)
+{
+	return atomic_notifier_chain_unregister(&user_nice_notifier_list, nb);
+}
+EXPORT_SYMBOL(unregister_user_nice_notifier);
+
 void set_user_nice(struct task_struct *p, long nice)
 {
 	bool queued, running;
-	int old_prio;
+	int old_prio, ret;
 	struct rq_flags rf;
 	struct rq *rq;
 
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
 
 out_unlock:
 	task_rq_unlock(rq, p, &rf);
+
+	ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
+	WARN_ON_ONCE(ret != NOTIFY_DONE);
 }
 EXPORT_SYMBOL(set_user_nice);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC 2/8] drm/i915: Explicitly track DRM clients
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 1/8] sched: Add nice value change notifier Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 3/8] drm/i915: Make GEM contexts " Tvrtko Ursulin
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx
  Cc: dri-devel, linux-kernel, Tvrtko Ursulin, Chris Wilson,
	Aravind Iddamsetty

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Tracking DRM clients more explicitly will allow later patches to
accumulate past and current GPU usage in a centralised place and also
consolidate access to owning task pid/name.

Unique client id is also assigned for the purpose of distinguishing/
consolidating between multiple file descriptors owned by the same process.

v2:
 Chris Wilson:
 * Enclose new members into dedicated structs.
 * Protect against failed sysfs registration.

v3:
 * sysfs_attr_init.

v4:
 * Fix for internal clients.

v5:
 * Use cyclic ida for client id. (Chris)
 * Do not leak pid reference. (Chris)
 * Tidy code with some locals.

v6:
 * Use xa_alloc_cyclic to simplify locking. (Chris)
 * No need to unregister individial sysfs files. (Chris)
 * Rebase on top of fpriv kref.
 * Track client closed status and reflect in sysfs.

v7:
 * Make drm_client more standalone concept.

v8:
 * Simplify sysfs show. (Chris)
 * Always track name and pid.

v9:
 * Fix cyclic id assignment.

v10:
 * No need for a mutex around xa_alloc_cyclic.
 * Refactor sysfs into own function.
 * Unregister sysfs before freeing pid and name.
 * Move clients setup into own function.

v11:
 * Call clients init directly from driver init. (Chris)

v12:
 * Do not fail client add on id wrap. (Maciej)

v13 (Lucas): Rebase.

v14:
 * Dropped sysfs bits.

v15:
 * Dropped tracking of pid/ and name.
 * Dropped RCU freeing of the client object.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v11
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v11
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile          |  5 +-
 drivers/gpu/drm/i915/i915_drm_client.c | 68 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h | 50 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.c        |  6 +++
 drivers/gpu/drm/i915/i915_drv.h        |  5 ++
 drivers/gpu/drm/i915/i915_gem.c        | 21 ++++++--
 6 files changed, 150 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 5c8e022a7383..005b5df425a1 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -32,8 +32,9 @@ subdir-ccflags-y += -I$(srctree)/$(src)
 # Please keep these build lists sorted!
 
 # core driver code
-i915-y += i915_drv.o \
-	  i915_config.o \
+i915-y += i915_config.o \
+	  i915_drm_client.o \
+	  i915_drv.o \
 	  i915_irq.o \
 	  i915_getparam.o \
 	  i915_mitigations.o \
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
new file mode 100644
index 000000000000..e61e9ba15256
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include "i915_drm_client.h"
+#include "i915_gem.h"
+#include "i915_utils.h"
+
+void i915_drm_clients_init(struct i915_drm_clients *clients,
+			   struct drm_i915_private *i915)
+{
+	clients->i915 = i915;
+	clients->next_id = 0;
+
+	xa_init_flags(&clients->xarray, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
+}
+
+struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients)
+{
+	struct i915_drm_client *client;
+	struct xarray *xa = &clients->xarray;
+	int ret;
+
+	client = kzalloc(sizeof(*client), GFP_KERNEL);
+	if (!client)
+		return ERR_PTR(-ENOMEM);
+
+	xa_lock_irq(xa);
+	ret = __xa_alloc_cyclic(xa, &client->id, client, xa_limit_32b,
+				&clients->next_id, GFP_KERNEL);
+	xa_unlock_irq(xa);
+	if (ret < 0)
+		goto err;
+
+	kref_init(&client->kref);
+	client->clients = clients;
+
+	return client;
+
+err:
+	kfree(client);
+
+	return ERR_PTR(ret);
+}
+
+void __i915_drm_client_free(struct kref *kref)
+{
+	struct i915_drm_client *client =
+		container_of(kref, typeof(*client), kref);
+	struct xarray *xa = &client->clients->xarray;
+	unsigned long flags;
+
+	xa_lock_irqsave(xa, flags);
+	__xa_erase(xa, client->id);
+	xa_unlock_irqrestore(xa, flags);
+	kfree(client);
+}
+
+void i915_drm_clients_fini(struct i915_drm_clients *clients)
+{
+	GEM_BUG_ON(!xa_empty(&clients->xarray));
+	xa_destroy(&clients->xarray);
+}
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
new file mode 100644
index 000000000000..e8986ad51176
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __I915_DRM_CLIENT_H__
+#define __I915_DRM_CLIENT_H__
+
+#include <linux/kref.h>
+#include <linux/xarray.h>
+
+struct drm_i915_private;
+
+struct i915_drm_clients {
+	struct drm_i915_private *i915;
+
+	struct xarray xarray;
+	u32 next_id;
+};
+
+struct i915_drm_client {
+	struct kref kref;
+
+	unsigned int id;
+
+	struct i915_drm_clients *clients;
+};
+
+void i915_drm_clients_init(struct i915_drm_clients *clients,
+			   struct drm_i915_private *i915);
+
+static inline struct i915_drm_client *
+i915_drm_client_get(struct i915_drm_client *client)
+{
+	kref_get(&client->kref);
+	return client;
+}
+
+void __i915_drm_client_free(struct kref *kref);
+
+static inline void i915_drm_client_put(struct i915_drm_client *client)
+{
+	kref_put(&client->kref, __i915_drm_client_free);
+}
+
+struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients);
+
+void i915_drm_clients_fini(struct i915_drm_clients *clients);
+
+#endif /* !__I915_DRM_CLIENT_H__ */
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1e0e4175b481..5e4d2ea1fae5 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -69,6 +69,7 @@
 #include "gt/intel_rc6.h"
 
 #include "i915_debugfs.h"
+#include "i915_drm_client.h"
 #include "i915_drv.h"
 #include "i915_ioc32.h"
 #include "i915_irq.h"
@@ -345,6 +346,8 @@ static int i915_driver_early_probe(struct drm_i915_private *dev_priv)
 
 	intel_gt_init_early(&dev_priv->gt, dev_priv);
 
+	i915_drm_clients_init(&dev_priv->clients, dev_priv);
+
 	i915_gem_init_early(dev_priv);
 
 	/* This must be called before any calls to HAS_PCH_* */
@@ -364,6 +367,7 @@ static int i915_driver_early_probe(struct drm_i915_private *dev_priv)
 
 err_gem:
 	i915_gem_cleanup_early(dev_priv);
+	i915_drm_clients_fini(&dev_priv->clients);
 	intel_gt_driver_late_release(&dev_priv->gt);
 	intel_region_ttm_device_fini(dev_priv);
 err_ttm:
@@ -383,6 +387,7 @@ static void i915_driver_late_release(struct drm_i915_private *dev_priv)
 	intel_irq_fini(dev_priv);
 	intel_power_domains_cleanup(dev_priv);
 	i915_gem_cleanup_early(dev_priv);
+	i915_drm_clients_fini(&dev_priv->clients);
 	intel_gt_driver_late_release(&dev_priv->gt);
 	intel_region_ttm_device_fini(dev_priv);
 	vlv_suspend_cleanup(dev_priv);
@@ -999,6 +1004,7 @@ static void i915_driver_postclose(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 
 	i915_gem_context_close(file);
+	i915_drm_client_put(file_priv->client);
 
 	kfree_rcu(file_priv, rcu);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 63c939891f1c..de2ba95b81f6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -96,6 +96,7 @@
 #include "intel_wakeref.h"
 #include "intel_wopcm.h"
 
+#include "i915_drm_client.h"
 #include "i915_gem.h"
 #include "i915_gem_gtt.h"
 #include "i915_gpu_error.h"
@@ -284,6 +285,8 @@ struct drm_i915_file_private {
 	/** ban_score: Accumulated score of all ctx bans and fast hangs. */
 	atomic_t ban_score;
 	unsigned long hang_timestamp;
+
+	struct i915_drm_client *client;
 };
 
 /* Interface history:
@@ -1238,6 +1241,8 @@ struct drm_i915_private {
 
 	struct i915_pmu pmu;
 
+	struct i915_drm_clients clients;
+
 	struct i915_hdcp_comp_master *hdcp_master;
 	bool hdcp_comp_added;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 981e383d1a5d..3df78babe1a9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1177,25 +1177,40 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
 int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv;
-	int ret;
+	struct i915_drm_client *client;
+	int ret = -ENOMEM;
 
 	DRM_DEBUG("\n");
 
 	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
 	if (!file_priv)
-		return -ENOMEM;
+		goto err_alloc;
+
+	client = i915_drm_client_add(&i915->clients);
+	if (IS_ERR(client)) {
+		ret = PTR_ERR(client);
+		goto err_client;
+	}
 
 	file->driver_priv = file_priv;
 	file_priv->dev_priv = i915;
 	file_priv->file = file;
+	file_priv->client = client;
 
 	file_priv->bsd_engine = -1;
 	file_priv->hang_timestamp = jiffies;
 
 	ret = i915_gem_context_open(i915, file);
 	if (ret)
-		kfree(file_priv);
+		goto err_context;
+
+	return 0;
 
+err_context:
+	i915_drm_client_put(client);
+err_client:
+	kfree(file_priv);
+err_alloc:
 	return ret;
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC 3/8] drm/i915: Make GEM contexts track DRM clients
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 1/8] sched: Add nice value change notifier Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 2/8] drm/i915: Explicitly track DRM clients Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 4/8] drm/i915: Track all user contexts per client Tvrtko Ursulin
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx
  Cc: dri-devel, linux-kernel, Tvrtko Ursulin, Chris Wilson,
	Aravind Iddamsetty

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Make GEM contexts keep a reference to i915_drm_client for the whole of
of their lifetime which will come handy in following patches.

v2: Don't bother supporting selftests contexts from debugfs. (Chris)
v3 (Lucas): Finish constructing ctx before adding it to the list
v4 (Ram): Rebase.
v5: Trivial rebase for proto ctx changes.
v6: Rebase after clients no longer track name and pid.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v5
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v5
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c       | 5 +++++
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 8208fd5b72c3..9296d69681d7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -956,6 +956,9 @@ static void i915_gem_context_release_work(struct work_struct *work)
 	if (vm)
 		i915_vm_put(vm);
 
+	if (ctx->client)
+		i915_drm_client_put(ctx->client);
+
 	mutex_destroy(&ctx->engines_mutex);
 	mutex_destroy(&ctx->lut_mutex);
 
@@ -1373,6 +1376,8 @@ static void gem_context_register(struct i915_gem_context *ctx,
 	ctx->file_priv = fpriv;
 
 	ctx->pid = get_task_pid(current, PIDTYPE_PID);
+	ctx->client = i915_drm_client_get(fpriv->client);
+
 	snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
 		 current->comm, pid_nr(ctx->pid));
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index c4617e4d9fa9..598c57ac5cdf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -277,6 +277,9 @@ struct i915_gem_context {
 	/** @link: place with &drm_i915_private.context_list */
 	struct list_head link;
 
+	/** @client: struct i915_drm_client */
+	struct i915_drm_client *client;
+
 	/**
 	 * @ref: reference count
 	 *
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC 4/8] drm/i915: Track all user contexts per client
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2021-10-04 14:36 ` [RFC 3/8] drm/i915: Make GEM contexts " Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 5/8] drm/i915: Keep track of registered clients indexed by task struct Tvrtko Ursulin
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx
  Cc: dri-devel, linux-kernel, Tvrtko Ursulin, Aravind Iddamsetty,
	Chris Wilson

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We soon want to start answering questions like how much GPU time is the
context belonging to a client which exited still using.

To enable this we start tracking all context belonging to a client on a
separate list.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c       | 12 ++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h |  3 +++
 drivers/gpu/drm/i915/i915_drm_client.c            |  2 ++
 drivers/gpu/drm/i915/i915_drm_client.h            |  5 +++++
 4 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 9296d69681d7..d1992ba59ed8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1169,6 +1169,7 @@ static void set_closed_name(struct i915_gem_context *ctx)
 
 static void context_close(struct i915_gem_context *ctx)
 {
+	struct i915_drm_client *client;
 	struct i915_address_space *vm;
 
 	/* Flush any concurrent set_engines() */
@@ -1205,6 +1206,13 @@ static void context_close(struct i915_gem_context *ctx)
 	list_del(&ctx->link);
 	spin_unlock(&ctx->i915->gem.contexts.lock);
 
+	client = ctx->client;
+	if (client) {
+		spin_lock(&client->ctx_lock);
+		list_del_rcu(&ctx->client_link);
+		spin_unlock(&client->ctx_lock);
+	}
+
 	mutex_unlock(&ctx->mutex);
 
 	/*
@@ -1385,6 +1393,10 @@ static void gem_context_register(struct i915_gem_context *ctx,
 	old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
 	WARN_ON(old);
 
+	spin_lock(&ctx->client->ctx_lock);
+	list_add_tail_rcu(&ctx->client_link, &ctx->client->ctx_list);
+	spin_unlock(&ctx->client->ctx_lock);
+
 	spin_lock(&i915->gem.contexts.lock);
 	list_add_tail(&ctx->link, &i915->gem.contexts.list);
 	spin_unlock(&i915->gem.contexts.lock);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 598c57ac5cdf..b878e1b13b38 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -280,6 +280,9 @@ struct i915_gem_context {
 	/** @client: struct i915_drm_client */
 	struct i915_drm_client *client;
 
+	/** link: &drm_client.context_list */
+	struct list_head client_link;
+
 	/**
 	 * @ref: reference count
 	 *
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index e61e9ba15256..91a8559bebf7 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -38,6 +38,8 @@ struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients)
 		goto err;
 
 	kref_init(&client->kref);
+	spin_lock_init(&client->ctx_lock);
+	INIT_LIST_HEAD(&client->ctx_list);
 	client->clients = clients;
 
 	return client;
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index e8986ad51176..0207dfad4568 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -7,6 +7,8 @@
 #define __I915_DRM_CLIENT_H__
 
 #include <linux/kref.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
 #include <linux/xarray.h>
 
 struct drm_i915_private;
@@ -23,6 +25,9 @@ struct i915_drm_client {
 
 	unsigned int id;
 
+	spinlock_t ctx_lock; /* For add/remove from ctx_list. */
+	struct list_head ctx_list; /* List of contexts belonging to client. */
+
 	struct i915_drm_clients *clients;
 };
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC 5/8] drm/i915: Keep track of registered clients indexed by task struct
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2021-10-04 14:36 ` [RFC 4/8] drm/i915: Track all user contexts per client Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute Tvrtko Ursulin
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx; +Cc: dri-devel, linux-kernel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A simple hash table of registered clients indexed by the task struct
pointer is kept to be used in a following patch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  2 ++
 drivers/gpu/drm/i915/i915_drm_client.c      | 31 ++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h      | 13 +++++++++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index d1992ba59ed8..8d4d687ab1d0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1932,6 +1932,8 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
+	i915_drm_client_update_owner(ext_data.fpriv->client, current);
+
 	ext_data.pc = proto_context_create(i915, args->flags);
 	if (IS_ERR(ext_data.pc))
 		return PTR_ERR(ext_data.pc);
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 91a8559bebf7..82b9636482ef 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -18,6 +18,9 @@ void i915_drm_clients_init(struct i915_drm_clients *clients,
 	clients->next_id = 0;
 
 	xa_init_flags(&clients->xarray, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
+
+	rwlock_init(&clients->lock);
+	hash_init(clients->tasks);
 }
 
 struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients)
@@ -42,6 +45,8 @@ struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients)
 	INIT_LIST_HEAD(&client->ctx_list);
 	client->clients = clients;
 
+	i915_drm_client_update_owner(client, current);
+
 	return client;
 
 err:
@@ -54,9 +59,14 @@ void __i915_drm_client_free(struct kref *kref)
 {
 	struct i915_drm_client *client =
 		container_of(kref, typeof(*client), kref);
-	struct xarray *xa = &client->clients->xarray;
+	struct i915_drm_clients *clients = client->clients;
+	struct xarray *xa = &clients->xarray;
 	unsigned long flags;
 
+	write_lock(&clients->lock);
+	hash_del(&client->node);
+	write_unlock(&clients->lock);
+
 	xa_lock_irqsave(xa, flags);
 	__xa_erase(xa, client->id);
 	xa_unlock_irqrestore(xa, flags);
@@ -68,3 +78,22 @@ void i915_drm_clients_fini(struct i915_drm_clients *clients)
 	GEM_BUG_ON(!xa_empty(&clients->xarray));
 	xa_destroy(&clients->xarray);
 }
+
+void i915_drm_client_update_owner(struct i915_drm_client *client,
+				  struct task_struct *owner)
+{
+	struct i915_drm_clients *clients;
+
+	if (READ_ONCE(client->owner) == owner)
+		return;
+
+	clients = client->clients;
+	write_lock(&clients->lock);
+	if (READ_ONCE(client->owner) != owner) {
+		if (client->owner)
+			hash_del(&client->node);
+		client->owner = owner;
+		hash_add(clients->tasks, &client->node, (uintptr_t)owner);
+	}
+	write_unlock(&clients->lock);
+}
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 0207dfad4568..42fd79f0558a 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -6,8 +6,11 @@
 #ifndef __I915_DRM_CLIENT_H__
 #define __I915_DRM_CLIENT_H__
 
+#include <linux/hashtable.h>
 #include <linux/kref.h>
 #include <linux/list.h>
+#include <linux/rwlock.h>
+#include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/xarray.h>
 
@@ -18,6 +21,9 @@ struct i915_drm_clients {
 
 	struct xarray xarray;
 	u32 next_id;
+
+	rwlock_t lock;
+	DECLARE_HASHTABLE(tasks, 6);
 };
 
 struct i915_drm_client {
@@ -28,6 +34,9 @@ struct i915_drm_client {
 	spinlock_t ctx_lock; /* For add/remove from ctx_list. */
 	struct list_head ctx_list; /* List of contexts belonging to client. */
 
+	struct task_struct *owner; /* No reference kept, never dereferenced. */
+	struct hlist_node node;
+
 	struct i915_drm_clients *clients;
 };
 
@@ -52,4 +61,8 @@ struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients);
 
 void i915_drm_clients_fini(struct i915_drm_clients *clients);
 
+void i915_drm_client_update_owner(struct i915_drm_client *client,
+				  struct task_struct *owner);
+
+
 #endif /* !__I915_DRM_CLIENT_H__ */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2021-10-04 14:36 ` [RFC 5/8] drm/i915: Keep track of registered clients indexed by task struct Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  2021-10-06 17:12   ` Matthew Brost
  2021-10-04 14:36 ` [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority Tvrtko Ursulin
  2021-10-04 14:36 ` [RFC 8/8] drm/i915: Connect with the process nice change notifier Tvrtko Ursulin
  7 siblings, 1 reply; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx
  Cc: dri-devel, linux-kernel, Tvrtko Ursulin, Matthew Brost,
	Daniele Ceraolo Spurio

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
priority management") introduced some scheduling related vfuncs which
take integer request priority as argument.

Make them instead take struct i915_sched_attr, which is the type
encapsulating this information, so it probably aligns with the design
better. It definitely enables extending the set of scheduling attributes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c    | 3 ++-
 drivers/gpu/drm/i915/i915_scheduler.c                | 4 ++--
 drivers/gpu/drm/i915/i915_scheduler_types.h          | 4 ++--
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 7147fe80919e..e91d803a6453 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine)
 	return engine->class != RENDER_CLASS;
 }
 
-static void kick_execlists(const struct i915_request *rq, int prio)
+static void kick_execlists(const struct i915_request *rq,
+			   const struct i915_sched_attr *attr)
 {
 	struct intel_engine_cs *engine = rq->engine;
 	struct i915_sched_engine *sched_engine = engine->sched_engine;
 	const struct i915_request *inflight;
+	const int prio = attr->priority;
 
 	/*
 	 * We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index ba0de35f6323..b5883a4365ca 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine)
 }
 
 static void guc_bump_inflight_request_prio(struct i915_request *rq,
-					   int prio)
+					   const struct i915_sched_attr *attr)
 {
 	struct intel_context *ce = rq->context;
+	const int prio = attr->priority;
 	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
 
 	/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 762127dd56c5..534bab99fcdc 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
 
 		/* Must be called before changing the nodes priority */
 		if (sched_engine->bump_inflight_request_prio)
-			sched_engine->bump_inflight_request_prio(from, prio);
+			sched_engine->bump_inflight_request_prio(from, attr);
 
 		WRITE_ONCE(node->attr.priority, prio);
 
@@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
 
 		/* Defer (tasklet) submission until after all of our updates. */
 		if (sched_engine->kick_backend)
-			sched_engine->kick_backend(node_to_request(node), prio);
+			sched_engine->kick_backend(node_to_request(node), attr);
 	}
 
 	spin_unlock(&sched_engine->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index b0a1b58c7893..24b9ac1c2ce2 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -177,13 +177,13 @@ struct i915_sched_engine {
 	 * @kick_backend: kick backend after a request's priority has changed
 	 */
 	void	(*kick_backend)(const struct i915_request *rq,
-				int prio);
+				const struct i915_sched_attr *attr);
 
 	/**
 	 * @bump_inflight_request_prio: update priority of an inflight request
 	 */
 	void	(*bump_inflight_request_prio)(struct i915_request *rq,
-					      int prio);
+					      const struct i915_sched_attr *attr);
 
 	/**
 	 * @retire_inflight_request_prio: indicate request is retired to
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2021-10-04 14:36 ` [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  2021-10-06 17:16   ` [Intel-gfx] " Matthew Brost
  2021-10-06 17:24   ` Matthew Brost
  2021-10-04 14:36 ` [RFC 8/8] drm/i915: Connect with the process nice change notifier Tvrtko Ursulin
  7 siblings, 2 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx; +Cc: dri-devel, linux-kernel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Introduce the concept of context nice value which matches the process
nice.

We do this by extending the struct i915_sched_attr and add a helper
(i915_sched_attr_priority) to be used to convert to effective priority
when used by backend code and for priority sorting.

Context nice is then inherited from the process which creates the GEM
context and utilised secondary to context priority, only when the latter
has been left at the default setting, in order to avoid disturbing any
application made choices of low and high (batch processing and maybe
latency sensitive compositing). In those cases nice value adjusts the
effective priority in the narrow band of -19 to +20 around
I915_CONTEXT_DEFAULT_PRIORITY.

This means that in theory userspace using the context priority uapi
directly has a wider range of possible adjustments (thought to be
beneficial), but in practice that only applies to execlists platforms.
With GuC there are only three priority buckets (less than zero is low
priority, zero is normal and greater than zero is high) which therefore
interact as expected with the nice adjustment. It makes the question of
should the nice be a sub-range of GEM priorities, or stretched across the
whole, a moot one.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c        |  1 +
 .../gpu/drm/i915/gt/intel_execlists_submission.c   |  4 ++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c  |  2 +-
 drivers/gpu/drm/i915/i915_request.c                |  2 +-
 drivers/gpu/drm/i915/i915_request.h                |  5 +++++
 drivers/gpu/drm/i915/i915_scheduler.c              | 12 ++++++++----
 drivers/gpu/drm/i915/i915_scheduler.h              | 14 ++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler_types.h        |  8 ++++++++
 8 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 8d4d687ab1d0..fed0733cb652 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
 	if (i915->params.enable_hangcheck)
 		pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
 	pc->sched.priority = I915_PRIORITY_NORMAL;
+	pc->sched.nice = task_nice(current);
 
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
 		if (!HAS_EXECLISTS(i915)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index e91d803a6453..1a02c65823a7 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
 
 static int rq_prio(const struct i915_request *rq)
 {
-	return READ_ONCE(rq->sched.attr.priority);
+	return i915_request_priority(rq);
 }
 
 static int effective_prio(const struct i915_request *rq)
@@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq,
 {
 	struct intel_engine_cs *engine = rq->engine;
 	struct i915_sched_engine *sched_engine = engine->sched_engine;
+	const int prio = i915_sched_attr_priority(attr);
 	const struct i915_request *inflight;
-	const int prio = attr->priority;
 
 	/*
 	 * We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b5883a4365ca..f258607685a2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
 					   const struct i915_sched_attr *attr)
 {
 	struct intel_context *ce = rq->context;
-	const int prio = attr->priority;
+	const int prio = i915_sched_attr_priority(attr);
 	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
 
 	/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 79da5eca60af..a8c6f3a64895 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr,
 		return x;
 
 	x += snprintf(buf + x, len - x,
-		      " prio=%d", attr->priority);
+		      " prio=%d nice=%d", attr->priority, attr->nice);
 
 	return x;
 }
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 7bd9ed20623e..c2c4c344837e 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq,
 #define I915_WAIT_PRIORITY	BIT(1) /* small priority bump for the request */
 #define I915_WAIT_ALL		BIT(2) /* used by i915_gem_object_wait() */
 
+static inline int i915_request_priority(const struct i915_request *rq)
+{
+	return i915_sched_attr_priority(&rq->sched.attr);
+}
+
 void i915_request_show(struct drm_printer *m,
 		       const struct i915_request *rq,
 		       const char *prefix,
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 534bab99fcdc..e75793e36454 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node,
 static void __i915_schedule(struct i915_sched_node *node,
 			    const struct i915_sched_attr *attr)
 {
-	const int prio = max(attr->priority, node->attr.priority);
+	const int prio =
+		max(i915_sched_attr_priority(attr),
+		    i915_sched_attr_priority(&node->attr));
 	struct i915_sched_engine *sched_engine;
 	struct i915_dependency *dep, *p;
 	struct i915_dependency stack;
@@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node,
 			if (node_signaled(p->signaler))
 				continue;
 
-			if (prio > READ_ONCE(p->signaler->attr.priority))
+			if (prio > i915_sched_attr_priority(&p->signaler->attr))
 				list_move_tail(&p->dfs_link, &dfs);
 		}
 	}
@@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node,
 		lockdep_assert_held(&sched_engine->lock);
 
 		/* Recheck after acquiring the engine->timeline.lock */
-		if (prio <= node->attr.priority || node_signaled(node))
+		if (prio <= i915_sched_attr_priority(&node->attr) ||
+		    node_signaled(node))
 			continue;
 
 		GEM_BUG_ON(node_to_request(node)->engine->sched_engine !=
@@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node,
 		if (sched_engine->bump_inflight_request_prio)
 			sched_engine->bump_inflight_request_prio(from, attr);
 
-		WRITE_ONCE(node->attr.priority, prio);
+		WRITE_ONCE(node->attr, *attr);
 
 		/*
 		 * Once the request is ready, it will be placed into the
@@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node)
 void i915_sched_node_reinit(struct i915_sched_node *node)
 {
 	node->attr.priority = I915_PRIORITY_INVALID;
+	node->attr.nice = 0;
 	node->semaphores = 0;
 	node->flags = 0;
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 0b9b86af6c7f..75ccc9f55d14 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node);
 void i915_schedule(struct i915_request *request,
 		   const struct i915_sched_attr *attr);
 
+static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr)
+{
+	int prio = attr->priority;
+
+	/*
+	 * Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the
+	 * nice setting.
+	 */
+	if (!prio)
+		prio = -attr->nice;
+
+	return prio;
+}
+
 struct list_head *
 i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index 24b9ac1c2ce2..159237aa7609 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -29,6 +29,14 @@ struct i915_sched_attr {
 	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
 	 */
 	int priority;
+
+	/**
+	 * @nice: context nice level
+	 *
+	 * Nice level follows the CPU scheduler nice value as set for the
+	 * process owning the GPU context.
+	 */
+	int nice;
 };
 
 /*
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC 8/8] drm/i915: Connect with the process nice change notifier
  2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2021-10-04 14:36 ` [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority Tvrtko Ursulin
@ 2021-10-04 14:36 ` Tvrtko Ursulin
  7 siblings, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-04 14:36 UTC (permalink / raw)
  To: Intel-gfx; +Cc: dri-devel, linux-kernel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Connect i915 with the process nice change notifier so that our scheduling
can react to runtime adjustments, on top of previously added nice value
inheritance at context create time.

To achieve this we use the previously added map of clients per owning
tasks in combination with the list of GEM contexts per client.

To avoid possibly unnecessary complications the updated context nice value
will only apply to future submissions against the context.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drm_client.c | 31 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h |  3 +++
 2 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c
index 82b9636482ef..e34c1228f65b 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -7,10 +7,35 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 
+#include "gem/i915_gem_context.h"
 #include "i915_drm_client.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
 
+static int
+clients_notify(struct notifier_block *nb, unsigned long val, void *ptr)
+{
+	struct i915_drm_clients *clients =
+		container_of(nb, typeof(*clients), prio_notifier);
+	struct i915_drm_client *client;
+
+	rcu_read_lock();
+	read_lock(&clients->lock);
+	hash_for_each_possible(clients->tasks, client, node, (uintptr_t)ptr) {
+		struct i915_gem_context *ctx;
+
+		if (client->owner != ptr)
+			continue;
+
+		list_for_each_entry_rcu(ctx, &client->ctx_list, client_link)
+			ctx->sched.nice = (int)val;
+	}
+	read_unlock(&clients->lock);
+	rcu_read_unlock();
+
+	return NOTIFY_DONE;
+}
+
 void i915_drm_clients_init(struct i915_drm_clients *clients,
 			   struct drm_i915_private *i915)
 {
@@ -21,6 +46,10 @@ void i915_drm_clients_init(struct i915_drm_clients *clients,
 
 	rwlock_init(&clients->lock);
 	hash_init(clients->tasks);
+
+	memset(&clients->prio_notifier, 0, sizeof(clients->prio_notifier));
+	clients->prio_notifier.notifier_call = clients_notify;
+	register_user_nice_notifier(&clients->prio_notifier);
 }
 
 struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients)
@@ -75,6 +104,8 @@ void __i915_drm_client_free(struct kref *kref)
 
 void i915_drm_clients_fini(struct i915_drm_clients *clients)
 {
+	unregister_user_nice_notifier(&clients->prio_notifier);
+
 	GEM_BUG_ON(!xa_empty(&clients->xarray));
 	xa_destroy(&clients->xarray);
 }
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
index 42fd79f0558a..dda26aa42ac9 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -9,6 +9,7 @@
 #include <linux/hashtable.h>
 #include <linux/kref.h>
 #include <linux/list.h>
+#include <linux/notifier.h>
 #include <linux/rwlock.h>
 #include <linux/sched.h>
 #include <linux/spinlock.h>
@@ -24,6 +25,8 @@ struct i915_drm_clients {
 
 	rwlock_t lock;
 	DECLARE_HASHTABLE(tasks, 6);
+
+	struct notifier_block prio_notifier;
 };
 
 struct i915_drm_client {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC 1/8] sched: Add nice value change notifier
  2021-10-04 14:36 ` [RFC 1/8] sched: Add nice value change notifier Tvrtko Ursulin
@ 2021-10-06  4:10   ` Wanghui (John)
  2021-10-06  7:58     ` Barry Song
  0 siblings, 1 reply; 23+ messages in thread
From: Wanghui (John) @ 2021-10-06  4:10 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx
  Cc: dri-devel, linux-kernel, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot

HI Tvrtko

On 2021/10/4 22:36, Tvrtko Ursulin wrote:
>   void set_user_nice(struct task_struct *p, long nice)
>   {
>   	bool queued, running;
> -	int old_prio;
> +	int old_prio, ret;
>   	struct rq_flags rf;
>   	struct rq *rq;
>   
> @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
>   
>   out_unlock:
>   	task_rq_unlock(rq, p, &rf);
> +
> +	ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
> +	WARN_ON_ONCE(ret != NOTIFY_DONE);
>   }
How about adding a new "io_nice" to task_struct,and move the call chain to
sched_setattr/getattr, there are two benefits:

1. Decoupled with fair scheduelr. In our use case, high priority tasks often
    use rt scheduler.
2. The range of value don't need to be bound to -20~19 or 0~139





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 1/8] sched: Add nice value change notifier
  2021-10-06  4:10   ` Wanghui (John)
@ 2021-10-06  7:58     ` Barry Song
  2021-10-06 13:44       ` Tvrtko Ursulin
  0 siblings, 1 reply; 23+ messages in thread
From: Barry Song @ 2021-10-06  7:58 UTC (permalink / raw)
  To: Wanghui (John)
  Cc: Tvrtko Ursulin, Intel-gfx, dri-devel, LKML, Tvrtko Ursulin,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot

On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) <john.wanghui@huawei.com> wrote:
>
> HI Tvrtko
>
> On 2021/10/4 22:36, Tvrtko Ursulin wrote:
> >   void set_user_nice(struct task_struct *p, long nice)
> >   {
> >       bool queued, running;
> > -     int old_prio;
> > +     int old_prio, ret;
> >       struct rq_flags rf;
> >       struct rq *rq;
> >
> > @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
> >
> >   out_unlock:
> >       task_rq_unlock(rq, p, &rf);
> > +
> > +     ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
> > +     WARN_ON_ONCE(ret != NOTIFY_DONE);
> >   }
> How about adding a new "io_nice" to task_struct,and move the call chain to
> sched_setattr/getattr, there are two benefits:

We already have an ionice for block io scheduler. hardly can this new io_nice
be generic to all I/O. it seems the patchset is trying to link
process' nice with
GPU's scheduler, to some extent, it makes more senses than having a
common ionice because we have a lot of IO devices in the systems, we don't
know which I/O the ionice of task_struct should be applied to.

Maybe we could have an ionice dedicated for GPU just like ionice for CFQ
of bio/request scheduler.

>
> 1. Decoupled with fair scheduelr. In our use case, high priority tasks often
>     use rt scheduler.

Is it possible to tell GPU RT as we are telling them CFS nice?

> 2. The range of value don't need to be bound to -20~19 or 0~139
>

could build a mapping between the priorities of process and GPU. It seems
not a big deal.

Thanks
barry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 1/8] sched: Add nice value change notifier
  2021-10-06  7:58     ` Barry Song
@ 2021-10-06 13:44       ` Tvrtko Ursulin
  2021-10-06 20:21         ` Barry Song
  0 siblings, 1 reply; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-06 13:44 UTC (permalink / raw)
  To: Barry Song, Wanghui (John)
  Cc: Intel-gfx, dri-devel, LKML, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot


Hi,

On 06/10/2021 08:58, Barry Song wrote:
> On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) <john.wanghui@huawei.com> wrote:
>>
>> HI Tvrtko
>>
>> On 2021/10/4 22:36, Tvrtko Ursulin wrote:
>>>    void set_user_nice(struct task_struct *p, long nice)
>>>    {
>>>        bool queued, running;
>>> -     int old_prio;
>>> +     int old_prio, ret;
>>>        struct rq_flags rf;
>>>        struct rq *rq;
>>>
>>> @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
>>>
>>>    out_unlock:
>>>        task_rq_unlock(rq, p, &rf);
>>> +
>>> +     ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
>>> +     WARN_ON_ONCE(ret != NOTIFY_DONE);
>>>    }
>> How about adding a new "io_nice" to task_struct,and move the call chain to
>> sched_setattr/getattr, there are two benefits:
> 
> We already have an ionice for block io scheduler. hardly can this new io_nice
> be generic to all I/O. it seems the patchset is trying to link
> process' nice with
> GPU's scheduler, to some extent, it makes more senses than having a
> common ionice because we have a lot of IO devices in the systems, we don't
> know which I/O the ionice of task_struct should be applied to.
> 
> Maybe we could have an ionice dedicated for GPU just like ionice for CFQ
> of bio/request scheduler.

Thought crossed my mind but I couldn't see the practicality of a 3rd 
nice concept. I mean even to start with I struggle a bit with the 
usefulness of existing ionice vs nice. Like coming up with practical 
examples of usecases where it makes sense to decouple the two priorities.

 From a different angle I did think inheriting CPU nice makes sense for 
GPU workloads. This is because today, and more so in the future, 
computations on a same data set do flow from one to the other.

Like maybe a simple example of batch image processing where CPU decodes, 
GPU does a transform and then CPU encodes. Or a different mix, doesn't 
really matter, since the main point it is one computing pipeline from 
users point of view.

In this example perhaps everything could be handled in userspace so 
that's another argument to be had. Userspace could query the current 
scheduling attributes before submitting work to the processing pipeline 
and adjust using respective uapi.

Downside would be inability to react to changes after the work is 
already running which may not be too serious limitation outside the 
world of multi-minute compute workloads. And latter are probably special 
case enough that would be configured explicitly.

>>
>> 1. Decoupled with fair scheduelr. In our use case, high priority tasks often
>>      use rt scheduler.
> 
> Is it possible to tell GPU RT as we are telling them CFS nice?

Yes of course. We could create a common notification "data packet" which 
would be sent from both entry points and provide more data than just the 
nice value. Consumers (of the notifier chain) could then decide for 
themselves what they want to do with the data.

Regards,

Tvrtko

> 
>> 2. The range of value don't need to be bound to -20~19 or 0~139
>>
> 
> could build a mapping between the priorities of process and GPU. It seems
> not a big deal.
> 
> Thanks
> barry
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute
  2021-10-04 14:36 ` [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute Tvrtko Ursulin
@ 2021-10-06 17:12   ` Matthew Brost
  2021-10-06 19:06     ` Tvrtko Ursulin
  2021-10-13 12:01     ` [Intel-gfx] " Daniel Vetter
  0 siblings, 2 replies; 23+ messages in thread
From: Matthew Brost @ 2021-10-06 17:12 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Intel-gfx, dri-devel, linux-kernel, Tvrtko Ursulin,
	Daniele Ceraolo Spurio

On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
> i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
> priority management") introduced some scheduling related vfuncs which
> take integer request priority as argument.
> 
> Make them instead take struct i915_sched_attr, which is the type
> encapsulating this information, so it probably aligns with the design
> better. It definitely enables extending the set of scheduling attributes.
> 

Understand the motivation here but the i915_scheduler is going to
disapear when we move to the DRM scheduler or at least its functionality
of priority inheritance will be pushed into the DRM scheduler. I'd be
very careful making any changes here as the priority in the DRM
scheduler is defined as single enum:

/* These are often used as an (initial) index
 * to an array, and as such should start at 0.
 */
enum drm_sched_priority {
        DRM_SCHED_PRIORITY_MIN,
        DRM_SCHED_PRIORITY_NORMAL,
        DRM_SCHED_PRIORITY_HIGH,
        DRM_SCHED_PRIORITY_KERNEL,

        DRM_SCHED_PRIORITY_COUNT,
        DRM_SCHED_PRIORITY_UNSET = -2
};

Adding a field to the i915_sched_attr is fairly easy as we already have
a structure but changing the DRM scheduler might be a tougher sell.
Anyway you can make this work without adding the 'nice' field to
i915_sched_attr? Might be worth exploring so when we move to the DRM
scheduler this feature drops in a little cleaner.

Matt

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c    | 3 ++-
>  drivers/gpu/drm/i915/i915_scheduler.c                | 4 ++--
>  drivers/gpu/drm/i915/i915_scheduler_types.h          | 4 ++--
>  4 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 7147fe80919e..e91d803a6453 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine)
>  	return engine->class != RENDER_CLASS;
>  }
>  
> -static void kick_execlists(const struct i915_request *rq, int prio)
> +static void kick_execlists(const struct i915_request *rq,
> +			   const struct i915_sched_attr *attr)
>  {
>  	struct intel_engine_cs *engine = rq->engine;
>  	struct i915_sched_engine *sched_engine = engine->sched_engine;
>  	const struct i915_request *inflight;
> +	const int prio = attr->priority;
>  
>  	/*
>  	 * We only need to kick the tasklet once for the high priority
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index ba0de35f6323..b5883a4365ca 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine)
>  }
>  
>  static void guc_bump_inflight_request_prio(struct i915_request *rq,
> -					   int prio)
> +					   const struct i915_sched_attr *attr)
>  {
>  	struct intel_context *ce = rq->context;
> +	const int prio = attr->priority;
>  	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>  
>  	/* Short circuit function */
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 762127dd56c5..534bab99fcdc 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>  
>  		/* Must be called before changing the nodes priority */
>  		if (sched_engine->bump_inflight_request_prio)
> -			sched_engine->bump_inflight_request_prio(from, prio);
> +			sched_engine->bump_inflight_request_prio(from, attr);
>  
>  		WRITE_ONCE(node->attr.priority, prio);
>  
> @@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>  
>  		/* Defer (tasklet) submission until after all of our updates. */
>  		if (sched_engine->kick_backend)
> -			sched_engine->kick_backend(node_to_request(node), prio);
> +			sched_engine->kick_backend(node_to_request(node), attr);
>  	}
>  
>  	spin_unlock(&sched_engine->lock);
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index b0a1b58c7893..24b9ac1c2ce2 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -177,13 +177,13 @@ struct i915_sched_engine {
>  	 * @kick_backend: kick backend after a request's priority has changed
>  	 */
>  	void	(*kick_backend)(const struct i915_request *rq,
> -				int prio);
> +				const struct i915_sched_attr *attr);
>  
>  	/**
>  	 * @bump_inflight_request_prio: update priority of an inflight request
>  	 */
>  	void	(*bump_inflight_request_prio)(struct i915_request *rq,
> -					      int prio);
> +					      const struct i915_sched_attr *attr);
>  
>  	/**
>  	 * @retire_inflight_request_prio: indicate request is retired to
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority
  2021-10-04 14:36 ` [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority Tvrtko Ursulin
@ 2021-10-06 17:16   ` Matthew Brost
  2021-10-06 17:24   ` Matthew Brost
  1 sibling, 0 replies; 23+ messages in thread
From: Matthew Brost @ 2021-10-06 17:16 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx, dri-devel, linux-kernel, Tvrtko Ursulin

On Mon, Oct 04, 2021 at 03:36:49PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Introduce the concept of context nice value which matches the process
> nice.
> 
> We do this by extending the struct i915_sched_attr and add a helper
> (i915_sched_attr_priority) to be used to convert to effective priority
> when used by backend code and for priority sorting.
> 
> Context nice is then inherited from the process which creates the GEM
> context and utilised secondary to context priority, only when the latter
> has been left at the default setting, in order to avoid disturbing any
> application made choices of low and high (batch processing and maybe
> latency sensitive compositing). In those cases nice value adjusts the
> effective priority in the narrow band of -19 to +20 around
> I915_CONTEXT_DEFAULT_PRIORITY.
> 
> This means that in theory userspace using the context priority uapi
> directly has a wider range of possible adjustments (thought to be
> beneficial), but in practice that only applies to execlists platforms.
> With GuC there are only three priority buckets (less than zero is low
> priority, zero is normal and greater than zero is high) which therefore
> interact as expected with the nice adjustment. It makes the question of
> should the nice be a sub-range of GEM priorities, or stretched across the
> whole, a moot one.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c        |  1 +
>  .../gpu/drm/i915/gt/intel_execlists_submission.c   |  4 ++--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c  |  2 +-
>  drivers/gpu/drm/i915/i915_request.c                |  2 +-
>  drivers/gpu/drm/i915/i915_request.h                |  5 +++++
>  drivers/gpu/drm/i915/i915_scheduler.c              | 12 ++++++++----
>  drivers/gpu/drm/i915/i915_scheduler.h              | 14 ++++++++++++++
>  drivers/gpu/drm/i915/i915_scheduler_types.h        |  8 ++++++++
>  8 files changed, 40 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8d4d687ab1d0..fed0733cb652 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
>  	if (i915->params.enable_hangcheck)
>  		pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
>  	pc->sched.priority = I915_PRIORITY_NORMAL;
> +	pc->sched.nice = task_nice(current);
>  
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
>  		if (!HAS_EXECLISTS(i915)) {
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index e91d803a6453..1a02c65823a7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
>  
>  static int rq_prio(const struct i915_request *rq)
>  {
> -	return READ_ONCE(rq->sched.attr.priority);
> +	return i915_request_priority(rq);
>  }
>  
>  static int effective_prio(const struct i915_request *rq)
> @@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq,
>  {
>  	struct intel_engine_cs *engine = rq->engine;
>  	struct i915_sched_engine *sched_engine = engine->sched_engine;
> +	const int prio = i915_sched_attr_priority(attr);
>  	const struct i915_request *inflight;
> -	const int prio = attr->priority;
>  
>  	/*
>  	 * We only need to kick the tasklet once for the high priority
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index b5883a4365ca..f258607685a2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
>  					   const struct i915_sched_attr *attr)
>  {
>  	struct intel_context *ce = rq->context;
> -	const int prio = attr->priority;
> +	const int prio = i915_sched_attr_priority(attr);
>  	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>  
>  	/* Short circuit function */
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 79da5eca60af..a8c6f3a64895 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr,
>  		return x;
>  
>  	x += snprintf(buf + x, len - x,
> -		      " prio=%d", attr->priority);
> +		      " prio=%d nice=%d", attr->priority, attr->nice);
>  
>  	return x;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 7bd9ed20623e..c2c4c344837e 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq,
>  #define I915_WAIT_PRIORITY	BIT(1) /* small priority bump for the request */
>  #define I915_WAIT_ALL		BIT(2) /* used by i915_gem_object_wait() */
>  
> +static inline int i915_request_priority(const struct i915_request *rq)
> +{
> +	return i915_sched_attr_priority(&rq->sched.attr);
> +}
> +
>  void i915_request_show(struct drm_printer *m,
>  		       const struct i915_request *rq,
>  		       const char *prefix,
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 534bab99fcdc..e75793e36454 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node,
>  static void __i915_schedule(struct i915_sched_node *node,
>  			    const struct i915_sched_attr *attr)
>  {
> -	const int prio = max(attr->priority, node->attr.priority);
> +	const int prio =
> +		max(i915_sched_attr_priority(attr),
> +		    i915_sched_attr_priority(&node->attr));
>  	struct i915_sched_engine *sched_engine;
>  	struct i915_dependency *dep, *p;
>  	struct i915_dependency stack;
> @@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>  			if (node_signaled(p->signaler))
>  				continue;
>  
> -			if (prio > READ_ONCE(p->signaler->attr.priority))
> +			if (prio > i915_sched_attr_priority(&p->signaler->attr))
>  				list_move_tail(&p->dfs_link, &dfs);
>  		}
>  	}
> @@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node,
>  		lockdep_assert_held(&sched_engine->lock);
>  
>  		/* Recheck after acquiring the engine->timeline.lock */
> -		if (prio <= node->attr.priority || node_signaled(node))
> +		if (prio <= i915_sched_attr_priority(&node->attr) ||
> +		    node_signaled(node))
>  			continue;
>  
>  		GEM_BUG_ON(node_to_request(node)->engine->sched_engine !=
> @@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>  		if (sched_engine->bump_inflight_request_prio)
>  			sched_engine->bump_inflight_request_prio(from, attr);
>  
> -		WRITE_ONCE(node->attr.priority, prio);
> +		WRITE_ONCE(node->attr, *attr);
>  
>  		/*
>  		 * Once the request is ready, it will be placed into the
> @@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node)
>  void i915_sched_node_reinit(struct i915_sched_node *node)
>  {
>  	node->attr.priority = I915_PRIORITY_INVALID;
> +	node->attr.nice = 0;
>  	node->semaphores = 0;
>  	node->flags = 0;
>  
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 0b9b86af6c7f..75ccc9f55d14 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node);
>  void i915_schedule(struct i915_request *request,
>  		   const struct i915_sched_attr *attr);
>  
> +static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr)
> +{
> +	int prio = attr->priority;
> +
> +	/*
> +	 * Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the
> +	 * nice setting.
> +	 */
> +	if (!prio)
> +		prio = -attr->nice;
> +
> +	return prio;
> +}
> +
>  struct list_head *
>  i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
>  
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 24b9ac1c2ce2..159237aa7609 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -29,6 +29,14 @@ struct i915_sched_attr {
>  	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
>  	 */
>  	int priority;
> +
> +	/**
> +	 * @nice: context nice level
> +	 *
> +	 * Nice level follows the CPU scheduler nice value as set for the
> +	 * process owning the GPU context.
> +	 */
> +	int nice;

Same comment as the previous patch, I'd avoid adding a field to this
structure as I don't think this will play with the DRM scheduler very
well. If anything I think we should drop i915_sched_attr completely and
just store these values directly in the request and pass around the
request rather than i915_sched_attr.

Matt

>  };
>  
>  /*
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority
  2021-10-04 14:36 ` [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority Tvrtko Ursulin
  2021-10-06 17:16   ` [Intel-gfx] " Matthew Brost
@ 2021-10-06 17:24   ` Matthew Brost
  2021-10-06 18:42     ` Tvrtko Ursulin
  1 sibling, 1 reply; 23+ messages in thread
From: Matthew Brost @ 2021-10-06 17:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx, dri-devel, linux-kernel, Tvrtko Ursulin

On Mon, Oct 04, 2021 at 03:36:49PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Introduce the concept of context nice value which matches the process
> nice.
> 
> We do this by extending the struct i915_sched_attr and add a helper
> (i915_sched_attr_priority) to be used to convert to effective priority
> when used by backend code and for priority sorting.
> 
> Context nice is then inherited from the process which creates the GEM
> context and utilised secondary to context priority, only when the latter
> has been left at the default setting, in order to avoid disturbing any
> application made choices of low and high (batch processing and maybe
> latency sensitive compositing). In those cases nice value adjusts the
> effective priority in the narrow band of -19 to +20 around
> I915_CONTEXT_DEFAULT_PRIORITY.
> 
> This means that in theory userspace using the context priority uapi
> directly has a wider range of possible adjustments (thought to be
> beneficial), but in practice that only applies to execlists platforms.
> With GuC there are only three priority buckets (less than zero is low
> priority, zero is normal and greater than zero is high) which therefore
> interact as expected with the nice adjustment. It makes the question of
> should the nice be a sub-range of GEM priorities, or stretched across the
> whole, a moot one.
> 

Opps, sorry for the double reply to this patch - for some reason I
thinking the below comment would be on the next patch.

The GuC + DRM scheduler actually has 4 priorities with the highest
reserved for the KMD, so 3 user levels which is what I think you mean.
So how would this work with only 3 levels? 

The nice value can move a normal priority value to either low or high,
right?

Normal to low seems fine but would moving to high be an issue?

I thought a high level was reserved for elevated user processes (e.g. a
root user or a compositor)?

Would it be ok for a non-elevated user process to be the same priority
as an elevated process?

Matt

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c        |  1 +
>  .../gpu/drm/i915/gt/intel_execlists_submission.c   |  4 ++--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c  |  2 +-
>  drivers/gpu/drm/i915/i915_request.c                |  2 +-
>  drivers/gpu/drm/i915/i915_request.h                |  5 +++++
>  drivers/gpu/drm/i915/i915_scheduler.c              | 12 ++++++++----
>  drivers/gpu/drm/i915/i915_scheduler.h              | 14 ++++++++++++++
>  drivers/gpu/drm/i915/i915_scheduler_types.h        |  8 ++++++++
>  8 files changed, 40 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8d4d687ab1d0..fed0733cb652 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
>  	if (i915->params.enable_hangcheck)
>  		pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
>  	pc->sched.priority = I915_PRIORITY_NORMAL;
> +	pc->sched.nice = task_nice(current);
>  
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
>  		if (!HAS_EXECLISTS(i915)) {
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index e91d803a6453..1a02c65823a7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
>  
>  static int rq_prio(const struct i915_request *rq)
>  {
> -	return READ_ONCE(rq->sched.attr.priority);
> +	return i915_request_priority(rq);
>  }
>  
>  static int effective_prio(const struct i915_request *rq)
> @@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq,
>  {
>  	struct intel_engine_cs *engine = rq->engine;
>  	struct i915_sched_engine *sched_engine = engine->sched_engine;
> +	const int prio = i915_sched_attr_priority(attr);
>  	const struct i915_request *inflight;
> -	const int prio = attr->priority;
>  
>  	/*
>  	 * We only need to kick the tasklet once for the high priority
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index b5883a4365ca..f258607685a2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
>  					   const struct i915_sched_attr *attr)
>  {
>  	struct intel_context *ce = rq->context;
> -	const int prio = attr->priority;
> +	const int prio = i915_sched_attr_priority(attr);
>  	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>  
>  	/* Short circuit function */
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 79da5eca60af..a8c6f3a64895 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr,
>  		return x;
>  
>  	x += snprintf(buf + x, len - x,
> -		      " prio=%d", attr->priority);
> +		      " prio=%d nice=%d", attr->priority, attr->nice);
>  
>  	return x;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 7bd9ed20623e..c2c4c344837e 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq,
>  #define I915_WAIT_PRIORITY	BIT(1) /* small priority bump for the request */
>  #define I915_WAIT_ALL		BIT(2) /* used by i915_gem_object_wait() */
>  
> +static inline int i915_request_priority(const struct i915_request *rq)
> +{
> +	return i915_sched_attr_priority(&rq->sched.attr);
> +}
> +
>  void i915_request_show(struct drm_printer *m,
>  		       const struct i915_request *rq,
>  		       const char *prefix,
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index 534bab99fcdc..e75793e36454 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node,
>  static void __i915_schedule(struct i915_sched_node *node,
>  			    const struct i915_sched_attr *attr)
>  {
> -	const int prio = max(attr->priority, node->attr.priority);
> +	const int prio =
> +		max(i915_sched_attr_priority(attr),
> +		    i915_sched_attr_priority(&node->attr));
>  	struct i915_sched_engine *sched_engine;
>  	struct i915_dependency *dep, *p;
>  	struct i915_dependency stack;
> @@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>  			if (node_signaled(p->signaler))
>  				continue;
>  
> -			if (prio > READ_ONCE(p->signaler->attr.priority))
> +			if (prio > i915_sched_attr_priority(&p->signaler->attr))
>  				list_move_tail(&p->dfs_link, &dfs);
>  		}
>  	}
> @@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node,
>  		lockdep_assert_held(&sched_engine->lock);
>  
>  		/* Recheck after acquiring the engine->timeline.lock */
> -		if (prio <= node->attr.priority || node_signaled(node))
> +		if (prio <= i915_sched_attr_priority(&node->attr) ||
> +		    node_signaled(node))
>  			continue;
>  
>  		GEM_BUG_ON(node_to_request(node)->engine->sched_engine !=
> @@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>  		if (sched_engine->bump_inflight_request_prio)
>  			sched_engine->bump_inflight_request_prio(from, attr);
>  
> -		WRITE_ONCE(node->attr.priority, prio);
> +		WRITE_ONCE(node->attr, *attr);
>  
>  		/*
>  		 * Once the request is ready, it will be placed into the
> @@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node)
>  void i915_sched_node_reinit(struct i915_sched_node *node)
>  {
>  	node->attr.priority = I915_PRIORITY_INVALID;
> +	node->attr.nice = 0;
>  	node->semaphores = 0;
>  	node->flags = 0;
>  
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 0b9b86af6c7f..75ccc9f55d14 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node);
>  void i915_schedule(struct i915_request *request,
>  		   const struct i915_sched_attr *attr);
>  
> +static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr)
> +{
> +	int prio = attr->priority;
> +
> +	/*
> +	 * Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the
> +	 * nice setting.
> +	 */
> +	if (!prio)
> +		prio = -attr->nice;
> +
> +	return prio;
> +}
> +
>  struct list_head *
>  i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
>  
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 24b9ac1c2ce2..159237aa7609 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -29,6 +29,14 @@ struct i915_sched_attr {
>  	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
>  	 */
>  	int priority;
> +
> +	/**
> +	 * @nice: context nice level
> +	 *
> +	 * Nice level follows the CPU scheduler nice value as set for the
> +	 * process owning the GPU context.
> +	 */
> +	int nice;
>  };
>  
>  /*
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority
  2021-10-06 17:24   ` Matthew Brost
@ 2021-10-06 18:42     ` Tvrtko Ursulin
  0 siblings, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-06 18:42 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Intel-gfx, dri-devel, linux-kernel, Tvrtko Ursulin


On 06/10/2021 18:24, Matthew Brost wrote:
> On Mon, Oct 04, 2021 at 03:36:49PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Introduce the concept of context nice value which matches the process
>> nice.
>>
>> We do this by extending the struct i915_sched_attr and add a helper
>> (i915_sched_attr_priority) to be used to convert to effective priority
>> when used by backend code and for priority sorting.
>>
>> Context nice is then inherited from the process which creates the GEM
>> context and utilised secondary to context priority, only when the latter
>> has been left at the default setting, in order to avoid disturbing any
>> application made choices of low and high (batch processing and maybe
>> latency sensitive compositing). In those cases nice value adjusts the
>> effective priority in the narrow band of -19 to +20 around
>> I915_CONTEXT_DEFAULT_PRIORITY.
>>
>> This means that in theory userspace using the context priority uapi
>> directly has a wider range of possible adjustments (thought to be
>> beneficial), but in practice that only applies to execlists platforms.
>> With GuC there are only three priority buckets (less than zero is low
>> priority, zero is normal and greater than zero is high) which therefore
>> interact as expected with the nice adjustment. It makes the question of
>> should the nice be a sub-range of GEM priorities, or stretched across the
>> whole, a moot one.
>>
> 
> Opps, sorry for the double reply to this patch - for some reason I
> thinking the below comment would be on the next patch.
> 
> The GuC + DRM scheduler actually has 4 priorities with the highest
> reserved for the KMD, so 3 user levels which is what I think you mean.
> So how would this work with only 3 levels?
 >
 > The nice value can move a normal priority value to either low or high,
 > right?

Yes. I was thinking about three user visible levels. The kernel one is 
reserved for internal stuff so I did not mention it.

Default, nice 0 = normal priority.
Positive nice = low priority.
Negative nice = high priority.

Same as mapping of -1024 to 1023 range works onto those three AFAIU. So 
I think it works.

> Normal to low seems fine but would moving to high be an issue?
> 
> I thought a high level was reserved for elevated user processes (e.g. a
> root user or a compositor)?

Yes, and lowering the nice (negative nice) also needs CAP_SYS_NICE which 
is not available to user processes by default. So that matches i915 
rules with going above default/normal.

> Would it be ok for a non-elevated user process to be the same priority
> as an elevated process?

Not sure exactly what you have in mind but in principle yes. Same as it 
is okay for all current greater than zero priorities to map to a single 
priority in case of GuC.

Regards,

Tvrtko

> 
> Matt
> 
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_context.c        |  1 +
>>   .../gpu/drm/i915/gt/intel_execlists_submission.c   |  4 ++--
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c  |  2 +-
>>   drivers/gpu/drm/i915/i915_request.c                |  2 +-
>>   drivers/gpu/drm/i915/i915_request.h                |  5 +++++
>>   drivers/gpu/drm/i915/i915_scheduler.c              | 12 ++++++++----
>>   drivers/gpu/drm/i915/i915_scheduler.h              | 14 ++++++++++++++
>>   drivers/gpu/drm/i915/i915_scheduler_types.h        |  8 ++++++++
>>   8 files changed, 40 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> index 8d4d687ab1d0..fed0733cb652 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> @@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
>>   	if (i915->params.enable_hangcheck)
>>   		pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
>>   	pc->sched.priority = I915_PRIORITY_NORMAL;
>> +	pc->sched.nice = task_nice(current);
>>   
>>   	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
>>   		if (!HAS_EXECLISTS(i915)) {
>> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>> index e91d803a6453..1a02c65823a7 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>> @@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
>>   
>>   static int rq_prio(const struct i915_request *rq)
>>   {
>> -	return READ_ONCE(rq->sched.attr.priority);
>> +	return i915_request_priority(rq);
>>   }
>>   
>>   static int effective_prio(const struct i915_request *rq)
>> @@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq,
>>   {
>>   	struct intel_engine_cs *engine = rq->engine;
>>   	struct i915_sched_engine *sched_engine = engine->sched_engine;
>> +	const int prio = i915_sched_attr_priority(attr);
>>   	const struct i915_request *inflight;
>> -	const int prio = attr->priority;
>>   
>>   	/*
>>   	 * We only need to kick the tasklet once for the high priority
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> index b5883a4365ca..f258607685a2 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> @@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
>>   					   const struct i915_sched_attr *attr)
>>   {
>>   	struct intel_context *ce = rq->context;
>> -	const int prio = attr->priority;
>> +	const int prio = i915_sched_attr_priority(attr);
>>   	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>>   
>>   	/* Short circuit function */
>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>> index 79da5eca60af..a8c6f3a64895 100644
>> --- a/drivers/gpu/drm/i915/i915_request.c
>> +++ b/drivers/gpu/drm/i915/i915_request.c
>> @@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr,
>>   		return x;
>>   
>>   	x += snprintf(buf + x, len - x,
>> -		      " prio=%d", attr->priority);
>> +		      " prio=%d nice=%d", attr->priority, attr->nice);
>>   
>>   	return x;
>>   }
>> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
>> index 7bd9ed20623e..c2c4c344837e 100644
>> --- a/drivers/gpu/drm/i915/i915_request.h
>> +++ b/drivers/gpu/drm/i915/i915_request.h
>> @@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq,
>>   #define I915_WAIT_PRIORITY	BIT(1) /* small priority bump for the request */
>>   #define I915_WAIT_ALL		BIT(2) /* used by i915_gem_object_wait() */
>>   
>> +static inline int i915_request_priority(const struct i915_request *rq)
>> +{
>> +	return i915_sched_attr_priority(&rq->sched.attr);
>> +}
>> +
>>   void i915_request_show(struct drm_printer *m,
>>   		       const struct i915_request *rq,
>>   		       const char *prefix,
>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
>> index 534bab99fcdc..e75793e36454 100644
>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>> @@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node,
>>   static void __i915_schedule(struct i915_sched_node *node,
>>   			    const struct i915_sched_attr *attr)
>>   {
>> -	const int prio = max(attr->priority, node->attr.priority);
>> +	const int prio =
>> +		max(i915_sched_attr_priority(attr),
>> +		    i915_sched_attr_priority(&node->attr));
>>   	struct i915_sched_engine *sched_engine;
>>   	struct i915_dependency *dep, *p;
>>   	struct i915_dependency stack;
>> @@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>>   			if (node_signaled(p->signaler))
>>   				continue;
>>   
>> -			if (prio > READ_ONCE(p->signaler->attr.priority))
>> +			if (prio > i915_sched_attr_priority(&p->signaler->attr))
>>   				list_move_tail(&p->dfs_link, &dfs);
>>   		}
>>   	}
>> @@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node,
>>   		lockdep_assert_held(&sched_engine->lock);
>>   
>>   		/* Recheck after acquiring the engine->timeline.lock */
>> -		if (prio <= node->attr.priority || node_signaled(node))
>> +		if (prio <= i915_sched_attr_priority(&node->attr) ||
>> +		    node_signaled(node))
>>   			continue;
>>   
>>   		GEM_BUG_ON(node_to_request(node)->engine->sched_engine !=
>> @@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>>   		if (sched_engine->bump_inflight_request_prio)
>>   			sched_engine->bump_inflight_request_prio(from, attr);
>>   
>> -		WRITE_ONCE(node->attr.priority, prio);
>> +		WRITE_ONCE(node->attr, *attr);
>>   
>>   		/*
>>   		 * Once the request is ready, it will be placed into the
>> @@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node)
>>   void i915_sched_node_reinit(struct i915_sched_node *node)
>>   {
>>   	node->attr.priority = I915_PRIORITY_INVALID;
>> +	node->attr.nice = 0;
>>   	node->semaphores = 0;
>>   	node->flags = 0;
>>   
>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
>> index 0b9b86af6c7f..75ccc9f55d14 100644
>> --- a/drivers/gpu/drm/i915/i915_scheduler.h
>> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
>> @@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node);
>>   void i915_schedule(struct i915_request *request,
>>   		   const struct i915_sched_attr *attr);
>>   
>> +static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr)
>> +{
>> +	int prio = attr->priority;
>> +
>> +	/*
>> +	 * Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the
>> +	 * nice setting.
>> +	 */
>> +	if (!prio)
>> +		prio = -attr->nice;
>> +
>> +	return prio;
>> +}
>> +
>>   struct list_head *
>>   i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
>>   
>> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
>> index 24b9ac1c2ce2..159237aa7609 100644
>> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
>> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
>> @@ -29,6 +29,14 @@ struct i915_sched_attr {
>>   	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
>>   	 */
>>   	int priority;
>> +
>> +	/**
>> +	 * @nice: context nice level
>> +	 *
>> +	 * Nice level follows the CPU scheduler nice value as set for the
>> +	 * process owning the GPU context.
>> +	 */
>> +	int nice;
>>   };
>>   
>>   /*
>> -- 
>> 2.30.2
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute
  2021-10-06 17:12   ` Matthew Brost
@ 2021-10-06 19:06     ` Tvrtko Ursulin
  2021-10-13 12:01     ` [Intel-gfx] " Daniel Vetter
  1 sibling, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-06 19:06 UTC (permalink / raw)
  To: Matthew Brost
  Cc: Intel-gfx, dri-devel, linux-kernel, Tvrtko Ursulin,
	Daniele Ceraolo Spurio


On 06/10/2021 18:12, Matthew Brost wrote:
> On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
>> i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
>> priority management") introduced some scheduling related vfuncs which
>> take integer request priority as argument.
>>
>> Make them instead take struct i915_sched_attr, which is the type
>> encapsulating this information, so it probably aligns with the design
>> better. It definitely enables extending the set of scheduling attributes.
>>
> 
> Understand the motivation here but the i915_scheduler is going to
> disapear when we move to the DRM scheduler or at least its functionality
> of priority inheritance will be pushed into the DRM scheduler. I'd be
> very careful making any changes here as the priority in the DRM
> scheduler is defined as single enum:
> 
> /* These are often used as an (initial) index
>   * to an array, and as such should start at 0.
>   */
> enum drm_sched_priority {
>          DRM_SCHED_PRIORITY_MIN,
>          DRM_SCHED_PRIORITY_NORMAL,
>          DRM_SCHED_PRIORITY_HIGH,
>          DRM_SCHED_PRIORITY_KERNEL,
> 
>          DRM_SCHED_PRIORITY_COUNT,
>          DRM_SCHED_PRIORITY_UNSET = -2
> };
> 
> Adding a field to the i915_sched_attr is fairly easy as we already have
> a structure but changing the DRM scheduler might be a tougher sell.
> Anyway you can make this work without adding the 'nice' field to
> i915_sched_attr? Might be worth exploring so when we move to the DRM
> scheduler this feature drops in a little cleaner.

/me rubs a crystal ball.. no idea how that would look. :)

Idea with separate nice in sched attr is that only default priority 
contexts are affected by nice. Thinking being, if i915 aware userspace 
has explicitly requested a different priority, then let it keep that. In 
other words, only if it did not bother and left it at default/normal, 
then we inherit from process nice.

I suppose to keep this idea in some future drm scheduler world it would 
require some work. But in general the concept is of course open to 
discussion.

Implementation wise, not having prio and nice separate opens some other 
problems in how this interacts with GEM context priority, but there are 
probably solutions to those problems as well.

I suppose I could define a "nice range" in the existing priorities and 
nice adjustments would only apply if within that range. That would allow 
max and min user priority to still be undisturbed as requested via the 
i915 uapi.

(min) -1023 ----- (nice 20) ----- 0 ------ (nice -19) ------ +1023 (max)

And I say if some userspace set a priority in the -20 to 19 range then 
it is allowed to adjust it via process nice. That would keep it a single 
int priority as today.

Regards,

Tvrtko

> 
> Matt
> 
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c    | 3 ++-
>>   drivers/gpu/drm/i915/i915_scheduler.c                | 4 ++--
>>   drivers/gpu/drm/i915/i915_scheduler_types.h          | 4 ++--
>>   4 files changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>> index 7147fe80919e..e91d803a6453 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>> @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine)
>>   	return engine->class != RENDER_CLASS;
>>   }
>>   
>> -static void kick_execlists(const struct i915_request *rq, int prio)
>> +static void kick_execlists(const struct i915_request *rq,
>> +			   const struct i915_sched_attr *attr)
>>   {
>>   	struct intel_engine_cs *engine = rq->engine;
>>   	struct i915_sched_engine *sched_engine = engine->sched_engine;
>>   	const struct i915_request *inflight;
>> +	const int prio = attr->priority;
>>   
>>   	/*
>>   	 * We only need to kick the tasklet once for the high priority
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> index ba0de35f6323..b5883a4365ca 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine)
>>   }
>>   
>>   static void guc_bump_inflight_request_prio(struct i915_request *rq,
>> -					   int prio)
>> +					   const struct i915_sched_attr *attr)
>>   {
>>   	struct intel_context *ce = rq->context;
>> +	const int prio = attr->priority;
>>   	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>>   
>>   	/* Short circuit function */
>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
>> index 762127dd56c5..534bab99fcdc 100644
>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>> @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>>   
>>   		/* Must be called before changing the nodes priority */
>>   		if (sched_engine->bump_inflight_request_prio)
>> -			sched_engine->bump_inflight_request_prio(from, prio);
>> +			sched_engine->bump_inflight_request_prio(from, attr);
>>   
>>   		WRITE_ONCE(node->attr.priority, prio);
>>   
>> @@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>>   
>>   		/* Defer (tasklet) submission until after all of our updates. */
>>   		if (sched_engine->kick_backend)
>> -			sched_engine->kick_backend(node_to_request(node), prio);
>> +			sched_engine->kick_backend(node_to_request(node), attr);
>>   	}
>>   
>>   	spin_unlock(&sched_engine->lock);
>> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
>> index b0a1b58c7893..24b9ac1c2ce2 100644
>> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
>> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
>> @@ -177,13 +177,13 @@ struct i915_sched_engine {
>>   	 * @kick_backend: kick backend after a request's priority has changed
>>   	 */
>>   	void	(*kick_backend)(const struct i915_request *rq,
>> -				int prio);
>> +				const struct i915_sched_attr *attr);
>>   
>>   	/**
>>   	 * @bump_inflight_request_prio: update priority of an inflight request
>>   	 */
>>   	void	(*bump_inflight_request_prio)(struct i915_request *rq,
>> -					      int prio);
>> +					      const struct i915_sched_attr *attr);
>>   
>>   	/**
>>   	 * @retire_inflight_request_prio: indicate request is retired to
>> -- 
>> 2.30.2
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 1/8] sched: Add nice value change notifier
  2021-10-06 13:44       ` Tvrtko Ursulin
@ 2021-10-06 20:21         ` Barry Song
  2021-10-07  8:50           ` Tvrtko Ursulin
  0 siblings, 1 reply; 23+ messages in thread
From: Barry Song @ 2021-10-06 20:21 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Wanghui (John),
	Intel-gfx, dri-devel, LKML, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot

On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> Hi,
>
> On 06/10/2021 08:58, Barry Song wrote:
> > On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) <john.wanghui@huawei.com> wrote:
> >>
> >> HI Tvrtko
> >>
> >> On 2021/10/4 22:36, Tvrtko Ursulin wrote:
> >>>    void set_user_nice(struct task_struct *p, long nice)
> >>>    {
> >>>        bool queued, running;
> >>> -     int old_prio;
> >>> +     int old_prio, ret;
> >>>        struct rq_flags rf;
> >>>        struct rq *rq;
> >>>
> >>> @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
> >>>
> >>>    out_unlock:
> >>>        task_rq_unlock(rq, p, &rf);
> >>> +
> >>> +     ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
> >>> +     WARN_ON_ONCE(ret != NOTIFY_DONE);
> >>>    }
> >> How about adding a new "io_nice" to task_struct,and move the call chain to
> >> sched_setattr/getattr, there are two benefits:
> >
> > We already have an ionice for block io scheduler. hardly can this new io_nice
> > be generic to all I/O. it seems the patchset is trying to link
> > process' nice with
> > GPU's scheduler, to some extent, it makes more senses than having a
> > common ionice because we have a lot of IO devices in the systems, we don't
> > know which I/O the ionice of task_struct should be applied to.
> >
> > Maybe we could have an ionice dedicated for GPU just like ionice for CFQ
> > of bio/request scheduler.
>
> Thought crossed my mind but I couldn't see the practicality of a 3rd
> nice concept. I mean even to start with I struggle a bit with the
> usefulness of existing ionice vs nice. Like coming up with practical
> examples of usecases where it makes sense to decouple the two priorities.
>
>  From a different angle I did think inheriting CPU nice makes sense for
> GPU workloads. This is because today, and more so in the future,
> computations on a same data set do flow from one to the other.
>
> Like maybe a simple example of batch image processing where CPU decodes,
> GPU does a transform and then CPU encodes. Or a different mix, doesn't
> really matter, since the main point it is one computing pipeline from
> users point of view.
>

I am on it. but I am also seeing two problems here:
1. nice is not global in linux. For example, if you have two cgroups, cgroup A
has more quota then cgroup B. Tasks in B won't win even if it has a lower nice.
cgroups will run proportional-weight time-based division of CPU.

2. Historically, we had dynamic nice which was adjusted based on the average
sleep/running time; right now, we don't have dynamic nice, but virtual time
still make tasks which sleep more preempt other tasks with the same nice
or even lower nice.
virtual time += physical time/weight by nice
so, static nice number doesn't always make sense to decide preemption.

So it seems your patch only works under some simple situation for example
no cgroups, tasks have similar sleep/running time.

> In this example perhaps everything could be handled in userspace so
> that's another argument to be had. Userspace could query the current
> scheduling attributes before submitting work to the processing pipeline
> and adjust using respective uapi.
>
> Downside would be inability to react to changes after the work is
> already running which may not be too serious limitation outside the
> world of multi-minute compute workloads. And latter are probably special
> case enough that would be configured explicitly.
>
> >>
> >> 1. Decoupled with fair scheduelr. In our use case, high priority tasks often
> >>      use rt scheduler.
> >
> > Is it possible to tell GPU RT as we are telling them CFS nice?
>
> Yes of course. We could create a common notification "data packet" which
> would be sent from both entry points and provide more data than just the
> nice value. Consumers (of the notifier chain) could then decide for
> themselves what they want to do with the data.

RT should have the same problem with CFS once we have cgroups.

>
> Regards,
>
> Tvrtko
>
> >
> >> 2. The range of value don't need to be bound to -20~19 or 0~139
> >>
> >
> > could build a mapping between the priorities of process and GPU. It seems
> > not a big deal.
> >
> > Thanks
> > barry
> >

Thanks
barry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 1/8] sched: Add nice value change notifier
  2021-10-06 20:21         ` Barry Song
@ 2021-10-07  8:50           ` Tvrtko Ursulin
  2021-10-07  9:09             ` Tvrtko Ursulin
  0 siblings, 1 reply; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-07  8:50 UTC (permalink / raw)
  To: Barry Song
  Cc: Wanghui (John),
	Intel-gfx, dri-devel, LKML, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot


On 06/10/2021 21:21, Barry Song wrote:
> On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> Hi,
>>
>> On 06/10/2021 08:58, Barry Song wrote:
>>> On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) <john.wanghui@huawei.com> wrote:
>>>>
>>>> HI Tvrtko
>>>>
>>>> On 2021/10/4 22:36, Tvrtko Ursulin wrote:
>>>>>     void set_user_nice(struct task_struct *p, long nice)
>>>>>     {
>>>>>         bool queued, running;
>>>>> -     int old_prio;
>>>>> +     int old_prio, ret;
>>>>>         struct rq_flags rf;
>>>>>         struct rq *rq;
>>>>>
>>>>> @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
>>>>>
>>>>>     out_unlock:
>>>>>         task_rq_unlock(rq, p, &rf);
>>>>> +
>>>>> +     ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
>>>>> +     WARN_ON_ONCE(ret != NOTIFY_DONE);
>>>>>     }
>>>> How about adding a new "io_nice" to task_struct,and move the call chain to
>>>> sched_setattr/getattr, there are two benefits:
>>>
>>> We already have an ionice for block io scheduler. hardly can this new io_nice
>>> be generic to all I/O. it seems the patchset is trying to link
>>> process' nice with
>>> GPU's scheduler, to some extent, it makes more senses than having a
>>> common ionice because we have a lot of IO devices in the systems, we don't
>>> know which I/O the ionice of task_struct should be applied to.
>>>
>>> Maybe we could have an ionice dedicated for GPU just like ionice for CFQ
>>> of bio/request scheduler.
>>
>> Thought crossed my mind but I couldn't see the practicality of a 3rd
>> nice concept. I mean even to start with I struggle a bit with the
>> usefulness of existing ionice vs nice. Like coming up with practical
>> examples of usecases where it makes sense to decouple the two priorities.
>>
>>   From a different angle I did think inheriting CPU nice makes sense for
>> GPU workloads. This is because today, and more so in the future,
>> computations on a same data set do flow from one to the other.
>>
>> Like maybe a simple example of batch image processing where CPU decodes,
>> GPU does a transform and then CPU encodes. Or a different mix, doesn't
>> really matter, since the main point it is one computing pipeline from
>> users point of view.
>>
> 
> I am on it. but I am also seeing two problems here:
> 1. nice is not global in linux. For example, if you have two cgroups, cgroup A
> has more quota then cgroup B. Tasks in B won't win even if it has a lower nice.
> cgroups will run proportional-weight time-based division of CPU.
> 
> 2. Historically, we had dynamic nice which was adjusted based on the average
> sleep/running time; right now, we don't have dynamic nice, but virtual time
> still make tasks which sleep more preempt other tasks with the same nice
> or even lower nice.
> virtual time += physical time/weight by nice
> so, static nice number doesn't always make sense to decide preemption.
> 
> So it seems your patch only works under some simple situation for example
> no cgroups, tasks have similar sleep/running time.

Yes, I broadly agree with your assessment. Although there are plans for 
adding cgroup support to i915 scheduling, I doubt as fine grained 
control and exact semantics as there are on the CPU side will happen.

Mostly because the drive seems to be for more micro-controller managed 
scheduling which adds further challenges in connecting the two sides 
together.

But when you say it is a problem, I would characterize it more a 
weakness in terms of being only a subset of possible control. It is 
still richer (better?) than what currently exists and as demonstrated 
with benchmarks in my cover letter it can deliver improvements in user 
experience. If in the mid term future we can extend it with cgroup 
support then the concept should still apply and get closer to how you 
described nice works in the CPU world.

Main question in my mind is whether the idea of adding the 
sched_attr/priority notifier to the kernel can be justified. Because as 
mentioned before, everything apart from adjusting currently running GPU 
jobs could be done purely in userspace. Stack changes would be quite 
extensive and all, but that is not usually a good enough reason to put 
something in the kernel. That's why it is an RFC an invitation to discuss.

Even ionice inherits from nice (see task_nice_ioprio()) so I think 
argument can be made for drivers as well.

Regards,

Tvrtko

>> In this example perhaps everything could be handled in userspace so
>> that's another argument to be had. Userspace could query the current
>> scheduling attributes before submitting work to the processing pipeline
>> and adjust using respective uapi.
>>
>> Downside would be inability to react to changes after the work is
>> already running which may not be too serious limitation outside the
>> world of multi-minute compute workloads. And latter are probably special
>> case enough that would be configured explicitly.
>>
>>>>
>>>> 1. Decoupled with fair scheduelr. In our use case, high priority tasks often
>>>>       use rt scheduler.
>>>
>>> Is it possible to tell GPU RT as we are telling them CFS nice?
>>
>> Yes of course. We could create a common notification "data packet" which
>> would be sent from both entry points and provide more data than just the
>> nice value. Consumers (of the notifier chain) could then decide for
>> themselves what they want to do with the data.
> 
> RT should have the same problem with CFS once we have cgroups.
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>>> 2. The range of value don't need to be bound to -20~19 or 0~139
>>>>
>>>
>>> could build a mapping between the priorities of process and GPU. It seems
>>> not a big deal.
>>>
>>> Thanks
>>> barry
>>>
> 
> Thanks
> barry
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 1/8] sched: Add nice value change notifier
  2021-10-07  8:50           ` Tvrtko Ursulin
@ 2021-10-07  9:09             ` Tvrtko Ursulin
  2021-10-07 10:00               ` Barry Song
  0 siblings, 1 reply; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-07  9:09 UTC (permalink / raw)
  To: Barry Song
  Cc: Wanghui (John),
	Intel-gfx, dri-devel, LKML, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot


On 07/10/2021 09:50, Tvrtko Ursulin wrote:
> 
> On 06/10/2021 21:21, Barry Song wrote:
>> On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin
>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> On 06/10/2021 08:58, Barry Song wrote:
>>>> On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) 
>>>> <john.wanghui@huawei.com> wrote:
>>>>>
>>>>> HI Tvrtko
>>>>>
>>>>> On 2021/10/4 22:36, Tvrtko Ursulin wrote:
>>>>>>     void set_user_nice(struct task_struct *p, long nice)
>>>>>>     {
>>>>>>         bool queued, running;
>>>>>> -     int old_prio;
>>>>>> +     int old_prio, ret;
>>>>>>         struct rq_flags rf;
>>>>>>         struct rq *rq;
>>>>>>
>>>>>> @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, 
>>>>>> long nice)
>>>>>>
>>>>>>     out_unlock:
>>>>>>         task_rq_unlock(rq, p, &rf);
>>>>>> +
>>>>>> +     ret = atomic_notifier_call_chain(&user_nice_notifier_list, 
>>>>>> nice, p);
>>>>>> +     WARN_ON_ONCE(ret != NOTIFY_DONE);
>>>>>>     }
>>>>> How about adding a new "io_nice" to task_struct,and move the call 
>>>>> chain to
>>>>> sched_setattr/getattr, there are two benefits:
>>>>
>>>> We already have an ionice for block io scheduler. hardly can this 
>>>> new io_nice
>>>> be generic to all I/O. it seems the patchset is trying to link
>>>> process' nice with
>>>> GPU's scheduler, to some extent, it makes more senses than having a
>>>> common ionice because we have a lot of IO devices in the systems, we 
>>>> don't
>>>> know which I/O the ionice of task_struct should be applied to.
>>>>
>>>> Maybe we could have an ionice dedicated for GPU just like ionice for 
>>>> CFQ
>>>> of bio/request scheduler.
>>>
>>> Thought crossed my mind but I couldn't see the practicality of a 3rd
>>> nice concept. I mean even to start with I struggle a bit with the
>>> usefulness of existing ionice vs nice. Like coming up with practical
>>> examples of usecases where it makes sense to decouple the two 
>>> priorities.
>>>
>>>   From a different angle I did think inheriting CPU nice makes sense for
>>> GPU workloads. This is because today, and more so in the future,
>>> computations on a same data set do flow from one to the other.
>>>
>>> Like maybe a simple example of batch image processing where CPU decodes,
>>> GPU does a transform and then CPU encodes. Or a different mix, doesn't
>>> really matter, since the main point it is one computing pipeline from
>>> users point of view.
>>>
>>
>> I am on it. but I am also seeing two problems here:
>> 1. nice is not global in linux. For example, if you have two cgroups, 
>> cgroup A
>> has more quota then cgroup B. Tasks in B won't win even if it has a 
>> lower nice.
>> cgroups will run proportional-weight time-based division of CPU.
>>
>> 2. Historically, we had dynamic nice which was adjusted based on the 
>> average
>> sleep/running time; right now, we don't have dynamic nice, but virtual 
>> time
>> still make tasks which sleep more preempt other tasks with the same nice
>> or even lower nice.
>> virtual time += physical time/weight by nice
>> so, static nice number doesn't always make sense to decide preemption.
>>
>> So it seems your patch only works under some simple situation for example
>> no cgroups, tasks have similar sleep/running time.
> 
> Yes, I broadly agree with your assessment. Although there are plans for 
> adding cgroup support to i915 scheduling, I doubt as fine grained 
> control and exact semantics as there are on the CPU side will happen.
> 
> Mostly because the drive seems to be for more micro-controller managed 
> scheduling which adds further challenges in connecting the two sides 
> together.
> 
> But when you say it is a problem, I would characterize it more a 
> weakness in terms of being only a subset of possible control. It is 
> still richer (better?) than what currently exists and as demonstrated 
> with benchmarks in my cover letter it can deliver improvements in user 
> experience. If in the mid term future we can extend it with cgroup 
> support then the concept should still apply and get closer to how you 
> described nice works in the CPU world.
> 
> Main question in my mind is whether the idea of adding the 
> sched_attr/priority notifier to the kernel can be justified. Because as 
> mentioned before, everything apart from adjusting currently running GPU 
> jobs could be done purely in userspace. Stack changes would be quite 
> extensive and all, but that is not usually a good enough reason to put 
> something in the kernel. That's why it is an RFC an invitation to discuss.
> 
> Even ionice inherits from nice (see task_nice_ioprio()) so I think 
> argument can be made for drivers as well.

Now that I wrote this, I had a little bit of a light bulb moment. If I 
abandon the idea of adjusting the priority of already submitted work 
items, then I can do much of what I want purely from within the confines 
of i915.

I simply add code to inherit from current task nice on every new work 
item submission. This should probably bring the majority of the benefit 
I measured.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 1/8] sched: Add nice value change notifier
  2021-10-07  9:09             ` Tvrtko Ursulin
@ 2021-10-07 10:00               ` Barry Song
  0 siblings, 0 replies; 23+ messages in thread
From: Barry Song @ 2021-10-07 10:00 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Wanghui (John),
	Intel-gfx, dri-devel, LKML, Tvrtko Ursulin, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot

On Thu, Oct 7, 2021 at 10:09 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 07/10/2021 09:50, Tvrtko Ursulin wrote:
> >
> > On 06/10/2021 21:21, Barry Song wrote:
> >> On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin
> >> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>
> >>>
> >>> Hi,
> >>>
> >>> On 06/10/2021 08:58, Barry Song wrote:
> >>>> On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John)
> >>>> <john.wanghui@huawei.com> wrote:
> >>>>>
> >>>>> HI Tvrtko
> >>>>>
> >>>>> On 2021/10/4 22:36, Tvrtko Ursulin wrote:
> >>>>>>     void set_user_nice(struct task_struct *p, long nice)
> >>>>>>     {
> >>>>>>         bool queued, running;
> >>>>>> -     int old_prio;
> >>>>>> +     int old_prio, ret;
> >>>>>>         struct rq_flags rf;
> >>>>>>         struct rq *rq;
> >>>>>>
> >>>>>> @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p,
> >>>>>> long nice)
> >>>>>>
> >>>>>>     out_unlock:
> >>>>>>         task_rq_unlock(rq, p, &rf);
> >>>>>> +
> >>>>>> +     ret = atomic_notifier_call_chain(&user_nice_notifier_list,
> >>>>>> nice, p);
> >>>>>> +     WARN_ON_ONCE(ret != NOTIFY_DONE);
> >>>>>>     }
> >>>>> How about adding a new "io_nice" to task_struct,and move the call
> >>>>> chain to
> >>>>> sched_setattr/getattr, there are two benefits:
> >>>>
> >>>> We already have an ionice for block io scheduler. hardly can this
> >>>> new io_nice
> >>>> be generic to all I/O. it seems the patchset is trying to link
> >>>> process' nice with
> >>>> GPU's scheduler, to some extent, it makes more senses than having a
> >>>> common ionice because we have a lot of IO devices in the systems, we
> >>>> don't
> >>>> know which I/O the ionice of task_struct should be applied to.
> >>>>
> >>>> Maybe we could have an ionice dedicated for GPU just like ionice for
> >>>> CFQ
> >>>> of bio/request scheduler.
> >>>
> >>> Thought crossed my mind but I couldn't see the practicality of a 3rd
> >>> nice concept. I mean even to start with I struggle a bit with the
> >>> usefulness of existing ionice vs nice. Like coming up with practical
> >>> examples of usecases where it makes sense to decouple the two
> >>> priorities.
> >>>
> >>>   From a different angle I did think inheriting CPU nice makes sense for
> >>> GPU workloads. This is because today, and more so in the future,
> >>> computations on a same data set do flow from one to the other.
> >>>
> >>> Like maybe a simple example of batch image processing where CPU decodes,
> >>> GPU does a transform and then CPU encodes. Or a different mix, doesn't
> >>> really matter, since the main point it is one computing pipeline from
> >>> users point of view.
> >>>
> >>
> >> I am on it. but I am also seeing two problems here:
> >> 1. nice is not global in linux. For example, if you have two cgroups,
> >> cgroup A
> >> has more quota then cgroup B. Tasks in B won't win even if it has a
> >> lower nice.
> >> cgroups will run proportional-weight time-based division of CPU.
> >>
> >> 2. Historically, we had dynamic nice which was adjusted based on the
> >> average
> >> sleep/running time; right now, we don't have dynamic nice, but virtual
> >> time
> >> still make tasks which sleep more preempt other tasks with the same nice
> >> or even lower nice.
> >> virtual time += physical time/weight by nice
> >> so, static nice number doesn't always make sense to decide preemption.
> >>
> >> So it seems your patch only works under some simple situation for example
> >> no cgroups, tasks have similar sleep/running time.
> >
> > Yes, I broadly agree with your assessment. Although there are plans for
> > adding cgroup support to i915 scheduling, I doubt as fine grained
> > control and exact semantics as there are on the CPU side will happen.
> >
> > Mostly because the drive seems to be for more micro-controller managed
> > scheduling which adds further challenges in connecting the two sides
> > together.
> >
> > But when you say it is a problem, I would characterize it more a
> > weakness in terms of being only a subset of possible control. It is
> > still richer (better?) than what currently exists and as demonstrated
> > with benchmarks in my cover letter it can deliver improvements in user
> > experience. If in the mid term future we can extend it with cgroup
> > support then the concept should still apply and get closer to how you
> > described nice works in the CPU world.
> >
> > Main question in my mind is whether the idea of adding the
> > sched_attr/priority notifier to the kernel can be justified. Because as
> > mentioned before, everything apart from adjusting currently running GPU
> > jobs could be done purely in userspace. Stack changes would be quite
> > extensive and all, but that is not usually a good enough reason to put
> > something in the kernel. That's why it is an RFC an invitation to discuss.
> >
> > Even ionice inherits from nice (see task_nice_ioprio()) so I think
> > argument can be made for drivers as well.
>
> Now that I wrote this, I had a little bit of a light bulb moment. If I
> abandon the idea of adjusting the priority of already submitted work
> items, then I can do much of what I want purely from within the confines
> of i915.
>
> I simply add code to inherit from current task nice on every new work
> item submission. This should probably bring the majority of the benefit
> I measured.

I think the idea makes sense to link the process's priority with the GPU's
scheduler. I have no doubt about this.
My question is more of what is the best way to implement this.

Android has bg_non_interactive cgroup with much lower weight for
background processes. interactive tasks, on the other hand, are placed
in another cgroup with much higer weight. So Android depends on
cgroup to improve user experience.

Chrome browser in your cover-letter uses nice to de-prioritise background
tabs.  this works perfectly as the whole chrome should be in the same
cgroup, so changing nice will improve/decrease the resource gotten by
tasks in this cgroup. But once we have two cgroups,  bringing this nice
belonging to the cgroup  to the global scheduler of GPU will somehow
break the aim.

For example, if we have two cgroup A and B
/sys/fs/cgroup/cpu$ sudo sh -c 'echo 4096 > A/cpu.shares'
/sys/fs/cgroup/cpu$ sudo sh -c 'echo 512 > B/cpu.shares'

task in B with lower nice will get more GPU than task in A. But actually A group
has 8X weight of B. So the result seems wrong. especially real users like
Android does depend on cgroup.
I don't know how to overcome this "weakness", it seems not easy.

>
> Regards,
>
> Tvrtko

Thanks
barry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute
  2021-10-06 17:12   ` Matthew Brost
  2021-10-06 19:06     ` Tvrtko Ursulin
@ 2021-10-13 12:01     ` Daniel Vetter
  2021-10-13 15:50       ` Tvrtko Ursulin
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Vetter @ 2021-10-13 12:01 UTC (permalink / raw)
  To: Matthew Brost
  Cc: Tvrtko Ursulin, Intel-gfx, dri-devel, linux-kernel,
	Tvrtko Ursulin, Daniele Ceraolo Spurio

On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:
> On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
> > i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
> > priority management") introduced some scheduling related vfuncs which
> > take integer request priority as argument.
> > 
> > Make them instead take struct i915_sched_attr, which is the type
> > encapsulating this information, so it probably aligns with the design
> > better. It definitely enables extending the set of scheduling attributes.
> > 
> 
> Understand the motivation here but the i915_scheduler is going to
> disapear when we move to the DRM scheduler or at least its functionality
> of priority inheritance will be pushed into the DRM scheduler. I'd be
> very careful making any changes here as the priority in the DRM
> scheduler is defined as single enum:

Yeah I'm not sure it makes sense to build this and make the conversion to
drm/sched even harder. We've already merged a lot of code with a "we'll
totally convert to drm/sched right after" promise, there's not really room
for more fun like this built on top of i915-scheduler.
-Daniel

> 
> /* These are often used as an (initial) index
>  * to an array, and as such should start at 0.
>  */
> enum drm_sched_priority {
>         DRM_SCHED_PRIORITY_MIN,
>         DRM_SCHED_PRIORITY_NORMAL,
>         DRM_SCHED_PRIORITY_HIGH,
>         DRM_SCHED_PRIORITY_KERNEL,
> 
>         DRM_SCHED_PRIORITY_COUNT,
>         DRM_SCHED_PRIORITY_UNSET = -2
> };
> 
> Adding a field to the i915_sched_attr is fairly easy as we already have
> a structure but changing the DRM scheduler might be a tougher sell.
> Anyway you can make this work without adding the 'nice' field to
> i915_sched_attr? Might be worth exploring so when we move to the DRM
> scheduler this feature drops in a little cleaner.
> 
> Matt
> 
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c    | 3 ++-
> >  drivers/gpu/drm/i915/i915_scheduler.c                | 4 ++--
> >  drivers/gpu/drm/i915/i915_scheduler_types.h          | 4 ++--
> >  4 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index 7147fe80919e..e91d803a6453 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine)
> >  	return engine->class != RENDER_CLASS;
> >  }
> >  
> > -static void kick_execlists(const struct i915_request *rq, int prio)
> > +static void kick_execlists(const struct i915_request *rq,
> > +			   const struct i915_sched_attr *attr)
> >  {
> >  	struct intel_engine_cs *engine = rq->engine;
> >  	struct i915_sched_engine *sched_engine = engine->sched_engine;
> >  	const struct i915_request *inflight;
> > +	const int prio = attr->priority;
> >  
> >  	/*
> >  	 * We only need to kick the tasklet once for the high priority
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index ba0de35f6323..b5883a4365ca 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine)
> >  }
> >  
> >  static void guc_bump_inflight_request_prio(struct i915_request *rq,
> > -					   int prio)
> > +					   const struct i915_sched_attr *attr)
> >  {
> >  	struct intel_context *ce = rq->context;
> > +	const int prio = attr->priority;
> >  	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
> >  
> >  	/* Short circuit function */
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > index 762127dd56c5..534bab99fcdc 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
> >  
> >  		/* Must be called before changing the nodes priority */
> >  		if (sched_engine->bump_inflight_request_prio)
> > -			sched_engine->bump_inflight_request_prio(from, prio);
> > +			sched_engine->bump_inflight_request_prio(from, attr);
> >  
> >  		WRITE_ONCE(node->attr.priority, prio);
> >  
> > @@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
> >  
> >  		/* Defer (tasklet) submission until after all of our updates. */
> >  		if (sched_engine->kick_backend)
> > -			sched_engine->kick_backend(node_to_request(node), prio);
> > +			sched_engine->kick_backend(node_to_request(node), attr);
> >  	}
> >  
> >  	spin_unlock(&sched_engine->lock);
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> > index b0a1b58c7893..24b9ac1c2ce2 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> > +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> > @@ -177,13 +177,13 @@ struct i915_sched_engine {
> >  	 * @kick_backend: kick backend after a request's priority has changed
> >  	 */
> >  	void	(*kick_backend)(const struct i915_request *rq,
> > -				int prio);
> > +				const struct i915_sched_attr *attr);
> >  
> >  	/**
> >  	 * @bump_inflight_request_prio: update priority of an inflight request
> >  	 */
> >  	void	(*bump_inflight_request_prio)(struct i915_request *rq,
> > -					      int prio);
> > +					      const struct i915_sched_attr *attr);
> >  
> >  	/**
> >  	 * @retire_inflight_request_prio: indicate request is retired to
> > -- 
> > 2.30.2
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute
  2021-10-13 12:01     ` [Intel-gfx] " Daniel Vetter
@ 2021-10-13 15:50       ` Tvrtko Ursulin
  0 siblings, 0 replies; 23+ messages in thread
From: Tvrtko Ursulin @ 2021-10-13 15:50 UTC (permalink / raw)
  To: Matthew Brost, Intel-gfx, dri-devel, linux-kernel,
	Tvrtko Ursulin, Daniele Ceraolo Spurio


On 13/10/2021 13:01, Daniel Vetter wrote:
> On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:
>> On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
>>> i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
>>> priority management") introduced some scheduling related vfuncs which
>>> take integer request priority as argument.
>>>
>>> Make them instead take struct i915_sched_attr, which is the type
>>> encapsulating this information, so it probably aligns with the design
>>> better. It definitely enables extending the set of scheduling attributes.
>>>
>>
>> Understand the motivation here but the i915_scheduler is going to
>> disapear when we move to the DRM scheduler or at least its functionality
>> of priority inheritance will be pushed into the DRM scheduler. I'd be
>> very careful making any changes here as the priority in the DRM
>> scheduler is defined as single enum:
> 
> Yeah I'm not sure it makes sense to build this and make the conversion to
> drm/sched even harder. We've already merged a lot of code with a "we'll
> totally convert to drm/sched right after" promise, there's not really room
> for more fun like this built on top of i915-scheduler.

It is not really fun on top of i915-scheduler. It is fun on top of the 
concept of uapi gem context priority. As long as there is gem context 
priority, and requests inherit from it, the concept works. This is 
demonstrated by the fact it ties in with the GuC backend which reduces 
to three priorities already. It is limited granularity but it does 
something.

Implementation details aside, key question is the proposal to tie 
process nice with GPU scheduling priority. There seems to be interest 
from other parties so there probably is something here.

But I do plan to simplify this RFC to not add anything to 
i915_sched_attr and also drop the task sched attr change notifier.

Regards,

Tvrtko

> -Daniel
> 
>>
>> /* These are often used as an (initial) index
>>   * to an array, and as such should start at 0.
>>   */
>> enum drm_sched_priority {
>>          DRM_SCHED_PRIORITY_MIN,
>>          DRM_SCHED_PRIORITY_NORMAL,
>>          DRM_SCHED_PRIORITY_HIGH,
>>          DRM_SCHED_PRIORITY_KERNEL,
>>
>>          DRM_SCHED_PRIORITY_COUNT,
>>          DRM_SCHED_PRIORITY_UNSET = -2
>> };
>>
>> Adding a field to the i915_sched_attr is fairly easy as we already have
>> a structure but changing the DRM scheduler might be a tougher sell.
>> Anyway you can make this work without adding the 'nice' field to
>> i915_sched_attr? Might be worth exploring so when we move to the DRM
>> scheduler this feature drops in a little cleaner.
>>
>> Matt
>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
>>>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c    | 3 ++-
>>>   drivers/gpu/drm/i915/i915_scheduler.c                | 4 ++--
>>>   drivers/gpu/drm/i915/i915_scheduler_types.h          | 4 ++--
>>>   4 files changed, 9 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>>> index 7147fe80919e..e91d803a6453 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
>>> @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine)
>>>   	return engine->class != RENDER_CLASS;
>>>   }
>>>   
>>> -static void kick_execlists(const struct i915_request *rq, int prio)
>>> +static void kick_execlists(const struct i915_request *rq,
>>> +			   const struct i915_sched_attr *attr)
>>>   {
>>>   	struct intel_engine_cs *engine = rq->engine;
>>>   	struct i915_sched_engine *sched_engine = engine->sched_engine;
>>>   	const struct i915_request *inflight;
>>> +	const int prio = attr->priority;
>>>   
>>>   	/*
>>>   	 * We only need to kick the tasklet once for the high priority
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>>> index ba0de35f6323..b5883a4365ca 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>>> @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine)
>>>   }
>>>   
>>>   static void guc_bump_inflight_request_prio(struct i915_request *rq,
>>> -					   int prio)
>>> +					   const struct i915_sched_attr *attr)
>>>   {
>>>   	struct intel_context *ce = rq->context;
>>> +	const int prio = attr->priority;
>>>   	u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
>>>   
>>>   	/* Short circuit function */
>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
>>> index 762127dd56c5..534bab99fcdc 100644
>>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>>> @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>>>   
>>>   		/* Must be called before changing the nodes priority */
>>>   		if (sched_engine->bump_inflight_request_prio)
>>> -			sched_engine->bump_inflight_request_prio(from, prio);
>>> +			sched_engine->bump_inflight_request_prio(from, attr);
>>>   
>>>   		WRITE_ONCE(node->attr.priority, prio);
>>>   
>>> @@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
>>>   
>>>   		/* Defer (tasklet) submission until after all of our updates. */
>>>   		if (sched_engine->kick_backend)
>>> -			sched_engine->kick_backend(node_to_request(node), prio);
>>> +			sched_engine->kick_backend(node_to_request(node), attr);
>>>   	}
>>>   
>>>   	spin_unlock(&sched_engine->lock);
>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
>>> index b0a1b58c7893..24b9ac1c2ce2 100644
>>> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
>>> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
>>> @@ -177,13 +177,13 @@ struct i915_sched_engine {
>>>   	 * @kick_backend: kick backend after a request's priority has changed
>>>   	 */
>>>   	void	(*kick_backend)(const struct i915_request *rq,
>>> -				int prio);
>>> +				const struct i915_sched_attr *attr);
>>>   
>>>   	/**
>>>   	 * @bump_inflight_request_prio: update priority of an inflight request
>>>   	 */
>>>   	void	(*bump_inflight_request_prio)(struct i915_request *rq,
>>> -					      int prio);
>>> +					      const struct i915_sched_attr *attr);
>>>   
>>>   	/**
>>>   	 * @retire_inflight_request_prio: indicate request is retired to
>>> -- 
>>> 2.30.2
>>>
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-10-13 15:50 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 1/8] sched: Add nice value change notifier Tvrtko Ursulin
2021-10-06  4:10   ` Wanghui (John)
2021-10-06  7:58     ` Barry Song
2021-10-06 13:44       ` Tvrtko Ursulin
2021-10-06 20:21         ` Barry Song
2021-10-07  8:50           ` Tvrtko Ursulin
2021-10-07  9:09             ` Tvrtko Ursulin
2021-10-07 10:00               ` Barry Song
2021-10-04 14:36 ` [RFC 2/8] drm/i915: Explicitly track DRM clients Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 3/8] drm/i915: Make GEM contexts " Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 4/8] drm/i915: Track all user contexts per client Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 5/8] drm/i915: Keep track of registered clients indexed by task struct Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute Tvrtko Ursulin
2021-10-06 17:12   ` Matthew Brost
2021-10-06 19:06     ` Tvrtko Ursulin
2021-10-13 12:01     ` [Intel-gfx] " Daniel Vetter
2021-10-13 15:50       ` Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority Tvrtko Ursulin
2021-10-06 17:16   ` [Intel-gfx] " Matthew Brost
2021-10-06 17:24   ` Matthew Brost
2021-10-06 18:42     ` Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 8/8] drm/i915: Connect with the process nice change notifier Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).