linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints
@ 2020-02-28  9:07 Parth Shah
  2020-02-28  9:07 ` [PATCH v5 1/4] sched: Introduce latency-nice as a per-task attribute Parth Shah
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Parth Shah @ 2020-02-28  9:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, qais.yousef,
	chris.hyser, pkondeti, patrick.bellasi, valentin.schneider,
	David.Laight, pjt, pavel, tj, dhaval.giani, qperret, tim.c.chen

This is the 5th revision of the patch set to introduce latency_nice as a
per task attribute.

The previous version can be found at:
v1: https://lkml.org/lkml/2019/11/25/151
v2: https://lkml.org/lkml/2019/12/8/10
v3: https://lkml.org/lkml/2020/1/16/319
v4: https://lkml.org/lkml/2020/2/24/216

Changes in this revision are:
v4->v5:
- Added debugging prints in /proc/<pid>/sched for latency_nice ( based on
  suggestion from Pavan Kondeti )
- Initialized init_task with latency_nice = 0
- Collected review tag and added few minor fixes.
v3->v4:
- Based on Chris's comment, added security check to refrain non-root user
  set lower value than the current latency_nice values.
v2 -> v3:
- This series changes the longer attribute name to "latency_nice" as per
  the comment from Dietmar Eggemann https://lkml.org/lkml/2019/12/5/394
v1 -> v2:
- Addressed comments from Qais Yousef
- As per suggestion from Dietmar, moved content from newly created
  include/linux/sched/latency_tolerance.h to kernel/sched/sched.h
- Extend sched_setattr() to support latency_tolerance in tools headers UAPI


Introduction:
==============
This patch series introduces a new per-task attribute latency_nice to
provide the scheduler hints about the latency requirements of the task [1].

Latency_nice is a ranged attribute of a task with the value ranging
from [-20, 19] both inclusive which makes it align with the task nice
value.

The value should provide scheduler hints about the relative latency
requirements of tasks, meaning the task with "latency_nice = -20"
should have lower latency requirements than compared to those tasks with
higher values. Similarly a task with "latency_nice = 19" can have higher
latency and hence such tasks may not care much about latency.

The default value is set to 0. The usecases discussed below can use this
range of [-20, 19] for latency_nice for the specific purpose. This
patch does not implement any use cases for such attribute so that any
change in naming or range does not affect much to the other (future)
patches using this. The actual use of latency_nice during task wakeup
and load-balancing is yet to be coded for each of those usecases.

As per my view, this defined attribute can be used in following ways for a
some of the usecases:
1 Reduce search scan time for select_idle_cpu():
- Reduce search scans for finding idle CPU for a waking task with lower
  latency_nice values.

2 TurboSched:
- Classify the tasks with higher latency_nice values as a small
  background task given that its historic utilization is very low, for
  which the scheduler can search for more number of cores to do task
  packing.  A task with a latency_nice >= some_threshold (e.g, == 19)
  and util <= 12.5% can be background tasks.

3 Optimize AVX512 based workload:
- Bias scheduler to not put a task having (latency_nice == -20) on a
  core occupying AVX512 based workload.


Series Organization:
====================
- Patch 1-3: Add support for latency_nice attr in the task struct using
  	     sched_{set,get}attr syscall
- Patch 4  : Add permission checks for setting the value.


The patch series can be applied on tip/sched/core at the
commit a0f03b617c3b ("sched/numa: Stop an exhastive search if a reasonable swap candidate or idle CPU is found")


References:
============
[1]. Usecases for the per-task latency-nice attribute,
     https://lkml.org/lkml/2019/9/30/215
[2]. Task Latency-nice, "Subhra Mazumdar",
     https://lkml.org/lkml/2019/8/30/829
[3]. Introduce per-task latency_tolerance for scheduler hints,
     https://lkml.org/lkml/2019/12/8/10



Parth Shah (4):
  sched: Introduce latency-nice as a per-task attribute
  sched/core: Propagate parent task's latency requirements to the child
    task
  sched: Allow sched_{get,set}attr to change latency_nice of the task
  sched/core: Add permission checks for setting the latency_nice value

 include/linux/sched.h            |  1 +
 include/uapi/linux/sched.h       |  4 +++-
 include/uapi/linux/sched/types.h | 19 +++++++++++++++++++
 init/init_task.c                 |  1 +
 kernel/sched/core.c              | 26 ++++++++++++++++++++++++++
 kernel/sched/debug.c             |  1 +
 kernel/sched/sched.h             | 18 ++++++++++++++++++
 tools/include/uapi/linux/sched.h |  4 +++-
 8 files changed, 72 insertions(+), 2 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 1/4] sched: Introduce latency-nice as a per-task attribute
  2020-02-28  9:07 [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints Parth Shah
@ 2020-02-28  9:07 ` Parth Shah
  2020-02-28  9:07 ` [PATCH v5 2/4] sched/core: Propagate parent task's latency requirements to the child task Parth Shah
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Parth Shah @ 2020-02-28  9:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, qais.yousef,
	chris.hyser, pkondeti, patrick.bellasi, valentin.schneider,
	David.Laight, pjt, pavel, tj, dhaval.giani, qperret, tim.c.chen

Latency-nice indicates the latency requirements of a task with respect
to the other tasks in the system. The value of the attribute can be within
the range of [-20, 19] both inclusive to be in-line with the values just
like task nice values.

latency_nice = -20 indicates the task to have the least latency as
compared to the tasks having latency_nice = +19.

The latency_nice may affect only the CFS SCHED_CLASS by getting
latency requirements from the userspace.

Additionally, add debugging bits for newly added latency_nice attribute.

Signed-off-by: Parth Shah <parth@linux.ibm.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
---
 include/linux/sched.h |  1 +
 kernel/sched/debug.c  |  1 +
 kernel/sched/sched.h  | 18 ++++++++++++++++++
 3 files changed, 20 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 2e9199bf947b..cf23c147dd36 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -679,6 +679,7 @@ struct task_struct {
 	int				static_prio;
 	int				normal_prio;
 	unsigned int			rt_priority;
+	int				latency_nice;
 
 	const struct sched_class	*sched_class;
 	struct sched_entity		se;
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 8331bc04aea2..2a96dadc9752 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -958,6 +958,7 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
 #endif
 	P(policy);
 	P(prio);
+	P(latency_nice);
 	if (task_has_dl_policy(p)) {
 		P(dl.runtime);
 		P(dl.deadline);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 2a0caf394dd4..a857e1baebd7 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -101,6 +101,24 @@ extern long calc_load_fold_active(struct rq *this_rq, long adjust);
  */
 #define NS_TO_JIFFIES(TIME)	((unsigned long)(TIME) / (NSEC_PER_SEC / HZ))
 
+/*
+ * Latency nice is meant to provide scheduler hints about the relative
+ * latency requirements of a task with respect to other tasks.
+ * Thus a task with latency_nice == 19 can be hinted as the task with no
+ * latency requirements, in contrast to the task with latency_nice == -20
+ * which should be given priority in terms of lower latency.
+ */
+#define MAX_LATENCY_NICE	19
+#define MIN_LATENCY_NICE	-20
+
+#define LATENCY_NICE_WIDTH	\
+	(MAX_LATENCY_NICE - MIN_LATENCY_NICE + 1)
+
+/*
+ * Default tasks should be treated as a task with latency_nice = 0.
+ */
+#define DEFAULT_LATENCY_NICE	0
+
 /*
  * Increase resolution of nice-level calculations for 64-bit architectures.
  * The extra resolution improves shares distribution and load balancing of
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 2/4] sched/core: Propagate parent task's latency requirements to the child task
  2020-02-28  9:07 [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints Parth Shah
  2020-02-28  9:07 ` [PATCH v5 1/4] sched: Introduce latency-nice as a per-task attribute Parth Shah
@ 2020-02-28  9:07 ` Parth Shah
  2020-02-28  9:07 ` [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task Parth Shah
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Parth Shah @ 2020-02-28  9:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, qais.yousef,
	chris.hyser, pkondeti, patrick.bellasi, valentin.schneider,
	David.Laight, pjt, pavel, tj, dhaval.giani, qperret, tim.c.chen

Clone parent task's latency_nice attribute to the forked child task.

Reset the latency_nice value to default value when the child task is
set to sched_reset_on_fork.

Also, initialize init_task.latency_nice value with DEFAULT_LATENCY_NICE
value

Signed-off-by: Parth Shah <parth@linux.ibm.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
---
 init/init_task.c    | 1 +
 kernel/sched/core.c | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/init/init_task.c b/init/init_task.c
index 9e5cbe5eab7b..0d7d865519b8 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -69,6 +69,7 @@ struct task_struct init_task
 	.prio		= MAX_PRIO - 20,
 	.static_prio	= MAX_PRIO - 20,
 	.normal_prio	= MAX_PRIO - 20,
+	.latency_nice	= 0,
 	.policy		= SCHED_NORMAL,
 	.cpus_ptr	= &init_task.cpus_mask,
 	.cpus_mask	= CPU_MASK_ALL,
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8e6f38073ab3..866ea3d2d284 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2847,6 +2847,9 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p)
 	 */
 	p->prio = current->normal_prio;
 
+	/* Propagate the parent's latency requirements to the child as well */
+	p->latency_nice = current->latency_nice;
+
 	uclamp_fork(p);
 
 	/*
@@ -2863,6 +2866,7 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p)
 		p->prio = p->normal_prio = __normal_prio(p);
 		set_load_weight(p, false);
 
+		p->latency_nice = DEFAULT_LATENCY_NICE;
 		/*
 		 * We don't need the reset flag anymore after the fork. It has
 		 * fulfilled its duty:
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task
  2020-02-28  9:07 [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints Parth Shah
  2020-02-28  9:07 ` [PATCH v5 1/4] sched: Introduce latency-nice as a per-task attribute Parth Shah
  2020-02-28  9:07 ` [PATCH v5 2/4] sched/core: Propagate parent task's latency requirements to the child task Parth Shah
@ 2020-02-28  9:07 ` Parth Shah
  2020-05-11 11:13   ` Dietmar Eggemann
  2020-02-28  9:07 ` [PATCH v5 4/4] sched/core: Add permission checks for setting the latency_nice value Parth Shah
  2020-03-06 16:36 ` [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints chris hyser
  4 siblings, 1 reply; 9+ messages in thread
From: Parth Shah @ 2020-02-28  9:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, qais.yousef,
	chris.hyser, pkondeti, patrick.bellasi, valentin.schneider,
	David.Laight, pjt, pavel, tj, dhaval.giani, qperret, tim.c.chen

Introduce the latency_nice attribute to sched_attr and provide a
mechanism to change the value with the use of sched_setattr/sched_getattr
syscall.

Also add new flag "SCHED_FLAG_LATENCY_NICE" to hint the change in
latency_nice of the task on every sched_setattr syscall.

Signed-off-by: Parth Shah <parth@linux.ibm.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
---
 include/uapi/linux/sched.h       |  4 +++-
 include/uapi/linux/sched/types.h | 19 +++++++++++++++++++
 kernel/sched/core.c              | 18 ++++++++++++++++++
 tools/include/uapi/linux/sched.h |  4 +++-
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 2e3bc22c6f20..8778c51c7953 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -127,6 +127,7 @@ struct clone_args {
 #define SCHED_FLAG_KEEP_PARAMS		0x10
 #define SCHED_FLAG_UTIL_CLAMP_MIN	0x20
 #define SCHED_FLAG_UTIL_CLAMP_MAX	0x40
+#define SCHED_FLAG_LATENCY_NICE		0x80
 
 #define SCHED_FLAG_KEEP_ALL	(SCHED_FLAG_KEEP_POLICY | \
 				 SCHED_FLAG_KEEP_PARAMS)
@@ -138,6 +139,7 @@ struct clone_args {
 			 SCHED_FLAG_RECLAIM		| \
 			 SCHED_FLAG_DL_OVERRUN		| \
 			 SCHED_FLAG_KEEP_ALL		| \
-			 SCHED_FLAG_UTIL_CLAMP)
+			 SCHED_FLAG_UTIL_CLAMP		| \
+			 SCHED_FLAG_LATENCY_NICE)
 
 #endif /* _UAPI_LINUX_SCHED_H */
diff --git a/include/uapi/linux/sched/types.h b/include/uapi/linux/sched/types.h
index c852153ddb0d..626ab61c1145 100644
--- a/include/uapi/linux/sched/types.h
+++ b/include/uapi/linux/sched/types.h
@@ -10,6 +10,7 @@ struct sched_param {
 
 #define SCHED_ATTR_SIZE_VER0	48	/* sizeof first published struct */
 #define SCHED_ATTR_SIZE_VER1	56	/* add: util_{min,max} */
+#define SCHED_ATTR_SIZE_VER2	60	/* add: latency_nice */
 
 /*
  * Extended scheduling parameters data structure.
@@ -96,6 +97,22 @@ struct sched_param {
  * on a CPU with a capacity big enough to fit the specified value.
  * A task with a max utilization value smaller than 1024 is more likely
  * scheduled on a CPU with no more capacity than the specified value.
+ *
+ * Latency Tolerance Attributes
+ * ===========================
+ *
+ * A subset of sched_attr attributes allows to specify the relative latency
+ * requirements of a task with respect to the other tasks running/queued in the
+ * system.
+ *
+ * @ sched_latency_nice	task's latency_nice value
+ *
+ * The latency_nice of a task can have any value in a range of
+ * [LATENCY_NICE_MIN..LATENCY_NICE_MAX].
+ *
+ * A task with latency_nice with the value of LATENCY_NICE_MIN can be
+ * taken for a task with lower latency requirements as opposed to the task with
+ * higher latency_nice.
  */
 struct sched_attr {
 	__u32 size;
@@ -118,6 +135,8 @@ struct sched_attr {
 	__u32 sched_util_min;
 	__u32 sched_util_max;
 
+	/* latency requirement hints */
+	__s32 sched_latency_nice;
 };
 
 #endif /* _UAPI_LINUX_SCHED_TYPES_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 866ea3d2d284..cd1fb9c8be26 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4710,6 +4710,9 @@ static void __setscheduler_params(struct task_struct *p,
 	p->rt_priority = attr->sched_priority;
 	p->normal_prio = normal_prio(p);
 	set_load_weight(p, true);
+
+	if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
+		p->latency_nice = attr->sched_latency_nice;
 }
 
 /* Actually do priority change: must hold pi & rq lock. */
@@ -4867,6 +4870,13 @@ static int __sched_setscheduler(struct task_struct *p,
 			return retval;
 	}
 
+	if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE) {
+		if (attr->sched_latency_nice > MAX_LATENCY_NICE)
+			return -EINVAL;
+		if (attr->sched_latency_nice < MIN_LATENCY_NICE)
+			return -EINVAL;
+	}
+
 	if (pi)
 		cpuset_read_lock();
 
@@ -4901,6 +4911,9 @@ static int __sched_setscheduler(struct task_struct *p,
 			goto change;
 		if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
 			goto change;
+		if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE &&
+		    attr->sched_latency_nice != p->latency_nice)
+			goto change;
 
 		p->sched_reset_on_fork = reset_on_fork;
 		retval = 0;
@@ -5149,6 +5162,9 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
 	    size < SCHED_ATTR_SIZE_VER1)
 		return -EINVAL;
 
+	if ((attr->sched_flags & SCHED_FLAG_LATENCY_NICE) &&
+	    size < SCHED_ATTR_SIZE_VER2)
+		return -EINVAL;
 	/*
 	 * XXX: Do we want to be lenient like existing syscalls; or do we want
 	 * to be strict and return an error on out-of-bounds values?
@@ -5378,6 +5394,8 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
 	else
 		kattr.sched_nice = task_nice(p);
 
+	kattr.sched_latency_nice = p->latency_nice;
+
 #ifdef CONFIG_UCLAMP_TASK
 	kattr.sched_util_min = p->uclamp_req[UCLAMP_MIN].value;
 	kattr.sched_util_max = p->uclamp_req[UCLAMP_MAX].value;
diff --git a/tools/include/uapi/linux/sched.h b/tools/include/uapi/linux/sched.h
index 2e3bc22c6f20..94adc87df0e6 100644
--- a/tools/include/uapi/linux/sched.h
+++ b/tools/include/uapi/linux/sched.h
@@ -127,6 +127,7 @@ struct clone_args {
 #define SCHED_FLAG_KEEP_PARAMS		0x10
 #define SCHED_FLAG_UTIL_CLAMP_MIN	0x20
 #define SCHED_FLAG_UTIL_CLAMP_MAX	0x40
+#define SCHED_FLAG_LATENCY_NICE		0X80
 
 #define SCHED_FLAG_KEEP_ALL	(SCHED_FLAG_KEEP_POLICY | \
 				 SCHED_FLAG_KEEP_PARAMS)
@@ -138,6 +139,7 @@ struct clone_args {
 			 SCHED_FLAG_RECLAIM		| \
 			 SCHED_FLAG_DL_OVERRUN		| \
 			 SCHED_FLAG_KEEP_ALL		| \
-			 SCHED_FLAG_UTIL_CLAMP)
+			 SCHED_FLAG_UTIL_CLAMP		| \
+			 SCHED_FLAG_LATENCY_NICE)
 
 #endif /* _UAPI_LINUX_SCHED_H */
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 4/4] sched/core: Add permission checks for setting the latency_nice value
  2020-02-28  9:07 [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints Parth Shah
                   ` (2 preceding siblings ...)
  2020-02-28  9:07 ` [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task Parth Shah
@ 2020-02-28  9:07 ` Parth Shah
  2020-03-06 16:36 ` [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints chris hyser
  4 siblings, 0 replies; 9+ messages in thread
From: Parth Shah @ 2020-02-28  9:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, qais.yousef,
	chris.hyser, pkondeti, patrick.bellasi, valentin.schneider,
	David.Laight, pjt, pavel, tj, dhaval.giani, qperret, tim.c.chen

Since the latency_nice uses the similar infrastructure as NICE, use the
already existing CAP_SYS_NICE security checks for the latency_nice. This
should return -EPERM for the non-root user when trying to set the task
latency_nice value to any lower than the current value.

Signed-off-by: Parth Shah <parth@linux.ibm.com>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
---
 kernel/sched/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cd1fb9c8be26..564b3a2036d4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4875,6 +4875,10 @@ static int __sched_setscheduler(struct task_struct *p,
 			return -EINVAL;
 		if (attr->sched_latency_nice < MIN_LATENCY_NICE)
 			return -EINVAL;
+		/* Use the same security checks as NICE */
+		if (attr->sched_latency_nice < p->latency_nice &&
+		    !capable(CAP_SYS_NICE))
+			return -EPERM;
 	}
 
 	if (pi)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints
  2020-02-28  9:07 [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints Parth Shah
                   ` (3 preceding siblings ...)
  2020-02-28  9:07 ` [PATCH v5 4/4] sched/core: Add permission checks for setting the latency_nice value Parth Shah
@ 2020-03-06 16:36 ` chris hyser
  4 siblings, 0 replies; 9+ messages in thread
From: chris hyser @ 2020-03-06 16:36 UTC (permalink / raw)
  To: Parth Shah, linux-kernel
  Cc: peterz, mingo, vincent.guittot, dietmar.eggemann, qais.yousef,
	pkondeti, patrick.bellasi, valentin.schneider, David.Laight, pjt,
	pavel, tj, dhaval.giani, qperret, tim.c.chen

On 2/28/20 4:07 AM, Parth Shah wrote:
> This is the 5th revision of the patch set to introduce latency_nice as a
> per task attribute.
> 
> The previous version can be found at:
> v1: https://lkml.org/lkml/2019/11/25/151
> v2: https://lkml.org/lkml/2019/12/8/10
> v3: https://lkml.org/lkml/2020/1/16/319
> v4: https://lkml.org/lkml/2020/2/24/216
> 
> Changes in this revision are:
> v4->v5:
> - Added debugging prints in /proc/<pid>/sched for latency_nice ( based on
>    suggestion from Pavan Kondeti )
> - Initialized init_task with latency_nice = 0
> - Collected review tag and added few minor fixes.
> v3->v4:
> - Based on Chris's comment, added security check to refrain non-root user
>    set lower value than the current latency_nice values.
> v2 -> v3:
> - This series changes the longer attribute name to "latency_nice" as per
>    the comment from Dietmar Eggemann https://lkml.org/lkml/2019/12/5/394
> v1 -> v2:
> - Addressed comments from Qais Yousef
> - As per suggestion from Dietmar, moved content from newly created
>    include/linux/sched/latency_tolerance.h to kernel/sched/sched.h
> - Extend sched_setattr() to support latency_tolerance in tools headers UAPI
> 
> 
> Introduction:
> ==============
> This patch series introduces a new per-task attribute latency_nice to
> provide the scheduler hints about the latency requirements of the task [1].
> 
> Latency_nice is a ranged attribute of a task with the value ranging
> from [-20, 19] both inclusive which makes it align with the task nice
> value.
> 
> The value should provide scheduler hints about the relative latency
> requirements of tasks, meaning the task with "latency_nice = -20"
> should have lower latency requirements than compared to those tasks with
> higher values. Similarly a task with "latency_nice = 19" can have higher
> latency and hence such tasks may not care much about latency.
> 
> The default value is set to 0. The usecases discussed below can use this
> range of [-20, 19] for latency_nice for the specific purpose. This
> patch does not implement any use cases for such attribute so that any
> change in naming or range does not affect much to the other (future)
> patches using this. The actual use of latency_nice during task wakeup
> and load-balancing is yet to be coded for each of those usecases.
> 
> As per my view, this defined attribute can be used in following ways for a
> some of the usecases:
> 1 Reduce search scan time for select_idle_cpu():
> - Reduce search scans for finding idle CPU for a waking task with lower
>    latency_nice values.
> 
> 2 TurboSched:
> - Classify the tasks with higher latency_nice values as a small
>    background task given that its historic utilization is very low, for
>    which the scheduler can search for more number of cores to do task
>    packing.  A task with a latency_nice >= some_threshold (e.g, == 19)
>    and util <= 12.5% can be background tasks.
> 
> 3 Optimize AVX512 based workload:
> - Bias scheduler to not put a task having (latency_nice == -20) on a
>    core occupying AVX512 based workload.
> 
> 
> Series Organization:
> ====================
> - Patch 1-3: Add support for latency_nice attr in the task struct using
>    	     sched_{set,get}attr syscall
> - Patch 4  : Add permission checks for setting the value.
> 
> 
> The patch series can be applied on tip/sched/core at the
> commit a0f03b617c3b ("sched/numa: Stop an exhastive search if a reasonable swap candidate or idle CPU is found")
> 
> 
> References:
> ============
> [1]. Usecases for the per-task latency-nice attribute,
>       https://lkml.org/lkml/2019/9/30/215
> [2]. Task Latency-nice, "Subhra Mazumdar",
>       https://lkml.org/lkml/2019/8/30/829
> [3]. Introduce per-task latency_tolerance for scheduler hints,
>       https://lkml.org/lkml/2019/12/8/10
> 
> 
> 
> Parth Shah (4):
>    sched: Introduce latency-nice as a per-task attribute
>    sched/core: Propagate parent task's latency requirements to the child
>      task
>    sched: Allow sched_{get,set}attr to change latency_nice of the task
>    sched/core: Add permission checks for setting the latency_nice value
> 
>   include/linux/sched.h            |  1 +
>   include/uapi/linux/sched.h       |  4 +++-
>   include/uapi/linux/sched/types.h | 19 +++++++++++++++++++
>   init/init_task.c                 |  1 +
>   kernel/sched/core.c              | 26 ++++++++++++++++++++++++++
>   kernel/sched/debug.c             |  1 +
>   kernel/sched/sched.h             | 18 ++++++++++++++++++
>   tools/include/uapi/linux/sched.h |  4 +++-
>   8 files changed, 72 insertions(+), 2 deletions(-)

I've now had a chance to test and play with this a fair amount. It looks good, meets the need. Thanks Parth.

-chrish

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task
  2020-02-28  9:07 ` [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task Parth Shah
@ 2020-05-11 11:13   ` Dietmar Eggemann
  2020-05-13  9:41     ` Parth Shah
  0 siblings, 1 reply; 9+ messages in thread
From: Dietmar Eggemann @ 2020-05-11 11:13 UTC (permalink / raw)
  To: Parth Shah, linux-kernel
  Cc: peterz, mingo, vincent.guittot, qais.yousef, chris.hyser,
	pkondeti, patrick.bellasi, valentin.schneider, David.Laight, pjt,
	pavel, tj, dhaval.giani, qperret, tim.c.chen

On 28/02/2020 10:07, Parth Shah wrote:
> Introduce the latency_nice attribute to sched_attr and provide a
> mechanism to change the value with the use of sched_setattr/sched_getattr
> syscall.
> 
> Also add new flag "SCHED_FLAG_LATENCY_NICE" to hint the change in
> latency_nice of the task on every sched_setattr syscall.
> 
> Signed-off-by: Parth Shah <parth@linux.ibm.com>
> Reviewed-by: Qais Yousef <qais.yousef@arm.com>

[...]

ndif /* _UAPI_LINUX_SCHED_TYPES_H */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 866ea3d2d284..cd1fb9c8be26 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4710,6 +4710,9 @@ static void __setscheduler_params(struct task_struct *p,
>  	p->rt_priority = attr->sched_priority;
>  	p->normal_prio = normal_prio(p);
>  	set_load_weight(p, true);
> +
> +	if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
> +		p->latency_nice = attr->sched_latency_nice;
>  }

How do you make sure that p->latency_nice can be set independently from
p->static_prio?

AFAICS, util_clamp achieves this by relying on SCHED_FLAG_KEEP_PARAMS,
so completely bypassing __setscheduler_params() and using it's own
__setscheduler_uclamp().

[...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task
  2020-05-11 11:13   ` Dietmar Eggemann
@ 2020-05-13  9:41     ` Parth Shah
  2020-05-13 11:14       ` Parth Shah
  0 siblings, 1 reply; 9+ messages in thread
From: Parth Shah @ 2020-05-13  9:41 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-kernel
  Cc: peterz, mingo, vincent.guittot, qais.yousef, chris.hyser,
	pkondeti, patrick.bellasi, valentin.schneider, David.Laight, pjt,
	pavel, tj, dhaval.giani, qperret, tim.c.chen



On 5/11/20 4:43 PM, Dietmar Eggemann wrote:
> On 28/02/2020 10:07, Parth Shah wrote:
>> Introduce the latency_nice attribute to sched_attr and provide a
>> mechanism to change the value with the use of sched_setattr/sched_getattr
>> syscall.
>>
>> Also add new flag "SCHED_FLAG_LATENCY_NICE" to hint the change in
>> latency_nice of the task on every sched_setattr syscall.
>>
>> Signed-off-by: Parth Shah <parth@linux.ibm.com>
>> Reviewed-by: Qais Yousef <qais.yousef@arm.com>
> 
> [...]
> 
> ndif /* _UAPI_LINUX_SCHED_TYPES_H */
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 866ea3d2d284..cd1fb9c8be26 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -4710,6 +4710,9 @@ static void __setscheduler_params(struct task_struct *p,
>>  	p->rt_priority = attr->sched_priority;
>>  	p->normal_prio = normal_prio(p);
>>  	set_load_weight(p, true);
>> +
>> +	if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
>> +		p->latency_nice = attr->sched_latency_nice;
>>  }
> 
> How do you make sure that p->latency_nice can be set independently from
> p->static_prio?
> 
> AFAICS, util_clamp achieves this by relying on SCHED_FLAG_KEEP_PARAMS,
> so completely bypassing __setscheduler_params() and using it's own
> __setscheduler_uclamp().
> 

Right. good catch.
Use of SCHED_FLAG_LATENCY_NICE/SCHED_FLAG_ALL is must to change
latency_nice value, but currently setting latency_nice value also changes
static_prio.

One possible solution here is to move the above code to _setscheduler():

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6031ec58c7ae..44bcbf060718 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4731,9 +4731,6 @@ static void __setscheduler_params(struct task_struct *p,
        p->rt_priority = attr->sched_priority;
        p->normal_prio = normal_prio(p);
        set_load_weight(p, true);
-
-       if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
-               p->latency_nice = attr->sched_latency_nice;
 }

 /* Actually do priority change: must hold pi & rq lock. */
@@ -4749,6 +4746,13 @@ static void __setscheduler(struct rq *rq, struct
task_struct *p,

        __setscheduler_params(p, attr);

+       /*
+        * Change latency_nice value only when SCHED_FLAG_LATENCY_NICE or
+        * SCHED_FLAG_ALL sched_flag is set.
+        */
+       if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
+               p->latency_nice = attr->sched_latency_nice;
+

This should allow setting value only on above flags, also restricts setting
the value when SCHED_FLAG_KEEP_PARAMS/SCHED_FLAG_KEEP_ALL is passed.


Thanks,
Parth

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task
  2020-05-13  9:41     ` Parth Shah
@ 2020-05-13 11:14       ` Parth Shah
  0 siblings, 0 replies; 9+ messages in thread
From: Parth Shah @ 2020-05-13 11:14 UTC (permalink / raw)
  To: Dietmar Eggemann, linux-kernel
  Cc: peterz, mingo, vincent.guittot, qais.yousef, chris.hyser,
	pkondeti, patrick.bellasi, valentin.schneider, David.Laight, pjt,
	pavel, tj, dhaval.giani, qperret, tim.c.chen



On 5/13/20 3:11 PM, Parth Shah wrote:
> 
> 
> On 5/11/20 4:43 PM, Dietmar Eggemann wrote:
>> On 28/02/2020 10:07, Parth Shah wrote:
>>> Introduce the latency_nice attribute to sched_attr and provide a
>>> mechanism to change the value with the use of sched_setattr/sched_getattr
>>> syscall.
>>>
>>> Also add new flag "SCHED_FLAG_LATENCY_NICE" to hint the change in
>>> latency_nice of the task on every sched_setattr syscall.
>>>
>>> Signed-off-by: Parth Shah <parth@linux.ibm.com>
>>> Reviewed-by: Qais Yousef <qais.yousef@arm.com>
>>
>> [...]
>>
>> ndif /* _UAPI_LINUX_SCHED_TYPES_H */
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index 866ea3d2d284..cd1fb9c8be26 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -4710,6 +4710,9 @@ static void __setscheduler_params(struct task_struct *p,
>>>  	p->rt_priority = attr->sched_priority;
>>>  	p->normal_prio = normal_prio(p);
>>>  	set_load_weight(p, true);
>>> +
>>> +	if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
>>> +		p->latency_nice = attr->sched_latency_nice;
>>>  }
>>
>> How do you make sure that p->latency_nice can be set independently from
>> p->static_prio?
>>
>> AFAICS, util_clamp achieves this by relying on SCHED_FLAG_KEEP_PARAMS,
>> so completely bypassing __setscheduler_params() and using it's own
>> __setscheduler_uclamp().
>>
> 
> Right. good catch.
> Use of SCHED_FLAG_LATENCY_NICE/SCHED_FLAG_ALL is must to change
> latency_nice value, but currently setting latency_nice value also changes
> static_prio.
> 
> One possible solution here is to move the above code to _setscheduler():
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 6031ec58c7ae..44bcbf060718 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4731,9 +4731,6 @@ static void __setscheduler_params(struct task_struct *p,
>         p->rt_priority = attr->sched_priority;
>         p->normal_prio = normal_prio(p);
>         set_load_weight(p, true);
> -
> -       if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
> -               p->latency_nice = attr->sched_latency_nice;
>  }
> 
>  /* Actually do priority change: must hold pi & rq lock. */
> @@ -4749,6 +4746,13 @@ static void __setscheduler(struct rq *rq, struct
> task_struct *p,
> 
>         __setscheduler_params(p, attr);
> 
> +       /*
> +        * Change latency_nice value only when SCHED_FLAG_LATENCY_NICE or
> +        * SCHED_FLAG_ALL sched_flag is set.
> +        */
> +       if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE)
> +               p->latency_nice = attr->sched_latency_nice;
> +
> 
> This should allow setting value only on above flags, also restricts setting
> the value when SCHED_FLAG_KEEP_PARAMS/SCHED_FLAG_KEEP_ALL is passed.

and also get rid of __setscheduler_params(p, attr) when
attr->sched_flags == SCHED_FLAG_LATENCY_NICE

Other way is surely to bypass keep_param check just like UCLAMP.

> 
> 
> Thanks,
> Parth
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-05-13 11:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-28  9:07 [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints Parth Shah
2020-02-28  9:07 ` [PATCH v5 1/4] sched: Introduce latency-nice as a per-task attribute Parth Shah
2020-02-28  9:07 ` [PATCH v5 2/4] sched/core: Propagate parent task's latency requirements to the child task Parth Shah
2020-02-28  9:07 ` [PATCH v5 3/4] sched: Allow sched_{get,set}attr to change latency_nice of the task Parth Shah
2020-05-11 11:13   ` Dietmar Eggemann
2020-05-13  9:41     ` Parth Shah
2020-05-13 11:14       ` Parth Shah
2020-02-28  9:07 ` [PATCH v5 4/4] sched/core: Add permission checks for setting the latency_nice value Parth Shah
2020-03-06 16:36 ` [PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints chris hyser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).