linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function
@ 2016-05-27 15:16 Julien Desfossez
  2016-05-27 15:16 ` [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint Julien Desfossez
  2016-05-30 13:18 ` [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function Mathieu Desnoyers
  0 siblings, 2 replies; 9+ messages in thread
From: Julien Desfossez @ 2016-05-27 15:16 UTC (permalink / raw)
  To: tglx, rostedt, mathieu.desnoyers; +Cc: linux-kernel, Julien Desfossez

Currently, the priority of tasks is modified directly in the scheduling
functions. Encapsulate priority updates to enable instrumentation of
priority changes. This will enable analysis of real-time scheduling
delays per thread priority, which cannot be performed accurately if we
only trace the priority of the currently scheduled processes.

The call sites that modify the priority of a task are mostly system
calls: sched_setscheduler, sched_setattr, sched_process_fork and
set_user_nice. Priority can also be dynamically boosted through
priority inheritance of rt_mutex by rt_mutex_setprio.

Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
---
 include/linux/sched.h |  3 ++-
 kernel/sched/core.c   | 19 +++++++++++++------
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 52c4847..48b35c0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1409,7 +1409,8 @@ struct task_struct {
 #endif
 	int on_rq;
 
-	int prio, static_prio, normal_prio;
+	int prio; /* Updated through sched_set_prio() */
+	int static_prio, normal_prio;
 	unsigned int rt_priority;
 	const struct sched_class *sched_class;
 	struct sched_entity se;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d1f7149..6946b8f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2230,6 +2230,11 @@ int sysctl_schedstats(struct ctl_table *table, int write,
 #endif
 #endif
 
+static void sched_set_prio(struct task_struct *p, int prio)
+{
+	p->prio = prio;
+}
+
 /*
  * fork()/clone()-time setup:
  */
@@ -2249,7 +2254,7 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p)
 	/*
 	 * Make sure we do not leak PI boosting priority to the child.
 	 */
-	p->prio = current->normal_prio;
+	sched_set_prio(p, current->normal_prio);
 
 	/*
 	 * Revert to default priority/policy on fork if requested.
@@ -2262,7 +2267,8 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p)
 		} else if (PRIO_TO_NICE(p->static_prio) < 0)
 			p->static_prio = NICE_TO_PRIO(0);
 
-		p->prio = p->normal_prio = __normal_prio(p);
+		p->normal_prio = __normal_prio(p);
+		sched_set_prio(p, p->normal_prio);
 		set_load_weight(p);
 
 		/*
@@ -3477,7 +3483,7 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
 		p->sched_class = &fair_sched_class;
 	}
 
-	p->prio = prio;
+	sched_set_prio(p, prio);
 
 	if (running)
 		p->sched_class->set_curr_task(rq);
@@ -3524,7 +3530,7 @@ void set_user_nice(struct task_struct *p, long nice)
 	p->static_prio = NICE_TO_PRIO(nice);
 	set_load_weight(p);
 	old_prio = p->prio;
-	p->prio = effective_prio(p);
+	sched_set_prio(p, effective_prio(p));
 	delta = p->prio - old_prio;
 
 	if (queued) {
@@ -3731,9 +3737,10 @@ static void __setscheduler(struct rq *rq, struct task_struct *p,
 	 * sched_setscheduler().
 	 */
 	if (keep_boost)
-		p->prio = rt_mutex_get_effective_prio(p, normal_prio(p));
+		sched_set_prio(p, rt_mutex_get_effective_prio(p,
+					normal_prio(p)));
 	else
-		p->prio = normal_prio(p);
+		sched_set_prio(p, normal_prio(p));
 
 	if (dl_prio(p->prio))
 		p->sched_class = &dl_sched_class;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint
  2016-05-27 15:16 [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function Julien Desfossez
@ 2016-05-27 15:16 ` Julien Desfossez
  2016-05-30 13:18   ` Mathieu Desnoyers
  2016-05-30 13:18 ` [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function Mathieu Desnoyers
  1 sibling, 1 reply; 9+ messages in thread
From: Julien Desfossez @ 2016-05-27 15:16 UTC (permalink / raw)
  To: tglx, rostedt, mathieu.desnoyers; +Cc: linux-kernel, Julien Desfossez

This tracepoint allows to keep track of all priority changes made by all
sites that can change this value. The impacted system calls are
sched_setscheduler, sched_setattr, sched_process_fork and set_user_nice.
The priority inheritance mechanism from rt_mutex gets also instrumented
with this tracepoint even though there is a dedicated tracepoint for it
(sched_pi_setprio).

This allows analysis of real-time scheduling delays per thread priority,
which cannot be performed accurately if we only trace the priority of
the currently scheduled processes.

Here is an example of what is output by ftrace when we change the
priority of a running process:
sys_sched_setscheduler(pid: 1c52, policy: 2, param: 7ffc22e20980)
sched_set_prio: comm=burnP6 pid=7250 oldprio=120 newprio=39
sys_sched_setscheduler -> 0x0
sched_switch: prev_comm=chrt prev_pid=7268 prev_prio=120
              prev_state=R ==> next_comm=burnP6 next_pid=7250
              next_prio=39

Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
---
 include/trace/events/sched.h | 21 ++++++++++++++++-----
 kernel/sched/core.c          |  1 +
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 9b90c57..3b83ddb 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -407,11 +407,7 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
 	     TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
 	     TP_ARGS(tsk, runtime, vruntime));
 
-/*
- * Tracepoint for showing priority inheritance modifying a tasks
- * priority.
- */
-TRACE_EVENT(sched_pi_setprio,
+DECLARE_EVENT_CLASS(sched_prio_template,
 
 	TP_PROTO(struct task_struct *tsk, int newprio),
 
@@ -436,6 +432,21 @@ TRACE_EVENT(sched_pi_setprio,
 			__entry->oldprio, __entry->newprio)
 );
 
+/*
+ * Tracepoint for showing priority inheritance modifying a tasks
+ * priority.
+ */
+DEFINE_EVENT(sched_prio_template, sched_pi_setprio,
+		TP_PROTO(struct task_struct *tsk, int newprio),
+		TP_ARGS(tsk, newprio));
+
+/*
+ * Tracepoint for priority changes of a task.
+ */
+DEFINE_EVENT(sched_prio_template, sched_set_prio,
+		TP_PROTO(struct task_struct *tsk, int newprio),
+		TP_ARGS(tsk, newprio));
+
 #ifdef CONFIG_DETECT_HUNG_TASK
 TRACE_EVENT(sched_process_hang,
 	TP_PROTO(struct task_struct *tsk),
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6946b8f..45fbaab 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2232,6 +2232,7 @@ int sysctl_schedstats(struct ctl_table *table, int write,
 
 static void sched_set_prio(struct task_struct *p, int prio)
 {
+	trace_sched_set_prio(p, prio);
 	p->prio = prio;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function
  2016-05-27 15:16 [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function Julien Desfossez
  2016-05-27 15:16 ` [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint Julien Desfossez
@ 2016-05-30 13:18 ` Mathieu Desnoyers
  2016-06-06 21:47   ` Mathieu Desnoyers
  1 sibling, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2016-05-30 13:18 UTC (permalink / raw)
  To: Julien Desfossez; +Cc: Thomas Gleixner, rostedt, linux-kernel

----- On May 27, 2016, at 5:16 PM, Julien Desfossez jdesfossez@efficios.com wrote:

> Currently, the priority of tasks is modified directly in the scheduling
> functions. Encapsulate priority updates to enable instrumentation of
> priority changes. This will enable analysis of real-time scheduling
> delays per thread priority, which cannot be performed accurately if we
> only trace the priority of the currently scheduled processes.
> 
> The call sites that modify the priority of a task are mostly system
> calls: sched_setscheduler, sched_setattr, sched_process_fork and
> set_user_nice. Priority can also be dynamically boosted through
> priority inheritance of rt_mutex by rt_mutex_setprio.
> 
> Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

> ---
> include/linux/sched.h |  3 ++-
> kernel/sched/core.c   | 19 +++++++++++++------
> 2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 52c4847..48b35c0 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1409,7 +1409,8 @@ struct task_struct {
> #endif
> 	int on_rq;
> 
> -	int prio, static_prio, normal_prio;
> +	int prio; /* Updated through sched_set_prio() */
> +	int static_prio, normal_prio;
> 	unsigned int rt_priority;
> 	const struct sched_class *sched_class;
> 	struct sched_entity se;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d1f7149..6946b8f 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2230,6 +2230,11 @@ int sysctl_schedstats(struct ctl_table *table, int write,
> #endif
> #endif
> 
> +static void sched_set_prio(struct task_struct *p, int prio)
> +{
> +	p->prio = prio;
> +}
> +
> /*
>  * fork()/clone()-time setup:
>  */
> @@ -2249,7 +2254,7 @@ int sched_fork(unsigned long clone_flags, struct
> task_struct *p)
> 	/*
> 	 * Make sure we do not leak PI boosting priority to the child.
> 	 */
> -	p->prio = current->normal_prio;
> +	sched_set_prio(p, current->normal_prio);
> 
> 	/*
> 	 * Revert to default priority/policy on fork if requested.
> @@ -2262,7 +2267,8 @@ int sched_fork(unsigned long clone_flags, struct
> task_struct *p)
> 		} else if (PRIO_TO_NICE(p->static_prio) < 0)
> 			p->static_prio = NICE_TO_PRIO(0);
> 
> -		p->prio = p->normal_prio = __normal_prio(p);
> +		p->normal_prio = __normal_prio(p);
> +		sched_set_prio(p, p->normal_prio);
> 		set_load_weight(p);
> 
> 		/*
> @@ -3477,7 +3483,7 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
> 		p->sched_class = &fair_sched_class;
> 	}
> 
> -	p->prio = prio;
> +	sched_set_prio(p, prio);
> 
> 	if (running)
> 		p->sched_class->set_curr_task(rq);
> @@ -3524,7 +3530,7 @@ void set_user_nice(struct task_struct *p, long nice)
> 	p->static_prio = NICE_TO_PRIO(nice);
> 	set_load_weight(p);
> 	old_prio = p->prio;
> -	p->prio = effective_prio(p);
> +	sched_set_prio(p, effective_prio(p));
> 	delta = p->prio - old_prio;
> 
> 	if (queued) {
> @@ -3731,9 +3737,10 @@ static void __setscheduler(struct rq *rq, struct
> task_struct *p,
> 	 * sched_setscheduler().
> 	 */
> 	if (keep_boost)
> -		p->prio = rt_mutex_get_effective_prio(p, normal_prio(p));
> +		sched_set_prio(p, rt_mutex_get_effective_prio(p,
> +					normal_prio(p)));
> 	else
> -		p->prio = normal_prio(p);
> +		sched_set_prio(p, normal_prio(p));
> 
> 	if (dl_prio(p->prio))
> 		p->sched_class = &dl_sched_class;
> --
> 1.9.1

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint
  2016-05-27 15:16 ` [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint Julien Desfossez
@ 2016-05-30 13:18   ` Mathieu Desnoyers
  2016-06-06 19:52     ` Mathieu Desnoyers
  0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2016-05-30 13:18 UTC (permalink / raw)
  To: Julien Desfossez; +Cc: Thomas Gleixner, rostedt, linux-kernel

----- On May 27, 2016, at 5:16 PM, Julien Desfossez jdesfossez@efficios.com wrote:

> This tracepoint allows to keep track of all priority changes made by all
> sites that can change this value. The impacted system calls are
> sched_setscheduler, sched_setattr, sched_process_fork and set_user_nice.
> The priority inheritance mechanism from rt_mutex gets also instrumented
> with this tracepoint even though there is a dedicated tracepoint for it
> (sched_pi_setprio).
> 
> This allows analysis of real-time scheduling delays per thread priority,
> which cannot be performed accurately if we only trace the priority of
> the currently scheduled processes.
> 
> Here is an example of what is output by ftrace when we change the
> priority of a running process:
> sys_sched_setscheduler(pid: 1c52, policy: 2, param: 7ffc22e20980)
> sched_set_prio: comm=burnP6 pid=7250 oldprio=120 newprio=39
> sys_sched_setscheduler -> 0x0
> sched_switch: prev_comm=chrt prev_pid=7268 prev_prio=120
>              prev_state=R ==> next_comm=burnP6 next_pid=7250
>              next_prio=39
> 
> Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

> ---
> include/trace/events/sched.h | 21 ++++++++++++++++-----
> kernel/sched/core.c          |  1 +
> 2 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> index 9b90c57..3b83ddb 100644
> --- a/include/trace/events/sched.h
> +++ b/include/trace/events/sched.h
> @@ -407,11 +407,7 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
> 	     TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
> 	     TP_ARGS(tsk, runtime, vruntime));
> 
> -/*
> - * Tracepoint for showing priority inheritance modifying a tasks
> - * priority.
> - */
> -TRACE_EVENT(sched_pi_setprio,
> +DECLARE_EVENT_CLASS(sched_prio_template,
> 
> 	TP_PROTO(struct task_struct *tsk, int newprio),
> 
> @@ -436,6 +432,21 @@ TRACE_EVENT(sched_pi_setprio,
> 			__entry->oldprio, __entry->newprio)
> );
> 
> +/*
> + * Tracepoint for showing priority inheritance modifying a tasks
> + * priority.
> + */
> +DEFINE_EVENT(sched_prio_template, sched_pi_setprio,
> +		TP_PROTO(struct task_struct *tsk, int newprio),
> +		TP_ARGS(tsk, newprio));
> +
> +/*
> + * Tracepoint for priority changes of a task.
> + */
> +DEFINE_EVENT(sched_prio_template, sched_set_prio,
> +		TP_PROTO(struct task_struct *tsk, int newprio),
> +		TP_ARGS(tsk, newprio));
> +
> #ifdef CONFIG_DETECT_HUNG_TASK
> TRACE_EVENT(sched_process_hang,
> 	TP_PROTO(struct task_struct *tsk),
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 6946b8f..45fbaab 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2232,6 +2232,7 @@ int sysctl_schedstats(struct ctl_table *table, int write,
> 
> static void sched_set_prio(struct task_struct *p, int prio)
> {
> +	trace_sched_set_prio(p, prio);
> 	p->prio = prio;
> }
> 
> --
> 1.9.1

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint
  2016-05-30 13:18   ` Mathieu Desnoyers
@ 2016-06-06 19:52     ` Mathieu Desnoyers
  2016-06-06 21:03       ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2016-06-06 19:52 UTC (permalink / raw)
  To: Julien Desfossez, Ingo Molnar, Peter Zijlstra
  Cc: Thomas Gleixner, rostedt, linux-kernel

----- On May 30, 2016, at 9:18 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On May 27, 2016, at 5:16 PM, Julien Desfossez jdesfossez@efficios.com
> wrote:
> 
>> This tracepoint allows to keep track of all priority changes made by all
>> sites that can change this value. The impacted system calls are
>> sched_setscheduler, sched_setattr, sched_process_fork and set_user_nice.
>> The priority inheritance mechanism from rt_mutex gets also instrumented
>> with this tracepoint even though there is a dedicated tracepoint for it
>> (sched_pi_setprio).
>> 
>> This allows analysis of real-time scheduling delays per thread priority,
>> which cannot be performed accurately if we only trace the priority of
>> the currently scheduled processes.
>> 
>> Here is an example of what is output by ftrace when we change the
>> priority of a running process:
>> sys_sched_setscheduler(pid: 1c52, policy: 2, param: 7ffc22e20980)
>> sched_set_prio: comm=burnP6 pid=7250 oldprio=120 newprio=39
>> sys_sched_setscheduler -> 0x0
>> sched_switch: prev_comm=chrt prev_pid=7268 prev_prio=120
>>              prev_state=R ==> next_comm=burnP6 next_pid=7250
>>              next_prio=39
>> 
>> Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
> 
> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Adding Ingo and Peter in CC, considering that it touches to tracing and
the scheduler.

Thanks,

Mathieu

> 
>> ---
>> include/trace/events/sched.h | 21 ++++++++++++++++-----
>> kernel/sched/core.c          |  1 +
>> 2 files changed, 17 insertions(+), 5 deletions(-)
>> 
>> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
>> index 9b90c57..3b83ddb 100644
>> --- a/include/trace/events/sched.h
>> +++ b/include/trace/events/sched.h
>> @@ -407,11 +407,7 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
>> 	     TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
>> 	     TP_ARGS(tsk, runtime, vruntime));
>> 
>> -/*
>> - * Tracepoint for showing priority inheritance modifying a tasks
>> - * priority.
>> - */
>> -TRACE_EVENT(sched_pi_setprio,
>> +DECLARE_EVENT_CLASS(sched_prio_template,
>> 
>> 	TP_PROTO(struct task_struct *tsk, int newprio),
>> 
>> @@ -436,6 +432,21 @@ TRACE_EVENT(sched_pi_setprio,
>> 			__entry->oldprio, __entry->newprio)
>> );
>> 
>> +/*
>> + * Tracepoint for showing priority inheritance modifying a tasks
>> + * priority.
>> + */
>> +DEFINE_EVENT(sched_prio_template, sched_pi_setprio,
>> +		TP_PROTO(struct task_struct *tsk, int newprio),
>> +		TP_ARGS(tsk, newprio));
>> +
>> +/*
>> + * Tracepoint for priority changes of a task.
>> + */
>> +DEFINE_EVENT(sched_prio_template, sched_set_prio,
>> +		TP_PROTO(struct task_struct *tsk, int newprio),
>> +		TP_ARGS(tsk, newprio));
>> +
>> #ifdef CONFIG_DETECT_HUNG_TASK
>> TRACE_EVENT(sched_process_hang,
>> 	TP_PROTO(struct task_struct *tsk),
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 6946b8f..45fbaab 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2232,6 +2232,7 @@ int sysctl_schedstats(struct ctl_table *table, int write,
>> 
>> static void sched_set_prio(struct task_struct *p, int prio)
>> {
>> +	trace_sched_set_prio(p, prio);
>> 	p->prio = prio;
>> }
>> 
>> --
>> 1.9.1
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint
  2016-06-06 19:52     ` Mathieu Desnoyers
@ 2016-06-06 21:03       ` Peter Zijlstra
  2016-06-08  0:18         ` Daniel Bristot de Oliveira
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2016-06-06 21:03 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Julien Desfossez, Ingo Molnar, Thomas Gleixner, rostedt, linux-kernel

On Mon, Jun 06, 2016 at 07:52:00PM +0000, Mathieu Desnoyers wrote:
> ----- On May 30, 2016, at 9:18 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> Adding Ingo and Peter in CC, considering that it touches to tracing and
> the scheduler.

> >> +/*
> >> + * Tracepoint for showing priority inheritance modifying a tasks
> >> + * priority.
> >> + */
> >> +DEFINE_EVENT(sched_prio_template, sched_pi_setprio,
> >> +		TP_PROTO(struct task_struct *tsk, int newprio),
> >> +		TP_ARGS(tsk, newprio));
> >> +
> >> +/*
> >> + * Tracepoint for priority changes of a task.
> >> + */
> >> +DEFINE_EVENT(sched_prio_template, sched_set_prio,
> >> +		TP_PROTO(struct task_struct *tsk, int newprio),
> >> +		TP_ARGS(tsk, newprio));
> >> +

Nak on anything that cannot fundamentally deal with SCHED_DEADLINE.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function
  2016-05-30 13:18 ` [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function Mathieu Desnoyers
@ 2016-06-06 21:47   ` Mathieu Desnoyers
  2016-06-07  7:11     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2016-06-06 21:47 UTC (permalink / raw)
  To: Julien Desfossez, Peter Zijlstra, Ingo Molnar
  Cc: Thomas Gleixner, rostedt, linux-kernel

----- On May 30, 2016, at 9:18 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On May 27, 2016, at 5:16 PM, Julien Desfossez jdesfossez@efficios.com
> wrote:
> 
>> Currently, the priority of tasks is modified directly in the scheduling
>> functions. Encapsulate priority updates to enable instrumentation of
>> priority changes. This will enable analysis of real-time scheduling
>> delays per thread priority, which cannot be performed accurately if we
>> only trace the priority of the currently scheduled processes.
>> 
>> The call sites that modify the priority of a task are mostly system
>> calls: sched_setscheduler, sched_setattr, sched_process_fork and
>> set_user_nice. Priority can also be dynamically boosted through
>> priority inheritance of rt_mutex by rt_mutex_setprio.
>> 
>> Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
> 
> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

CCing Ingo and Peter on the first patch of the series too,
so they can let us know if we missed anything fundamental
related to sched_deadline.

Thanks,

Mathieu

> 
>> ---
>> include/linux/sched.h |  3 ++-
>> kernel/sched/core.c   | 19 +++++++++++++------
>> 2 files changed, 15 insertions(+), 7 deletions(-)
>> 
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 52c4847..48b35c0 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1409,7 +1409,8 @@ struct task_struct {
>> #endif
>> 	int on_rq;
>> 
>> -	int prio, static_prio, normal_prio;
>> +	int prio; /* Updated through sched_set_prio() */
>> +	int static_prio, normal_prio;
>> 	unsigned int rt_priority;
>> 	const struct sched_class *sched_class;
>> 	struct sched_entity se;
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index d1f7149..6946b8f 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2230,6 +2230,11 @@ int sysctl_schedstats(struct ctl_table *table, int write,
>> #endif
>> #endif
>> 
>> +static void sched_set_prio(struct task_struct *p, int prio)
>> +{
>> +	p->prio = prio;
>> +}
>> +
>> /*
>>  * fork()/clone()-time setup:
>>  */
>> @@ -2249,7 +2254,7 @@ int sched_fork(unsigned long clone_flags, struct
>> task_struct *p)
>> 	/*
>> 	 * Make sure we do not leak PI boosting priority to the child.
>> 	 */
>> -	p->prio = current->normal_prio;
>> +	sched_set_prio(p, current->normal_prio);
>> 
>> 	/*
>> 	 * Revert to default priority/policy on fork if requested.
>> @@ -2262,7 +2267,8 @@ int sched_fork(unsigned long clone_flags, struct
>> task_struct *p)
>> 		} else if (PRIO_TO_NICE(p->static_prio) < 0)
>> 			p->static_prio = NICE_TO_PRIO(0);
>> 
>> -		p->prio = p->normal_prio = __normal_prio(p);
>> +		p->normal_prio = __normal_prio(p);
>> +		sched_set_prio(p, p->normal_prio);
>> 		set_load_weight(p);
>> 
>> 		/*
>> @@ -3477,7 +3483,7 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
>> 		p->sched_class = &fair_sched_class;
>> 	}
>> 
>> -	p->prio = prio;
>> +	sched_set_prio(p, prio);
>> 
>> 	if (running)
>> 		p->sched_class->set_curr_task(rq);
>> @@ -3524,7 +3530,7 @@ void set_user_nice(struct task_struct *p, long nice)
>> 	p->static_prio = NICE_TO_PRIO(nice);
>> 	set_load_weight(p);
>> 	old_prio = p->prio;
>> -	p->prio = effective_prio(p);
>> +	sched_set_prio(p, effective_prio(p));
>> 	delta = p->prio - old_prio;
>> 
>> 	if (queued) {
>> @@ -3731,9 +3737,10 @@ static void __setscheduler(struct rq *rq, struct
>> task_struct *p,
>> 	 * sched_setscheduler().
>> 	 */
>> 	if (keep_boost)
>> -		p->prio = rt_mutex_get_effective_prio(p, normal_prio(p));
>> +		sched_set_prio(p, rt_mutex_get_effective_prio(p,
>> +					normal_prio(p)));
>> 	else
>> -		p->prio = normal_prio(p);
>> +		sched_set_prio(p, normal_prio(p));
>> 
>> 	if (dl_prio(p->prio))
>> 		p->sched_class = &dl_sched_class;
>> --
>> 1.9.1
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function
  2016-06-06 21:47   ` Mathieu Desnoyers
@ 2016-06-07  7:11     ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2016-06-07  7:11 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Julien Desfossez, Ingo Molnar, Thomas Gleixner, rostedt, linux-kernel

On Mon, Jun 06, 2016 at 09:47:52PM +0000, Mathieu Desnoyers wrote:
> >> +++ b/kernel/sched/core.c
> >> @@ -2230,6 +2230,11 @@ int sysctl_schedstats(struct ctl_table *table, int write,
> >> #endif
> >> #endif
> >> 
> >> +static void sched_set_prio(struct task_struct *p, int prio)
> >> +{
> >> +	p->prio = prio;
> >> +}

Urgh; so there's a bunch of patches to the PI code that I've been
ignoring that'll completely break all this.

Let me dig them up and get this all sorted.

Basically, looking at @prio is broken.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint
  2016-06-06 21:03       ` Peter Zijlstra
@ 2016-06-08  0:18         ` Daniel Bristot de Oliveira
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Bristot de Oliveira @ 2016-06-08  0:18 UTC (permalink / raw)
  To: Peter Zijlstra, Mathieu Desnoyers
  Cc: Julien Desfossez, Ingo Molnar, Thomas Gleixner, rostedt, linux-kernel

On 06/06/2016 06:03 PM, Peter Zijlstra wrote:
>>>> > >> +/*
>>>> > >> + * Tracepoint for priority changes of a task.
>>>> > >> + */
>>>> > >> +DEFINE_EVENT(sched_prio_template, sched_set_prio,
>>>> > >> +		TP_PROTO(struct task_struct *tsk, int newprio),
>>>> > >> +		TP_ARGS(tsk, newprio));
>>>> > >> +
> Nak on anything that cannot fundamentally deal with SCHED_DEADLINE.

My 2 cents: I liked the idea... But, I agree with peterz. This cannot
fundamentally deal with deadline scheduler.

In the deadline scheduler, the priority is set at every new sporadic
activation of the task (it is a fixed job priority scheduler - not a
fixed priority scheduler). The priority is the deadline of the new job.
The deadline is an u64, that does not fit in the int prio - so it cannot
be reused.

-- Daniel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-06-08  0:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-27 15:16 [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function Julien Desfossez
2016-05-27 15:16 ` [RFC PATCH 2/2] tracing: add sched_set_prio tracepoint Julien Desfossez
2016-05-30 13:18   ` Mathieu Desnoyers
2016-06-06 19:52     ` Mathieu Desnoyers
2016-06-06 21:03       ` Peter Zijlstra
2016-06-08  0:18         ` Daniel Bristot de Oliveira
2016-05-30 13:18 ` [RFC PATCH 1/2] sched: encapsulate priority changes in a sched_set_prio static function Mathieu Desnoyers
2016-06-06 21:47   ` Mathieu Desnoyers
2016-06-07  7:11     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).