All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration
@ 2017-06-26 15:07 Daniel Bristot de Oliveira
  2017-06-26 15:07 ` [PATCH V2 1/2] sched/debug: Inform the number of rt/dl task that can migrate Daniel Bristot de Oliveira
  2017-06-26 15:07 ` [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
  0 siblings, 2 replies; 10+ messages in thread
From: Daniel Bristot de Oliveira @ 2017-06-26 15:07 UTC (permalink / raw)
  To: linux-rt-users
  Cc: Luis Claudio R . Goncalves, Clark Williams, Luiz Capitulino,
	Sebastian Andrzej Siewior, Thomas Gleixner, Steven Rostedt,
	Peter Zijlstra, Ingo Molnar, LKML

This is the continuation of the:
  [RFC] rt: Some fixes for migrate_disable/enable

However, migrate_disable/enable was reworked during the
4.11-rt window, so it turns out that 2 of 3 problems were fixed.
Good! 

But there is still one problem, which is the dl/rt_nr_migratory inc/dec.

The problem is reproducible with the following command [in a 4 CPU box]:

  # chrt -f 1 taskset -c 3 cat /dev/full | taskset -c 0-2 grep 'batman'

By applying only the patch 1/2, it is possible to see the problem with
the following command:

  # cat /proc/sched_debug | grep rt_nr_migratory
    .rt_nr_migratory               : 18446744073709542849
    .rt_nr_migratory               : 18446744073709538566
    .rt_nr_migratory               : 18446744073709548257
    .rt_nr_migratory               : 0

The detailed description of the bug, and the fix, is in the log
of the patch 2/2.

Changes from V1:
 - Print .dl/rt_nr_migratory only if CONFIG_SMP is set (Ingo Molnar)
 - Use helper functions to reduce duplicated code (Ingo Molnar)

Changes from RFC:
 - The problems addressed in the patches:
   x  rt: Update nr_cpus_allowed if the affinity of a task changes while its
      migration is disabled
   x  rt: Checks if task needs migration when re-enabling migration

  were fixed, so these patches are not needed anymore, while patch:

   x  rt: Increase/decrease the nr of migratory tasks when
      enabling/disabling migration

  is still needed, so it was reworked for the new implementation.

 - The patch showing the rt/dl_nr_migratory was added.

Daniel Bristot de Oliveira (2):
  sched/debug: Inform the number of rt/dl task that can migrate
  rt: Increase/decrease the nr of migratory tasks when 
    enabling/disabling migration

 kernel/sched/core.c  | 49 ++++++++++++++++++++++++++++++++++++++++++++-----
 kernel/sched/debug.c | 17 +++++++++++++++--
 2 files changed, 59 insertions(+), 7 deletions(-)

-- 
2.9.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 1/2] sched/debug: Inform the number of rt/dl task that can migrate
  2017-06-26 15:07 [PATCH V2 0/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
@ 2017-06-26 15:07 ` Daniel Bristot de Oliveira
  2017-06-30 13:09   ` [tip:sched/core] sched/debug: Expose the number of RT/DL tasks " tip-bot for Daniel Bristot de Oliveira
  2017-06-26 15:07 ` [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Bristot de Oliveira @ 2017-06-26 15:07 UTC (permalink / raw)
  To: linux-rt-users
  Cc: Luis Claudio R . Goncalves, Clark Williams, Luiz Capitulino,
	Sebastian Andrzej Siewior, Thomas Gleixner, Steven Rostedt,
	Peter Zijlstra, Ingo Molnar, LKML

Add the value of the rt_rq.rt_nr_migratory and dl_rq.dl_nr_migratory
to the sched_debug output, for instance:

rt_rq[0]:
  .rt_nr_running                 : 2
  .rt_nr_migratory               : 1     <--- Like this
  .rt_throttled                  : 0
  .rt_time                       : 828.645877
  .rt_runtime                    : 1000.000000

This is useful to debug problems related to the dl/rt schedulers.

This also fixes the format of some variables, that were unsigned, rather
than signed.

Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
---
 kernel/sched/debug.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 0e2af53..287615a 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -552,15 +552,21 @@ void print_rt_rq(struct seq_file *m, int cpu, struct rt_rq *rt_rq)
 
 #define P(x) \
 	SEQ_printf(m, "  .%-30s: %Ld\n", #x, (long long)(rt_rq->x))
+#define PU(x) \
+	SEQ_printf(m, "  .%-30s: %lu\n", #x, (unsigned long)(rt_rq->x))
 #define PN(x) \
 	SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", #x, SPLIT_NS(rt_rq->x))
 
-	P(rt_nr_running);
+	PU(rt_nr_running);
+#ifdef CONFIG_SMP
+	PU(rt_nr_migratory);
+#endif
 	P(rt_throttled);
 	PN(rt_time);
 	PN(rt_runtime);
 
 #undef PN
+#undef PU
 #undef P
 }
 
@@ -569,14 +575,21 @@ void print_dl_rq(struct seq_file *m, int cpu, struct dl_rq *dl_rq)
 	struct dl_bw *dl_bw;
 
 	SEQ_printf(m, "\ndl_rq[%d]:\n", cpu);
-	SEQ_printf(m, "  .%-30s: %ld\n", "dl_nr_running", dl_rq->dl_nr_running);
+
+#define PU(x) \
+	SEQ_printf(m, "  .%-30s: %lu\n", #x, (unsigned long)(dl_rq->x))
+
+	PU(dl_nr_running);
 #ifdef CONFIG_SMP
+	PU(dl_nr_migratory);
 	dl_bw = &cpu_rq(cpu)->rd->dl_bw;
 #else
 	dl_bw = &dl_rq->dl_bw;
 #endif
 	SEQ_printf(m, "  .%-30s: %lld\n", "dl_bw->bw", dl_bw->bw);
 	SEQ_printf(m, "  .%-30s: %lld\n", "dl_bw->total_bw", dl_bw->total_bw);
+
+#undef PU
 }
 
 extern __read_mostly int sched_clock_running;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when  enabling/disabling migration
  2017-06-26 15:07 [PATCH V2 0/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
  2017-06-26 15:07 ` [PATCH V2 1/2] sched/debug: Inform the number of rt/dl task that can migrate Daniel Bristot de Oliveira
@ 2017-06-26 15:07 ` Daniel Bristot de Oliveira
  2017-06-27 14:55   ` Henri Roosen
                     ` (2 more replies)
  1 sibling, 3 replies; 10+ messages in thread
From: Daniel Bristot de Oliveira @ 2017-06-26 15:07 UTC (permalink / raw)
  To: linux-rt-users
  Cc: Luis Claudio R . Goncalves, Clark Williams, Luiz Capitulino,
	Sebastian Andrzej Siewior, Thomas Gleixner, Steven Rostedt,
	Peter Zijlstra, Ingo Molnar, LKML

There is a problem in the migrate_disable()/enable() implementation
regarding the number of migratory tasks in the rt/dl RQs. The problem
is the following:

When a task is attached to the rt runqueue, it is checked if it either
can run in more than one CPU, or if it is with migration disable. If
either check is true, the rt_rq->rt_nr_migratory counter is not
increased. The counter increases otherwise.

When the task is detached, the same check is done. If either check is
true, the rt_rq->rt_nr_migratory counter is not decreased. The counter
decreases otherwise. The same check is done in the dl scheduler.

One important thing is that, migrate disable/enable does not touch this
counter for tasks attached to the rt rq. So suppose the following chain
of events.

Assumptions:
Task A is the only runnable task in A      Task B runs on the CPU B
Task A runs on CFS (non-rt)                Task B has RT priority
Thus, rt_nr_migratory is 0                 B is running
Task A can run on all CPUS.

Timeline:
        CPU A/TASK A                        CPU B/TASK B
A takes the rt mutex X                           .
A disables migration                             .
           .                          B tries to take the rt mutex X
           .                          As it is held by A {
           .                            A inherits the rt priority of B
           .                            A is dequeued from CFS RQ of CPU A
           .                            A is enqueued in the RT RQ of CPU A
           .                            As migration is disabled
           .                              rt_nr_migratory in A is not increased
           .
A enables migration
A releases the rt mutex X {
  A returns to its original priority
  A ask to be dequeued from RT RQ {
    As migration is now enabled and it can run on all CPUS {
       rt_nr_migratory should be decreased
       As rt_nr_migratory is 0, rt_nr_migratory under flows
    }
}

This variable is important because it notifies if there are more than one
runnable & migratory task in the runqueue. If there are more than one
tasks, the rt_rq is set as overloaded, and then tries to migrate some
tasks. This rule is important to keep the scheduler working conserving,
that is, in a system with M CPUs, the M highest priority tasks should be
running.

As rt_nr_migratory is unsigned, it will become > 0, notifying that the
RQ is overloaded, activating pushing mechanism without need.

This patch fixes this problem by decreasing/increasing the
rt/dl_nr_migratory in the migrate disable/enable operations.

Reported-by: Pei Zhang <pezhang@redhat.com>
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
---
 kernel/sched/core.c | 49 ++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 44 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ce34e4f..7d3565e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7566,6 +7566,47 @@ const u32 sched_prio_to_wmult[40] = {
 
 #if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
 
+static inline void
+update_nr_migratory(struct task_struct *p, long delta)
+{
+	if (unlikely((p->sched_class == &rt_sched_class ||
+		      p->sched_class == &dl_sched_class) &&
+		      p->nr_cpus_allowed > 1)) {
+		if (p->sched_class == &rt_sched_class)
+			task_rq(p)->rt.rt_nr_migratory += delta;
+		else
+			task_rq(p)->dl.dl_nr_migratory += delta;
+	}
+}
+
+static inline void
+migrate_disable_update_cpus_allowed(struct task_struct *p)
+{
+	struct rq *rq;
+	struct rq_flags rf;
+
+	p->cpus_ptr = cpumask_of(smp_processor_id());
+
+	rq = task_rq_lock(p, &rf);
+	update_nr_migratory(p, -1);
+	p->nr_cpus_allowed = 1;
+	task_rq_unlock(rq, p, &rf);
+}
+
+static inline void
+migrate_enable_update_cpus_allowed(struct task_struct *p)
+{
+	struct rq *rq;
+	struct rq_flags rf;
+
+	p->cpus_ptr = &p->cpus_mask;
+
+	rq = task_rq_lock(p, &rf);
+	p->nr_cpus_allowed = cpumask_weight(&p->cpus_mask);
+	update_nr_migratory(p, 1);
+	task_rq_unlock(rq, p, &rf);
+}
+
 void migrate_disable(void)
 {
 	struct task_struct *p = current;
@@ -7593,10 +7634,9 @@ void migrate_disable(void)
 	preempt_disable();
 	preempt_lazy_disable();
 	pin_current_cpu();
-	p->migrate_disable = 1;
 
-	p->cpus_ptr = cpumask_of(smp_processor_id());
-	p->nr_cpus_allowed = 1;
+	migrate_disable_update_cpus_allowed(p);
+	p->migrate_disable = 1;
 
 	preempt_enable();
 }
@@ -7628,9 +7668,8 @@ void migrate_enable(void)
 
 	preempt_disable();
 
-	p->cpus_ptr = &p->cpus_mask;
-	p->nr_cpus_allowed = cpumask_weight(&p->cpus_mask);
 	p->migrate_disable = 0;
+	migrate_enable_update_cpus_allowed(p);
 
 	if (p->migrate_disable_update) {
 		struct rq *rq;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration
  2017-06-26 15:07 ` [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
@ 2017-06-27 14:55   ` Henri Roosen
  2017-06-27 16:32     ` Daniel Bristot de Oliveira
  2017-06-30  7:30   ` Ingo Molnar
  2017-08-07 15:46   ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 10+ messages in thread
From: Henri Roosen @ 2017-06-27 14:55 UTC (permalink / raw)
  To: Daniel Bristot de Oliveira, linux-rt-users
  Cc: Luis Claudio R . Goncalves, Clark Williams, Luiz Capitulino,
	Sebastian Andrzej Siewior, Thomas Gleixner, Steven Rostedt,
	Peter Zijlstra, Ingo Molnar, LKML

On 06/26/2017 05:07 PM, Daniel Bristot de Oliveira wrote:
> There is a problem in the migrate_disable()/enable() implementation
> regarding the number of migratory tasks in the rt/dl RQs. The problem
> is the following:
>
> When a task is attached to the rt runqueue, it is checked if it either
> can run in more than one CPU, or if it is with migration disable. If
> either check is true, the rt_rq->rt_nr_migratory counter is not
> increased. The counter increases otherwise.
>
> When the task is detached, the same check is done. If either check is
> true, the rt_rq->rt_nr_migratory counter is not decreased. The counter
> decreases otherwise. The same check is done in the dl scheduler.
>
> One important thing is that, migrate disable/enable does not touch this
> counter for tasks attached to the rt rq. So suppose the following chain
> of events.
>
> Assumptions:
> Task A is the only runnable task in A      Task B runs on the CPU B
> Task A runs on CFS (non-rt)                Task B has RT priority
> Thus, rt_nr_migratory is 0                 B is running
> Task A can run on all CPUS.
>
> Timeline:
>         CPU A/TASK A                        CPU B/TASK B
> A takes the rt mutex X                           .
> A disables migration                             .
>            .                          B tries to take the rt mutex X
>            .                          As it is held by A {
>            .                            A inherits the rt priority of B
>            .                            A is dequeued from CFS RQ of CPU A
>            .                            A is enqueued in the RT RQ of CPU A
>            .                            As migration is disabled
>            .                              rt_nr_migratory in A is not increased
>            .
> A enables migration
> A releases the rt mutex X {
>   A returns to its original priority
>   A ask to be dequeued from RT RQ {
>     As migration is now enabled and it can run on all CPUS {
>        rt_nr_migratory should be decreased
>        As rt_nr_migratory is 0, rt_nr_migratory under flows
>     }
> }
>
> This variable is important because it notifies if there are more than one
> runnable & migratory task in the runqueue. If there are more than one
> tasks, the rt_rq is set as overloaded, and then tries to migrate some
> tasks. This rule is important to keep the scheduler working conserving,
> that is, in a system with M CPUs, the M highest priority tasks should be
> running.
>
> As rt_nr_migratory is unsigned, it will become > 0, notifying that the
> RQ is overloaded, activating pushing mechanism without need.

What kind of symptoms might be triggered by this? I'm currently facing a 
problem with a continuous-reboot-test where the kernel seems to hang 
sometimes at a (seemingly) random place during kernel boot, on 
4.9.33-rt23 with iMX6Q. A back-port of this patch to 4.9-rt seems to fix 
it. Or is it covering up a different problem?

Thanks,
Henri
-- 

>
> This patch fixes this problem by decreasing/increasing the
> rt/dl_nr_migratory in the migrate disable/enable operations.
>
> Reported-by: Pei Zhang <pezhang@redhat.com>
> Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: Luiz Capitulino <lcapitulino@redhat.com>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
> ---
>  kernel/sched/core.c | 49 ++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ce34e4f..7d3565e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7566,6 +7566,47 @@ const u32 sched_prio_to_wmult[40] = {
>
>  #if defined(CONFIG_PREEMPT_COUNT) && defined(CONFIG_SMP)
>
> +static inline void
> +update_nr_migratory(struct task_struct *p, long delta)
> +{
> +	if (unlikely((p->sched_class == &rt_sched_class ||
> +		      p->sched_class == &dl_sched_class) &&
> +		      p->nr_cpus_allowed > 1)) {
> +		if (p->sched_class == &rt_sched_class)
> +			task_rq(p)->rt.rt_nr_migratory += delta;
> +		else
> +			task_rq(p)->dl.dl_nr_migratory += delta;
> +	}
> +}
> +
> +static inline void
> +migrate_disable_update_cpus_allowed(struct task_struct *p)
> +{
> +	struct rq *rq;
> +	struct rq_flags rf;
> +
> +	p->cpus_ptr = cpumask_of(smp_processor_id());
> +
> +	rq = task_rq_lock(p, &rf);
> +	update_nr_migratory(p, -1);
> +	p->nr_cpus_allowed = 1;
> +	task_rq_unlock(rq, p, &rf);
> +}
> +
> +static inline void
> +migrate_enable_update_cpus_allowed(struct task_struct *p)
> +{
> +	struct rq *rq;
> +	struct rq_flags rf;
> +
> +	p->cpus_ptr = &p->cpus_mask;
> +
> +	rq = task_rq_lock(p, &rf);
> +	p->nr_cpus_allowed = cpumask_weight(&p->cpus_mask);
> +	update_nr_migratory(p, 1);
> +	task_rq_unlock(rq, p, &rf);
> +}
> +
>  void migrate_disable(void)
>  {
>  	struct task_struct *p = current;
> @@ -7593,10 +7634,9 @@ void migrate_disable(void)
>  	preempt_disable();
>  	preempt_lazy_disable();
>  	pin_current_cpu();
> -	p->migrate_disable = 1;
>
> -	p->cpus_ptr = cpumask_of(smp_processor_id());
> -	p->nr_cpus_allowed = 1;
> +	migrate_disable_update_cpus_allowed(p);
> +	p->migrate_disable = 1;
>
>  	preempt_enable();
>  }
> @@ -7628,9 +7668,8 @@ void migrate_enable(void)
>
>  	preempt_disable();
>
> -	p->cpus_ptr = &p->cpus_mask;
> -	p->nr_cpus_allowed = cpumask_weight(&p->cpus_mask);
>  	p->migrate_disable = 0;
> +	migrate_enable_update_cpus_allowed(p);
>
>  	if (p->migrate_disable_update) {
>  		struct rq *rq;
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration
  2017-06-27 14:55   ` Henri Roosen
@ 2017-06-27 16:32     ` Daniel Bristot de Oliveira
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Bristot de Oliveira @ 2017-06-27 16:32 UTC (permalink / raw)
  To: Henri Roosen, Daniel Bristot de Oliveira, linux-rt-users
  Cc: Luis Claudio R . Goncalves, Clark Williams, Luiz Capitulino,
	Sebastian Andrzej Siewior, Thomas Gleixner, Steven Rostedt,
	Peter Zijlstra, Ingo Molnar, LKML



On 06/27/2017 04:55 PM, Henri Roosen wrote:
>>
>> As rt_nr_migratory is unsigned, it will become > 0, notifying that the
>> RQ is overloaded, activating pushing mechanism without need.
> 
> What kind of symptoms might be triggered by this? I'm currently facing a
> problem with a continuous-reboot-test where the kernel seems to hang
> sometimes at a (seemingly) random place during kernel boot, on
> 4.9.33-rt23 with iMX6Q. A back-port of this patch to 4.9-rt seems to fix
> it. Or is it covering up a different problem?

The side effect is notifying that the RQ is overloaded, activating
pushing mechanism without need.

I was not seeing system freezes because of it... but, just in case...do
you have the console output from your board?

-- Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when  enabling/disabling migration
  2017-06-26 15:07 ` [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
  2017-06-27 14:55   ` Henri Roosen
@ 2017-06-30  7:30   ` Ingo Molnar
  2017-06-30  8:51     ` Daniel Bristot de Oliveira
  2017-08-07 15:46   ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2017-06-30  7:30 UTC (permalink / raw)
  To: Daniel Bristot de Oliveira
  Cc: linux-rt-users, Luis Claudio R . Goncalves, Clark Williams,
	Luiz Capitulino, Sebastian Andrzej Siewior, Thomas Gleixner,
	Steven Rostedt, Peter Zijlstra, LKML


* Daniel Bristot de Oliveira <bristot@redhat.com> wrote:

> There is a problem in the migrate_disable()/enable() implementation
> regarding the number of migratory tasks in the rt/dl RQs. The problem
> is the following:
> 
> When a task is attached to the rt runqueue, it is checked if it either
> can run in more than one CPU, or if it is with migration disable. If
> either check is true, the rt_rq->rt_nr_migratory counter is not
> increased. The counter increases otherwise.
> 
> When the task is detached, the same check is done. If either check is
> true, the rt_rq->rt_nr_migratory counter is not decreased. The counter
> decreases otherwise. The same check is done in the dl scheduler.
> 
> One important thing is that, migrate disable/enable does not touch this
> counter for tasks attached to the rt rq. So suppose the following chain
> of events.
> 
> Assumptions:
> Task A is the only runnable task in A      Task B runs on the CPU B
> Task A runs on CFS (non-rt)                Task B has RT priority
> Thus, rt_nr_migratory is 0                 B is running
> Task A can run on all CPUS.
> 
> Timeline:
>         CPU A/TASK A                        CPU B/TASK B
> A takes the rt mutex X                           .
> A disables migration                             .
>            .                          B tries to take the rt mutex X
>            .                          As it is held by A {
>            .                            A inherits the rt priority of B
>            .                            A is dequeued from CFS RQ of CPU A
>            .                            A is enqueued in the RT RQ of CPU A
>            .                            As migration is disabled
>            .                              rt_nr_migratory in A is not increased
>            .
> A enables migration
> A releases the rt mutex X {
>   A returns to its original priority
>   A ask to be dequeued from RT RQ {
>     As migration is now enabled and it can run on all CPUS {
>        rt_nr_migratory should be decreased
>        As rt_nr_migratory is 0, rt_nr_migratory under flows
>     }
> }
> 
> This variable is important because it notifies if there are more than one
> runnable & migratory task in the runqueue. If there are more than one
> tasks, the rt_rq is set as overloaded, and then tries to migrate some
> tasks. This rule is important to keep the scheduler working conserving,
> that is, in a system with M CPUs, the M highest priority tasks should be
> running.
> 
> As rt_nr_migratory is unsigned, it will become > 0, notifying that the
> RQ is overloaded, activating pushing mechanism without need.
> 
> This patch fixes this problem by decreasing/increasing the
> rt/dl_nr_migratory in the migrate disable/enable operations.
> 
> Reported-by: Pei Zhang <pezhang@redhat.com>
> Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: Luiz Capitulino <lcapitulino@redhat.com>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
> ---
>  kernel/sched/core.c | 49 ++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 44 insertions(+), 5 deletions(-)

This second patch does not apply to the latest scheduler tree (tip:master) cleanly 
- which tree is it against?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration
  2017-06-30  7:30   ` Ingo Molnar
@ 2017-06-30  8:51     ` Daniel Bristot de Oliveira
  2017-06-30  9:41       ` Ingo Molnar
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Bristot de Oliveira @ 2017-06-30  8:51 UTC (permalink / raw)
  To: Ingo Molnar, Daniel Bristot de Oliveira
  Cc: linux-rt-users, Luis Claudio R . Goncalves, Clark Williams,
	Luiz Capitulino, Sebastian Andrzej Siewior, Thomas Gleixner,
	Steven Rostedt, Peter Zijlstra, LKML



On 06/30/2017 09:30 AM, Ingo Molnar wrote:
> 
> * Daniel Bristot de Oliveira <bristot@redhat.com> wrote:
> 
>> There is a problem in the migrate_disable()/enable() implementation
>> regarding the number of migratory tasks in the rt/dl RQs. The problem
>> is the following:
>>
>> When a task is attached to the rt runqueue, it is checked if it either
>> can run in more than one CPU, or if it is with migration disable. If
>> either check is true, the rt_rq->rt_nr_migratory counter is not
>> increased. The counter increases otherwise.
>>
>> When the task is detached, the same check is done. If either check is
>> true, the rt_rq->rt_nr_migratory counter is not decreased. The counter
>> decreases otherwise. The same check is done in the dl scheduler.
>>
>> One important thing is that, migrate disable/enable does not touch this
>> counter for tasks attached to the rt rq. So suppose the following chain
>> of events.
>>
>> Assumptions:
>> Task A is the only runnable task in A      Task B runs on the CPU B
>> Task A runs on CFS (non-rt)                Task B has RT priority
>> Thus, rt_nr_migratory is 0                 B is running
>> Task A can run on all CPUS.
>>
>> Timeline:
>>         CPU A/TASK A                        CPU B/TASK B
>> A takes the rt mutex X                           .
>> A disables migration                             .
>>            .                          B tries to take the rt mutex X
>>            .                          As it is held by A {
>>            .                            A inherits the rt priority of B
>>            .                            A is dequeued from CFS RQ of CPU A
>>            .                            A is enqueued in the RT RQ of CPU A
>>            .                            As migration is disabled
>>            .                              rt_nr_migratory in A is not increased
>>            .
>> A enables migration
>> A releases the rt mutex X {
>>   A returns to its original priority
>>   A ask to be dequeued from RT RQ {
>>     As migration is now enabled and it can run on all CPUS {
>>        rt_nr_migratory should be decreased
>>        As rt_nr_migratory is 0, rt_nr_migratory under flows
>>     }
>> }
>>
>> This variable is important because it notifies if there are more than one
>> runnable & migratory task in the runqueue. If there are more than one
>> tasks, the rt_rq is set as overloaded, and then tries to migrate some
>> tasks. This rule is important to keep the scheduler working conserving,
>> that is, in a system with M CPUs, the M highest priority tasks should be
>> running.
>>
>> As rt_nr_migratory is unsigned, it will become > 0, notifying that the
>> RQ is overloaded, activating pushing mechanism without need.
>>
>> This patch fixes this problem by decreasing/increasing the
>> rt/dl_nr_migratory in the migrate disable/enable operations.
>>
>> Reported-by: Pei Zhang <pezhang@redhat.com>
>> Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
>> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
>> Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
>> Cc: Clark Williams <williams@redhat.com>
>> Cc: Luiz Capitulino <lcapitulino@redhat.com>
>> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: LKML <linux-kernel@vger.kernel.org>
>> Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
>> ---
>>  kernel/sched/core.c | 49 ++++++++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 44 insertions(+), 5 deletions(-)
> 
> This second patch does not apply to the latest scheduler tree (tip:master) cleanly 
> - which tree is it against?

Hi Ingo,

migrate_disable/enable() are PREEMPT_RT specific, so the patch 2/2 is
addressed only to the PREEMPT_RT patch set.

I was working in the 4.11-rt tree.

The first one is not -rt specific, though.

Sorry for a possible miss communication...

-- Daniel


> Thanks,
> 
> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration
  2017-06-30  8:51     ` Daniel Bristot de Oliveira
@ 2017-06-30  9:41       ` Ingo Molnar
  0 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2017-06-30  9:41 UTC (permalink / raw)
  To: Daniel Bristot de Oliveira
  Cc: linux-rt-users, Luis Claudio R . Goncalves, Clark Williams,
	Luiz Capitulino, Sebastian Andrzej Siewior, Thomas Gleixner,
	Steven Rostedt, Peter Zijlstra, LKML


* Daniel Bristot de Oliveira <bristot@redhat.com> wrote:

> The first one is not -rt specific, though.

Ok, then all is good, and I've applied the debug patch to the upstream sched/core 
tree.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:sched/core] sched/debug: Expose the number of RT/DL tasks that can migrate
  2017-06-26 15:07 ` [PATCH V2 1/2] sched/debug: Inform the number of rt/dl task that can migrate Daniel Bristot de Oliveira
@ 2017-06-30 13:09   ` tip-bot for Daniel Bristot de Oliveira
  0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Daniel Bristot de Oliveira @ 2017-06-30 13:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: rostedt, mingo, tglx, bigeasy, williams, hpa, lgoncalv,
	lcapitulino, linux-kernel, peterz, linux-rt-users, torvalds,
	bristot

Commit-ID:  48365b38849fdb1ee6dc65beac044ca59f669683
Gitweb:     http://git.kernel.org/tip/48365b38849fdb1ee6dc65beac044ca59f669683
Author:     Daniel Bristot de Oliveira <bristot@redhat.com>
AuthorDate: Mon, 26 Jun 2017 17:07:14 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 30 Jun 2017 09:32:07 +0200

sched/debug: Expose the number of RT/DL tasks that can migrate

Add the value of the rt_rq.rt_nr_migratory and dl_rq.dl_nr_migratory
to the sched_debug output, for instance:

 rt_rq[0]:
   .rt_nr_running                 : 2
   .rt_nr_migratory               : 1     <--- Like this
   .rt_throttled                  : 0
   .rt_time                       : 828.645877
   .rt_runtime                    : 1000.000000

This is useful to debug problems related to the RT/DL schedulers.

This also fixes the format of some variables, that were unsigned, rather
than signed.

Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Link: http://lkml.kernel.org/r/7896f71cada54ee7dd8507bb666063a2e051c3d4.1498482127.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/debug.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 38f0193..4fa66de 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -552,15 +552,21 @@ void print_rt_rq(struct seq_file *m, int cpu, struct rt_rq *rt_rq)
 
 #define P(x) \
 	SEQ_printf(m, "  .%-30s: %Ld\n", #x, (long long)(rt_rq->x))
+#define PU(x) \
+	SEQ_printf(m, "  .%-30s: %lu\n", #x, (unsigned long)(rt_rq->x))
 #define PN(x) \
 	SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", #x, SPLIT_NS(rt_rq->x))
 
-	P(rt_nr_running);
+	PU(rt_nr_running);
+#ifdef CONFIG_SMP
+	PU(rt_nr_migratory);
+#endif
 	P(rt_throttled);
 	PN(rt_time);
 	PN(rt_runtime);
 
 #undef PN
+#undef PU
 #undef P
 }
 
@@ -569,14 +575,21 @@ void print_dl_rq(struct seq_file *m, int cpu, struct dl_rq *dl_rq)
 	struct dl_bw *dl_bw;
 
 	SEQ_printf(m, "\ndl_rq[%d]:\n", cpu);
-	SEQ_printf(m, "  .%-30s: %ld\n", "dl_nr_running", dl_rq->dl_nr_running);
+
+#define PU(x) \
+	SEQ_printf(m, "  .%-30s: %lu\n", #x, (unsigned long)(dl_rq->x))
+
+	PU(dl_nr_running);
 #ifdef CONFIG_SMP
+	PU(dl_nr_migratory);
 	dl_bw = &cpu_rq(cpu)->rd->dl_bw;
 #else
 	dl_bw = &dl_rq->dl_bw;
 #endif
 	SEQ_printf(m, "  .%-30s: %lld\n", "dl_bw->bw", dl_bw->bw);
 	SEQ_printf(m, "  .%-30s: %lld\n", "dl_bw->total_bw", dl_bw->total_bw);
+
+#undef PU
 }
 
 extern __read_mostly int sched_clock_running;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when  enabling/disabling migration
  2017-06-26 15:07 ` [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
  2017-06-27 14:55   ` Henri Roosen
  2017-06-30  7:30   ` Ingo Molnar
@ 2017-08-07 15:46   ` Sebastian Andrzej Siewior
  2 siblings, 0 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2017-08-07 15:46 UTC (permalink / raw)
  To: Daniel Bristot de Oliveira
  Cc: linux-rt-users, Luis Claudio R . Goncalves, Clark Williams,
	Luiz Capitulino, Thomas Gleixner, Steven Rostedt, Peter Zijlstra,
	Ingo Molnar, LKML

On 2017-06-26 17:07:15 [+0200], Daniel Bristot de Oliveira wrote:
> There is a problem in the migrate_disable()/enable() implementation
> regarding the number of migratory tasks in the rt/dl RQs. The problem
> is the following:
> This patch fixes this problem by decreasing/increasing the
> rt/dl_nr_migratory in the migrate disable/enable operations.
> 
> Reported-by: Pei Zhang <pezhang@redhat.com>
> Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: Luiz Capitulino <lcapitulino@redhat.com>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: linux-rt-users <linux-rt-users@vger.kernel.org>

Applied.

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-08-07 15:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-26 15:07 [PATCH V2 0/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
2017-06-26 15:07 ` [PATCH V2 1/2] sched/debug: Inform the number of rt/dl task that can migrate Daniel Bristot de Oliveira
2017-06-30 13:09   ` [tip:sched/core] sched/debug: Expose the number of RT/DL tasks " tip-bot for Daniel Bristot de Oliveira
2017-06-26 15:07 ` [PATCH V2 2/2] rt: Increase/decrease the nr of migratory tasks when enabling/disabling migration Daniel Bristot de Oliveira
2017-06-27 14:55   ` Henri Roosen
2017-06-27 16:32     ` Daniel Bristot de Oliveira
2017-06-30  7:30   ` Ingo Molnar
2017-06-30  8:51     ` Daniel Bristot de Oliveira
2017-06-30  9:41       ` Ingo Molnar
2017-08-07 15:46   ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.