linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks
@ 2022-07-13 13:48 Schspa Shi
  2022-07-13 13:48 ` [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating Schspa Shi
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Schspa Shi @ 2022-07-13 13:48 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid
  Cc: linux-kernel, zhaohui.shi, Schspa Shi

Commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing")
allow find_lock_lowest_rq to pick a task with migration disabled.
This commit is intended to push the current running task on this CPU
away.

There is a race scenario, which allows a migration disabled task to
be migrated to another CPU.

When there is a RT task with higher priority, rt sched class was
intended to migrate higher priority task to lowest rq via push_rt_tasks,
this BUG will happen here.

With the system running on PREEMPT_RT, rt_spin_lock will disable
migration, this will make the problem easier to reproduce.

I have seen this crash on PREEMPT_RT, from the logs, there is a race
when trying to migrate higher priority tasks to the lowest rq.

Please refer to the following scenarios.

           CPU0                                  CPU1
------------------------------------------------------------------
push_rt_task
  check is_migration_disabled(next_task)
                                        task not running and
                                        migration_disabled == 0
  find_lock_lowest_rq(next_task, rq);
    _double_lock_balance(this_rq, busiest);
      raw_spin_rq_unlock(this_rq);
      double_rq_lock(this_rq, busiest);
        <<wait for busiest rq>>
                                            <wakeup>
                                        task become running
                                        migrate_disable();
                                          <context out>
  deactivate_task(rq, next_task, 0);
  set_task_cpu(next_task, lowest_rq->cpu);
    WARN_ON_ONCE(is_migration_disabled(p));
      ---------OOPS-------------

Crash logs are as follows:
[123671.996430] WARNING: CPU: 2 PID: 13470 at kernel/sched/core.c:2485
set_task_cpu+0x8c/0x108
[123671.996800] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--)
[123671.996811] pc : set_task_cpu+0x8c/0x108
[123671.996820] lr : set_task_cpu+0x7c/0x108
[123671.996828] sp : ffff80001268bd30
[123671.996832] pmr_save: 00000060
[123671.996835] x29: ffff80001268bd30 x28: ffff0001a3d68e80
[123671.996844] x27: ffff80001225f4a8 x26: ffff800010ab62cb
[123671.996854] x25: ffff80026d95e000 x24: 0000000000000005
[123671.996864] x23: ffff00019746c1b0 x22: 0000000000000000
[123671.996873] x21: ffff00027ee33a80 x20: 0000000000000000
[123671.996882] x19: ffff00019746ba00 x18: 0000000000000000
[123671.996890] x17: 0000000000000000 x16: 0000000000000000
[123671.996899] x15: 000000000000000a x14: 000000000000349e
[123671.996908] x13: ffff800012f4503d x12: 0000000000000001
[123671.996916] x11: 0000000000000000 x10: 0000000000000000
[123671.996925] x9 : 00000000000c0000 x8 : ffff00027ee58700
[123671.996933] x7 : ffff00027ee8da80 x6 : ffff00027ee8e580
[123671.996942] x5 : ffff00027ee8dcc0 x4 : 0000000000000005
[123671.996951] x3 : ffff00027ee8e338 x2 : 0000000000000000
[123671.996959] x1 : 00000000000000ff x0 : 0000000000000002
[123671.996969] Call trace:
[123671.996975]  set_task_cpu+0x8c/0x108
[123671.996984]  push_rt_task.part.0+0x144/0x184
[123671.996995]  push_rt_tasks+0x28/0x3c
[123671.997002]  task_woken_rt+0x58/0x68
[123671.997009]  ttwu_do_wakeup+0x5c/0xd0
[123671.997019]  ttwu_do_activate+0xc0/0xd4
[123671.997028]  try_to_wake_up+0x244/0x288
[123671.997036]  wake_up_process+0x18/0x24
[123671.997045]  __irq_wake_thread+0x64/0x80
[123671.997056]  __handle_irq_event_percpu+0x110/0x124
[123671.997064]  handle_irq_event_percpu+0x50/0xac
[123671.997072]  handle_irq_event+0x84/0xfc

To fix it, we need to check migration_disabled flag again to avoid
bad migration.

Fixes: 95158a89dd50 ("sched,rt: Use the full cpumask for balancing")

Signed-off-by: Schspa Shi <schspa@gmail.com>

--

Changelog:
v1 -> v2:
        - Modify commit message to add fixed commit information.
        - Going to retry to push the current running task on this CPU
          away, instead doing nothing for this migrate disabled task.
v2 -> v3:
        - Change migration disabled check to the correct position
v3 -> v4:
        - Check migrate disabled in find_lock_lowest_rq to avoid not
        necessary check when task rq is not released as Steven advised.
v4 -> v5:
        - Adjust the comment as Steve advised to make it clear.
v5 -> v6:
        - Adjust the comment again as Steve advised.
v6 -> v7:
        - Add missing put_task_struct && add this task migration
        disable check to deadline scheduler too as Dietmar advised.
---
 kernel/sched/deadline.c | 1 +
 kernel/sched/rt.c       | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b5152961b743..cb3b886a081c 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2238,6 +2238,7 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
 				     !cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
 				     task_running(rq, task) ||
 				     !dl_task(task) ||
+				     is_migration_disabled(task) ||
 				     !task_on_rq_queued(task))) {
 				double_unlock_balance(rq, later_rq);
 				later_rq = NULL;
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 8c9ed9664840..7bd3e6ecbe45 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1998,11 +1998,15 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
 			 * the mean time, task could have
 			 * migrated already or had its affinity changed.
 			 * Also make sure that it wasn't scheduled on its rq.
+			 * It is possible the task was scheduled, set
+			 * "migrate_disabled" and then got preempted, so we must
+			 * check the task migration disable flag here too.
 			 */
 			if (unlikely(task_rq(task) != rq ||
 				     !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) ||
 				     task_running(rq, task) ||
 				     !rt_task(task) ||
+				     is_migration_disabled(task) ||
 				     !task_on_rq_queued(task))) {
 
 				double_unlock_balance(rq, lowest_rq);
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating
  2022-07-13 13:48 [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks Schspa Shi
@ 2022-07-13 13:48 ` Schspa Shi
  2022-08-26 18:46   ` Dietmar Eggemann
  2022-08-24  6:04 ` [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks Schspa Shi
  2022-08-26 18:45 ` Dietmar Eggemann
  2 siblings, 1 reply; 7+ messages in thread
From: Schspa Shi @ 2022-07-13 13:48 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid
  Cc: linux-kernel, zhaohui.shi, Schspa Shi

When the task to push disable migration, retry to push the current
running task on this CPU away, instead doing nothing for this migrate
disabled task.

Signed-off-by: Schspa Shi <schspa@gmail.com>
---
 kernel/sched/core.c     | 13 ++++++++++++-
 kernel/sched/deadline.c |  9 +++++++++
 kernel/sched/rt.c       |  8 ++++++++
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index da0bf6fe9ecd..af90cc558b8e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2509,8 +2509,19 @@ int push_cpu_stop(void *arg)
 	if (p->sched_class->find_lock_rq)
 		lowest_rq = p->sched_class->find_lock_rq(p, rq);
 
-	if (!lowest_rq)
+	if (!lowest_rq) {
+		/*
+		 * The find_lock_rq function above could have released the rq
+		 * lock and allow p to schedule and be preempted again, and
+		 * that lowest_rq could be NULL because p now has the
+		 * migrate_disable flag set and not because it could not find
+		 * the lowest rq. So we must check task migration flag again.
+		 */
+		if (unlikely(is_migration_disabled(p)))
+			p->migration_flags |= MDF_PUSH;
+
 		goto out_unlock;
+	}
 
 	// XXX validate p is still the highest prio task
 	if (task_rq(p) == rq) {
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index cb3b886a081c..21af20445e7f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2335,6 +2335,15 @@ static int push_dl_task(struct rq *rq)
 		 */
 		task = pick_next_pushable_dl_task(rq);
 		if (task == next_task) {
+			/*
+			 * If next task has now disabled migrating, see if we
+			 * can push the current task.
+			 */
+			if (unlikely(is_migration_disabled(task))) {
+				put_task_struct(next_task);
+				goto retry;
+			}
+
 			/*
 			 * The task is still there. We don't try
 			 * again, some other CPU will pull it when ready.
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 7bd3e6ecbe45..316088e2fee2 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2136,6 +2136,14 @@ static int push_rt_task(struct rq *rq, bool pull)
 		 */
 		task = pick_next_pushable_task(rq);
 		if (task == next_task) {
+			/*
+			 * If next task has now disabled migrating, see if we
+			 * can push the current task.
+			 */
+			if (unlikely(is_migration_disabled(task))) {
+				put_task_struct(next_task);
+				goto retry;
+			}
 			/*
 			 * The task hasn't migrated, and is still the next
 			 * eligible task, but we failed to find a run-queue
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks
  2022-07-13 13:48 [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks Schspa Shi
  2022-07-13 13:48 ` [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating Schspa Shi
@ 2022-08-24  6:04 ` Schspa Shi
  2022-08-26 18:45 ` Dietmar Eggemann
  2 siblings, 0 replies; 7+ messages in thread
From: Schspa Shi @ 2022-08-24  6:04 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, vschneid
  Cc: linux-kernel, zhaohui.shi, Schspa Shi

Hi Valentin, Steven, Dietmar:

Could you help to review this V7 patch again ?

I have been testing on our platform for more than a month, and it
seems that the system has no adverse reactions.

Thank you very much.


--
BRs
Schspa Shi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks
  2022-07-13 13:48 [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks Schspa Shi
  2022-07-13 13:48 ` [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating Schspa Shi
  2022-08-24  6:04 ` [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks Schspa Shi
@ 2022-08-26 18:45 ` Dietmar Eggemann
  2022-08-28 15:45   ` Schspa Shi
  2 siblings, 1 reply; 7+ messages in thread
From: Dietmar Eggemann @ 2022-08-26 18:45 UTC (permalink / raw)
  To: Schspa Shi, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
	bsegall, mgorman, bristot, vschneid
  Cc: linux-kernel, zhaohui.shi

On 13/07/2022 15:48, Schspa Shi wrote:
> Commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing")
> allow find_lock_lowest_rq to pick a task with migration disabled.
> This commit is intended to push the current running task on this CPU
> away.
> 
> There is a race scenario, which allows a migration disabled task to
> be migrated to another CPU.
> 
> When there is a RT task with higher priority, rt sched class was
> intended to migrate higher priority task to lowest rq via push_rt_tasks,
> this BUG will happen here.
       ^^^

You mean the warning in set_task_cpu()?

> With the system running on PREEMPT_RT, rt_spin_lock will disable
> migration, this will make the problem easier to reproduce.
> 
> I have seen this crash on PREEMPT_RT, from the logs, there is a race
                   ^^^^^
                     ?

We still talking about the set_task_cpu() warning, right?

[...]

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

You should carry the Reviewed-by:'s you got in previous versions.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating
  2022-07-13 13:48 ` [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating Schspa Shi
@ 2022-08-26 18:46   ` Dietmar Eggemann
  2022-08-28 15:54     ` Schspa Shi
  0 siblings, 1 reply; 7+ messages in thread
From: Dietmar Eggemann @ 2022-08-26 18:46 UTC (permalink / raw)
  To: Schspa Shi, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
	bsegall, mgorman, bristot, vschneid
  Cc: linux-kernel, zhaohui.shi

On 13/07/2022 15:48, Schspa Shi wrote:
> When the task to push disable migration, retry to push the current
> running task on this CPU away, instead doing nothing for this migrate
> disabled task.
> 
> Signed-off-by: Schspa Shi <schspa@gmail.com>

Unfortunately, I can't recreate this issue on my Arm64 6 CPUs system on
mainline or PREEMPT_RT (linux-5.19.y-rt and v5.10.59-rt52) (the one you
mentioned in v6.)

With an rt-app rt workload of 12-18 periodic rt-tasks (4/16ms) all with
different priorities I never ran into a `is_migration_disabled(task)`
situation. I only ever get `task_rq(task) != rq` or `task_running(rq,
task)` under the `if (double_lock_balance(rq, lowest_rq))` condition in
find_lock_lowest_rq().

[...]

>  	// XXX validate p is still the highest prio task
>  	if (task_rq(p) == rq) {
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index cb3b886a081c..21af20445e7f 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2335,6 +2335,15 @@ static int push_dl_task(struct rq *rq)
>  		 */
>  		task = pick_next_pushable_dl_task(rq);
>  		if (task == next_task) {
> +			/*
> +			 * If next task has now disabled migrating, see if we
> +			 * can push the current task.
> +			 */
> +			if (unlikely(is_migration_disabled(task))) {
> +				put_task_struct(next_task);
> +				goto retry;
> +			}
> +

Looks like for DL this makes no sense since we're not pushing rq->curr
in `retry:` like for RT in case `is_migration_disabled(next_task)`.

[...]

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks
  2022-08-26 18:45 ` Dietmar Eggemann
@ 2022-08-28 15:45   ` Schspa Shi
  0 siblings, 0 replies; 7+ messages in thread
From: Schspa Shi @ 2022-08-28 15:45 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: mingo, peterz, juri.lelli, vincent.guittot, rostedt, bsegall,
	mgorman, bristot, vschneid, linux-kernel


Dietmar Eggemann <dietmar.eggemann@arm.com> writes:

> On 13/07/2022 15:48, Schspa Shi wrote:
>> Commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing")
>> allow find_lock_lowest_rq to pick a task with migration disabled.
>> This commit is intended to push the current running task on this CPU
>> away.
>> 
>> There is a race scenario, which allows a migration disabled task to
>> be migrated to another CPU.
>> 
>> When there is a RT task with higher priority, rt sched class was
>> intended to migrate higher priority task to lowest rq via push_rt_tasks,
>> this BUG will happen here.
>        ^^^
>
> You mean the warning in set_task_cpu()?
>

Yes, this is the warning in set_task_cpu()?

void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
{
#ifdef CONFIG_SCHED_DEBUG
	...
	WARN_ON_ONCE(is_migration_disabled(p));
#endif
	...
}

>> With the system running on PREEMPT_RT, rt_spin_lock will disable
>> migration, this will make the problem easier to reproduce.
>> 
>> I have seen this crash on PREEMPT_RT, from the logs, there is a race
>                    ^^^^^
>                      ?
>
> We still talking about the set_task_cpu() warning, right?
>

Yes.

> [...]
>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
>
> You should carry the Reviewed-by:'s you got in previous versions.

Thanks for the reminder, I'll pay attention to this next time.

I will upload a new patch version to carry it. and change the BUG in
comment message to WARN.

-- 
BRs
Schspa Shi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating
  2022-08-26 18:46   ` Dietmar Eggemann
@ 2022-08-28 15:54     ` Schspa Shi
  0 siblings, 0 replies; 7+ messages in thread
From: Schspa Shi @ 2022-08-28 15:54 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: mingo, peterz, juri.lelli, vincent.guittot, rostedt, bsegall,
	mgorman, bristot, vschneid, linux-kernel


Dietmar Eggemann <dietmar.eggemann@arm.com> writes:

> On 13/07/2022 15:48, Schspa Shi wrote:
>> When the task to push disable migration, retry to push the current
>> running task on this CPU away, instead doing nothing for this migrate
>> disabled task.
>> 
>> Signed-off-by: Schspa Shi <schspa@gmail.com>
>
> Unfortunately, I can't recreate this issue on my Arm64 6 CPUs system on
> mainline or PREEMPT_RT (linux-5.19.y-rt and v5.10.59-rt52) (the one you
> mentioned in v6.)
>
> With an rt-app rt workload of 12-18 periodic rt-tasks (4/16ms) all with
> different priorities I never ran into a `is_migration_disabled(task)`
> situation. I only ever get `task_rq(task) != rq` or `task_running(rq,
> task)` under the `if (double_lock_balance(rq, lowest_rq))` condition in
> find_lock_lowest_rq().
>

I think we need to write a kernel module to add more hard irq context to
increase the probability of recurrence.

I never recreate this issue with my test case too. But our test team can
reproduce the problem, they have more machines to reproduce the problem,
and the problem is easier to reproduce when the CPU is hotplugging.


> [...]
>
>>  	// XXX validate p is still the highest prio task
>>  	if (task_rq(p) == rq) {
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index cb3b886a081c..21af20445e7f 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -2335,6 +2335,15 @@ static int push_dl_task(struct rq *rq)
>>  		 */
>>  		task = pick_next_pushable_dl_task(rq);
>>  		if (task == next_task) {
>> +			/*
>> +			 * If next task has now disabled migrating, see if we
>> +			 * can push the current task.
>> +			 */
>> +			if (unlikely(is_migration_disabled(task))) {
>> +				put_task_struct(next_task);
>> +				goto retry;
>> +			}
>> +
>
> Looks like for DL this makes no sense since we're not pushing rq->curr
> in `retry:` like for RT in case `is_migration_disabled(next_task)`.
>

It seems we have the opportunity to execute resched_curr, which will
have a similar effect. I should change the comments for this for next
patch version.

> [...]
>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

-- 
BRs
Schspa Shi

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-08-28 16:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-13 13:48 [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks Schspa Shi
2022-07-13 13:48 ` [PATCH v7 2/2] sched/rt: Trying to push current task when target disable migrating Schspa Shi
2022-08-26 18:46   ` Dietmar Eggemann
2022-08-28 15:54     ` Schspa Shi
2022-08-24  6:04 ` [PATCH v7 1/2] sched/rt: fix bad task migration for rt tasks Schspa Shi
2022-08-26 18:45 ` Dietmar Eggemann
2022-08-28 15:45   ` Schspa Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).