linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
@ 2019-09-12  1:55 shikemeng
  2019-09-12 22:09 ` Valentin Schneider
  0 siblings, 1 reply; 11+ messages in thread
From: shikemeng @ 2019-09-12  1:55 UTC (permalink / raw)
  To: mingo, peterz; +Cc: linux-kernel

From 089dbf0216628ac6ae98742ab90725ca9c2bf201 Mon Sep 17 00:00:00 2001
From:  <shikemeng@huawei.com>
Date: Tue, 10 Sep 2019 09:44:58 -0400
Subject: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr

reason: migration to invalid cpu in __set_cpus_allowed_ptr
archive path: patches/euleros/sched

Oops occur when running qemu on arm64:
 Unable to handle kernel paging request at virtual address ffff000008effe40
 Internal error: Oops: 96000007 [#1] SMP
 Process migration/0 (pid: 12, stack limit = 0x00000000084e3736)
 pstate: 20000085 (nzCv daIf -PAN -UAO)
 pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
 lr : move_queued_task.isra.21+0x124/0x298
 ...
 Call trace:
  __ll_sc___cmpxchg_case_acq_4+0x4/0x20
  __migrate_task+0xc8/0xe0
  migration_cpu_stop+0x170/0x180
  cpu_stopper_thread+0xec/0x178
  smpboot_thread_fn+0x1ac/0x1e8
  kthread+0x134/0x138
  ret_from_fork+0x10/0x18

__set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to migrage the process if process is not
currently running on any one of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will choose an invalid
dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after
cpumask_intersects check.Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if corresponding bit
is coincidentally set.As a consequence, kernel will access a invalid rq address associate with the invalid cpu in
migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops:
1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling sched_setaffinity
2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn
3) Oops appears if the invalid cpu is set in memory after tested cpumask.


Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40
---
 kernel/sched/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4b63fef..5181ea9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
    if (cpumask_equal(&p->cpus_allowed, new_mask))
        goto out;

-   if (!cpumask_intersects(new_mask, cpu_valid_mask)) {
+   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
+   if (dest_cpu >= nr_cpu_ids) {
        ret = -EINVAL;
        goto out;
    }
@@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
    if (cpumask_test_cpu(task_cpu(p), new_mask))
        goto out;

-   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
    if (task_running(rq, p) || p->state == TASK_WAKING) {
        struct migration_arg arg = { p, dest_cpu };
        /* Need help from migration thread: drop lock and wait. */
--
1.8.5.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-12  1:55 [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr shikemeng
@ 2019-09-12 22:09 ` Valentin Schneider
  2019-09-15  6:13   ` shikemeng
  2019-09-15  8:21   ` shikemeng
  0 siblings, 2 replies; 11+ messages in thread
From: Valentin Schneider @ 2019-09-12 22:09 UTC (permalink / raw)
  To: shikemeng, mingo, peterz; +Cc: linux-kernel

On 12/09/2019 02:55, shikemeng wrote:
> From 089dbf0216628ac6ae98742ab90725ca9c2bf201 Mon Sep 17 00:00:00 2001
> From:  <shikemeng@huawei.com>
> Date: Tue, 10 Sep 2019 09:44:58 -0400
> Subject: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
> 
> reason: migration to invalid cpu in __set_cpus_allowed_ptr
> archive path: patches/euleros/sched
> 

The above should probably be trimmed from the log.

> Oops occur when running qemu on arm64:
>  Unable to handle kernel paging request at virtual address ffff000008effe40
>  Internal error: Oops: 96000007 [#1] SMP
>  Process migration/0 (pid: 12, stack limit = 0x00000000084e3736)
>  pstate: 20000085 (nzCv daIf -PAN -UAO)
>  pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>  lr : move_queued_task.isra.21+0x124/0x298
>  ...
>  Call trace:
>   __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>   __migrate_task+0xc8/0xe0
>   migration_cpu_stop+0x170/0x180
>   cpu_stopper_thread+0xec/0x178
>   smpboot_thread_fn+0x1ac/0x1e8
>   kthread+0x134/0x138
>   ret_from_fork+0x10/0x18
> 
> __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to migrage the process if process is not
> currently running on any one of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will choose an invalid
> dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after
> cpumask_intersects check.

Right, cpumask_any_and() returns >= nr_cpu_ids when there isn't any valid CPU
bit set.

> Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if corresponding bit
> is coincidentally set.

Ouch. I was going to say the cpu_active_mask check from is_cpu_allowed()
should've stopped the whole thing there, but AFAICT there's no safeguard
against > nr_cpu_ids bit accesses. I see CONFIG_DEBUG_PER_CPU_MAPS should
fire a warning for such accesses, but we don't enable it by default.

Would it make sense to do something like

	return test_bit(...) && cpu < nr_cpu_ids;

for cpumask_test_cpu()? We still get the warn with the right config, but we
prevent sneaky mistakes like this one. And it seems it's not the only one
according to:

--
virtual patch
virtual report

@nocheck@
expression E;
identifier any_func =~ "^cpumask_any";
position pos;
@@

E@pos = any_func(...);
... when != E >= nr_cpu_ids
    when != E < nr_cpu_ids

@script:python depends on nocheck && report@
p << nocheck.pos;
@@
coccilib.report.print_report(p[0], "Missing cpumask_any_*() return value check!")
---

Some of those seem benign since they are on e.g. cpu_online_mask, some other
are somewhat dubious (e.g. deadline.c::dl_task_can_attach()).

As a consequence, kernel will access a invalid rq address associate with the invalid cpu in
> migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops:
> 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling sched_setaffinity
> 2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn
> 3) Oops appears if the invalid cpu is set in memory after tested cpumask.
> 
> 
> Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40

- This doesn't respect the 75 char per line limit
- Change IDs don't belong here (we're not using Gerrit)
- You're missing a Signed-off-by.

You'll find all the guidelines you need for formatting patches in
Documentation/process/submitting-patches.rst.

> ---
>  kernel/sched/core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 4b63fef..5181ea9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>     if (cpumask_equal(&p->cpus_allowed, new_mask))
>         goto out;
> 
> -   if (!cpumask_intersects(new_mask, cpu_valid_mask)) {
> +   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
> +   if (dest_cpu >= nr_cpu_ids) {
>         ret = -EINVAL;
>         goto out;
>     }
> @@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>     if (cpumask_test_cpu(task_cpu(p), new_mask))
>         goto out;
> 
> -   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);>     if (task_running(rq, p) || p->state == TASK_WAKING) {
>         struct migration_arg arg = { p, dest_cpu };
>         /* Need help from migration thread: drop lock and wait. */
> --
> 1.8.5.6
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-12 22:09 ` Valentin Schneider
@ 2019-09-15  6:13   ` shikemeng
  2019-09-15  8:21   ` shikemeng
  1 sibling, 0 replies; 11+ messages in thread
From: shikemeng @ 2019-09-15  6:13 UTC (permalink / raw)
  To: mingo, peterz, valentin.schneider; +Cc: linux-kernel

On 13/09/2019 6:09,Valentin Schneider wrote:
>> From 089dbf0216628ac6ae98742ab90725ca9c2bf201 Mon Sep 17 00:00:00 2001
>> From:  <shikemeng@huawei.com>
>> Date: Tue, 10 Sep 2019 09:44:58 -0400
>> Subject: [PATCH] sched: fix migration to invalid cpu in 
>> __set_cpus_allowed_ptr
>> 
>> reason: migration to invalid cpu in __set_cpus_allowed_ptr archive 
>> path: patches/euleros/sched
>> 
>
>The above should probably be trimmed from the log.
>

Thanks for reminding me. I will remove this part in a new patch.

>> Oops occur when running qemu on arm64:
>>  Unable to handle kernel paging request at virtual address 
>> ffff000008effe40  Internal error: Oops: 96000007 [#1] SMP  Process 
>> migration/0 (pid: 12, stack limit = 0x00000000084e3736)
>>  pstate: 20000085 (nzCv daIf -PAN -UAO)  pc : 
>> __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>>  lr : move_queued_task.isra.21+0x124/0x298
>>  ...
>>  Call trace:
>>   __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>>   __migrate_task+0xc8/0xe0
>>   migration_cpu_stop+0x170/0x180
>>   cpu_stopper_thread+0xec/0x178
>>   smpboot_thread_fn+0x1ac/0x1e8
>>   kthread+0x134/0x138
>>   ret_from_fork+0x10/0x18
>> 
>> __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask 
>> to migrage the process if process is not currently running on any one 
>> of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will 
>> choose an invalid dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after cpumask_intersects check.
>
>Right, cpumask_any_and() returns >= nr_cpu_ids when there isn't any valid CPU bit set.
>
>> Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if 
>> corresponding bit is coincidentally set.
>
>Ouch. I was going to say the cpu_active_mask check from is_cpu_allowed() should've stopped the whole thing there, but AFAICT there's no safeguard against > nr_cpu_ids bit accesses. I see CONFIG_DEBUG_PER_CPU_MAPS should fire a warning for such accesses, but we don't enable it by default.
>
>Would it make sense to do something like
>
>	return test_bit(...) && cpu < nr_cpu_ids;
>
>for cpumask_test_cpu()? We still get the warn with the right config, but we prevent sneaky mistakes like this one. And it seems it's not the only one according to:
>
>--
>virtual patch
>virtual report
>
>@nocheck@
>expression E;
>identifier any_func =~ "^cpumask_any";
>position pos;
>@@
>
>E@pos = any_func(...);
>... when != E >= nr_cpu_ids
>    when != E < nr_cpu_ids
>
>@script:python depends on nocheck && report@ p << nocheck.pos; @@ coccilib.report.print_report(p[0], "Missing cpumask_any_*() return value check!")
>---
>
>Some of those seem benign since they are on e.g. cpu_online_mask, some other are somewhat dubious (e.g. deadline.c::dl_task_can_attach()).
>
>As a consequence, kernel will access a invalid rq address associate with the invalid cpu in

It's more thoughtful to add check in cpumask_test_cpu.It can solve this problem and can prevent other potential bugs.I will test it and resend
a new patch.

>> migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops:
>> 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by 
>> calling sched_setaffinity
>> 2) A shell script repeatedly "echo 0 > 
>> /sys/devices/system/cpu/cpu1/online" and "echo 1 > 
>> /sys/devices/system/cpu/cpu1/online" in turn
>> 3) Oops appears if the invalid cpu is set in memory after tested cpumask.
>> 
>> 
>> Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40
>
>- This doesn't respect the 75 char per line limit
>- Change IDs don't belong here (we're not using Gerrit)
>- You're missing a Signed-off-by.
>
>You'll find all the guidelines you need for formatting patches in Documentation/process/submitting-patches.rst.
>

Thanks for the guide.I will read it carefully before send next patch.

>> ---
>>  kernel/sched/core.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 
>> 4b63fef..5181ea9 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>>     if (cpumask_equal(&p->cpus_allowed, new_mask))
>>         goto out;
>> 
>> -   if (!cpumask_intersects(new_mask, cpu_valid_mask)) {
>> +   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
>> +   if (dest_cpu >= nr_cpu_ids) {
>>         ret = -EINVAL;
>>         goto out;
>>     }
>> @@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>>     if (cpumask_test_cpu(task_cpu(p), new_mask))
>>         goto out;
>> 
>> -   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);>     if (task_running(rq, p) || p->state == TASK_WAKING) {
>>         struct migration_arg arg = { p, dest_cpu };
>>         /* Need help from migration thread: drop lock and wait. */
>> --
>> 1.8.5.6
>> 

Thansks for your review and advises.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-12 22:09 ` Valentin Schneider
  2019-09-15  6:13   ` shikemeng
@ 2019-09-15  8:21   ` shikemeng
  2019-09-15 14:33     ` Valentin Schneider
  1 sibling, 1 reply; 11+ messages in thread
From: shikemeng @ 2019-09-15  8:21 UTC (permalink / raw)
  To: mingo, peterz, valentin.schneider; +Cc: linux-kernel

On 15/09/2019 6:13,shikemeng wrote:
>>> From 089dbf0216628ac6ae98742ab90725ca9c2bf201 Mon Sep 17 00:00:00 2001
>>> From:  <shikemeng@huawei.com>
>>> Date: Tue, 10 Sep 2019 09:44:58 -0400
>>> Subject: [PATCH] sched: fix migration to invalid cpu in 
>>> __set_cpus_allowed_ptr
>>> 
>>> reason: migration to invalid cpu in __set_cpus_allowed_ptr archive 
>>> path: patches/euleros/sched
>>> 
>>
>>The above should probably be trimmed from the log.
>>
>
>Thanks for reminding me. I will remove this part in a new patch.
>
>>> Oops occur when running qemu on arm64:
>>>  Unable to handle kernel paging request at virtual address 
>>> ffff000008effe40  Internal error: Oops: 96000007 [#1] SMP  Process 
>>> migration/0 (pid: 12, stack limit = 0x00000000084e3736)
>>>  pstate: 20000085 (nzCv daIf -PAN -UAO)  pc : 
>>> __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>>>  lr : move_queued_task.isra.21+0x124/0x298
>>>  ...
>>>  Call trace:
>>>   __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>>>   __migrate_task+0xc8/0xe0
>>>   migration_cpu_stop+0x170/0x180
>>>   cpu_stopper_thread+0xec/0x178
>>>   smpboot_thread_fn+0x1ac/0x1e8
>>>   kthread+0x134/0x138
>>>   ret_from_fork+0x10/0x18
>>> 
>>> __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask 
>>> to migrage the process if process is not currently running on any one 
>>> of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will 
>>> choose an invalid dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after cpumask_intersects check.
>>
>>Right, cpumask_any_and() returns >= nr_cpu_ids when there isn't any valid CPU bit set.
>>
>>> Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if 
>>> corresponding bit is coincidentally set.
>>
>>Ouch. I was going to say the cpu_active_mask check from is_cpu_allowed() should've stopped the whole thing there, but AFAICT there's no safeguard against > nr_cpu_ids bit accesses. I see CONFIG_DEBUG_PER_CPU_MAPS should fire a warning for such accesses, but we don't enable it by default.
>>
>>Would it make sense to do something like
>>
>>	return test_bit(...) && cpu < nr_cpu_ids;
>>
>>for cpumask_test_cpu()? We still get the warn with the right config, but we prevent sneaky mistakes like this one. And it seems it's not the only one according to:
>>
>>--
>>virtual patch
>>virtual report
>>
>>@nocheck@
>>expression E;
>>identifier any_func =~ "^cpumask_any";
>>position pos;
>>@@
>>
>>E@pos = any_func(...);
>>... when != E >= nr_cpu_ids
>>    when != E < nr_cpu_ids
>>
>>@script:python depends on nocheck && report@ p << nocheck.pos; @@ coccilib.report.print_report(p[0], "Missing cpumask_any_*() return value check!")
>>---
>>
>>Some of those seem benign since they are on e.g. cpu_online_mask, some other are somewhat dubious (e.g. deadline.c::dl_task_can_attach()).
>>
>>As a consequence, kernel will access a invalid rq address associate with the invalid cpu in
>
>It's more thoughtful to add check in cpumask_test_cpu.It can solve this problem and can prevent other potential bugs.I will test it and resend
>a new patch.
>

Think again and again. As cpumask_check will fire a warning if cpu >= nr_cpu_ids, it seems that cpumask_check only expects cpu < nr_cpu_ids and it's
caller's responsibility to very cpu is in valid range. Interfaces like cpumask_test_and_set_cpu, cpumask_test_and_clear_cpu and so on are not checking
cpu < nr_cpu_ids either and may cause demage if cpu is out of range.

>>> migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops:
>>> 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by 
>>> calling sched_setaffinity
>>> 2) A shell script repeatedly "echo 0 > 
>>> /sys/devices/system/cpu/cpu1/online" and "echo 1 > 
>>> /sys/devices/system/cpu/cpu1/online" in turn
>>> 3) Oops appears if the invalid cpu is set in memory after tested cpumask.
>>> 
>>> 
>>> Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40
>>
>>- This doesn't respect the 75 char per line limit
>>- Change IDs don't belong here (we're not using Gerrit)
>>- You're missing a Signed-off-by.
>>
>>You'll find all the guidelines you need for formatting patches in Documentation/process/submitting-patches.rst.
>>
>
>Thanks for the guide.I will read it carefully before send next patch.
>
>>> ---
>>>  kernel/sched/core.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 
>>> 4b63fef..5181ea9 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>>>     if (cpumask_equal(&p->cpus_allowed, new_mask))
>>>         goto out;
>>> 
>>> -   if (!cpumask_intersects(new_mask, cpu_valid_mask)) {
>>> +   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
>>> +   if (dest_cpu >= nr_cpu_ids) {
>>>         ret = -EINVAL;
>>>         goto out;
>>>     }
>>> @@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>>>     if (cpumask_test_cpu(task_cpu(p), new_mask))
>>>         goto out;
>>> 
>>> -   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);>     if (task_running(rq, p) || p->state == TASK_WAKING) {
>>>         struct migration_arg arg = { p, dest_cpu };
>>>         /* Need help from migration thread: drop lock and wait. */
>>> --
>>> 1.8.5.6
>>> 
>
>Thansks for your review and advises.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-15  8:21   ` shikemeng
@ 2019-09-15 14:33     ` Valentin Schneider
  2019-09-23 15:43       ` Dietmar Eggemann
  0 siblings, 1 reply; 11+ messages in thread
From: Valentin Schneider @ 2019-09-15 14:33 UTC (permalink / raw)
  To: shikemeng, mingo, peterz; +Cc: linux-kernel

On 15/09/2019 09:21, shikemeng wrote:
>> It's more thoughtful to add check in cpumask_test_cpu.It can solve this problem and can prevent other potential bugs.I will test it and resend
>> a new patch.
>>
> 
> Think again and again. As cpumask_check will fire a warning if cpu >= nr_cpu_ids, it seems that cpumask_check only expects cpu < nr_cpu_ids and it's
> caller's responsibility to very cpu is in valid range. Interfaces like cpumask_test_and_set_cpu, cpumask_test_and_clear_cpu and so on are not checking
> cpu < nr_cpu_ids either and may cause demage if cpu is out of range.
> 

cpumask operations clearly should never be fed CPU numbers > nr_cpu_ids,
but we can get some sneaky mishaps like the one you're fixing. The answer
might just be to have more folks turn on DEBUG_PER_CPU_MAPS in their test
runs (I don't for instance - will do from now on), since I get the feeling
people like to be able to disable these checks for producty kernels.

In any case, don't feel like you have to fix this globally - your fix is
fine on its own.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-15 14:33     ` Valentin Schneider
@ 2019-09-23 15:43       ` Dietmar Eggemann
  2019-09-23 16:06         ` Valentin Schneider
  0 siblings, 1 reply; 11+ messages in thread
From: Dietmar Eggemann @ 2019-09-23 15:43 UTC (permalink / raw)
  To: Valentin Schneider, shikemeng, mingo, peterz; +Cc: linux-kernel

On 9/15/19 4:33 PM, Valentin Schneider wrote:
> On 15/09/2019 09:21, shikemeng wrote:
>>> It's more thoughtful to add check in cpumask_test_cpu.It can solve this problem and can prevent other potential bugs.I will test it and resend
>>> a new patch.
>>>
>>
>> Think again and again. As cpumask_check will fire a warning if cpu >= nr_cpu_ids, it seems that cpumask_check only expects cpu < nr_cpu_ids and it's
>> caller's responsibility to very cpu is in valid range. Interfaces like cpumask_test_and_set_cpu, cpumask_test_and_clear_cpu and so on are not checking
>> cpu < nr_cpu_ids either and may cause demage if cpu is out of range.
>>
> 
> cpumask operations clearly should never be fed CPU numbers > nr_cpu_ids,
> but we can get some sneaky mishaps like the one you're fixing. The answer
> might just be to have more folks turn on DEBUG_PER_CPU_MAPS in their test
> runs (I don't for instance - will do from now on), since I get the feeling
> people like to be able to disable these checks for producty kernels.
> 
> In any case, don't feel like you have to fix this globally - your fix is
> fine on its own.

I'm not sure that CONFIG_DEBUG_PER_CPU_MAPS=y will help you here.

__set_cpus_allowed_ptr(...)
{
    ...
    dest_cpu = cpumask_any_and(...)
    ...
}

With:

#define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
#define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p),
(src2p))

cpumask_next_and() is called with n = -1 and in this case does not
invoke cpumask_check().

---

BTW, I can recreate the issue quite easily with:

  qemu-system-x86_64 ... -smp cores=64 ... -enable-kvm

with the default kernel config.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-23 15:43       ` Dietmar Eggemann
@ 2019-09-23 16:06         ` Valentin Schneider
  2019-09-24 14:09           ` Dietmar Eggemann
  0 siblings, 1 reply; 11+ messages in thread
From: Valentin Schneider @ 2019-09-23 16:06 UTC (permalink / raw)
  To: Dietmar Eggemann, shikemeng, mingo, peterz; +Cc: linux-kernel

On 23/09/2019 16:43, Dietmar Eggemann wrote:
> I'm not sure that CONFIG_DEBUG_PER_CPU_MAPS=y will help you here.
> 
> __set_cpus_allowed_ptr(...)
> {
>     ...
>     dest_cpu = cpumask_any_and(...)
>     ...
> }
> 
> With:
> 
> #define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
> #define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p),
> (src2p))
> 
> cpumask_next_and() is called with n = -1 and in this case does not
> invoke cpumask_check().
> 

It won't warn here because it's still a valid return value, but it should
warn in the cpumask_test_cpu() that follows (in is_cpu_allowed()) because
it would be passed a value >= nr_cpu_ids. So at the very least this config
does catch cpumask_any*() return values being blindly passed to
cpumask_test_cpu().

Calls to cpumask_any*() without relevant return value check can easily be
spotted by the coccinelle snippet I sent earlier. If this one fix gets
merged, I'll go and stare at / fixup the others (and maybe add the semantic
patch to coccicheck).

> ---
> 
> BTW, I can recreate the issue quite easily with:
> 
>   qemu-system-x86_64 ... -smp cores=64 ... -enable-kvm
> 
> with the default kernel config.
> 
> 

Might want to send your tested-by to [1] then :)

[1]: https://lkml.kernel.org/r/1568616808-16808-1-git-send-email-shikemeng@huawei.com

> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-23 16:06         ` Valentin Schneider
@ 2019-09-24 14:09           ` Dietmar Eggemann
  2019-09-24 16:12             ` Valentin Schneider
  0 siblings, 1 reply; 11+ messages in thread
From: Dietmar Eggemann @ 2019-09-24 14:09 UTC (permalink / raw)
  To: Valentin Schneider, shikemeng, mingo, peterz; +Cc: linux-kernel

On 9/23/19 6:06 PM, Valentin Schneider wrote:
> On 23/09/2019 16:43, Dietmar Eggemann wrote:
>> I'm not sure that CONFIG_DEBUG_PER_CPU_MAPS=y will help you here.
>>
>> __set_cpus_allowed_ptr(...)
>> {
>>     ...
>>     dest_cpu = cpumask_any_and(...)
>>     ...
>> }
>>
>> With:
>>
>> #define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
>> #define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p),
>> (src2p))
>>
>> cpumask_next_and() is called with n = -1 and in this case does not
>> invoke cpumask_check().
>>
> 
> It won't warn here because it's still a valid return value, but it should
> warn in the cpumask_test_cpu() that follows (in is_cpu_allowed()) because
> it would be passed a value >= nr_cpu_ids. So at the very least this config
> does catch cpumask_any*() return values being blindly passed to
> cpumask_test_cpu().

OK, I see and agree.

But IMHO, we still don't call cpumask_test_cpu(dest_cpu, ...), right.

What the patch fixes is that it closes the window between two reads of
cpu_active_mask in which cpuhp can potentially punch a hole into the
cpu_active_mask.

If p is not running or queued and it's state is unequal to TASK_WAKING,
a 'dest_cpu == nr_cpu_ids' goes unnoticed. Otherwise we see an 'unable
to handle kernel paging request' or 'unable to handle page fault for
address' bug in migration_cpu_stop() or move_queued_task().

Do I miss something?

[...]


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-24 14:09           ` Dietmar Eggemann
@ 2019-09-24 16:12             ` Valentin Schneider
  0 siblings, 0 replies; 11+ messages in thread
From: Valentin Schneider @ 2019-09-24 16:12 UTC (permalink / raw)
  To: Dietmar Eggemann, shikemeng, mingo, peterz; +Cc: linux-kernel

On 24/09/2019 15:09, Dietmar Eggemann wrote:
> On 9/23/19 6:06 PM, Valentin Schneider wrote:
>> On 23/09/2019 16:43, Dietmar Eggemann wrote:
>>> I'm not sure that CONFIG_DEBUG_PER_CPU_MAPS=y will help you here.
>>>
>>> __set_cpus_allowed_ptr(...)
>>> {
>>>     ...
>>>     dest_cpu = cpumask_any_and(...)
>>>     ...
>>> }
>>>
>>> With:
>>>
>>> #define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
>>> #define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p),
>>> (src2p))
>>>
>>> cpumask_next_and() is called with n = -1 and in this case does not
>>> invoke cpumask_check().
>>>
>>
>> It won't warn here because it's still a valid return value, but it should
>> warn in the cpumask_test_cpu() that follows (in is_cpu_allowed()) because
>> it would be passed a value >= nr_cpu_ids. So at the very least this config
>> does catch cpumask_any*() return values being blindly passed to
>> cpumask_test_cpu().
> 
> OK, I see and agree.
> 
> But IMHO, we still don't call cpumask_test_cpu(dest_cpu, ...), right.
> 
> What the patch fixes is that it closes the window between two reads of
> cpu_active_mask in which cpuhp can potentially punch a hole into the
> cpu_active_mask.
> 
> If p is not running or queued and it's state is unequal to TASK_WAKING,
> a 'dest_cpu == nr_cpu_ids' goes unnoticed.

In this case we don't need to force it off to another CPU, since that will
get sorted out at its next wakeup. However, the patch still catches that
, since it does an early

if (dest_cpu >= nr_cpu_ids) {
        ret = -EINVAL;
        goto out;

and that's regardless of the task's state.

> Otherwise we see an 'unable
> to handle kernel paging request' or 'unable to handle page fault for
> address' bug in migration_cpu_stop() or move_queued_task().
> 
> Do I miss something?
> 
> [...]
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
  2019-09-15  3:07 shikemeng
@ 2019-09-15 17:00 ` Valentin Schneider
  0 siblings, 0 replies; 11+ messages in thread
From: Valentin Schneider @ 2019-09-15 17:00 UTC (permalink / raw)
  To: shikemeng, mingo, peterz; +Cc: linux-kernel

On 15/09/2019 04:07, shikemeng wrote:
> From: <shikemeng@huawei.com>
> 
> reason: migration to invalid cpu in __set_cpus_allowed_ptr
> archive path: patches/euleros/sched
> 
> Oops occur when running qemu on arm64:
>  Unable to handle kernel paging request at virtual address ffff000008effe40
>  Internal error: Oops: 96000007 [#1] SMP
>  Process migration/0 (pid: 12, stack limit = 0x00000000084e3736)
>  pstate: 20000085 (nzCv daIf -PAN -UAO)
>  pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>  lr : move_queued_task.isra.21+0x124/0x298
>  ...
>  Call trace:
>   __ll_sc___cmpxchg_case_acq_4+0x4/0x20
>   __migrate_task+0xc8/0xe0
>   migration_cpu_stop+0x170/0x180
>   cpu_stopper_thread+0xec/0x178
>   smpboot_thread_fn+0x1ac/0x1e8
>   kthread+0x134/0x138
>   ret_from_fork+0x10/0x18
> 
> __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to migrage the process if process is not
> currently running on any one of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will choose an invalid
> dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after
> cpumask_intersects check.Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if corresponding bit
> is coincidentally set.As a consequence, kernel will access a invalid rq address associate with the invalid cpu in
> migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops:
> 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling sched_setaffinity
> 2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn
> 3) Oops appears if the invalid cpu is set in memory after tested cpumask.
> 
> Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40
> Signed-off-by: <shikemeng@huawei.com>

The log still isn't wrapped to 75 chars, and the change-id still hasn't been
removed.

The subject should also mention that this is v2 of the patch, again this is
all in the process documentation.

The fix itself looks fine though, so once the log respects the rules:
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
@ 2019-09-15  3:07 shikemeng
  2019-09-15 17:00 ` Valentin Schneider
  0 siblings, 1 reply; 11+ messages in thread
From: shikemeng @ 2019-09-15  3:07 UTC (permalink / raw)
  To: mingo, peterz; +Cc: linux-kernel

From: <shikemeng@huawei.com>

reason: migration to invalid cpu in __set_cpus_allowed_ptr
archive path: patches/euleros/sched

Oops occur when running qemu on arm64:
 Unable to handle kernel paging request at virtual address ffff000008effe40
 Internal error: Oops: 96000007 [#1] SMP
 Process migration/0 (pid: 12, stack limit = 0x00000000084e3736)
 pstate: 20000085 (nzCv daIf -PAN -UAO)
 pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
 lr : move_queued_task.isra.21+0x124/0x298
 ...
 Call trace:
  __ll_sc___cmpxchg_case_acq_4+0x4/0x20
  __migrate_task+0xc8/0xe0
  migration_cpu_stop+0x170/0x180
  cpu_stopper_thread+0xec/0x178
  smpboot_thread_fn+0x1ac/0x1e8
  kthread+0x134/0x138
  ret_from_fork+0x10/0x18

__set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to migrage the process if process is not
currently running on any one of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will choose an invalid
dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after
cpumask_intersects check.Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if corresponding bit
is coincidentally set.As a consequence, kernel will access a invalid rq address associate with the invalid cpu in
migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops:
1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling sched_setaffinity
2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn
3) Oops appears if the invalid cpu is set in memory after tested cpumask.

Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40
Signed-off-by: <shikemeng@huawei.com>
---
 kernel/sched/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4b63fef..5181ea9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
 	if (cpumask_equal(&p->cpus_allowed, new_mask))
 		goto out;
 
-	if (!cpumask_intersects(new_mask, cpu_valid_mask)) {
+	dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
+	if (dest_cpu >= nr_cpu_ids) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
 	if (cpumask_test_cpu(task_cpu(p), new_mask))
 		goto out;
 
-	dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
 	if (task_running(rq, p) || p->state == TASK_WAKING) {
 		struct migration_arg arg = { p, dest_cpu };
 		/* Need help from migration thread: drop lock and wait. */
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-09-24 16:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-12  1:55 [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr shikemeng
2019-09-12 22:09 ` Valentin Schneider
2019-09-15  6:13   ` shikemeng
2019-09-15  8:21   ` shikemeng
2019-09-15 14:33     ` Valentin Schneider
2019-09-23 15:43       ` Dietmar Eggemann
2019-09-23 16:06         ` Valentin Schneider
2019-09-24 14:09           ` Dietmar Eggemann
2019-09-24 16:12             ` Valentin Schneider
2019-09-15  3:07 shikemeng
2019-09-15 17:00 ` Valentin Schneider

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).