* [PATCH] mips: Fix race on setting and getting cpu_online_mask
@ 2017-08-03 6:20 Matija Glavinic Pecotic
2017-08-07 8:23 ` Ralf Baechle
0 siblings, 1 reply; 3+ messages in thread
From: Matija Glavinic Pecotic @ 2017-08-03 6:20 UTC (permalink / raw)
To: linux-mips, Sverdlin, Alexander (Nokia - DE/Ulm)
While testing cpu hoptlug (cpu down and up in loops) on kernel 4.4, it was
observed that occasionally check for cpu online will fail in kernel/cpu.c,
_cpu_up:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/kernel/cpu.c?h=v4.4.79#n485
518 /* Arch-specific enabling code. */
519 ret = __cpu_up(cpu, idle);
520
521 if (ret != 0)
522 goto out_notify;
523 BUG_ON(!cpu_online(cpu));
Reason is race between start_secondary and _cpu_up. cpu_callin_map is set
before cpu_online_mask. In __cpu_up, cpu_callin_map is waited for, but cpu
online mask is not, resulting in race in which secondary processor started
and set cpu_callin_map, but not yet set the online mask,resulting in above
BUG being hit.
Upstream differs in the area. cpu_online check is in bringup_wait_for_ap,
which is after cpu reached AP_ONLINE_IDLE,where secondary passed its start
function. Nonetheless, fix makes start_secondary safe and not depending on
other locks throughout the code. It protects as well against cpu_online
checks put in between sometimes in the future.
Fix this by moving completion after all flags are set.
Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
---
arch/mips/kernel/smp.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index 770d4d1..6bace76 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -376,9 +376,6 @@ asmlinkage void start_secondary(void)
cpumask_set_cpu(cpu, &cpu_coherent_mask);
notify_cpu_starting(cpu);
- complete(&cpu_running);
- synchronise_count_slave(cpu);
-
set_cpu_online(cpu, true);
set_cpu_sibling_map(cpu);
@@ -386,6 +383,9 @@ asmlinkage void start_secondary(void)
calculate_cpu_foreign_map();
+ complete(&cpu_running);
+ synchronise_count_slave(cpu);
+
/*
* irq will be enabled in ->smp_finish(), enabling it too early
* is dangerous.
--
2.1.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] mips: Fix race on setting and getting cpu_online_mask
2017-08-03 6:20 [PATCH] mips: Fix race on setting and getting cpu_online_mask Matija Glavinic Pecotic
@ 2017-08-07 8:23 ` Ralf Baechle
2017-08-14 15:34 ` Matija Glavinic Pecotic
0 siblings, 1 reply; 3+ messages in thread
From: Ralf Baechle @ 2017-08-07 8:23 UTC (permalink / raw)
To: Matija Glavinic Pecotic; +Cc: linux-mips, Sverdlin, Alexander (Nokia - DE/Ulm)
On Thu, Aug 03, 2017 at 08:20:22AM +0200, Matija Glavinic Pecotic wrote:
> While testing cpu hoptlug (cpu down and up in loops) on kernel 4.4, it was
> observed that occasionally check for cpu online will fail in kernel/cpu.c,
> _cpu_up:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/kernel/cpu.c?h=v4.4.79#n485
> 518 /* Arch-specific enabling code. */
> 519 ret = __cpu_up(cpu, idle);
> 520
> 521 if (ret != 0)
> 522 goto out_notify;
> 523 BUG_ON(!cpu_online(cpu));
>
> Reason is race between start_secondary and _cpu_up. cpu_callin_map is set
> before cpu_online_mask. In __cpu_up, cpu_callin_map is waited for, but cpu
> online mask is not, resulting in race in which secondary processor started
> and set cpu_callin_map, but not yet set the online mask,resulting in above
> BUG being hit.
>
> Upstream differs in the area. cpu_online check is in bringup_wait_for_ap,
> which is after cpu reached AP_ONLINE_IDLE,where secondary passed its start
> function. Nonetheless, fix makes start_secondary safe and not depending on
> other locks throughout the code. It protects as well against cpu_online
> checks put in between sometimes in the future.
>
> Fix this by moving completion after all flags are set.
>
> Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
> ---
> arch/mips/kernel/smp.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
> index 770d4d1..6bace76 100644
> --- a/arch/mips/kernel/smp.c
> +++ b/arch/mips/kernel/smp.c
> @@ -376,9 +376,6 @@ asmlinkage void start_secondary(void)
> cpumask_set_cpu(cpu, &cpu_coherent_mask);
> notify_cpu_starting(cpu);
>
> - complete(&cpu_running);
> - synchronise_count_slave(cpu);
> -
> set_cpu_online(cpu, true);
>
> set_cpu_sibling_map(cpu);
> @@ -386,6 +383,9 @@ asmlinkage void start_secondary(void)
>
> calculate_cpu_foreign_map();
>
> + complete(&cpu_running);
> + synchronise_count_slave(cpu);
> +
> /*
> * irq will be enabled in ->smp_finish(), enabling it too early
> * is dangerous.
Makes sense, applied - but after applying I was wondering if
synchronise_count_slave() should be before the complete, not after.
Ralf
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mips: Fix race on setting and getting cpu_online_mask
2017-08-07 8:23 ` Ralf Baechle
@ 2017-08-14 15:34 ` Matija Glavinic Pecotic
0 siblings, 0 replies; 3+ messages in thread
From: Matija Glavinic Pecotic @ 2017-08-14 15:34 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Sverdlin, Alexander (Nokia - DE/Ulm)
On 08/07/2017 10:23 AM, Ralf Baechle wrote:
> Makes sense, applied - but after applying I was wondering if
> synchronise_count_slave() should be before the complete, not after.
I believe it would be proper thing to do. Do you want me to send updated patch, new one, or would you like to do it?
Regards,
Matija
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-08-14 15:35 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-03 6:20 [PATCH] mips: Fix race on setting and getting cpu_online_mask Matija Glavinic Pecotic
2017-08-07 8:23 ` Ralf Baechle
2017-08-14 15:34 ` Matija Glavinic Pecotic
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.