All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask"
@ 2017-08-23  8:21 ` Matt Redfearn
  0 siblings, 0 replies; 6+ messages in thread
From: Matt Redfearn @ 2017-08-23  8:21 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Matt Redfearn, Matija Glavinic Pecotic,
	Marcin Nowakowski, Paul Gortmaker, linux-kernel, Ingo Molnar,
	Paul Burton

Commit 6f542ebeaee0 ("MIPS: Fix race on setting and getting
cpu_online_mask") effectively reverted commit 8f46cca1e6c06 ("MIPS: SMP:
Fix possibility of deadlock when bringing CPUs online") and thus has
reinstated the possibility of deadlock.

The commit was based on testing of kernel v4.4, where the CPU hotplug
code issued a BUG() if the starting CPU is not marked online when the
boot CPU returns from __cpu_up. The commit fixes this race, but
re-introduces the deadlock situation.

As noted in the commit message, upstream differs in this area. The
hotplug code now waits on a completion event in bringup_wait_for_ap,
which is set by the starting CPU in cpuhp_online_idle once it calls
cpu_startup_entry. Thus there is no possibility of a race in upstream,
and this commit has only re-introduced the deadlock condition, which can
be observed on multiple platforms when running a heavy load test at the
same time as hotplugging CPUs. See commit 8f46cca1e6c06 ("MIPS: SMP: Fix
possibility of deadlock when bringing CPUs online") for details.

This reverts commit 6f542ebeaee0ee552a902ce3892220fc22c7ec8e.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
CC: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
---

 arch/mips/kernel/smp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index 25f6877ce464..603963e1d8e7 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -383,6 +383,9 @@ asmlinkage void start_secondary(void)
 	cpumask_set_cpu(cpu, &cpu_coherent_mask);
 	notify_cpu_starting(cpu);
 
+	complete(&cpu_running);
+	synchronise_count_slave(cpu);
+
 	set_cpu_online(cpu, true);
 
 	set_cpu_sibling_map(cpu);
@@ -390,9 +393,6 @@ asmlinkage void start_secondary(void)
 
 	calculate_cpu_foreign_map();
 
-	complete(&cpu_running);
-	synchronise_count_slave(cpu);
-
 	/*
 	 * irq will be enabled in ->smp_finish(), enabling it too early
 	 * is dangerous.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask"
@ 2017-08-23  8:21 ` Matt Redfearn
  0 siblings, 0 replies; 6+ messages in thread
From: Matt Redfearn @ 2017-08-23  8:21 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Matt Redfearn, Matija Glavinic Pecotic,
	Marcin Nowakowski, Paul Gortmaker, linux-kernel, Ingo Molnar,
	Paul Burton

Commit 6f542ebeaee0 ("MIPS: Fix race on setting and getting
cpu_online_mask") effectively reverted commit 8f46cca1e6c06 ("MIPS: SMP:
Fix possibility of deadlock when bringing CPUs online") and thus has
reinstated the possibility of deadlock.

The commit was based on testing of kernel v4.4, where the CPU hotplug
code issued a BUG() if the starting CPU is not marked online when the
boot CPU returns from __cpu_up. The commit fixes this race, but
re-introduces the deadlock situation.

As noted in the commit message, upstream differs in this area. The
hotplug code now waits on a completion event in bringup_wait_for_ap,
which is set by the starting CPU in cpuhp_online_idle once it calls
cpu_startup_entry. Thus there is no possibility of a race in upstream,
and this commit has only re-introduced the deadlock condition, which can
be observed on multiple platforms when running a heavy load test at the
same time as hotplugging CPUs. See commit 8f46cca1e6c06 ("MIPS: SMP: Fix
possibility of deadlock when bringing CPUs online") for details.

This reverts commit 6f542ebeaee0ee552a902ce3892220fc22c7ec8e.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
CC: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
---

 arch/mips/kernel/smp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index 25f6877ce464..603963e1d8e7 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -383,6 +383,9 @@ asmlinkage void start_secondary(void)
 	cpumask_set_cpu(cpu, &cpu_coherent_mask);
 	notify_cpu_starting(cpu);
 
+	complete(&cpu_running);
+	synchronise_count_slave(cpu);
+
 	set_cpu_online(cpu, true);
 
 	set_cpu_sibling_map(cpu);
@@ -390,9 +393,6 @@ asmlinkage void start_secondary(void)
 
 	calculate_cpu_foreign_map();
 
-	complete(&cpu_running);
-	synchronise_count_slave(cpu);
-
 	/*
 	 * irq will be enabled in ->smp_finish(), enabling it too early
 	 * is dangerous.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask"
  2017-08-23  8:21 ` Matt Redfearn
  (?)
@ 2017-08-28 10:07 ` Matija Glavinic Pecotic
  2017-08-29  1:43   ` Huacai Chen
  -1 siblings, 1 reply; 6+ messages in thread
From: Matija Glavinic Pecotic @ 2017-08-28 10:07 UTC (permalink / raw)
  To: Matt Redfearn, Ralf Baechle
  Cc: linux-mips, Marcin Nowakowski, Paul Gortmaker, linux-kernel,
	Ingo Molnar, Paul Burton

On 08/23/2017 10:21 AM, Matt Redfearn wrote:
> As noted in the commit message, upstream differs in this area. The
> hotplug code now waits on a completion event in bringup_wait_for_ap,
> which is set by the starting CPU in cpuhp_online_idle once it calls
> cpu_startup_entry. Thus there is no possibility of a race in upstream,
> and this commit has only re-introduced the deadlock condition, which can
> be observed on multiple platforms when running a heavy load test at the
> same time as hotplugging CPUs. See commit 8f46cca1e6c06 ("MIPS: SMP: Fix
> possibility of deadlock when bringing CPUs online") for details.

I personally do not like the fact that synchronization is implicitly done by the callers, it is the reason why the patch was proposed. As noted before, it is enough someone checks cpu online mask somewhere in between and there is race again.

How about moving synchronise_count_slave before setting the cpu online? Is there dependency it has to be done after completion?

Regards,

Matija

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask"
  2017-08-28 10:07 ` Matija Glavinic Pecotic
@ 2017-08-29  1:43   ` Huacai Chen
  2017-08-30 13:24     ` Matt Redfearn
  0 siblings, 1 reply; 6+ messages in thread
From: Huacai Chen @ 2017-08-29  1:43 UTC (permalink / raw)
  To: Matija Glavinic Pecotic
  Cc: Matt Redfearn, Ralf Baechle, Linux MIPS Mailing List,
	Marcin Nowakowski, Paul Gortmaker, linux-kernel, Ingo Molnar,
	Paul Burton

I suggest to drop sync_r4k completely, because it is inaccurate. You
can use IPI to synchronize count/compare instead, as Loongson-3 does.

Huacai

On Mon, Aug 28, 2017 at 6:07 PM, Matija Glavinic Pecotic
<matija.glavinic-pecotic.ext@nokia.com> wrote:
> On 08/23/2017 10:21 AM, Matt Redfearn wrote:
>> As noted in the commit message, upstream differs in this area. The
>> hotplug code now waits on a completion event in bringup_wait_for_ap,
>> which is set by the starting CPU in cpuhp_online_idle once it calls
>> cpu_startup_entry. Thus there is no possibility of a race in upstream,
>> and this commit has only re-introduced the deadlock condition, which can
>> be observed on multiple platforms when running a heavy load test at the
>> same time as hotplugging CPUs. See commit 8f46cca1e6c06 ("MIPS: SMP: Fix
>> possibility of deadlock when bringing CPUs online") for details.
>
> I personally do not like the fact that synchronization is implicitly done by the callers, it is the reason why the patch was proposed. As noted before, it is enough someone checks cpu online mask somewhere in between and there is race again.
>
> How about moving synchronise_count_slave before setting the cpu online? Is there dependency it has to be done after completion?
>
> Regards,
>
> Matija
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask"
  2017-08-29  1:43   ` Huacai Chen
@ 2017-08-30 13:24     ` Matt Redfearn
  2017-08-31  7:03       ` Huacai Chen
  0 siblings, 1 reply; 6+ messages in thread
From: Matt Redfearn @ 2017-08-30 13:24 UTC (permalink / raw)
  To: Huacai Chen, Matija Glavinic Pecotic
  Cc: Ralf Baechle, Linux MIPS Mailing List, Marcin Nowakowski,
	Paul Gortmaker, linux-kernel, Ingo Molnar, Paul Burton

Hi Huacai,

On 29/08/17 02:43, Huacai Chen wrote:
> I suggest to drop sync_r4k completely, because it is inaccurate. You
> can use IPI to synchronize count/compare instead, as Loongson-3 does.

I am all for a better fix, such as this - but that would be a much more 
invasive change than what I propose. Currently 4.13 is broken by the 
patch that this is attempting to revert. It is easy to deadlock the 
system by hotplugging a CPU while it is busy. That was what my patch 
8f46cca1e6c06 originally fixed. Even though it is, perhaps, not 
stylistically great to have the synchronisation done by callers, the 
fact is that it *is* done (added in 8df3e07e7f21f), so the behavior for 
4.13 would be safe and deadlocks not possible. We can then look at more 
invasive changes that are acceptable to everyone during the 4.14 cycle.

Thanks,
Matt

>
> Huacai
>
> On Mon, Aug 28, 2017 at 6:07 PM, Matija Glavinic Pecotic
> <matija.glavinic-pecotic.ext@nokia.com> wrote:
>> On 08/23/2017 10:21 AM, Matt Redfearn wrote:
>>> As noted in the commit message, upstream differs in this area. The
>>> hotplug code now waits on a completion event in bringup_wait_for_ap,
>>> which is set by the starting CPU in cpuhp_online_idle once it calls
>>> cpu_startup_entry. Thus there is no possibility of a race in upstream,
>>> and this commit has only re-introduced the deadlock condition, which can
>>> be observed on multiple platforms when running a heavy load test at the
>>> same time as hotplugging CPUs. See commit 8f46cca1e6c06 ("MIPS: SMP: Fix
>>> possibility of deadlock when bringing CPUs online") for details.
>> I personally do not like the fact that synchronization is implicitly done by the callers, it is the reason why the patch was proposed. As noted before, it is enough someone checks cpu online mask somewhere in between and there is race again.
>>
>> How about moving synchronise_count_slave before setting the cpu online? Is there dependency it has to be done after completion?
>>
>> Regards,
>>
>> Matija
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask"
  2017-08-30 13:24     ` Matt Redfearn
@ 2017-08-31  7:03       ` Huacai Chen
  0 siblings, 0 replies; 6+ messages in thread
From: Huacai Chen @ 2017-08-31  7:03 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: Matija Glavinic Pecotic, Ralf Baechle, Linux MIPS Mailing List,
	Marcin Nowakowski, Paul Gortmaker, linux-kernel, Ingo Molnar,
	Paul Burton

Yes, your original patch (8f46cca1e6c06) is needed in 4.12/4.13, but
they should be reverted in 4.1/4.4-stable branch.

Huacai

On Wed, Aug 30, 2017 at 9:24 PM, Matt Redfearn <matt.redfearn@imgtec.com> wrote:
> Hi Huacai,
>
> On 29/08/17 02:43, Huacai Chen wrote:
>>
>> I suggest to drop sync_r4k completely, because it is inaccurate. You
>> can use IPI to synchronize count/compare instead, as Loongson-3 does.
>
>
> I am all for a better fix, such as this - but that would be a much more
> invasive change than what I propose. Currently 4.13 is broken by the patch
> that this is attempting to revert. It is easy to deadlock the system by
> hotplugging a CPU while it is busy. That was what my patch 8f46cca1e6c06
> originally fixed. Even though it is, perhaps, not stylistically great to
> have the synchronisation done by callers, the fact is that it *is* done
> (added in 8df3e07e7f21f), so the behavior for 4.13 would be safe and
> deadlocks not possible. We can then look at more invasive changes that are
> acceptable to everyone during the 4.14 cycle.
>
> Thanks,
> Matt
>
>
>>
>> Huacai
>>
>> On Mon, Aug 28, 2017 at 6:07 PM, Matija Glavinic Pecotic
>> <matija.glavinic-pecotic.ext@nokia.com> wrote:
>>>
>>> On 08/23/2017 10:21 AM, Matt Redfearn wrote:
>>>>
>>>> As noted in the commit message, upstream differs in this area. The
>>>> hotplug code now waits on a completion event in bringup_wait_for_ap,
>>>> which is set by the starting CPU in cpuhp_online_idle once it calls
>>>> cpu_startup_entry. Thus there is no possibility of a race in upstream,
>>>> and this commit has only re-introduced the deadlock condition, which can
>>>> be observed on multiple platforms when running a heavy load test at the
>>>> same time as hotplugging CPUs. See commit 8f46cca1e6c06 ("MIPS: SMP: Fix
>>>> possibility of deadlock when bringing CPUs online") for details.
>>>
>>> I personally do not like the fact that synchronization is implicitly done
>>> by the callers, it is the reason why the patch was proposed. As noted
>>> before, it is enough someone checks cpu online mask somewhere in between and
>>> there is race again.
>>>
>>> How about moving synchronise_count_slave before setting the cpu online?
>>> Is there dependency it has to be done after completion?
>>>
>>> Regards,
>>>
>>> Matija
>>>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-31  7:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-23  8:21 [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask" Matt Redfearn
2017-08-23  8:21 ` Matt Redfearn
2017-08-28 10:07 ` Matija Glavinic Pecotic
2017-08-29  1:43   ` Huacai Chen
2017-08-30 13:24     ` Matt Redfearn
2017-08-31  7:03       ` Huacai Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.