All of lore.kernel.org
 help / color / mirror / Atom feed
* smp_call_function_single with wait=0 considered harmful
@ 2013-12-04 16:46 Christoph Hellwig
  2013-12-05 21:43 ` Bjorn Helgaas
  2014-02-28 12:26 ` Peter Zijlstra
  0 siblings, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2013-12-04 16:46 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar, Thomas Gleixner, Tony Luck,
	Robert Richter, Bjorn Helgaas, Aaro Koskinen, David Daney
  Cc: linux-kernel

While doing my recent work on the generic smp function calls I noticed
that smp_call_function_single without the wait flag can't work, as
it allocates struct call_single_data on stack, and without the wait
flag will happily return before the IPI has been executed.

This affects the following callers:

  arch/ia64/kernel/mca.c:mca_cpu_callback()
  arch/ia64/kernel/smpboot.c:ia64_sync_itc()
  arch/x86/kernel/kvm.c:kvm_cpu_notify()
  arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier()
  arch/x86/pci/amd_bus.c:amd_cpu_notify()
  drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu()
  kernel/stop_machine.c:stop_two_cpus()

It would be good to get these fixed so that we could remove the
parameter.  Either convert them to wait, or use a preallocated
call_single_data and __smp_call_function_single.

After that I'd like to remove the wait argument to prevent further
abuses.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: smp_call_function_single with wait=0 considered harmful
  2013-12-04 16:46 smp_call_function_single with wait=0 considered harmful Christoph Hellwig
@ 2013-12-05 21:43 ` Bjorn Helgaas
  2013-12-06 10:56   ` Christoph Hellwig
  2014-02-28 12:26 ` Peter Zijlstra
  1 sibling, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2013-12-05 21:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Ingo Molnar, Thomas Gleixner, Tony Luck,
	Robert Richter, Aaro Koskinen, David Daney, linux-kernel

On Wed, Dec 4, 2013 at 9:46 AM, Christoph Hellwig <hch@infradead.org> wrote:
> While doing my recent work on the generic smp function calls I noticed
> that smp_call_function_single without the wait flag can't work, as
> it allocates struct call_single_data on stack, and without the wait
> flag will happily return before the IPI has been executed.

I don't understand the problem yet.  With wait==0,
smp_call_function_single() sets "csd = &__get_cpu_var(csd_data)", so
it's not using a struct on the stack.  We'll queue up "func" and
likely will return before it is executed, but that should be fine
because nobody will overwrite csd_data until it *is* executed and
csd_unlock() has been called.

> This affects the following callers:
>
>   arch/ia64/kernel/mca.c:mca_cpu_callback()
>   arch/ia64/kernel/smpboot.c:ia64_sync_itc()
>   arch/x86/kernel/kvm.c:kvm_cpu_notify()
>   arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier()
>   arch/x86/pci/amd_bus.c:amd_cpu_notify()

I don't see any reason why amd_cpu_notify() needs to use wait==0.

>   drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu()
>   kernel/stop_machine.c:stop_two_cpus()
>
> It would be good to get these fixed so that we could remove the
> parameter.  Either convert them to wait, or use a preallocated
> call_single_data and __smp_call_function_single.
>
> After that I'd like to remove the wait argument to prevent further
> abuses.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: smp_call_function_single with wait=0 considered harmful
  2013-12-05 21:43 ` Bjorn Helgaas
@ 2013-12-06 10:56   ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2013-12-06 10:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Christoph Hellwig, Andrew Morton, Ingo Molnar, Thomas Gleixner,
	Tony Luck, Robert Richter, Aaro Koskinen, David Daney,
	linux-kernel

On Thu, Dec 05, 2013 at 02:43:03PM -0700, Bjorn Helgaas wrote:
> smp_call_function_single() sets "csd = &__get_cpu_var(csd_data)", so
> it's not using a struct on the stack.  We'll queue up "func" and
> likely will return before it is executed, but that should be fine
> because nobody will overwrite csd_data until it *is* executed and
> csd_unlock() has been called.

You're right, I missed the usage of the per-cpu data later in the
function.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: smp_call_function_single with wait=0 considered harmful
  2013-12-04 16:46 smp_call_function_single with wait=0 considered harmful Christoph Hellwig
  2013-12-05 21:43 ` Bjorn Helgaas
@ 2014-02-28 12:26 ` Peter Zijlstra
  2014-02-28 12:39   ` Peter Zijlstra
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2014-02-28 12:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Ingo Molnar, Thomas Gleixner, Tony Luck,
	Robert Richter, Bjorn Helgaas, Aaro Koskinen, David Daney,
	linux-kernel

On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
> While doing my recent work on the generic smp function calls I noticed
> that smp_call_function_single without the wait flag can't work, as
> it allocates struct call_single_data on stack, and without the wait
> flag will happily return before the IPI has been executed.

It doesn't actually; it uses a per-cpu one in the !wait case.

The subsequent csd_lock() ensures it will wait for any prior user to
complete, so only if you're doing multiple smp_call_function_single()
invocations back-to-back will they queue up.

> This affects the following callers:

<snip>

>   kernel/stop_machine.c:stop_two_cpus()

That site should work with .wait=1 just fine, but given the above, the
.wait=0 doesn't appear problematic at all.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: smp_call_function_single with wait=0 considered harmful
  2014-02-28 12:26 ` Peter Zijlstra
@ 2014-02-28 12:39   ` Peter Zijlstra
  2014-02-28 17:06     ` Rik van Riel
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Peter Zijlstra @ 2014-02-28 12:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Ingo Molnar, Thomas Gleixner, Tony Luck,
	Robert Richter, Bjorn Helgaas, Aaro Koskinen, David Daney,
	linux-kernel, Prarit Bhargava, Rik van Riel, Mel Gorman

On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
> >   kernel/stop_machine.c:stop_two_cpus()
> 
> That site should work with .wait=1 just fine, but given the above, the
> .wait=0 doesn't appear problematic at all.

Scratch that; its broken, but not because of smp_call_function_single().

---
Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()

We must use smp_call_function_single(.wait=1) for the
irq_cpu_stop_queue_work() to ensure the queueing is actually done under
stop_cpus_lock. Without this we could have dropped the lock by the time
we do the queueing and get the race we tried to fix.

Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()")
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
 kernel/stop_machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 84571e09c907..01fbae5b97b7 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -293,7 +293,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *
 	 */
 	smp_call_function_single(min(cpu1, cpu2),
 				 &irq_cpu_stop_queue_work,
-				 &call_args, 0);
+				 &call_args, 1);
 	lg_local_unlock(&stop_cpus_lock);
 	preempt_enable();
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: smp_call_function_single with wait=0 considered harmful
  2014-02-28 12:39   ` Peter Zijlstra
@ 2014-02-28 17:06     ` Rik van Riel
  2014-02-28 17:34     ` Prarit Bhargava
  2014-03-11 12:36     ` [tip:sched/core] stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2014-02-28 17:06 UTC (permalink / raw)
  To: Peter Zijlstra, Christoph Hellwig
  Cc: Andrew Morton, Ingo Molnar, Thomas Gleixner, Tony Luck,
	Robert Richter, Bjorn Helgaas, Aaro Koskinen, David Daney,
	linux-kernel, Prarit Bhargava, Mel Gorman

On 02/28/2014 07:39 AM, Peter Zijlstra wrote:

> Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
> 
> We must use smp_call_function_single(.wait=1) for the
> irq_cpu_stop_queue_work() to ensure the queueing is actually done under
> stop_cpus_lock. Without this we could have dropped the lock by the time
> we do the queueing and get the race we tried to fix.
> 
> Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()")
> Cc: Prarit Bhargava <prarit@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: smp_call_function_single with wait=0 considered harmful
  2014-02-28 12:39   ` Peter Zijlstra
  2014-02-28 17:06     ` Rik van Riel
@ 2014-02-28 17:34     ` Prarit Bhargava
  2014-03-11 12:36     ` [tip:sched/core] stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 8+ messages in thread
From: Prarit Bhargava @ 2014-02-28 17:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Andrew Morton, Ingo Molnar, Thomas Gleixner,
	Tony Luck, Robert Richter, Bjorn Helgaas, Aaro Koskinen,
	David Daney, linux-kernel, Rik van Riel, Mel Gorman



On 02/28/2014 07:39 AM, Peter Zijlstra wrote:
> On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote:
>> On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
>>>   kernel/stop_machine.c:stop_two_cpus()
>>
>> That site should work with .wait=1 just fine, but given the above, the
>> .wait=0 doesn't appear problematic at all.
> 
> Scratch that; its broken, but not because of smp_call_function_single().
> 
> ---
> Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
> 
> We must use smp_call_function_single(.wait=1) for the
> irq_cpu_stop_queue_work() to ensure the queueing is actually done under
> stop_cpus_lock. Without this we could have dropped the lock by the time
> we do the queueing and get the race we tried to fix.
> 
> Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()")
> Cc: Prarit Bhargava <prarit@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>

Reviewed-by: Prarit Bhargava <prarit@redhat.com>

P.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:sched/core] stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
  2014-02-28 12:39   ` Peter Zijlstra
  2014-02-28 17:06     ` Rik van Riel
  2014-02-28 17:34     ` Prarit Bhargava
@ 2014-03-11 12:36     ` tip-bot for Peter Zijlstra
  2 siblings, 0 replies; 8+ messages in thread
From: tip-bot for Peter Zijlstra @ 2014-03-11 12:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, peterz, hch, riel, akpm, mgorman, tglx, prarit

Commit-ID:  177c53d943368fc97644ebc0a250dc8e2d124250
Gitweb:     http://git.kernel.org/tip/177c53d943368fc97644ebc0a250dc8e2d124250
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Fri, 28 Feb 2014 13:39:05 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 11 Mar 2014 11:33:47 +0100

stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()

We must use smp_call_function_single(.wait=1) for the
irq_cpu_stop_queue_work() to ensure the queueing is actually done under
stop_cpus_lock. Without this we could have dropped the lock by the time
we do the queueing and get the race we tried to fix.

Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()")

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140228123905.GK3104@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/stop_machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 84571e0..01fbae5 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -293,7 +293,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *
 	 */
 	smp_call_function_single(min(cpu1, cpu2),
 				 &irq_cpu_stop_queue_work,
-				 &call_args, 0);
+				 &call_args, 1);
 	lg_local_unlock(&stop_cpus_lock);
 	preempt_enable();
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-03-11 12:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-04 16:46 smp_call_function_single with wait=0 considered harmful Christoph Hellwig
2013-12-05 21:43 ` Bjorn Helgaas
2013-12-06 10:56   ` Christoph Hellwig
2014-02-28 12:26 ` Peter Zijlstra
2014-02-28 12:39   ` Peter Zijlstra
2014-02-28 17:06     ` Rik van Riel
2014-02-28 17:34     ` Prarit Bhargava
2014-03-11 12:36     ` [tip:sched/core] stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.