All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@novell.com>
To: "Milton Miller" <miltonm@bga.com>
Cc: <xiaoguangrong@cn.fujitsu.com>, <mingo@elte.hu>,
	<jaxboe@fusionio.com>, <npiggin@gmail.com>,
	"Mike Galbraith" <efault@gmx.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Tony Luck" <tony.luck@intel.com>, <benh@kernel.crashing.org>,
	<akpm@linux-foundation.org>, <torvalds@linux-foundation.org>,
	<paulmck@linux.vnet.ibm.com>, <rusty@rustcorp.com.au>,
	"Anton Blanchard" <anton@samba.org>,
	"Dimitri Sivanich" <sivanich@sgi.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask
Date: Wed, 16 Mar 2011 07:52:19 +0000	[thread overview]
Message-ID: <4D807A430200007800036BDF@vpn.id2.novell.com> (raw)
In-Reply-To: <smp-ipi-mask-clear-race-v3@mdm.bga.com>

>>> On 15.03.11 at 20:27, Milton Miller <miltonm@bga.com> wrote:
> Mike Galbraith reported finding a lockup ("perma-spin bug") where the
> cpumask passed to smp_call_function_many was cleared by other cpu(s)
> while a cpu was preparing its call_data block, resulting in no cpu to
> clear the last ref and unlock the block.
> 
> Having cpus clear their bit asynchronously could be useful on a mask of
> cpus that might have a translation context, or cpus that need a push to
> complete an rcu window.
> 
> Instead of adding a BUG_ON and requiring yet another cpumask copy, just
> detect the race and handle it.
> 
> Note: arch_send_call_function_ipi_mask must still handle an empty
> cpumask because the data block is globally visible before the that
> arch callback is made.  And (obviously) there are no guarantees to
> which cpus are notified if the mask is changed during the call;
> only cpus that were online and had their mask bit set during the
> whole call are guaranteed to be called.
> 
> Reported-by: Mike Galbraith <efault@gmx.de>
> Reported-by: Jan Beulich <JBeulich@novell.com>
> Signed-off-by: Milton Miller <miltonm@bga.com>

Acked-by: Jan Beulich <jbeulich@novell.com>

> ---
> v3: try to clarify which mask in comment
> v2: rediff for v2 of call_function_many: fix list delete vs add race
> 
> The arch code not expecting the race to empty the mask is the cause
> of https://bugzilla.kernel.org/show_bug.cgi?id=23042 that Andrew pointed
> out.
> 
> 
> Index: common/kernel/smp.c
> ===================================================================
> --- common.orig/kernel/smp.c	2011-03-15 05:22:26.000000000 -0500
> +++ common/kernel/smp.c	2011-03-15 06:22:26.000000000 -0500
> @@ -450,7 +450,7 @@ void smp_call_function_many(const struct
>  {
>  	struct call_function_data *data;
>  	unsigned long flags;
> -	int cpu, next_cpu, this_cpu = smp_processor_id();
> +	int refs, cpu, next_cpu, this_cpu = smp_processor_id();
>  
>  	/*
>  	 * Can deadlock when called with interrupts disabled.
> @@ -461,7 +461,7 @@ void smp_call_function_many(const struct
>  	WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>  		     && !oops_in_progress && !early_boot_irqs_disabled);
>  
> -	/* So, what's a CPU they want? Ignoring this one. */
> +	/* Try to fastpath.  So, what's a CPU they want? Ignoring this one. */
>  	cpu = cpumask_first_and(mask, cpu_online_mask);
>  	if (cpu == this_cpu)
>  		cpu = cpumask_next_and(cpu, mask, cpu_online_mask);
> @@ -519,6 +519,13 @@ void smp_call_function_many(const struct
>  	/* We rely on the "and" being processed before the store */
>  	cpumask_and(data->cpumask, mask, cpu_online_mask);
>  	cpumask_clear_cpu(this_cpu, data->cpumask);
> +	refs = cpumask_weight(data->cpumask);
> +
> +	/* Some callers race with other cpus changing the passed mask */
> +	if (unlikely(!refs)) {
> +		csd_unlock(&data->csd);
> +		return;
> +	}
>  
>  	raw_spin_lock_irqsave(&call_function.lock, flags);
>  	/*
> @@ -532,7 +539,7 @@ void smp_call_function_many(const struct
>  	 * to the cpumask before this write to refs, which indicates
>  	 * data is on the list and is ready to be processed.
>  	 */
> -	atomic_set(&data->refs, cpumask_weight(data->cpumask));
> +	atomic_set(&data->refs, refs);
>  	raw_spin_unlock_irqrestore(&call_function.lock, flags);
>  
>  	/*




  parent reply	other threads:[~2011-03-16  7:51 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12  4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra
2011-01-18 21:05   ` Milton Miller
2011-01-18 21:06     ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22       ` Peter Zijlstra
2011-01-27 21:59         ` Milton Miller
2011-01-29  0:20           ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31  7:21             ` Mike Galbraith
2011-01-31 20:26               ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01  3:15                 ` Mike Galbraith
2011-01-31 10:27             ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26               ` Milton Miller
2011-01-31 20:39                 ` Peter Zijlstra
2011-01-31 21:17             ` Peter Zijlstra
2011-01-31 21:36               ` Milton Miller
2011-02-01  0:22               ` Benjamin Herrenschmidt
2011-02-01  1:39                 ` Linus Torvalds
2011-02-01  2:18                   ` Paul E. McKenney
2011-02-01  2:43                     ` Linus Torvalds
2011-02-01  4:45                       ` Paul E. McKenney
2011-02-01  5:46                         ` Linus Torvalds
2011-02-01  6:18                           ` Benjamin Herrenschmidt
2011-02-01 14:13                           ` Paul E. McKenney
2011-02-01  6:16                       ` Benjamin Herrenschmidt
     [not found]             ` <ipi-list-reply@mdm.bga.com>
2011-02-01  7:12               ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00                 ` Paul E. McKenney
2011-02-01 22:00                   ` Milton Miller
2011-02-02  4:17                     ` Paul E. McKenney
2011-02-06 23:51                       ` Paul E. McKenney
2011-03-15 19:27                         ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22                           ` Luck, Tony
2011-03-15 20:32                             ` Dimitri Sivanich
2011-03-15 20:39                           ` Peter Zijlstra
2011-03-16 17:55                           ` Linus Torvalds
2011-03-16 18:13                             ` Peter Zijlstra
2011-03-17  3:15                           ` Mike Galbraith
2011-02-07  8:12                       ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36                         ` Paul E. McKenney
2011-08-21  6:17                           ` Mike Galbraith
2011-02-02  6:22                     ` Mike Galbraith
2011-02-01  7:12               ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27               ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27               ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06                 ` Paul E. McKenney
2011-03-15 19:27               ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-03-15 19:27               ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32                 ` Catalin Marinas
2011-03-16  7:52                 ` Jan Beulich [this message]
2011-01-18 21:07     ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20  0:41       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D807A430200007800036BDF@vpn.id2.novell.com \
    --to=jbeulich@novell.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=efault@gmx.de \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miltonm@bga.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=sivanich@sgi.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.