All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Milton Miller <miltonm@bga.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	akpm@linux-foundation.org, Anton Blanchard <anton@samba.org>,
	xiaoguangrong@cn.fujitsu.com, mingo@elte.hu, jaxboe@fusionio.com,
	npiggin@gmail.com, rusty@rustcorp.com.au,
	torvalds@linux-foundation.org, paulmck@linux.vnet.ibm.com,
	benh@kernel.crashing.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] smp_call_function_many: handle concurrent clearing of mask
Date: Tue, 01 Feb 2011 04:15:36 +0100	[thread overview]
Message-ID: <1296530136.7862.22.camel@marge.simson.net> (raw)
In-Reply-To: <smp-ipi-mike@mdm.bga.com>

On Mon, 2011-01-31 at 14:26 -0600, Milton Miller wrote:
> On Mon, 31 Jan 2011 about 08:21:22 +0100,  Mike Galbraith wrote:
> > Wondering if a final sanity check makes sense.  I've got a perma-spin
> > bug where comment apparently happened.  Another CPU's diddle the mask
> > IPI may make this CPU do horrible things to itself as it's setting up to
> > IPI others with that mask.
> > 
> > ---
> >  kernel/smp.c |    3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > Index: linux-2.6.38.git/kernel/smp.c
> > ===================================================================
> > --- linux-2.6.38.git.orig/kernel/smp.c
> > +++ linux-2.6.38.git/kernel/smp.c
> > @@ -490,6 +490,9 @@ void smp_call_function_many(const struct
> >  	cpumask_and(data->cpumask, mask, cpu_online_mask);
> >  	cpumask_clear_cpu(this_cpu, data->cpumask);
> >  
> > +	/* Did you pass me a mask that can be changed/emptied under me? */
> > +	BUG_ON(cpumask_empty(data->cpumask));
> > +
> 
> I was thinking of this as "the ipi cpumask was cleared", but I realize now
> you are saying the caller passed in a cpumask, but between the cpu_first/
> cpu_next calls above and the cpumask_and another cpu cleared all the cpus?
> 
> I could see how that could happen on say a mask of cpus that might have a
> translation context, or cpus that need a push to complete an rcu window.
> Instead of the BUG_ON, we can handle the mask being cleared.
> 
> The arch code to send the IPI must handle an empty mask, as the other
> cpus are racing to clear their bit while its trying to send the IPI.
> In fact that expected race is the cause of the x86 warning in bz 23042
> https://bugzilla.kernel.org/show_bug.cgi?id=23042  that Andrew pointed
> out.
> 
> 
> How about this [untested] patch?
> 
> Mike Galbraith reported finding a lockup where aparently the passed in
> cpumask was cleared on other cpu(s) while this cpu was preparing its
> smp_call_function_many block.   Detect this race and unlock the call
> data block.  Note: arch_send_call_function_ipi_mask must still handle an
> empty mask because the element is globally visable before it is called.
> And obviously there are no guarantees to which cpus are notified if the
> mask is changed during the call.

Yes, that would work.  In my case, it was passed mm_cpumask(mm).  What
is unclear is whether mask at call time was what the programmer needed
action on, ie mask changing may be intolerable information loss/gain.

	-Mike



  reply	other threads:[~2011-02-01  3:15 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12  4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra
2011-01-18 21:05   ` Milton Miller
2011-01-18 21:06     ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22       ` Peter Zijlstra
2011-01-27 21:59         ` Milton Miller
2011-01-29  0:20           ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31  7:21             ` Mike Galbraith
2011-01-31 20:26               ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01  3:15                 ` Mike Galbraith [this message]
2011-01-31 10:27             ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26               ` Milton Miller
2011-01-31 20:39                 ` Peter Zijlstra
2011-01-31 21:17             ` Peter Zijlstra
2011-01-31 21:36               ` Milton Miller
2011-02-01  0:22               ` Benjamin Herrenschmidt
2011-02-01  1:39                 ` Linus Torvalds
2011-02-01  2:18                   ` Paul E. McKenney
2011-02-01  2:43                     ` Linus Torvalds
2011-02-01  4:45                       ` Paul E. McKenney
2011-02-01  5:46                         ` Linus Torvalds
2011-02-01  6:18                           ` Benjamin Herrenschmidt
2011-02-01 14:13                           ` Paul E. McKenney
2011-02-01  6:16                       ` Benjamin Herrenschmidt
     [not found]             ` <ipi-list-reply@mdm.bga.com>
2011-02-01  7:12               ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00                 ` Paul E. McKenney
2011-02-01 22:00                   ` Milton Miller
2011-02-02  4:17                     ` Paul E. McKenney
2011-02-06 23:51                       ` Paul E. McKenney
2011-03-15 19:27                         ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22                           ` Luck, Tony
2011-03-15 20:32                             ` Dimitri Sivanich
2011-03-15 20:39                           ` Peter Zijlstra
2011-03-16 17:55                           ` Linus Torvalds
2011-03-16 18:13                             ` Peter Zijlstra
2011-03-17  3:15                           ` Mike Galbraith
2011-02-07  8:12                       ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36                         ` Paul E. McKenney
2011-08-21  6:17                           ` Mike Galbraith
2011-02-02  6:22                     ` Mike Galbraith
2011-02-01  7:12               ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27               ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27               ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06                 ` Paul E. McKenney
2011-03-15 19:27               ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-03-15 19:27               ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32                 ` Catalin Marinas
2011-03-16  7:52                 ` Jan Beulich
2011-01-18 21:07     ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20  0:41       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1296530136.7862.22.camel@marge.simson.net \
    --to=efault@gmx.de \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miltonm@bga.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.