linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Deadlock on poweroff
@ 2012-10-07  2:47 Kirill A. Shutemov
       [not found] ` <20121007160311.GE2485@linux.vnet.ibm.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2012-10-07  2:47 UTC (permalink / raw)
  To: Paul E. McKenney, linux-kernel
  Cc: Dipankar Sarma, Thomas Gleixner, Andrew Morton, Steffen Klassert,
	.linux-crypto

Hi Paul and all,

With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
poweroff.

It guess it happens because of race for cpu_hotplug.lock:

	CPU A					CPU B
disable_nonboot_cpus()
_cpu_down()
cpu_hotplug_begin()
 mutex_lock(&cpu_hotplug.lock);
__cpu_notify()
padata_cpu_callback()
__padata_remove_cpu()
padata_replace()
synchronize_rcu()
					rcu_gp_kthread()
					get_online_cpus();
					mutex_lock(&cpu_hotplug.lock);

Have you seen the issue before?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Deadlock on poweroff
       [not found] ` <20121007160311.GE2485@linux.vnet.ibm.com>
@ 2012-10-07 16:50   ` Kirill A. Shutemov
  2012-10-07 17:05     ` Srivatsa S. Bhat
  2012-10-08  4:41     ` Paul E. McKenney
  0 siblings, 2 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2012-10-07 16:50 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Dipankar Sarma, Thomas Gleixner, Andrew Morton,
	Steffen Klassert, linux-crypto

On Sun, Oct 07, 2012 at 09:03:11AM -0700, Paul E. McKenney wrote:
> On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
> > Hi Paul and all,
> > 
> > With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> > poweroff.
> > 
> > It guess it happens because of race for cpu_hotplug.lock:
> > 
> > 	CPU A					CPU B
> > disable_nonboot_cpus()
> > _cpu_down()
> > cpu_hotplug_begin()
> >  mutex_lock(&cpu_hotplug.lock);
> > __cpu_notify()
> > padata_cpu_callback()
> > __padata_remove_cpu()
> > padata_replace()
> > synchronize_rcu()
> > 					rcu_gp_kthread()
> > 					get_online_cpus();
> > 					mutex_lock(&cpu_hotplug.lock);
> > 
> > Have you seen the issue before?
> 
> This is a new one for me.  Does the following (very lightly tested)
> patch help?

Works for me. Thanks.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Deadlock on poweroff
  2012-10-07 16:50   ` Kirill A. Shutemov
@ 2012-10-07 17:05     ` Srivatsa S. Bhat
  2012-10-07 17:11       ` Kirill A. Shutemov
  2012-10-08  4:41     ` Paul E. McKenney
  1 sibling, 1 reply; 8+ messages in thread
From: Srivatsa S. Bhat @ 2012-10-07 17:05 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Paul E. McKenney, linux-kernel, Dipankar Sarma, Thomas Gleixner,
	Andrew Morton, Steffen Klassert, linux-crypto, Srivatsa S. Bhat

On 10/07/2012 10:20 PM, Kirill A. Shutemov wrote:
> On Sun, Oct 07, 2012 at 09:03:11AM -0700, Paul E. McKenney wrote:
>> On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
>>> Hi Paul and all,
>>>
>>> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
>>> poweroff.
>>>
>>> It guess it happens because of race for cpu_hotplug.lock:
>>>
>>> 	CPU A					CPU B
>>> disable_nonboot_cpus()
>>> _cpu_down()
>>> cpu_hotplug_begin()
>>>  mutex_lock(&cpu_hotplug.lock);
>>> __cpu_notify()
>>> padata_cpu_callback()
>>> __padata_remove_cpu()
>>> padata_replace()
>>> synchronize_rcu()
>>> 					rcu_gp_kthread()
>>> 					get_online_cpus();
>>> 					mutex_lock(&cpu_hotplug.lock);
>>>
>>> Have you seen the issue before?
>>
>> This is a new one for me.  Does the following (very lightly tested)
>> patch help?
> 
> Works for me. Thanks.
> 

Could you share the patch please? Looks like it didn't hit the mailing
lists.. 

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Deadlock on poweroff
  2012-10-07 17:05     ` Srivatsa S. Bhat
@ 2012-10-07 17:11       ` Kirill A. Shutemov
  2012-10-07 17:16         ` Srivatsa S. Bhat
  0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2012-10-07 17:11 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Paul E. McKenney, linux-kernel, Dipankar Sarma, Thomas Gleixner,
	Andrew Morton, Steffen Klassert, linux-crypto

On Sun, Oct 07, 2012 at 10:35:01PM +0530, Srivatsa S. Bhat wrote:
> On 10/07/2012 10:20 PM, Kirill A. Shutemov wrote:
> > On Sun, Oct 07, 2012 at 09:03:11AM -0700, Paul E. McKenney wrote:
> >> On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
> >>> Hi Paul and all,
> >>>
> >>> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> >>> poweroff.
> >>>
> >>> It guess it happens because of race for cpu_hotplug.lock:
> >>>
> >>> 	CPU A					CPU B
> >>> disable_nonboot_cpus()
> >>> _cpu_down()
> >>> cpu_hotplug_begin()
> >>>  mutex_lock(&cpu_hotplug.lock);
> >>> __cpu_notify()
> >>> padata_cpu_callback()
> >>> __padata_remove_cpu()
> >>> padata_replace()
> >>> synchronize_rcu()
> >>> 					rcu_gp_kthread()
> >>> 					get_online_cpus();
> >>> 					mutex_lock(&cpu_hotplug.lock);
> >>>
> >>> Have you seen the issue before?
> >>
> >> This is a new one for me.  Does the following (very lightly tested)
> >> patch help?
> > 
> > Works for me. Thanks.
> > 
> 
> Could you share the patch please? Looks like it didn't hit the mailing
> lists.. 

Sure. Here's original  mail from Paul:

Date: Sun, 7 Oct 2012 09:03:11 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-kernel@vger.kernel.org, Dipankar Sarma <dipankar@in.ibm.com>,
        Thomas Gleixner <tglx@linutronix.de>,
        Andrew Morton <akpm@linux-foundation.org>,
        Steffen Klassert <steffen.klassert@secunet.com>,
        ".linux-crypto"@vger.kernel.org
Subject: Re: Deadlock on poweroff
Message-ID: <20121007160311.GE2485@linux.vnet.ibm.com>
Reply-To: paulmck@linux.vnet.ibm.com
References: <20121007024711.GA21403@shutemov.name>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20121007024711.GA21403@shutemov.name>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12100716-7408-0000-0000-000009194B17
Status: RO
Content-Length: 6055
Lines: 173

On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
> Hi Paul and all,
> 
> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> poweroff.
> 
> It guess it happens because of race for cpu_hotplug.lock:
> 
> 	CPU A					CPU B
> disable_nonboot_cpus()
> _cpu_down()
> cpu_hotplug_begin()
>  mutex_lock(&cpu_hotplug.lock);
> __cpu_notify()
> padata_cpu_callback()
> __padata_remove_cpu()
> padata_replace()
> synchronize_rcu()
> 					rcu_gp_kthread()
> 					get_online_cpus();
> 					mutex_lock(&cpu_hotplug.lock);
> 
> Have you seen the issue before?

This is a new one for me.  Does the following (very lightly tested)
patch help?

							Thanx, Paul

------------------------------------------------------------------------

rcu: Grace-period initialization excludes only RCU notifier

Kirill noted the following deadlock cycle on shutdown involving padata:

> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> poweroff.
>
> It guess it happens because of race for cpu_hotplug.lock:
>
>       CPU A                                   CPU B
> disable_nonboot_cpus()
> _cpu_down()
> cpu_hotplug_begin()
>  mutex_lock(&cpu_hotplug.lock);
> __cpu_notify()
> padata_cpu_callback()
> __padata_remove_cpu()
> padata_replace()
> synchronize_rcu()
>                                       rcu_gp_kthread()
>                                       get_online_cpus();
>                                       mutex_lock(&cpu_hotplug.lock);

It would of course be good to eliminate grace-period delays from
CPU-hotplug notifiers, but that is a separate issue.  Deadlock is
not an appropriate diagnostic for excessive CPU-hotplug latency.

Fortunately, grace-period initialization does not actually need to
exclude all of the CPU-hotplug operation, but rather only RCU's own
CPU_UP_PREPARE and CPU_DEAD CPU-hotplug notifiers.  This commit therefore
introduces a new per-rcu_state onoff_mutex that provides the required
concurrency control in place of the get_online_cpus() that was previously
in rcu_gp_init().

Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index fb63d7b..5eece12 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -74,6 +74,7 @@ static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
 	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
 	.orphan_donetail = &sname##_state.orphan_donelist, \
 	.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
+	.onoff_mutex = __MUTEX_INITIALIZER(sname##_state.onoff_mutex), \
 	.name = #sname, \
 }
 
@@ -1229,7 +1230,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 	raw_spin_unlock_irq(&rnp->lock);
 
 	/* Exclude any concurrent CPU-hotplug operations. */
-	get_online_cpus();
+	mutex_lock(&rsp->onoff_mutex);
 
 	/*
 	 * Set the quiescent-state-needed bits in all the rcu_node
@@ -1266,7 +1267,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 		cond_resched();
 	}
 
-	put_online_cpus();
+	mutex_unlock(&rsp->onoff_mutex);
 	return 1;
 }
 
@@ -1754,6 +1755,7 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
 	/* Remove the dead CPU from the bitmasks in the rcu_node hierarchy. */
 
 	/* Exclude any attempts to start a new grace period. */
+	mutex_lock(&rsp->onoff_mutex);
 	raw_spin_lock_irqsave(&rsp->onofflock, flags);
 
 	/* Orphan the dead CPU's callbacks, and adopt them if appropriate. */
@@ -1798,6 +1800,7 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
 	init_callback_list(rdp);
 	/* Disallow further callbacks on this CPU. */
 	rdp->nxttail[RCU_NEXT_TAIL] = NULL;
+	mutex_unlock(&rsp->onoff_mutex);
 }
 
 #else /* #ifdef CONFIG_HOTPLUG_CPU */
@@ -2708,6 +2711,9 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 	struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	/* Exclude new grace periods. */
+	mutex_lock(&rsp->onoff_mutex);
+
 	/* Set up local state, ensuring consistent view of global state. */
 	raw_spin_lock_irqsave(&rnp->lock, flags);
 	rdp->beenonline = 1;	 /* We have now been online. */
@@ -2722,14 +2728,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 	rcu_prepare_for_idle_init(cpu);
 	raw_spin_unlock(&rnp->lock);		/* irqs remain disabled. */
 
-	/*
-	 * A new grace period might start here.  If so, we won't be part
-	 * of it, but that is OK, as we are currently in a quiescent state.
-	 */
-
-	/* Exclude any attempts to start a new GP on large systems. */
-	raw_spin_lock(&rsp->onofflock);		/* irqs already disabled. */
-
 	/* Add CPU to rcu_node bitmasks. */
 	rnp = rdp->mynode;
 	mask = rdp->grpmask;
@@ -2753,8 +2751,9 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
 		raw_spin_unlock(&rnp->lock); /* irqs already disabled. */
 		rnp = rnp->parent;
 	} while (rnp != NULL && !(rnp->qsmaskinit & mask));
+	local_irq_restore(flags);
 
-	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
+	mutex_unlock(&rsp->onoff_mutex);
 }
 
 static void __cpuinit rcu_prepare_cpu(int cpu)
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 5faf05d..a240f03 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -394,11 +394,17 @@ struct rcu_state {
 	struct rcu_head **orphan_donetail;	/* Tail of above. */
 	long qlen_lazy;				/* Number of lazy callbacks. */
 	long qlen;				/* Total number of callbacks. */
+	/* End of fields guarded by onofflock. */
+
+	struct mutex onoff_mutex;		/* Coordinate hotplug & GPs. */
+
 	struct mutex barrier_mutex;		/* Guards barrier fields. */
 	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
 	struct completion barrier_completion;	/* Wake at barrier end. */
 	unsigned long n_barrier_done;		/* ++ at start and end of */
 						/*  _rcu_barrier(). */
+	/* End of fields guarded by barrier_mutex. */
+
 	unsigned long jiffies_force_qs;		/* Time at which to invoke */
 						/*  force_quiescent_state(). */
 	unsigned long n_force_qs;		/* Number of calls to */

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Deadlock on poweroff
  2012-10-07 17:11       ` Kirill A. Shutemov
@ 2012-10-07 17:16         ` Srivatsa S. Bhat
  2012-10-07 21:08           ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: Srivatsa S. Bhat @ 2012-10-07 17:16 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Paul E. McKenney, linux-kernel, Dipankar Sarma, Thomas Gleixner,
	Andrew Morton, Steffen Klassert, linux-crypto

On 10/07/2012 10:41 PM, Kirill A. Shutemov wrote:
> On Sun, Oct 07, 2012 at 10:35:01PM +0530, Srivatsa S. Bhat wrote:
>> On 10/07/2012 10:20 PM, Kirill A. Shutemov wrote:
>>> On Sun, Oct 07, 2012 at 09:03:11AM -0700, Paul E. McKenney wrote:
>>>> On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
>>>>> Hi Paul and all,
>>>>>
>>>>> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
>>>>> poweroff.
>>>>>
>>>>> It guess it happens because of race for cpu_hotplug.lock:
>>>>>
>>>>> 	CPU A					CPU B
>>>>> disable_nonboot_cpus()
>>>>> _cpu_down()
>>>>> cpu_hotplug_begin()
>>>>>  mutex_lock(&cpu_hotplug.lock);
>>>>> __cpu_notify()
>>>>> padata_cpu_callback()
>>>>> __padata_remove_cpu()
>>>>> padata_replace()
>>>>> synchronize_rcu()
>>>>> 					rcu_gp_kthread()
>>>>> 					get_online_cpus();
>>>>> 					mutex_lock(&cpu_hotplug.lock);
>>>>>
>>>>> Have you seen the issue before?
>>>>
>>>> This is a new one for me.  Does the following (very lightly tested)
>>>> patch help?
>>>
>>> Works for me. Thanks.
>>>
>>
>> Could you share the patch please? Looks like it didn't hit the mailing
>> lists.. 
> 
> Sure. Here's original  mail from Paul:
>

Ah, great! Thanks!

Regards,
Srivatsa S. Bhat
 
> Date: Sun, 7 Oct 2012 09:03:11 -0700
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> To: "Kirill A. Shutemov" <kirill@shutemov.name>
> Cc: linux-kernel@vger.kernel.org, Dipankar Sarma <dipankar@in.ibm.com>,
>         Thomas Gleixner <tglx@linutronix.de>,
>         Andrew Morton <akpm@linux-foundation.org>,
>         Steffen Klassert <steffen.klassert@secunet.com>,
>         ".linux-crypto"@vger.kernel.org
> Subject: Re: Deadlock on poweroff
> Message-ID: <20121007160311.GE2485@linux.vnet.ibm.com>
> Reply-To: paulmck@linux.vnet.ibm.com
> References: <20121007024711.GA21403@shutemov.name>
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> In-Reply-To: <20121007024711.GA21403@shutemov.name>
> User-Agent: Mutt/1.5.21 (2010-09-15)
> X-Content-Scanned: Fidelis XPS MAILER
> x-cbid: 12100716-7408-0000-0000-000009194B17
> Status: RO
> Content-Length: 6055
> Lines: 173
> 
> On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
>> Hi Paul and all,
>>
>> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
>> poweroff.
>>
>> It guess it happens because of race for cpu_hotplug.lock:
>>
>> 	CPU A					CPU B
>> disable_nonboot_cpus()
>> _cpu_down()
>> cpu_hotplug_begin()
>>  mutex_lock(&cpu_hotplug.lock);
>> __cpu_notify()
>> padata_cpu_callback()
>> __padata_remove_cpu()
>> padata_replace()
>> synchronize_rcu()
>> 					rcu_gp_kthread()
>> 					get_online_cpus();
>> 					mutex_lock(&cpu_hotplug.lock);
>>
>> Have you seen the issue before?
> 
> This is a new one for me.  Does the following (very lightly tested)
> patch help?
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> rcu: Grace-period initialization excludes only RCU notifier
> 
> Kirill noted the following deadlock cycle on shutdown involving padata:
> 
>> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
>> poweroff.
>>
>> It guess it happens because of race for cpu_hotplug.lock:
>>
>>       CPU A                                   CPU B
>> disable_nonboot_cpus()
>> _cpu_down()
>> cpu_hotplug_begin()
>>  mutex_lock(&cpu_hotplug.lock);
>> __cpu_notify()
>> padata_cpu_callback()
>> __padata_remove_cpu()
>> padata_replace()
>> synchronize_rcu()
>>                                       rcu_gp_kthread()
>>                                       get_online_cpus();
>>                                       mutex_lock(&cpu_hotplug.lock);
> 
> It would of course be good to eliminate grace-period delays from
> CPU-hotplug notifiers, but that is a separate issue.  Deadlock is
> not an appropriate diagnostic for excessive CPU-hotplug latency.
> 
> Fortunately, grace-period initialization does not actually need to
> exclude all of the CPU-hotplug operation, but rather only RCU's own
> CPU_UP_PREPARE and CPU_DEAD CPU-hotplug notifiers.  This commit therefore
> introduces a new per-rcu_state onoff_mutex that provides the required
> concurrency control in place of the get_online_cpus() that was previously
> in rcu_gp_init().
> 
> Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index fb63d7b..5eece12 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -74,6 +74,7 @@ static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
>  	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
>  	.orphan_donetail = &sname##_state.orphan_donelist, \
>  	.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
> +	.onoff_mutex = __MUTEX_INITIALIZER(sname##_state.onoff_mutex), \
>  	.name = #sname, \
>  }
>  
> @@ -1229,7 +1230,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
>  	raw_spin_unlock_irq(&rnp->lock);
>  
>  	/* Exclude any concurrent CPU-hotplug operations. */
> -	get_online_cpus();
> +	mutex_lock(&rsp->onoff_mutex);
>  
>  	/*
>  	 * Set the quiescent-state-needed bits in all the rcu_node
> @@ -1266,7 +1267,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
>  		cond_resched();
>  	}
>  
> -	put_online_cpus();
> +	mutex_unlock(&rsp->onoff_mutex);
>  	return 1;
>  }
>  
> @@ -1754,6 +1755,7 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
>  	/* Remove the dead CPU from the bitmasks in the rcu_node hierarchy. */
>  
>  	/* Exclude any attempts to start a new grace period. */
> +	mutex_lock(&rsp->onoff_mutex);
>  	raw_spin_lock_irqsave(&rsp->onofflock, flags);
>  
>  	/* Orphan the dead CPU's callbacks, and adopt them if appropriate. */
> @@ -1798,6 +1800,7 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
>  	init_callback_list(rdp);
>  	/* Disallow further callbacks on this CPU. */
>  	rdp->nxttail[RCU_NEXT_TAIL] = NULL;
> +	mutex_unlock(&rsp->onoff_mutex);
>  }
>  
>  #else /* #ifdef CONFIG_HOTPLUG_CPU */
> @@ -2708,6 +2711,9 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
>  	struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
> +	/* Exclude new grace periods. */
> +	mutex_lock(&rsp->onoff_mutex);
> +
>  	/* Set up local state, ensuring consistent view of global state. */
>  	raw_spin_lock_irqsave(&rnp->lock, flags);
>  	rdp->beenonline = 1;	 /* We have now been online. */
> @@ -2722,14 +2728,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
>  	rcu_prepare_for_idle_init(cpu);
>  	raw_spin_unlock(&rnp->lock);		/* irqs remain disabled. */
>  
> -	/*
> -	 * A new grace period might start here.  If so, we won't be part
> -	 * of it, but that is OK, as we are currently in a quiescent state.
> -	 */
> -
> -	/* Exclude any attempts to start a new GP on large systems. */
> -	raw_spin_lock(&rsp->onofflock);		/* irqs already disabled. */
> -
>  	/* Add CPU to rcu_node bitmasks. */
>  	rnp = rdp->mynode;
>  	mask = rdp->grpmask;
> @@ -2753,8 +2751,9 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
>  		raw_spin_unlock(&rnp->lock); /* irqs already disabled. */
>  		rnp = rnp->parent;
>  	} while (rnp != NULL && !(rnp->qsmaskinit & mask));
> +	local_irq_restore(flags);
>  
> -	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
> +	mutex_unlock(&rsp->onoff_mutex);
>  }
>  
>  static void __cpuinit rcu_prepare_cpu(int cpu)
> diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> index 5faf05d..a240f03 100644
> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -394,11 +394,17 @@ struct rcu_state {
>  	struct rcu_head **orphan_donetail;	/* Tail of above. */
>  	long qlen_lazy;				/* Number of lazy callbacks. */
>  	long qlen;				/* Total number of callbacks. */
> +	/* End of fields guarded by onofflock. */
> +
> +	struct mutex onoff_mutex;		/* Coordinate hotplug & GPs. */
> +
>  	struct mutex barrier_mutex;		/* Guards barrier fields. */
>  	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
>  	struct completion barrier_completion;	/* Wake at barrier end. */
>  	unsigned long n_barrier_done;		/* ++ at start and end of */
>  						/*  _rcu_barrier(). */
> +	/* End of fields guarded by barrier_mutex. */
> +
>  	unsigned long jiffies_force_qs;		/* Time at which to invoke */
>  						/*  force_quiescent_state(). */
>  	unsigned long n_force_qs;		/* Number of calls to */
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Deadlock on poweroff
  2012-10-07 17:16         ` Srivatsa S. Bhat
@ 2012-10-07 21:08           ` Paul E. McKenney
  0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2012-10-07 21:08 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Kirill A. Shutemov, linux-kernel, Dipankar Sarma,
	Thomas Gleixner, Andrew Morton, Steffen Klassert, linux-crypto

On Sun, Oct 07, 2012 at 10:46:54PM +0530, Srivatsa S. Bhat wrote:
> On 10/07/2012 10:41 PM, Kirill A. Shutemov wrote:
> > On Sun, Oct 07, 2012 at 10:35:01PM +0530, Srivatsa S. Bhat wrote:
> >> On 10/07/2012 10:20 PM, Kirill A. Shutemov wrote:
> >>> On Sun, Oct 07, 2012 at 09:03:11AM -0700, Paul E. McKenney wrote:
> >>>> On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
> >>>>> Hi Paul and all,
> >>>>>
> >>>>> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> >>>>> poweroff.
> >>>>>
> >>>>> It guess it happens because of race for cpu_hotplug.lock:
> >>>>>
> >>>>> 	CPU A					CPU B
> >>>>> disable_nonboot_cpus()
> >>>>> _cpu_down()
> >>>>> cpu_hotplug_begin()
> >>>>>  mutex_lock(&cpu_hotplug.lock);
> >>>>> __cpu_notify()
> >>>>> padata_cpu_callback()
> >>>>> __padata_remove_cpu()
> >>>>> padata_replace()
> >>>>> synchronize_rcu()
> >>>>> 					rcu_gp_kthread()
> >>>>> 					get_online_cpus();
> >>>>> 					mutex_lock(&cpu_hotplug.lock);
> >>>>>
> >>>>> Have you seen the issue before?
> >>>>
> >>>> This is a new one for me.  Does the following (very lightly tested)
> >>>> patch help?
> >>>
> >>> Works for me. Thanks.
> >>>
> >>
> >> Could you share the patch please? Looks like it didn't hit the mailing
> >> lists.. 
> > 
> > Sure. Here's original  mail from Paul:
> 
> Ah, great! Thanks!

Thank you, Kirill.

I wonder what the mailing list doesn't like about me...  ;-)

							Thanx, Paul

> Regards,
> Srivatsa S. Bhat
> 
> > Date: Sun, 7 Oct 2012 09:03:11 -0700
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > To: "Kirill A. Shutemov" <kirill@shutemov.name>
> > Cc: linux-kernel@vger.kernel.org, Dipankar Sarma <dipankar@in.ibm.com>,
> >         Thomas Gleixner <tglx@linutronix.de>,
> >         Andrew Morton <akpm@linux-foundation.org>,
> >         Steffen Klassert <steffen.klassert@secunet.com>,
> >         ".linux-crypto"@vger.kernel.org
> > Subject: Re: Deadlock on poweroff
> > Message-ID: <20121007160311.GE2485@linux.vnet.ibm.com>
> > Reply-To: paulmck@linux.vnet.ibm.com
> > References: <20121007024711.GA21403@shutemov.name>
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > In-Reply-To: <20121007024711.GA21403@shutemov.name>
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > X-Content-Scanned: Fidelis XPS MAILER
> > x-cbid: 12100716-7408-0000-0000-000009194B17
> > Status: RO
> > Content-Length: 6055
> > Lines: 173
> > 
> > On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
> >> Hi Paul and all,
> >>
> >> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> >> poweroff.
> >>
> >> It guess it happens because of race for cpu_hotplug.lock:
> >>
> >> 	CPU A					CPU B
> >> disable_nonboot_cpus()
> >> _cpu_down()
> >> cpu_hotplug_begin()
> >>  mutex_lock(&cpu_hotplug.lock);
> >> __cpu_notify()
> >> padata_cpu_callback()
> >> __padata_remove_cpu()
> >> padata_replace()
> >> synchronize_rcu()
> >> 					rcu_gp_kthread()
> >> 					get_online_cpus();
> >> 					mutex_lock(&cpu_hotplug.lock);
> >>
> >> Have you seen the issue before?
> > 
> > This is a new one for me.  Does the following (very lightly tested)
> > patch help?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > rcu: Grace-period initialization excludes only RCU notifier
> > 
> > Kirill noted the following deadlock cycle on shutdown involving padata:
> > 
> >> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> >> poweroff.
> >>
> >> It guess it happens because of race for cpu_hotplug.lock:
> >>
> >>       CPU A                                   CPU B
> >> disable_nonboot_cpus()
> >> _cpu_down()
> >> cpu_hotplug_begin()
> >>  mutex_lock(&cpu_hotplug.lock);
> >> __cpu_notify()
> >> padata_cpu_callback()
> >> __padata_remove_cpu()
> >> padata_replace()
> >> synchronize_rcu()
> >>                                       rcu_gp_kthread()
> >>                                       get_online_cpus();
> >>                                       mutex_lock(&cpu_hotplug.lock);
> > 
> > It would of course be good to eliminate grace-period delays from
> > CPU-hotplug notifiers, but that is a separate issue.  Deadlock is
> > not an appropriate diagnostic for excessive CPU-hotplug latency.
> > 
> > Fortunately, grace-period initialization does not actually need to
> > exclude all of the CPU-hotplug operation, but rather only RCU's own
> > CPU_UP_PREPARE and CPU_DEAD CPU-hotplug notifiers.  This commit therefore
> > introduces a new per-rcu_state onoff_mutex that provides the required
> > concurrency control in place of the get_online_cpus() that was previously
> > in rcu_gp_init().
> > 
> > Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index fb63d7b..5eece12 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -74,6 +74,7 @@ static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
> >  	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
> >  	.orphan_donetail = &sname##_state.orphan_donelist, \
> >  	.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
> > +	.onoff_mutex = __MUTEX_INITIALIZER(sname##_state.onoff_mutex), \
> >  	.name = #sname, \
> >  }
> >  
> > @@ -1229,7 +1230,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
> >  	raw_spin_unlock_irq(&rnp->lock);
> >  
> >  	/* Exclude any concurrent CPU-hotplug operations. */
> > -	get_online_cpus();
> > +	mutex_lock(&rsp->onoff_mutex);
> >  
> >  	/*
> >  	 * Set the quiescent-state-needed bits in all the rcu_node
> > @@ -1266,7 +1267,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
> >  		cond_resched();
> >  	}
> >  
> > -	put_online_cpus();
> > +	mutex_unlock(&rsp->onoff_mutex);
> >  	return 1;
> >  }
> >  
> > @@ -1754,6 +1755,7 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
> >  	/* Remove the dead CPU from the bitmasks in the rcu_node hierarchy. */
> >  
> >  	/* Exclude any attempts to start a new grace period. */
> > +	mutex_lock(&rsp->onoff_mutex);
> >  	raw_spin_lock_irqsave(&rsp->onofflock, flags);
> >  
> >  	/* Orphan the dead CPU's callbacks, and adopt them if appropriate. */
> > @@ -1798,6 +1800,7 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
> >  	init_callback_list(rdp);
> >  	/* Disallow further callbacks on this CPU. */
> >  	rdp->nxttail[RCU_NEXT_TAIL] = NULL;
> > +	mutex_unlock(&rsp->onoff_mutex);
> >  }
> >  
> >  #else /* #ifdef CONFIG_HOTPLUG_CPU */
> > @@ -2708,6 +2711,9 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
> >  	struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
> >  	struct rcu_node *rnp = rcu_get_root(rsp);
> >  
> > +	/* Exclude new grace periods. */
> > +	mutex_lock(&rsp->onoff_mutex);
> > +
> >  	/* Set up local state, ensuring consistent view of global state. */
> >  	raw_spin_lock_irqsave(&rnp->lock, flags);
> >  	rdp->beenonline = 1;	 /* We have now been online. */
> > @@ -2722,14 +2728,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
> >  	rcu_prepare_for_idle_init(cpu);
> >  	raw_spin_unlock(&rnp->lock);		/* irqs remain disabled. */
> >  
> > -	/*
> > -	 * A new grace period might start here.  If so, we won't be part
> > -	 * of it, but that is OK, as we are currently in a quiescent state.
> > -	 */
> > -
> > -	/* Exclude any attempts to start a new GP on large systems. */
> > -	raw_spin_lock(&rsp->onofflock);		/* irqs already disabled. */
> > -
> >  	/* Add CPU to rcu_node bitmasks. */
> >  	rnp = rdp->mynode;
> >  	mask = rdp->grpmask;
> > @@ -2753,8 +2751,9 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
> >  		raw_spin_unlock(&rnp->lock); /* irqs already disabled. */
> >  		rnp = rnp->parent;
> >  	} while (rnp != NULL && !(rnp->qsmaskinit & mask));
> > +	local_irq_restore(flags);
> >  
> > -	raw_spin_unlock_irqrestore(&rsp->onofflock, flags);
> > +	mutex_unlock(&rsp->onoff_mutex);
> >  }
> >  
> >  static void __cpuinit rcu_prepare_cpu(int cpu)
> > diff --git a/kernel/rcutree.h b/kernel/rcutree.h
> > index 5faf05d..a240f03 100644
> > --- a/kernel/rcutree.h
> > +++ b/kernel/rcutree.h
> > @@ -394,11 +394,17 @@ struct rcu_state {
> >  	struct rcu_head **orphan_donetail;	/* Tail of above. */
> >  	long qlen_lazy;				/* Number of lazy callbacks. */
> >  	long qlen;				/* Total number of callbacks. */
> > +	/* End of fields guarded by onofflock. */
> > +
> > +	struct mutex onoff_mutex;		/* Coordinate hotplug & GPs. */
> > +
> >  	struct mutex barrier_mutex;		/* Guards barrier fields. */
> >  	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
> >  	struct completion barrier_completion;	/* Wake at barrier end. */
> >  	unsigned long n_barrier_done;		/* ++ at start and end of */
> >  						/*  _rcu_barrier(). */
> > +	/* End of fields guarded by barrier_mutex. */
> > +
> >  	unsigned long jiffies_force_qs;		/* Time at which to invoke */
> >  						/*  force_quiescent_state(). */
> >  	unsigned long n_force_qs;		/* Number of calls to */
> > 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Deadlock on poweroff
  2012-10-07 16:50   ` Kirill A. Shutemov
  2012-10-07 17:05     ` Srivatsa S. Bhat
@ 2012-10-08  4:41     ` Paul E. McKenney
  2012-10-08  5:30       ` Kirill A. Shutemov
  1 sibling, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2012-10-08  4:41 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, Dipankar Sarma, Thomas Gleixner, Andrew Morton,
	Steffen Klassert, linux-crypto

On Sun, Oct 07, 2012 at 07:50:12PM +0300, Kirill A. Shutemov wrote:
> On Sun, Oct 07, 2012 at 09:03:11AM -0700, Paul E. McKenney wrote:
> > On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
> > > Hi Paul and all,
> > > 
> > > With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> > > poweroff.
> > > 
> > > It guess it happens because of race for cpu_hotplug.lock:
> > > 
> > > 	CPU A					CPU B
> > > disable_nonboot_cpus()
> > > _cpu_down()
> > > cpu_hotplug_begin()
> > >  mutex_lock(&cpu_hotplug.lock);
> > > __cpu_notify()
> > > padata_cpu_callback()
> > > __padata_remove_cpu()
> > > padata_replace()
> > > synchronize_rcu()
> > > 					rcu_gp_kthread()
> > > 					get_online_cpus();
> > > 					mutex_lock(&cpu_hotplug.lock);
> > > 
> > > Have you seen the issue before?
> > 
> > This is a new one for me.  Does the following (very lightly tested)
> > patch help?
> 
> Works for me. Thanks.

May I add your Tested-by?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Deadlock on poweroff
  2012-10-08  4:41     ` Paul E. McKenney
@ 2012-10-08  5:30       ` Kirill A. Shutemov
  0 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2012-10-08  5:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Dipankar Sarma, Thomas Gleixner, Andrew Morton,
	Steffen Klassert, linux-crypto

On Sun, Oct 07, 2012 at 09:41:28PM -0700, Paul E. McKenney wrote:
> On Sun, Oct 07, 2012 at 07:50:12PM +0300, Kirill A. Shutemov wrote:
> > On Sun, Oct 07, 2012 at 09:03:11AM -0700, Paul E. McKenney wrote:
> > > On Sun, Oct 07, 2012 at 05:47:11AM +0300, Kirill A. Shutemov wrote:
> > > > Hi Paul and all,
> > > > 
> > > > With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> > > > poweroff.
> > > > 
> > > > It guess it happens because of race for cpu_hotplug.lock:
> > > > 
> > > > 	CPU A					CPU B
> > > > disable_nonboot_cpus()
> > > > _cpu_down()
> > > > cpu_hotplug_begin()
> > > >  mutex_lock(&cpu_hotplug.lock);
> > > > __cpu_notify()
> > > > padata_cpu_callback()
> > > > __padata_remove_cpu()
> > > > padata_replace()
> > > > synchronize_rcu()
> > > > 					rcu_gp_kthread()
> > > > 					get_online_cpus();
> > > > 					mutex_lock(&cpu_hotplug.lock);
> > > > 
> > > > Have you seen the issue before?
> > > 
> > > This is a new one for me.  Does the following (very lightly tested)
> > > patch help?
> > 
> > Works for me. Thanks.
> 
> May I add your Tested-by?

Yep.

Tested-by: Kirill A. Shutemov <kirill@shutemov.name>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-10-08  5:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-07  2:47 Deadlock on poweroff Kirill A. Shutemov
     [not found] ` <20121007160311.GE2485@linux.vnet.ibm.com>
2012-10-07 16:50   ` Kirill A. Shutemov
2012-10-07 17:05     ` Srivatsa S. Bhat
2012-10-07 17:11       ` Kirill A. Shutemov
2012-10-07 17:16         ` Srivatsa S. Bhat
2012-10-07 21:08           ` Paul E. McKenney
2012-10-08  4:41     ` Paul E. McKenney
2012-10-08  5:30       ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).