All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	"Paul E. McKenney" <paul.mckenney@linaro.org>,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: linux-next ppc64: RCU mods cause __might_sleep BUGs
Date: Wed, 2 May 2012 13:25:30 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1205021324430.24246@eggly.anvils> (raw)
In-Reply-To: <20120501232516.GR2441@linux.vnet.ibm.com>

On Tue, 1 May 2012, Paul E. McKenney wrote:
> > > > > On Mon, 2012-04-30 at 15:37 -0700, Hugh Dickins wrote:
> > > > > > 
> > > > > > BUG: sleeping function called from invalid context at include/linux/pagemap.h:354
> > > > > > in_atomic(): 0, irqs_disabled(): 0, pid: 6886, name: cc1
> > > > > > Call Trace:
> > > > > > [c0000001a99f78e0] [c00000000000f34c] .show_stack+0x6c/0x16c (unreliable)
> > > > > > [c0000001a99f7990] [c000000000077b40] .__might_sleep+0x11c/0x134
> > > > > > [c0000001a99f7a10] [c0000000000c6228] .filemap_fault+0x1fc/0x494
> > > > > > [c0000001a99f7af0] [c0000000000e7c9c] .__do_fault+0x120/0x684
> > > > > > [c0000001a99f7c00] [c000000000025790] .do_page_fault+0x458/0x664
> > > > > > [c0000001a99f7e30] [c000000000005868] handle_page_fault+0x10/0x30

Got it at last.  Embarrassingly obvious.  __rcu_read_lock() and
__rcu_read_unlock() are not safe to be using __this_cpu operations,
the cpu may change in between the rmw's read and write: they should
be using this_cpu operations (or, I put preempt_disable/enable in the
__rcu_read_unlock below).  __this_cpus there work out fine on x86,
which was given good instructions to use; but not so well on PowerPC.

I've been running successfully for an hour now with the patch below;
but I expect you'll want to consider the tradeoffs, and may choose a
different solution.

Hugh

--- 3.4-rc4-next-20120427/include/linux/rcupdate.h	2012-04-28 09:26:38.000000000 -0700
+++ testing/include/linux/rcupdate.h	2012-05-02 11:46:06.000000000 -0700
@@ -159,7 +159,7 @@ DECLARE_PER_CPU(struct task_struct *, rc
  */
 static inline void __rcu_read_lock(void)
 {
-	__this_cpu_inc(rcu_read_lock_nesting);
+	this_cpu_inc(rcu_read_lock_nesting);
 	barrier(); /* Keep code within RCU read-side critical section. */
 }
 
--- 3.4-rc4-next-20120427/kernel/rcupdate.c	2012-04-28 09:26:40.000000000 -0700
+++ testing/kernel/rcupdate.c	2012-05-02 11:44:13.000000000 -0700
@@ -72,6 +72,7 @@ DEFINE_PER_CPU(struct task_struct *, rcu
  */
 void __rcu_read_unlock(void)
 {
+	preempt_disable();
 	if (__this_cpu_read(rcu_read_lock_nesting) != 1)
 		__this_cpu_dec(rcu_read_lock_nesting);
 	else {
@@ -83,13 +84,14 @@ void __rcu_read_unlock(void)
 		barrier();  /* ->rcu_read_unlock_special load before assign */
 		__this_cpu_write(rcu_read_lock_nesting, 0);
 	}
-#ifdef CONFIG_PROVE_LOCKING
+#if 1 /* CONFIG_PROVE_LOCKING */
 	{
 		int rln = __this_cpu_read(rcu_read_lock_nesting);
 
-		WARN_ON_ONCE(rln < 0 && rln > INT_MIN / 2);
+		BUG_ON(rln < 0 && rln > INT_MIN / 2);
 	}
 #endif /* #ifdef CONFIG_PROVE_LOCKING */
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(__rcu_read_unlock);
 
--- 3.4-rc4-next-20120427/kernel/sched/core.c	2012-04-28 09:26:40.000000000 -0700
+++ testing/kernel/sched/core.c	2012-05-01 22:40:46.000000000 -0700
@@ -2024,7 +2024,7 @@ asmlinkage void schedule_tail(struct tas
 {
 	struct rq *rq = this_rq();
 
-	rcu_switch_from(prev);
+	/* rcu_switch_from(prev); */
 	rcu_switch_to();
 	finish_task_switch(rq, prev);
 
@@ -7093,6 +7093,10 @@ void __might_sleep(const char *file, int
 		"BUG: sleeping function called from invalid context at %s:%d\n",
 			file, line);
 	printk(KERN_ERR
+		"cpu=%d preempt_count=%x preempt_offset=%x rcu_nesting=%x nesting_save=%x\n",
+		raw_smp_processor_id(), preempt_count(), preempt_offset,
+		rcu_preempt_depth(), current->rcu_read_lock_nesting_save); 
+	printk(KERN_ERR
 		"in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
 			in_atomic(), irqs_disabled(),
 			current->pid, current->comm);

WARNING: multiple messages have this Message-ID (diff)
From: Hugh Dickins <hughd@google.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	"Paul E. McKenney" <paul.mckenney@linaro.org>
Subject: Re: linux-next ppc64: RCU mods cause __might_sleep BUGs
Date: Wed, 2 May 2012 13:25:30 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1205021324430.24246@eggly.anvils> (raw)
In-Reply-To: <20120501232516.GR2441@linux.vnet.ibm.com>

On Tue, 1 May 2012, Paul E. McKenney wrote:
> > > > > On Mon, 2012-04-30 at 15:37 -0700, Hugh Dickins wrote:
> > > > > > 
> > > > > > BUG: sleeping function called from invalid context at include/linux/pagemap.h:354
> > > > > > in_atomic(): 0, irqs_disabled(): 0, pid: 6886, name: cc1
> > > > > > Call Trace:
> > > > > > [c0000001a99f78e0] [c00000000000f34c] .show_stack+0x6c/0x16c (unreliable)
> > > > > > [c0000001a99f7990] [c000000000077b40] .__might_sleep+0x11c/0x134
> > > > > > [c0000001a99f7a10] [c0000000000c6228] .filemap_fault+0x1fc/0x494
> > > > > > [c0000001a99f7af0] [c0000000000e7c9c] .__do_fault+0x120/0x684
> > > > > > [c0000001a99f7c00] [c000000000025790] .do_page_fault+0x458/0x664
> > > > > > [c0000001a99f7e30] [c000000000005868] handle_page_fault+0x10/0x30

Got it at last.  Embarrassingly obvious.  __rcu_read_lock() and
__rcu_read_unlock() are not safe to be using __this_cpu operations,
the cpu may change in between the rmw's read and write: they should
be using this_cpu operations (or, I put preempt_disable/enable in the
__rcu_read_unlock below).  __this_cpus there work out fine on x86,
which was given good instructions to use; but not so well on PowerPC.

I've been running successfully for an hour now with the patch below;
but I expect you'll want to consider the tradeoffs, and may choose a
different solution.

Hugh

--- 3.4-rc4-next-20120427/include/linux/rcupdate.h	2012-04-28 09:26:38.000000000 -0700
+++ testing/include/linux/rcupdate.h	2012-05-02 11:46:06.000000000 -0700
@@ -159,7 +159,7 @@ DECLARE_PER_CPU(struct task_struct *, rc
  */
 static inline void __rcu_read_lock(void)
 {
-	__this_cpu_inc(rcu_read_lock_nesting);
+	this_cpu_inc(rcu_read_lock_nesting);
 	barrier(); /* Keep code within RCU read-side critical section. */
 }
 
--- 3.4-rc4-next-20120427/kernel/rcupdate.c	2012-04-28 09:26:40.000000000 -0700
+++ testing/kernel/rcupdate.c	2012-05-02 11:44:13.000000000 -0700
@@ -72,6 +72,7 @@ DEFINE_PER_CPU(struct task_struct *, rcu
  */
 void __rcu_read_unlock(void)
 {
+	preempt_disable();
 	if (__this_cpu_read(rcu_read_lock_nesting) != 1)
 		__this_cpu_dec(rcu_read_lock_nesting);
 	else {
@@ -83,13 +84,14 @@ void __rcu_read_unlock(void)
 		barrier();  /* ->rcu_read_unlock_special load before assign */
 		__this_cpu_write(rcu_read_lock_nesting, 0);
 	}
-#ifdef CONFIG_PROVE_LOCKING
+#if 1 /* CONFIG_PROVE_LOCKING */
 	{
 		int rln = __this_cpu_read(rcu_read_lock_nesting);
 
-		WARN_ON_ONCE(rln < 0 && rln > INT_MIN / 2);
+		BUG_ON(rln < 0 && rln > INT_MIN / 2);
 	}
 #endif /* #ifdef CONFIG_PROVE_LOCKING */
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(__rcu_read_unlock);
 
--- 3.4-rc4-next-20120427/kernel/sched/core.c	2012-04-28 09:26:40.000000000 -0700
+++ testing/kernel/sched/core.c	2012-05-01 22:40:46.000000000 -0700
@@ -2024,7 +2024,7 @@ asmlinkage void schedule_tail(struct tas
 {
 	struct rq *rq = this_rq();
 
-	rcu_switch_from(prev);
+	/* rcu_switch_from(prev); */
 	rcu_switch_to();
 	finish_task_switch(rq, prev);
 
@@ -7093,6 +7093,10 @@ void __might_sleep(const char *file, int
 		"BUG: sleeping function called from invalid context at %s:%d\n",
 			file, line);
 	printk(KERN_ERR
+		"cpu=%d preempt_count=%x preempt_offset=%x rcu_nesting=%x nesting_save=%x\n",
+		raw_smp_processor_id(), preempt_count(), preempt_offset,
+		rcu_preempt_depth(), current->rcu_read_lock_nesting_save); 
+	printk(KERN_ERR
 		"in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
 			in_atomic(), irqs_disabled(),
 			current->pid, current->comm);

  reply	other threads:[~2012-05-02 20:25 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-30 22:37 linux-next ppc64: RCU mods cause __might_sleep BUGs Hugh Dickins
2012-04-30 22:37 ` Hugh Dickins
2012-04-30 23:14 ` Paul E. McKenney
2012-04-30 23:14   ` Paul E. McKenney
2012-05-01  0:33 ` Benjamin Herrenschmidt
2012-05-01  0:33   ` Benjamin Herrenschmidt
2012-05-01  5:10   ` Hugh Dickins
2012-05-01  5:10     ` Hugh Dickins
2012-05-01 14:22     ` Paul E. McKenney
2012-05-01 14:22       ` Paul E. McKenney
2012-05-01 21:42       ` Hugh Dickins
2012-05-01 21:42         ` Hugh Dickins
2012-05-01 23:25         ` Paul E. McKenney
2012-05-01 23:25           ` Paul E. McKenney
2012-05-02 20:25           ` Hugh Dickins [this message]
2012-05-02 20:25             ` Hugh Dickins
2012-05-02 20:49             ` Paul E. McKenney
2012-05-02 20:49               ` Paul E. McKenney
2012-05-02 21:32               ` Paul E. McKenney
2012-05-02 21:32                 ` Paul E. McKenney
2012-05-02 21:36                 ` Paul E. McKenney
2012-05-02 21:36                   ` Paul E. McKenney
2012-05-02 21:20             ` Benjamin Herrenschmidt
2012-05-02 21:20               ` Benjamin Herrenschmidt
2012-05-02 21:54               ` Paul E. McKenney
2012-05-02 21:54                 ` Paul E. McKenney
2012-05-02 22:54                 ` Hugh Dickins
2012-05-02 22:54                   ` Hugh Dickins
2012-05-03  0:14                   ` Paul E. McKenney
2012-05-03  0:14                     ` Paul E. McKenney
2012-05-03  0:24                     ` Hugh Dickins
2012-05-03  0:24                       ` Hugh Dickins
2012-05-07 16:21                       ` Hugh Dickins
2012-05-07 16:21                         ` Hugh Dickins
2012-05-07 18:50                         ` Paul E. McKenney
2012-05-07 18:50                           ` Paul E. McKenney
2012-05-07 21:38                           ` Hugh Dickins
2012-05-07 21:38                             ` Hugh Dickins
2012-05-01 13:39   ` Paul E. McKenney
2012-05-01 13:39     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.00.1205021324430.24246@eggly.anvils \
    --to=hughd@google.com \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paul.mckenney@linaro.org \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.