All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>, linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL rcu/next] rcu commits for 2.6.40
Date: Mon, 16 May 2011 00:08:49 -0700	[thread overview]
Message-ID: <20110516070849.GA20580@linux.vnet.ibm.com> (raw)
In-Reply-To: <20110515065949.GA11330@linux.vnet.ibm.com>

On Sat, May 14, 2011 at 11:59:49PM -0700, Paul E. McKenney wrote:
> On Sat, May 14, 2011 at 11:04:15PM -0700, Paul E. McKenney wrote:
> > On Sat, May 14, 2011 at 10:49:31PM -0700, Yinghai Lu wrote:
> > > On 05/14/2011 10:41 PM, Yinghai Lu wrote:
> > > > On 05/14/2011 09:14 PM, Yinghai Lu wrote:
> > > >> On 05/14/2011 11:34 AM, Paul E. McKenney wrote:
> > > >>>> and do the inspection afterwards.
> > > >>>
> > > >>> And here is a lightly-tested patch, which applies on tip/core/rcu.
> > > >>>
> > > >>> This problem could account for both the long delays seen with e59fb312
> > > >>> (Decrease memory-barrier usage based on semi-formal proof) and the
> > > >>> shorter delays seen with a26ac245 (move TREE_RCU from softirq to kthread).
> > > >>
> > > >> yes. it fixes the problem.
> > > >>
> > > >> for 1024g system when hotadd mem enabled in kernel config
> > > >>
> > > >> [   31.814803] cpu_dev_init done
> > > >> [   35.437163] memory_dev_init done
> > > >>
> > > >> even it is with gcc from opensuse 11.3
> > > > 
> > > > got:
> > > > 
> > > > [   86.931217] Switched to NOHz mode on CPU #0
> > > > [   86.931272] Switched to NOHz mode on CPU #25
> > > > [   86.931278] ------------[ cut here ]------------
> > > > [   86.931290] WARNING: at kernel/rcutree.c:364 rcu_enter_nohz+0x44/0x76()
> > > > [   86.931294] Hardware name: Sun Fire X4800 M2 
> > > > [   86.931297] Modules linked in:
> > > > [   86.931303] Pid: 0, comm: swapper Not tainted 2.6.39-rc7-tip-yh-04836-g5e42dc2-dirty #3
> > > > [   86.931307] Call Trace:
> > > > [   86.931333]  [<ffffffff81080280>] warn_slowpath_common+0x85/0x9d
> > > > [   86.931338] Switched to NOHz mode on CPU #74
> > > > [   86.931346]  [<ffffffff810802b2>] warn_slowpath_null+0x1a/0x1c
> > > > [   86.931356]  [<ffffffff810d3615>] rcu_enter_nohz+0x44/0x76
> > > > [   86.931370]  [<ffffffff810ab3cb>] tick_nohz_stop_sched_tick+0x27d/0x366
> > > > [   86.931381]  [<ffffffff810391bc>] cpu_idle+0x7a/0xcc
> > > > [   86.931397]  [<ffffffff81bd1aa3>] rest_init+0xb7/0xbe
> > > > [   86.931408]  [<ffffffff81bd19ec>] ? csum_partial_copy_generic+0x16c/0x16c
> > > > [   86.931423]  [<ffffffff82738e39>] start_kernel+0x3b2/0x3bd
> > > > [   86.931428] Switched to NOHz mode on CPU #94
> > > > [   86.931436]  [<ffffffff827382cc>] x86_64_start_reservations+0x9c/0xa0
> > > > [   86.931446]  [<ffffffff827384a8>] x86_64_start_kernel+0x1d8/0x1e3
> > > > [   86.931463] ---[ end trace 2cfc591bf7de931f ]---
> > > > [   86.931598] Switched to NOHz mode on CPU #151
> > > > [   86.931613] Switched to NOHz mode on CPU #152
> > > 
> > > it seems gcc from Fedora 14 is not happy with this patch.
> > > 
> > > [   35.113696] cpu_dev_init done
> > > [  155.963662] memory_dev_init done
> > 
> > Hmmm...  It looks like my attempts to make RCU recover from misnesting are
> > not completely foolproof.  I will be especially happy to look into this
> > if you could look for the source of the irq_enter()/irq_exit() misnesting.
> > 
> > (And yes, it still might be a bug in my code -- I will be looking at that
> > yet again as well.)
> 
> And the way you can prove that it is my code rather than the arch
> code is to show that the warning happens on your system when the
> irq_enter()/irq_exit() calls are perfectly nested.

So I took another look at the RCU debugfs stats you provided earlier,
and realized that your system gets a lot more NMIs than do the ones
that I have access to.  So as a diagnostic patch, I ifdefed out the
body of rcu_nmi_enter() and rcu_nmi_exit().

If everything works perfectly with this patch applied, that would point
to a race in those two functions.  Please feel free to apply on top
of my earlier diagnostic patches or directly on tip/core/rcu -- either
would provide good information.

							Thanx, Paul

------------------------------------------------------------------------

 rcutree.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 4a9e4aa..0d4a5b5 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -430,6 +430,7 @@ void rcu_exit_nohz(void)
  */
 void rcu_nmi_enter(void)
 {
+#if 0
 	struct rcu_dynticks *rdtp = &__get_cpu_var(rcu_dynticks);
 
 	if (rdtp->dynticks_nmi_nesting == 0 &&
@@ -443,6 +444,7 @@ void rcu_nmi_enter(void)
 	/* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
 	smp_mb__after_atomic_inc();  /* See above. */
 	WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
+#endif
 }
 
 /**
@@ -454,6 +456,7 @@ void rcu_nmi_enter(void)
  */
 void rcu_nmi_exit(void)
 {
+#if 0
 	struct rcu_dynticks *rdtp = &__get_cpu_var(rcu_dynticks);
 
 	if (rdtp->dynticks_nmi_nesting == 0 ||
@@ -466,6 +469,7 @@ void rcu_nmi_exit(void)
 	atomic_inc(&rdtp->dynticks);
 	smp_mb__after_atomic_inc();  /* Force delay to next write. */
 	WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
+#endif
 }
 
 /**

  reply	other threads:[~2011-05-16  7:08 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-08 15:18 [GIT PULL rcu/next] rcu commits for 2.6.40 Paul E. McKenney
2011-05-09  7:36 ` Ingo Molnar
2011-05-09 21:09   ` Yinghai Lu
2011-05-10  8:56     ` Paul E. McKenney
2011-05-10  9:37       ` Ingo Molnar
2011-05-10 18:04       ` Yinghai Lu
2011-05-10 19:32         ` Paul E. McKenney
2011-05-10 20:52           ` Yinghai Lu
2011-05-11  4:54             ` Paul E. McKenney
2011-05-11  6:03               ` Yinghai Lu
2011-05-11  6:42               ` Yinghai Lu
2011-05-11 20:13                 ` Paul E. McKenney
2011-05-11 16:54               ` Yinghai Lu
2011-05-11 16:56               ` Yinghai Lu
2011-05-11 20:18                 ` Paul E. McKenney
2011-05-11 20:59                   ` Yinghai Lu
2011-05-11 21:30                     ` Yinghai Lu
2011-05-11 23:02                       ` Yinghai Lu
2011-05-12  6:03                         ` Ingo Molnar
2011-05-12  7:27                           ` Yinghai Lu
2011-05-12  7:42                             ` Yinghai Lu
2011-05-12  9:20                               ` Paul E. McKenney
2011-05-12 17:31                                 ` Yinghai Lu
2011-05-12 21:36                                 ` Yinghai Lu
2011-05-13  1:28                                   ` Yinghai Lu
2011-05-13  8:42                                     ` Ingo Molnar
2011-05-13 12:19                                       ` Ingo Molnar
2011-05-13 13:04                                         ` Ingo Molnar
2011-05-13 13:12                                           ` Ingo Molnar
2011-05-13 14:14                                             ` Paul E. McKenney
2011-05-13 15:07                                               ` Ingo Molnar
2011-05-13 16:26                                                 ` Paul E. McKenney
2011-05-16  7:08                                                   ` Ingo Molnar
2011-05-16  7:48                                                     ` Paul E. McKenney
2011-05-16 11:51                                                       ` Ingo Molnar
2011-05-16 12:23                                                         ` Ingo Molnar
2011-05-16 14:30                                                           ` Ingo Molnar
2011-05-16 21:33                                                             ` Paul E. McKenney
2011-05-16 22:07                                                               ` Paul E. McKenney
2011-05-16 21:24                                                           ` Paul E. McKenney
2011-05-16 23:52                                                             ` Frederic Weisbecker
2011-05-17  2:40                                                             ` Frederic Weisbecker
2011-05-17  7:53                                                               ` Paul E. McKenney
2011-05-17 12:43                                                                 ` Frederic Weisbecker
2011-05-17 22:21                                                                   ` Paul E. McKenney
2011-05-18 21:10                                                               ` Yinghai Lu
2011-05-18 23:13                                                                 ` Frederic Weisbecker
2011-05-19  4:33                                                                   ` Yinghai Lu
2011-05-19 14:47                                                                     ` Frederic Weisbecker
2011-05-19 19:51                                                                       ` Yinghai Lu
2011-05-19 21:15                                                                         ` Frederic Weisbecker
2011-05-19 21:45                                                                           ` Yinghai Lu
2011-05-20  0:09                                                                             ` [PATCH] rcu: Fix unpaired rcu_irq_enter() from locking selftests Frederic Weisbecker
2011-05-20  8:36                                                                               ` Ingo Molnar
2011-05-20 15:12                                                                                 ` Paul E. McKenney
2011-05-20 15:11                                                                               ` Paul E. McKenney
2011-05-20  0:14                                                                             ` [GIT PULL rcu/next] rcu commits for 2.6.40 Frederic Weisbecker
2011-05-13 14:40                                             ` Ingo Molnar
2011-05-13 16:38                                               ` Paul E. McKenney
2011-05-16  7:10                                                 ` Ingo Molnar
2011-05-13 21:08                                   ` Yinghai Lu
2011-05-14 14:26                                     ` Paul E. McKenney
2011-05-14 15:31                                       ` Paul E. McKenney
2011-05-14 18:34                                         ` Paul E. McKenney
2011-05-15  3:59                                           ` Yinghai Lu
2011-05-15  4:14                                           ` Yinghai Lu
2011-05-15  5:41                                             ` Yinghai Lu
2011-05-15  5:49                                               ` Yinghai Lu
2011-05-15  6:04                                                 ` Paul E. McKenney
2011-05-15  6:59                                                   ` Paul E. McKenney
2011-05-16  7:08                                                     ` Paul E. McKenney [this message]
2011-05-16  7:39                                                       ` Ingo Molnar
2011-05-15  6:01                                               ` Paul E. McKenney
2011-05-15 22:01                                           ` Frederic Weisbecker
2011-05-16  5:56                                             ` Paul E. McKenney
2011-05-16 22:40                                               ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110516070849.GA20580@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.