netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Yanko Kaneti <yaneti@declera.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Cong Wang <cwang@twopensource.com>, Kevin Fenzi <kevin@scrye.com>,
	netdev <netdev@vger.kernel.org>,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: localed stuck in recent 3.18 git in copy_net_ns?
Date: Thu, 23 Oct 2014 13:05:07 -0700	[thread overview]
Message-ID: <20141023200507.GC4977@linux.vnet.ibm.com> (raw)
In-Reply-To: <20141023195159.GA2331@declera.com>

On Thu, Oct 23, 2014 at 10:51:59PM +0300, Yanko Kaneti wrote:
> On Thu-10/23/14-2014 08:33, Paul E. McKenney wrote:
> > On Thu, Oct 23, 2014 at 05:27:50AM -0700, Paul E. McKenney wrote:
> > > On Thu, Oct 23, 2014 at 09:09:26AM +0300, Yanko Kaneti wrote:
> > > > On Wed, 2014-10-22 at 16:24 -0700, Paul E. McKenney wrote:
> > > > > On Thu, Oct 23, 2014 at 01:40:32AM +0300, Yanko Kaneti wrote:
> > > > > > On Wed-10/22/14-2014 15:33, Josh Boyer wrote:
> > > > > > > On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney
> > > > > > > <paulmck@linux.vnet.ibm.com> wrote:
> > > > > 
> > > > > [ . . . ]
> > > > > 
> > > > > > > > Don't get me wrong -- the fact that this kthread appears to 
> > > > > > > > have
> > > > > > > > blocked within rcu_barrier() for 120 seconds means that 
> > > > > > > > something is
> > > > > > > > most definitely wrong here.  I am surprised that there are no 
> > > > > > > > RCU CPU
> > > > > > > > stall warnings, but perhaps the blockage is in the callback 
> > > > > > > > execution
> > > > > > > > rather than grace-period completion.  Or something is 
> > > > > > > > preventing this
> > > > > > > > kthread from starting up after the wake-up callback executes.  
> > > > > > > > Or...
> > > > > > > > 
> > > > > > > > Is this thing reproducible?
> > > > > > > 
> > > > > > > I've added Yanko on CC, who reported the backtrace above and can
> > > > > > > recreate it reliably.  Apparently reverting the RCU merge commit
> > > > > > > (d6dd50e) and rebuilding the latest after that does not show the
> > > > > > > issue.  I'll let Yanko explain more and answer any questions you 
> > > > > > > have.
> > > > > > 
> > > > > > - It is reproducible
> > > > > > - I've done another build here to double check and its definitely 
> > > > > > the rcu merge
> > > > > >   that's causing it.
> > > > > > 
> > > > > > Don't think I'll be able to dig deeper, but I can do testing if 
> > > > > > needed.
> > > > > 
> > > > > Please!  Does the following patch help?
> > > > 
> > > > Nope, doesn't seem to make a difference to the modprobe ppp_generic 
> > > > test
> > > 
> > > Well, I was hoping.  I will take a closer look at the RCU merge commit
> > > and see what suggests itself.  I am likely to ask you to revert specific
> > > commits, if that works for you.
> > 
> > Well, rather than reverting commits, could you please try testing the
> > following commits?
> > 
> > 11ed7f934cb8 (rcu: Make nocb leader kthreads process pending callbacks after spawning)
> > 
> > 73a860cd58a1 (rcu: Replace flush_signals() with WARN_ON(signal_pending()))
> > 
> > c847f14217d5 (rcu: Avoid misordering in nocb_leader_wait())
> > 
> > 	For whatever it is worth, I am guessing this one.
> 
> Indeed, c847f14217d5 it is.
> 
> Much to my embarrasment I just noticed that in addition to the
> rcu merge, triggering the bug "requires" my specific Fedora rawhide network
> setup. Booting in single mode and modprobe ppp_generic is fine. The bug
> appears when starting with my regular fedora network setup, which in my case 
> includes 3 ethernet adapters and a libvirt birdge+nat setup.
> 
> Hope that helps. 
> 
> I am attaching the config.

It does help a lot, thank you!!!

The following patch is a bit of a shot in the dark, and assumes that
commit 1772947bd012 (rcu: Handle NOCB callbacks from irq-disabled idle
code) introduced the problem.  Does this patch fix things up?

							Thanx, Paul

------------------------------------------------------------------------

rcu: Kick rcuo kthreads after their CPU goes offline

If a no-CBs CPU were to post an RCU callback with interrupts disabled
after it entered the idle loop for the last time, there might be no
deferred wakeup for the corresponding rcuo kthreads.  This commit
therefore adds a set of calls to do_nocb_deferred_wakeup() after the
CPU has gone completely offline.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 84b41b3c6ebd..4f3d25a58786 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3493,8 +3493,10 @@ static int rcu_cpu_notify(struct notifier_block *self,
 	case CPU_DEAD_FROZEN:
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:
-		for_each_rcu_flavor(rsp)
+		for_each_rcu_flavor(rsp) {
 			rcu_cleanup_dead_cpu(cpu, rsp);
+			do_nocb_deferred_wakeup(this_cpu_ptr(rsp->rda));
+		}
 		break;
 	default:
 		break;

  reply	other threads:[~2014-10-23 20:09 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-20 20:15 localed stuck in recent 3.18 git in copy_net_ns? Kevin Fenzi
2014-10-20 20:43 ` Dave Jones
2014-10-20 20:53   ` Kevin Fenzi
2014-10-21 21:12     ` Kevin Fenzi
2014-10-22 17:12       ` Josh Boyer
2014-10-22 17:37         ` Cong Wang
2014-10-22 17:49           ` Josh Boyer
2014-10-22 17:53           ` Eric W. Biederman
2014-10-22 18:11             ` Paul E. McKenney
2014-10-22 18:25               ` Eric W. Biederman
2014-10-22 18:55                 ` Paul E. McKenney
2014-10-22 19:33                   ` Josh Boyer
2014-10-22 22:40                     ` Yanko Kaneti
2014-10-22 23:24                       ` Paul E. McKenney
2014-10-23  6:09                         ` Yanko Kaneti
2014-10-23 12:27                           ` Paul E. McKenney
2014-10-23 15:33                             ` Paul E. McKenney
     [not found]                               ` <CA+5PVA4H6EAf6cBc4a_8W8x4Mgppjc5GsskKaCRry2jq+LP+FA@mail.gmail.com>
2014-10-23 16:28                                 ` Paul E. McKenney
2014-10-23 19:51                               ` Yanko Kaneti
2014-10-23 20:05                                 ` Paul E. McKenney [this message]
2014-10-23 21:45                                   ` Yanko Kaneti
2014-10-23 22:04                                     ` Paul E. McKenney
2014-10-24  4:48                                       ` Jay Vosburgh
2014-10-24 14:50                                         ` Paul E. McKenney
2014-10-24 18:20                                           ` Jay Vosburgh
2014-10-24 18:33                                             ` Paul E. McKenney
2014-10-24  9:08                                       ` Yanko Kaneti
2014-10-24 15:40                                         ` Paul E. McKenney
2014-10-24 16:29                                           ` Yanko Kaneti
2014-10-24 16:54                                             ` Paul E. McKenney
2014-10-24 17:09                                               ` Yanko Kaneti
2014-10-24 17:20                                                 ` Paul E. McKenney
2014-10-24 17:35                                                   ` Yanko Kaneti
2014-10-24 18:32                                                     ` Paul E. McKenney
2014-10-24 18:49                                                       ` Jay Vosburgh
2014-10-24 18:57                                                         ` Paul E. McKenney
2014-10-24 20:15                                                           ` Paul E. McKenney
2014-10-24 21:25                                                       ` Yanko Kaneti
2014-10-24 21:49                                                         ` Paul E. McKenney
2014-10-24 22:02                                                           ` Jay Vosburgh
2014-10-24 22:16                                                             ` Paul E. McKenney
2014-10-24 22:41                                                               ` Jay Vosburgh
2014-10-24 22:34                                                           ` Jay Vosburgh
2014-10-24 22:59                                                             ` Paul E. McKenney
2014-10-24 23:05                                                               ` Paul E. McKenney
2014-10-25  0:20                                                                 ` Jay Vosburgh
2014-10-25  2:03                                                                   ` Paul E. McKenney
2014-10-25  4:33                                                                     ` Jay Vosburgh
2014-10-25  5:16                                                                       ` Paul E. McKenney
2014-10-25 16:38                                                                         ` Jay Vosburgh
2014-10-25 18:18                                                                           ` Paul E. McKenney
2014-10-27 17:45                                                                             ` Paul E. McKenney
2014-10-27 20:43                                                                               ` Jay Vosburgh
2014-10-27 21:07                                                                                 ` Paul E. McKenney
2014-10-28  8:12                                                                               ` Yanko Kaneti
2014-10-28 12:50                                                                                 ` Paul E. McKenney
2014-10-28 13:00                                                                                   ` Yanko Kaneti
2014-10-28 15:54                                                                                     ` Kevin Fenzi
2014-10-28 16:15                                                                                       ` Paul E. McKenney
2014-10-25 12:09                                                           ` Yanko Kaneti
2014-10-25 13:38                                                             ` Paul E. McKenney
2014-10-22 17:59           ` Paul E. McKenney
2014-10-22 18:03             ` Josh Boyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141023200507.GC4977@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=cwang@twopensource.com \
    --cc=ebiederm@xmission.com \
    --cc=jwboyer@fedoraproject.org \
    --cc=kevin@scrye.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=yaneti@declera.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).