All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Dave Jones <davej@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Andrey Vagin <avagin@openvz.org>
Subject: Re: frequent softlockups with 3.10rc6.
Date: Fri, 21 Jun 2013 21:59:49 +0200	[thread overview]
Message-ID: <20130621195949.GA15519@redhat.com> (raw)
In-Reply-To: <20130621151119.GA1596@redhat.com>

On 06/21, Dave Jones wrote:
>
> On Thu, Jun 20, 2013 at 09:16:52AM -0700, Paul E. McKenney wrote:
>  > >  > >  > I've been hitting this a lot the last few days.
>  > >  > >  > This is the same machine that I was also seeing lockups during sync()
>  > >  > >
>  > >  > > On a whim, I reverted 971394f389992f8462c4e5ae0e3b49a10a9534a3
>  > >  > > (As I started seeing these just after that rcu merge).
>  > >  > >
>  > >  > > It's only been 30 minutes, but it seems stable again. Normally I would
>  > >  > > hit these within 5 minutes.
>  > >  > >
>  > >  > > I think this may be the same root cause for http://www.spinics.net/lists/kernel/msg1551503.html too.
>  > >  >
>  > > Dammit. Paul, you're off the hook (for now).
>  > > It just took longer to hit.
>  >
>  > Well, this commit could significantly increase CPU overhead, which might
>  > make the bug more likely to occur.  (Hey, I can rationalize -anything-!!!)
>
> I spent yesterday bisecting this (twice, on two systems in parallel to be sure).
> It came down to 8aac62706adaaf0fab02c4327761561c8bda9448
> Before I turned in last night, I pulled Linus current, and reverted that,

I hope you didn't pull some unrelated fix in between ;)

> and it's been fine during overnight stress testing.
>
> Oleg ?

I am puzzled. And I do not really understand

	hardirqs last  enabled at (2380318): [<ffffffff816ed220>] restore_args+0x0/0x30
	hardirqs last disabled at (2380319): [<ffffffff816f5d2a>] apic_timer_interrupt+0x6a/0x80
	softirqs last  enabled at (196990): [<ffffffff810542d4>] __do_softirq+0x194/0x440 [19886.471395]
	softirqs last disabled at (197479): [<ffffffff8105473d>] irq_exit+0xcd/0xe0

below. how can they differ that much...

Dave, any chance you can reproduce the hang with the debugging patch at
the end? Just in case, the warnings themself do not mean a problem, just
to have a bit more info.

>  > > [19886.451044] BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:12994]
>  > > [19886.452659] Modules linked in: bridge stp snd_seq_dummy fuse tun hidp rfcomm ipt_ULOG bnep nfnetlink can_raw ipx p8023 p8022 pppoe pppox ppp_generic slhc scsi_transport_iscsi can_bcm rds bluetooth can nfc
>  > > rfkill af_key af_802154 netrom irda af_rxrpc appletalk phonet psnap caif_socket caif x25 crc_ccitt llc2 llc rose ax25 atm coretemp hwmon kvm_intel kvm crc32c_intel ghash_clmulni_intel snd_hda_codec_realtek mi
>  > > crocode pcspkr snd_hda_codec_hdmi usb_debug snd_hda_intel snd_seq e1000e snd_hda_codec snd_hwdep ptp pps_core snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore xfs libcrc32c
>  > > [19886.464209] irq event stamp: 2380319
>  > > [19886.465510] hardirqs last  enabled at (2380318): [<ffffffff816ed220>] restore_args+0x0/0x30
>  > > [19886.467446] hardirqs last disabled at (2380319): [<ffffffff816f5d2a>] apic_timer_interrupt+0x6a/0x80
>  > > [19886.469464] softirqs last  enabled at (196990): [<ffffffff810542d4>] __do_softirq+0x194/0x440
>  > > [19886.471395] softirqs last disabled at (197479): [<ffffffff8105473d>] irq_exit+0xcd/0xe0
>  > > [19886.473238] CPU: 0 PID: 12994 Comm: trinity-child0 Not tainted 3.10.0-rc6+ #17
>  > > [19886.477881] task: ffff8801a8222520 ti: ffff880228d98000 task.ti: ffff880228d98000
>  > > [19886.479712] RIP: 0010:[<ffffffff810541f1>]  [<ffffffff810541f1>] __do_softirq+0xb1/0x440
>  > > [19886.481706] RSP: 0018:ffff880244803f08  EFLAGS: 00000202
>  > > [19886.483298] RAX: ffff8801a8222520 RBX: ffffffff816ed220 RCX: 0000000000000002
>  > > [19886.485094] RDX: 00000000000045b0 RSI: ffff8801a8222ca0 RDI: ffff8801a8222520
>  > > [19886.486896] RBP: ffff880244803f70 R08: 0000000000000001 R09: 0000000000000000
>  > > [19886.488687] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880244803e78
>  > > [19886.490507] R13: ffffffff816f5d2f R14: ffff880244803f70 R15: 0000000000000000
>  > > [19886.492325] FS:  00007f0bcf727740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
>  > > [19886.494184] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  > > [19886.495781] CR2: 0000000000000001 CR3: 00000001a56d0000 CR4: 00000000001407f0
>  > > [19886.497545] DR0: 00007f21e7713000 DR1: 0000000000000000 DR2: 0000000000000000
>  > > [19886.499304] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>  > > [19886.501070] Stack:
>  > > [19886.502247]  0000000a00406040 00000001001ddcf4 ffff880228d99fd8 ffff880228d99fd8
>  > > [19886.504104]  ffff880228d99fd8 ffff8801a8222918 ffff880228d99fd8 ffff880200000000
>  > > [19886.505996]  ffff8801a8222520 0000000000000000 0000000000002e82 00000000162dccb4
>  > > [19886.507859] Call Trace:
>  > > [19886.509095]  <IRQ>
>  > > [19886.509334]  [<ffffffff8105473d>] irq_exit+0xcd/0xe0
>  > > [19886.511879]  [<ffffffff816f6bcb>] smp_apic_timer_interrupt+0x6b/0x9b
>  > > [19886.513586]  [<ffffffff816f5d2f>] apic_timer_interrupt+0x6f/0x80
>  > > [19886.515243]  <EOI>
>  > > [19886.515482]  [<ffffffff812fe80d>] ? idr_find_slowpath+0x4d/0x150
>  > > [19886.518126]  [<ffffffff812a2cb9>] ipcget+0x89/0x380
>  > > [19886.519656]  [<ffffffff810b72dd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
>  > > [19886.521409]  [<ffffffff812a43b6>] SyS_msgget+0x56/0x60
>  > > [19886.522971]  [<ffffffff812a39a0>] ? rcu_read_lock+0x80/0x80
>  > > [19886.524585]  [<ffffffff812a37e0>] ? sysvipc_msg_proc_show+0xd0/0xd0
>  > > [19886.526274]  [<ffffffff816f52d4>] tracesys+0xdd/0xe2
>  > > [19886.527804]  [<ffffffff8100ffff>] ? enable_step+0x3f/0x1d0
>  > > [19886.529397] Code: 48 89 45 b8 48 89 45 b0 48 89 45 a8 66 0f 1f 44 00 00 65 c7 04 25 80 0f 1d 00 00 00 00 00 e8 b7 31 06 00 fb 49 c7 c6 00 41 c0 81 <eb> 0e 0f 1f 44 00 00 49 83 c6 08 41 d1 ef 74 6c 41 f6 c7 01 74
>  > >
> ---end quoted text---


diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index 10e5947..bbeb128 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -59,7 +59,9 @@ extern struct nsproxy init_nsproxy;
 
 static inline struct nsproxy *task_nsproxy(struct task_struct *tsk)
 {
-	return rcu_dereference(tsk->nsproxy);
+	struct nsproxy *ret = rcu_dereference(tsk->nsproxy);
+	WARN_ON(!ret);
+	return ret;
 }
 
 int copy_namespaces(unsigned long flags, struct task_struct *tsk);
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 364ceab..77a20e9 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -220,6 +220,7 @@ void switch_task_namespaces(struct task_struct *p, struct nsproxy *new)
 	rcu_assign_pointer(p->nsproxy, new);
 
 	if (ns && atomic_dec_and_test(&ns->count)) {
+		pr_info("YESTHISHAPPENS new=%p\n", new);
 		/*
 		 * wait for others to get what they want from this nsproxy.
 		 *
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 65bd3c9..917764e 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -9,6 +9,8 @@ task_work_add(struct task_struct *task, struct callback_head *work, bool notify)
 {
 	struct callback_head *head;
 
+	WARN_ON(task == current && !task->nsproxy);
+
 	do {
 		head = ACCESS_ONCE(task->task_works);
 		if (unlikely(head == &work_exited))


  reply	other threads:[~2013-06-21 20:04 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-19 16:45 frequent softlockups with 3.10rc6 Dave Jones
2013-06-19 17:53 ` Dave Jones
2013-06-19 18:13   ` Paul E. McKenney
2013-06-19 18:42     ` Dave Jones
2013-06-20  0:12     ` Dave Jones
2013-06-20 16:16       ` Paul E. McKenney
2013-06-20 16:27         ` Dave Jones
2013-06-21 15:11         ` Dave Jones
2013-06-21 19:59           ` Oleg Nesterov [this message]
2013-06-22  1:37             ` Dave Jones
2013-06-22 17:31               ` Oleg Nesterov
2013-06-22 21:59                 ` Dave Jones
2013-06-23  5:00                   ` Andrew Vagin
2013-06-23 14:36                   ` Oleg Nesterov
2013-06-23 15:06                     ` Dave Jones
2013-06-23 16:04                       ` Oleg Nesterov
2013-06-24  0:21                         ` Dave Jones
2013-06-24  2:00                         ` Dave Jones
2013-06-24 14:39                           ` Oleg Nesterov
2013-06-24 14:52                             ` Steven Rostedt
2013-06-24 16:00                               ` Dave Jones
2013-06-24 16:24                                 ` Steven Rostedt
2013-06-24 16:51                                   ` Dave Jones
2013-06-24 17:04                                     ` Steven Rostedt
2013-06-25 16:55                                       ` Dave Jones
2013-06-25 17:21                                         ` Steven Rostedt
2013-06-25 17:23                                           ` Steven Rostedt
2013-06-25 17:26                                           ` Dave Jones
2013-06-25 17:31                                             ` Steven Rostedt
2013-06-25 17:32                                             ` Steven Rostedt
2013-06-25 17:29                                           ` Steven Rostedt
2013-06-25 17:34                                             ` Dave Jones
2013-06-24 16:37                                 ` Oleg Nesterov
2013-06-24 16:49                                   ` Dave Jones
2013-06-24 15:57                         ` Dave Jones
2013-06-24 17:35                           ` Oleg Nesterov
2013-06-24 17:44                             ` Dave Jones
2013-06-24 17:53                             ` Steven Rostedt
2013-06-24 18:00                               ` Dave Jones
2013-06-25 15:35                             ` Dave Jones
2013-06-25 16:23                               ` Steven Rostedt
2013-06-26  5:23                                 ` Dave Jones
2013-06-26 19:52                                   ` Steven Rostedt
2013-06-26 20:00                                     ` Dave Jones
2013-06-27  3:01                                       ` Steven Rostedt
2013-06-26  5:48                                 ` Dave Jones
2013-06-26 19:18                               ` Oleg Nesterov
2013-06-26 19:40                                 ` Dave Jones
2013-06-27  0:22                                 ` Dave Jones
2013-06-27  1:06                                   ` Eric W. Biederman
2013-06-27  2:32                                     ` Tejun Heo
2013-06-27  7:55                                   ` Dave Chinner
2013-06-27 10:06                                     ` Dave Chinner
2013-06-27 12:52                                       ` Dave Chinner
2013-06-27 15:21                                         ` Dave Jones
2013-06-28  1:13                                           ` Dave Chinner
2013-06-28  3:58                                             ` Dave Chinner
2013-06-28 10:28                                               ` Jan Kara
2013-06-29  3:39                                                 ` Dave Chinner
2013-07-01 12:00                                                   ` Jan Kara
2013-07-02  6:29                                                     ` Dave Chinner
2013-07-02  8:19                                                       ` Jan Kara
2013-07-02 12:38                                                         ` Dave Chinner
2013-07-02 14:05                                                           ` Jan Kara
2013-07-02 16:13                                                             ` Linus Torvalds
2013-07-02 16:57                                                               ` Jan Kara
2013-07-02 17:38                                                                 ` Linus Torvalds
2013-07-03  3:07                                                                   ` Dave Chinner
2013-07-03  3:28                                                                     ` Linus Torvalds
2013-07-03  4:49                                                                       ` Dave Chinner
2013-07-04  7:19                                                                         ` Andrew Morton
2013-06-29 20:13                                               ` Dave Jones
2013-06-29 22:23                                                 ` Linus Torvalds
2013-06-29 23:44                                                   ` Dave Jones
2013-06-30  0:21                                                     ` Steven Rostedt
2013-07-01 12:49                                                     ` Pavel Machek
2013-06-30  0:17                                                   ` Steven Rostedt
2013-06-30  2:05                                                   ` Dave Chinner
2013-06-30  2:34                                                     ` Dave Chinner
2013-06-27 14:30                                     ` Dave Jones
2013-06-28  1:18                                       ` Dave Chinner
2013-06-28  2:54                                         ` Linus Torvalds
2013-06-28  3:54                                           ` Dave Chinner
2013-06-28  5:59                                             ` Linus Torvalds
2013-06-28  7:21                                               ` Dave Chinner
2013-06-28  8:22                                                 ` Linus Torvalds
2013-06-28  8:32                                                   ` Al Viro
2013-06-28  8:22                                               ` Al Viro
2013-06-28  9:49                                               ` Jan Kara
2013-07-01 17:57                                             ` block layer softlockup Dave Jones
2013-07-02  2:07                                               ` Dave Chinner
2013-07-02  6:01                                                 ` Dave Jones
2013-07-02  7:30                                                   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130621195949.GA15519@redhat.com \
    --to=oleg@redhat.com \
    --cc=avagin@openvz.org \
    --cc=davej@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.