All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Jones <davej@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andrey Vagin <avagin@openvz.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	axboe@kernel.dk
Subject: block layer softlockup
Date: Mon, 1 Jul 2013 13:57:34 -0400	[thread overview]
Message-ID: <20130701175734.GA13641@redhat.com> (raw)
In-Reply-To: <20130628035437.GB29338@dastard>

On Fri, Jun 28, 2013 at 01:54:37PM +1000, Dave Chinner wrote:
 > On Thu, Jun 27, 2013 at 04:54:53PM -1000, Linus Torvalds wrote:
 > > On Thu, Jun 27, 2013 at 3:18 PM, Dave Chinner <david@fromorbit.com> wrote:
 > > >
 > > > Right, that will be what is happening - the entire system will go
 > > > unresponsive when a sync call happens, so it's entirely possible
 > > > to see the soft lockups on inode_sb_list_add()/inode_sb_list_del()
 > > > trying to get the lock because of the way ticket spinlocks work...
 > > 
 > > So what made it all start happening now? I don't recall us having had
 > > these kinds of issues before..
 > 
 > Not sure - it's a sudden surprise for me, too. Then again, I haven't
 > been looking at sync from a performance or lock contention point of
 > view any time recently.  The algorithm that wait_sb_inodes() is
 > effectively unchanged since at least 2009, so it's probably a case
 > of it having been protected from contention by some external factor
 > we've fixed/removed recently.  Perhaps the bdi-flusher thread
 > replacement in -rc1 has changed the timing sufficiently that it no
 > longer serialises concurrent sync calls as much....

This mornings new trace reminded me of this last sentence. Related ?

BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child1:7219]
Modules linked in: lec sctp dlci 8021q garp mpoa dccp_ipv4 dccp bridge stp tun snd_seq_dummy fuse bnep rfcomm nfnetlink scsi_transport_iscsi hidp ipt_ULOG can_raw can_bcm af_key af_rxrpc rose ipx p8023 p8022 atm llc2 pppoe pppox ppp_generic slhc bluetooth rds af_802154 appletalk nfc psnap phonet llc rfkill netrom x25 ax25 irda can caif_socket caif crc_ccitt coretemp hwmon kvm_intel kvm snd_hda_codec_realtek crc32c_intel ghash_clmulni_intel snd_hda_codec_hdmi microcode pcspkr snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device usb_debug snd_pcm e1000e snd_page_alloc snd_timer snd ptp pps_core soundcore xfs libcrc32c
irq event stamp: 3181543
hardirqs last  enabled at (3181542): [<ffffffff816edc60>] restore_args+0x0/0x30
hardirqs last disabled at (3181543): [<ffffffff816f676a>] apic_timer_interrupt+0x6a/0x80
softirqs last  enabled at (1794686): [<ffffffff810542e4>] __do_softirq+0x194/0x440
softirqs last disabled at (1794689): [<ffffffff8105474d>] irq_exit+0xcd/0xe0
CPU: 0 PID: 7219 Comm: trinity-child1 Not tainted 3.10.0+ #38
task: ffff8801d3a0ca40 ti: ffff88022e07e000 task.ti: ffff88022e07e000
RIP: 0010:[<ffffffff816ed037>]  [<ffffffff816ed037>] _raw_spin_unlock_irqrestore+0x67/0x80
RSP: 0018:ffff880244803db0  EFLAGS: 00000286
RAX: ffff8801d3a0ca40 RBX: ffffffff816edc60 RCX: 0000000000000002
RDX: 0000000000004730 RSI: ffff8801d3a0d1c0 RDI: ffff8801d3a0ca40
RBP: ffff880244803dc0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff880244803d28
R13: ffffffff816f676f R14: ffff880244803dc0 R15: ffff88023d1307f8
FS:  00007f00e7ab6740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002f18000 CR3: 000000022d31b000 CR4: 00000000001407f0
DR0: 0000000000ad9000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88023e21a680 0000000000000000 ffff880244803df0 ffffffff812da4c1
 ffff88023e21a680 0000000000000000 0000000000000000 0000000000000000
 ffff880244803e00 ffffffff812da4e0 ffff880244803e60 ffffffff8149ba13
Call Trace:
 <IRQ> 

 [<ffffffff812da4c1>] blk_end_bidi_request+0x51/0x60
 [<ffffffff812da4e0>] blk_end_request+0x10/0x20
 [<ffffffff8149ba13>] scsi_io_completion+0xf3/0x6e0
 [<ffffffff81491a60>] scsi_finish_command+0xb0/0x110
 [<ffffffff8149b81f>] scsi_softirq_done+0x12f/0x160
 [<ffffffff812e1e08>] blk_done_softirq+0x88/0xa0
 [<ffffffff8105424f>] __do_softirq+0xff/0x440
 [<ffffffff8105474d>] irq_exit+0xcd/0xe0
 [<ffffffff816f760b>] smp_apic_timer_interrupt+0x6b/0x9b
 [<ffffffff816f676f>] apic_timer_interrupt+0x6f/0x80
 <EOI> 

 [<ffffffff816edc60>] ? retint_restore_args+0xe/0xe
 [<ffffffff812ff465>] ? idr_find_slowpath+0x115/0x150
 [<ffffffff812ff475>] ? idr_find_slowpath+0x125/0x150
 [<ffffffff8108ceb0>] ? scheduler_tick_max_deferment+0x60/0x60
 [<ffffffff816f1765>] ? add_preempt_count+0xa5/0xf0
 [<ffffffff810fc8ea>] rcu_lockdep_current_cpu_online+0x3a/0xa0
 [<ffffffff812ff475>] idr_find_slowpath+0x125/0x150
 [<ffffffff812a3879>] ipcget+0x89/0x380
 [<ffffffff810b76e5>] ? trace_hardirqs_on_caller+0x115/0x1e0
 [<ffffffff812a4f76>] SyS_msgget+0x56/0x60
 [<ffffffff812a4560>] ? rcu_read_lock+0x80/0x80
 [<ffffffff812a43a0>] ? sysvipc_msg_proc_show+0xd0/0xd0
 [<ffffffff816f5d14>] tracesys+0xdd/0xe2
 [<ffffffffa00000b4>] ? libcrc32c_mod_fini+0x48/0xf94 [libcrc32c]
Code: 00 e8 9e 47 00 00 65 48 8b 04 25 f0 b9 00 00 48 8b 80 38 e0 ff ff a8 08 75 13 5b 41 5c 5d c3 0f 1f 44 00 00 e8 7b a7 9c ff 53 9d <eb> cf 0f 1f 80 00 00 00 00 e8 bb ea ff ff eb df 66 0f 1f 84 00 


My read of this is that block layer was taking a *long* time to do something,
and prevented the msgget from progressing within the watchdog cutoff time.

Plausible ?

	Dave

  parent reply	other threads:[~2013-07-01 17:58 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-19 16:45 frequent softlockups with 3.10rc6 Dave Jones
2013-06-19 17:53 ` Dave Jones
2013-06-19 18:13   ` Paul E. McKenney
2013-06-19 18:42     ` Dave Jones
2013-06-20  0:12     ` Dave Jones
2013-06-20 16:16       ` Paul E. McKenney
2013-06-20 16:27         ` Dave Jones
2013-06-21 15:11         ` Dave Jones
2013-06-21 19:59           ` Oleg Nesterov
2013-06-22  1:37             ` Dave Jones
2013-06-22 17:31               ` Oleg Nesterov
2013-06-22 21:59                 ` Dave Jones
2013-06-23  5:00                   ` Andrew Vagin
2013-06-23 14:36                   ` Oleg Nesterov
2013-06-23 15:06                     ` Dave Jones
2013-06-23 16:04                       ` Oleg Nesterov
2013-06-24  0:21                         ` Dave Jones
2013-06-24  2:00                         ` Dave Jones
2013-06-24 14:39                           ` Oleg Nesterov
2013-06-24 14:52                             ` Steven Rostedt
2013-06-24 16:00                               ` Dave Jones
2013-06-24 16:24                                 ` Steven Rostedt
2013-06-24 16:51                                   ` Dave Jones
2013-06-24 17:04                                     ` Steven Rostedt
2013-06-25 16:55                                       ` Dave Jones
2013-06-25 17:21                                         ` Steven Rostedt
2013-06-25 17:23                                           ` Steven Rostedt
2013-06-25 17:26                                           ` Dave Jones
2013-06-25 17:31                                             ` Steven Rostedt
2013-06-25 17:32                                             ` Steven Rostedt
2013-06-25 17:29                                           ` Steven Rostedt
2013-06-25 17:34                                             ` Dave Jones
2013-06-24 16:37                                 ` Oleg Nesterov
2013-06-24 16:49                                   ` Dave Jones
2013-06-24 15:57                         ` Dave Jones
2013-06-24 17:35                           ` Oleg Nesterov
2013-06-24 17:44                             ` Dave Jones
2013-06-24 17:53                             ` Steven Rostedt
2013-06-24 18:00                               ` Dave Jones
2013-06-25 15:35                             ` Dave Jones
2013-06-25 16:23                               ` Steven Rostedt
2013-06-26  5:23                                 ` Dave Jones
2013-06-26 19:52                                   ` Steven Rostedt
2013-06-26 20:00                                     ` Dave Jones
2013-06-27  3:01                                       ` Steven Rostedt
2013-06-26  5:48                                 ` Dave Jones
2013-06-26 19:18                               ` Oleg Nesterov
2013-06-26 19:40                                 ` Dave Jones
2013-06-27  0:22                                 ` Dave Jones
2013-06-27  1:06                                   ` Eric W. Biederman
2013-06-27  2:32                                     ` Tejun Heo
2013-06-27  7:55                                   ` Dave Chinner
2013-06-27 10:06                                     ` Dave Chinner
2013-06-27 12:52                                       ` Dave Chinner
2013-06-27 15:21                                         ` Dave Jones
2013-06-28  1:13                                           ` Dave Chinner
2013-06-28  3:58                                             ` Dave Chinner
2013-06-28 10:28                                               ` Jan Kara
2013-06-29  3:39                                                 ` Dave Chinner
2013-07-01 12:00                                                   ` Jan Kara
2013-07-02  6:29                                                     ` Dave Chinner
2013-07-02  8:19                                                       ` Jan Kara
2013-07-02 12:38                                                         ` Dave Chinner
2013-07-02 14:05                                                           ` Jan Kara
2013-07-02 16:13                                                             ` Linus Torvalds
2013-07-02 16:57                                                               ` Jan Kara
2013-07-02 17:38                                                                 ` Linus Torvalds
2013-07-03  3:07                                                                   ` Dave Chinner
2013-07-03  3:28                                                                     ` Linus Torvalds
2013-07-03  4:49                                                                       ` Dave Chinner
2013-07-04  7:19                                                                         ` Andrew Morton
2013-06-29 20:13                                               ` Dave Jones
2013-06-29 22:23                                                 ` Linus Torvalds
2013-06-29 23:44                                                   ` Dave Jones
2013-06-30  0:21                                                     ` Steven Rostedt
2013-07-01 12:49                                                     ` Pavel Machek
2013-06-30  0:17                                                   ` Steven Rostedt
2013-06-30  2:05                                                   ` Dave Chinner
2013-06-30  2:34                                                     ` Dave Chinner
2013-06-27 14:30                                     ` Dave Jones
2013-06-28  1:18                                       ` Dave Chinner
2013-06-28  2:54                                         ` Linus Torvalds
2013-06-28  3:54                                           ` Dave Chinner
2013-06-28  5:59                                             ` Linus Torvalds
2013-06-28  7:21                                               ` Dave Chinner
2013-06-28  8:22                                                 ` Linus Torvalds
2013-06-28  8:32                                                   ` Al Viro
2013-06-28  8:22                                               ` Al Viro
2013-06-28  9:49                                               ` Jan Kara
2013-07-01 17:57                                             ` Dave Jones [this message]
2013-07-02  2:07                                               ` block layer softlockup Dave Chinner
2013-07-02  6:01                                                 ` Dave Jones
2013-07-02  7:30                                                   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130701175734.GA13641@redhat.com \
    --to=davej@redhat.com \
    --cc=avagin@openvz.org \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.