* xfs deadlock in stable kernel 3.0.4 @ 2011-09-10 12:23 Stefan Priebe 2011-09-12 15:21 ` Christoph Hellwig 0 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe @ 2011-09-10 12:23 UTC (permalink / raw) To: xfs; +Cc: xfs-masters Hello List, on some of our heavy loaded servers using xfs we're seeing a deadlock where reading/writing to the xfs filesystem suddenly stops working. Here you can find sysrq w triggered log messages of the locked processes. http://pastebin.com/JWjrbrh4 Please help! Thanks! Please cc me i'm not subscribed. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-10 12:23 xfs deadlock in stable kernel 3.0.4 Stefan Priebe @ 2011-09-12 15:21 ` Christoph Hellwig 2011-09-12 16:46 ` Stefan Priebe 0 siblings, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-12 15:21 UTC (permalink / raw) To: Stefan Priebe; +Cc: xfs-masters, xfs On Sat, Sep 10, 2011 at 02:23:12PM +0200, Stefan Priebe wrote: > Hello List, > > on some of our heavy loaded servers using xfs we're seeing a deadlock where reading/writing to the xfs filesystem suddenly stops working. > > Here you can find sysrq w triggered log messages of the locked processes. > > http://pastebin.com/JWjrbrh4 What kind of workload are you running? Also did the workload run fine with an older kernel, and if yes which one? _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-12 15:21 ` Christoph Hellwig @ 2011-09-12 16:46 ` Stefan Priebe 2011-09-12 20:05 ` Christoph Hellwig 0 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe @ 2011-09-12 16:46 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs Hi, >> Hello List, >> >> on some of our heavy loaded servers using xfs we're seeing a deadlock where reading/writing to the xfs filesystem suddenly stops working. >> >> Here you can find sysrq w triggered log messages of the locked processes. >> >> http://pastebin.com/JWjrbrh4 > > What kind of workload are you running? Also did the workload run fine > with an older kernel, and if yes which one? Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from that version. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-12 16:46 ` Stefan Priebe @ 2011-09-12 20:05 ` Christoph Hellwig 2011-09-13 6:04 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-12 20:05 UTC (permalink / raw) To: Stefan Priebe; +Cc: Christoph Hellwig, xfs-masters, xfs On Mon, Sep 12, 2011 at 06:46:26PM +0200, Stefan Priebe wrote: > > What kind of workload are you running? Also did the workload run fine > > with an older kernel, and if yes which one? > > Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from that version. Just curious, is this the same system that also shows the freezes reported to the scsi list? If I/Os don't get completed by lower layers I can see how we get everything in XFS waiting on the log reservations, given that we never get the log tail pushed. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-12 20:05 ` Christoph Hellwig @ 2011-09-13 6:04 ` Stefan Priebe - Profihost AG 2011-09-13 19:31 ` Stefan Priebe - Profihost AG 2011-09-13 20:50 ` Christoph Hellwig 0 siblings, 2 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-13 6:04 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs Hi, > On Mon, Sep 12, 2011 at 06:46:26PM +0200, Stefan Priebe wrote: >>> What kind of workload are you running? Also did the workload run fine >>> with an older kernel, and if yes which one? >> >> Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from that version. > > Just curious, is this the same system that also shows the freezes > reported to the scsi list? If I/Os don't get completed by lower layers > I can see how we get everything in XFS waiting on the log reservations, > given that we never get the log tail pushed. I just reported it to the scsi list as i didn't knew where the problems is. But then some people told be it must be a XFS problem. Some more informations: 1.) It's running with 2.6.32 and 2.6.38 2.) I can also write to another ext2 part on the same disk array(aacraid driver) while xfs stucks - so i think it must be an xfs problem 3.) I've also tried running 3.1-rc5 but then i'm seeing this error: BUG: unable to handle kernel NULL pointer dereference at 000000000000012c IP: [] inode_dio_done+0x4/0x25 PGD 293724067 PUD 292930067 PMD 0 Oops: 0002 [#1] SMP CPU 5 Modules linked in: ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables coretemp k8temp Pid: 4775, comm: mysqld Not tainted 3.1-rc5 #1 Supermicro X8DT3/X8DT3 RIP: 0010:[] [] inode_dio_done+0x4/0x25 RSP: 0018:ffff880292b5fad8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8806ab4927e0 RCX: 0000000000007524 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff880292b5fad8 R08: ffff880292b5e000 R09: 0000000000000000 R10: ffff88047f85e040 R11: ffff88042ddb5d88 R12: ffff88002b7f8800 R13: ffff88002b7f8800 R14: 0000000000000000 R15: ffff88042d896040 FS: 0000000045c79950(0063) GS:ffff88083fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000012c CR3: 0000000293408000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process mysqld (pid: 4775, threadinfo ffff880292b5e000, task ffff88042d896040) Stack: ffff880292b5faf8 ffffffff811938cd 0000000192b5fb18 0000000000004000 ffff880292b5fb18 ffffffff810feba2 0000000000000000 ffff88002b7f8920 ffff880292b5fbf8 ffffffff810ff4fb ffff880292b5fb78 ffff880292b5e000 Call Trace: [] xfs_end_io_direct_write+0x6a/0x6e [] dio_complete+0x90/0xbb [] __blockdev_direct_IO+0x92e/0x964 [] ? mempool_alloc_slab+0x11/0x13 [] xfs_vm_direct_IO+0x90/0x101 [] ? __xfs_get_blocks+0x395/0x395 [] ? xfs_finish_ioend_sync+0x1a/0x1a [] generic_file_direct_write+0xd7/0x147 [] xfs_file_dio_aio_write+0x1b9/0x1d1 [] ? wake_up_state+0xb/0xd [] xfs_file_aio_write+0x16a/0x21d [] ? do_futex+0xc0/0x988 [] do_sync_write+0xc7/0x10d [] vfs_write+0xab/0x103 [] sys_pwrite64+0x5c/0x7d [] system_call_fastpath+0x16/0x1b Code: 00 48 8d 34 30 89 d9 4c 89 e7 e8 3a fe ff ff 85 c0 75 0b 44 89 e8 49 01 84 24 90 00 00 00 41 5a 5b 41 5c 41 5d c9 c3 55 48 89 e5 ff 8f 2c 01 00 00 0f 94 c0 84 c0 74 11 48 81 c7 90 00 00 00 RIP [] inode_dio_done+0x4/0x25 RSP CR2: 000000000000012c ---[ end trace 79ce33ac2f7c10bd ]--- Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-13 6:04 ` Stefan Priebe - Profihost AG @ 2011-09-13 19:31 ` Stefan Priebe - Profihost AG 2011-09-13 20:50 ` Christoph Hellwig 1 sibling, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-13 19:31 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs Am 13.09.2011 08:04, schrieb Stefan Priebe - Profihost AG: > Hi, > >> On Mon, Sep 12, 2011 at 06:46:26PM +0200, Stefan Priebe wrote: >>>> What kind of workload are you running? Also did the workload run fine >>>> with an older kernel, and if yes which one? >>> >>> Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from >>> that version. >> >> Just curious, is this the same system that also shows the freezes >> reported to the scsi list? If I/Os don't get completed by lower layers >> I can see how we get everything in XFS waiting on the log reservations, >> given that we never get the log tail pushed. > > I just reported it to the scsi list as i didn't knew where the problems > is. But then some people told be it must be a XFS problem. > > Some more informations: > 1.) It's running with 2.6.32 and 2.6.38 > 2.) I can also write to another ext2 part on the same disk array(aacraid > driver) while xfs stucks - so i think it must be an xfs problem > 3.) I've also tried running 3.1-rc5 but then i'm seeing this error: > > ... > Any idea what we could try next or how to find the problem? At least this is happening with different devices and writing to other partitions is still working. Greets Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-13 6:04 ` Stefan Priebe - Profihost AG 2011-09-13 19:31 ` Stefan Priebe - Profihost AG @ 2011-09-13 20:50 ` Christoph Hellwig 2011-09-13 21:52 ` [xfs-masters] " Alex Elder 2011-09-14 7:26 ` Stefan Priebe - Profihost AG 1 sibling, 2 replies; 49+ messages in thread From: Christoph Hellwig @ 2011-09-13 20:50 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote: > I just reported it to the scsi list as i didn't knew where the > problems is. But then some people told be it must be a XFS problem. > > Some more informations: > 1.) It's running with 2.6.32 and 2.6.38 > 2.) I can also write to another ext2 part on the same disk > array(aacraid driver) while xfs stucks - so i think it must be an > xfs problem That points a bit more towards XFS, although we've seen storage setups create issues depending on the exact workload. The prime culprit for used to be the md software RAID driver, though. > 3.) I've also tried running 3.1-rc5 but then i'm seeing this error: > > BUG: unable to handle kernel NULL pointer dereference at 000000000000012c > IP: [] inode_dio_done+0x4/0x25 Oops, that's a bug that I actually introduced myself. Fix below: Index: linux-2.6/fs/xfs/xfs_aops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_aops.c 2011-09-13 16:38:47.141089046 -0400 +++ linux-2.6/fs/xfs/xfs_aops.c 2011-09-13 16:39:09.991647077 -0400 @@ -1300,6 +1300,7 @@ xfs_end_io_direct_write( bool is_async) { struct xfs_ioend *ioend = iocb->private; + struct inode *inode = ioend->io_inode; /* * blockdev_direct_IO can return an error even after the I/O @@ -1331,7 +1332,7 @@ xfs_end_io_direct_write( } /* XXX: probably should move into the real I/O completion handler */ - inode_dio_done(ioend->io_inode); + inode_dio_done(inode); } STATIC ssize_t _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-13 20:50 ` Christoph Hellwig @ 2011-09-13 21:52 ` Alex Elder 2011-09-13 21:58 ` Alex Elder 2011-09-14 7:26 ` Stefan Priebe - Profihost AG 1 sibling, 1 reply; 49+ messages in thread From: Alex Elder @ 2011-09-13 21:52 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, Stefan Priebe - Profihost AG On Tue, 2011-09-13 at 16:50 -0400, Christoph Hellwig wrote: > On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote: > > I just reported it to the scsi list as i didn't knew where the > > problems is. But then some people told be it must be a XFS problem. > > > > Some more informations: > > 1.) It's running with 2.6.32 and 2.6.38 > > 2.) I can also write to another ext2 part on the same disk > > array(aacraid driver) while xfs stucks - so i think it must be an > > xfs problem > > That points a bit more towards XFS, although we've seen storage setups > create issues depending on the exact workload. The prime culprit for > used to be the md software RAID driver, though. > > > 3.) I've also tried running 3.1-rc5 but then i'm seeing this error: > > > > BUG: unable to handle kernel NULL pointer dereference at 000000000000012c > > IP: [] inode_dio_done+0x4/0x25 > > Oops, that's a bug that I actually introduced myself. Fix below: Yikes. I'll prepare that one to send to Linus for 3.1. I'll wait for your formal signoff, though, Christoph. Reviewed-by: Alex Elder <aelder@sgi.com> > > Index: linux-2.6/fs/xfs/xfs_aops.c > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_aops.c 2011-09-13 16:38:47.141089046 -0400 > +++ linux-2.6/fs/xfs/xfs_aops.c 2011-09-13 16:39:09.991647077 -0400 > @@ -1300,6 +1300,7 @@ xfs_end_io_direct_write( > bool is_async) > { > struct xfs_ioend *ioend = iocb->private; > + struct inode *inode = ioend->io_inode; > > /* > * blockdev_direct_IO can return an error even after the I/O > @@ -1331,7 +1332,7 @@ xfs_end_io_direct_write( > } > > /* XXX: probably should move into the real I/O completion handler */ > - inode_dio_done(ioend->io_inode); > + inode_dio_done(inode); > } > > STATIC ssize_t > > _______________________________________________ > xfs-masters mailing list > xfs-masters@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs-masters _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-13 21:52 ` [xfs-masters] " Alex Elder @ 2011-09-13 21:58 ` Alex Elder 2011-09-13 22:26 ` Christoph Hellwig 0 siblings, 1 reply; 49+ messages in thread From: Alex Elder @ 2011-09-13 21:58 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, Stefan Priebe - Profihost AG On Tue, 2011-09-13 at 16:52 -0500, Alex Elder wrote: > On Tue, 2011-09-13 at 16:50 -0400, Christoph Hellwig wrote: > > On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote: > > > I just reported it to the scsi list as i didn't knew where the > > > problems is. But then some people told be it must be a XFS problem. > > > > > > Some more informations: > > > 1.) It's running with 2.6.32 and 2.6.38 > > > 2.) I can also write to another ext2 part on the same disk > > > array(aacraid driver) while xfs stucks - so i think it must be an > > > xfs problem > > > > That points a bit more towards XFS, although we've seen storage setups > > create issues depending on the exact workload. The prime culprit for > > used to be the md software RAID driver, though. > > > > > 3.) I've also tried running 3.1-rc5 but then i'm seeing this error: > > > > > > BUG: unable to handle kernel NULL pointer dereference at 000000000000012c > > > IP: [] inode_dio_done+0x4/0x25 > > > > Oops, that's a bug that I actually introduced myself. Fix below: > > Yikes. I'll prepare that one to send to Linus for 3.1. > I'll wait for your formal signoff, though, Christoph. > > Reviewed-by: Alex Elder <aelder@sgi.com> Nevermind--the latest code doesn't look quite like that and doesn't suffer the same problem. Christoph, will you please ensure the fix gets to the stable folks though? You have my review for the change. -Alex _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-13 21:58 ` Alex Elder @ 2011-09-13 22:26 ` Christoph Hellwig 0 siblings, 0 replies; 49+ messages in thread From: Christoph Hellwig @ 2011-09-13 22:26 UTC (permalink / raw) To: Alex Elder Cc: Christoph Hellwig, xfs-masters, xfs, Stefan Priebe - Profihost AG On Tue, Sep 13, 2011 at 04:58:13PM -0500, Alex Elder wrote: > > Reviewed-by: Alex Elder <aelder@sgi.com> > > Nevermind--the latest code doesn't look quite > like that and doesn't suffer the same problem. It needs to go into 3.1, where this bug was introduced. In the 3.2 queue it already has been fixed by different means _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-13 20:50 ` Christoph Hellwig 2011-09-13 21:52 ` [xfs-masters] " Alex Elder @ 2011-09-14 7:26 ` Stefan Priebe - Profihost AG 2011-09-14 7:48 ` Stefan Priebe - Profihost AG 1 sibling, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-14 7:26 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs Hi, Am 13.09.2011 22:50, schrieb Christoph Hellwig: > On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote: >> I just reported it to the scsi list as i didn't knew where the >> problems is. But then some people told be it must be a XFS problem. >> >> Some more informations: >> 1.) It's running with 2.6.32 and 2.6.38 >> 2.) I can also write to another ext2 part on the same disk >> array(aacraid driver) while xfs stucks - so i think it must be an >> xfs problem > > That points a bit more towards XFS, although we've seen storage setups > create issues depending on the exact workload. The prime culprit for > used to be the md software RAID driver, though. > >> 3.) I've also tried running 3.1-rc5 but then i'm seeing this error: >> >> BUG: unable to handle kernel NULL pointer dereference at 000000000000012c >> IP: [] inode_dio_done+0x4/0x25 > > Oops, that's a bug that I actually introduced myself. Fix below: Thanks for the patch. Now we have the following situation: 1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch 2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X will become the next long term stable. So there will be a lot of people using it. 3.) I have seen this deadlock on systems with aacraid and with intel ahci onboard. (that's all we're using) 4.) I still write to other devices / raids on the same controller while the XFS root filesystem hangs. What can we do / try now / next? Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-14 7:26 ` Stefan Priebe - Profihost AG @ 2011-09-14 7:48 ` Stefan Priebe - Profihost AG 2011-09-14 8:49 ` Stefan Priebe - Profihost AG 2011-09-14 14:30 ` Christoph Hellwig 0 siblings, 2 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-14 7:48 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs Hi, >> Oops, that's a bug that I actually introduced myself. Fix below: > > Thanks for the patch. > > Now we have the following situation: > > 1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch > 2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X > will become the next long term stable. So there will be a lot of people > using it. > 3.) I have seen this deadlock on systems with aacraid and with intel > ahci onboard. (that's all we're using) > 4.) I still write to other devices / raids on the same controller while > the XFS root filesystem hangs. Sadly it was now crashing with 3.1 rc-6 + patch again. Sorry i was to fast to write you an email. Hung Task detection showed me this with 3.1 rc-6: [] ? might_fault+0x3b/0x88 [] do_filp_open+0x38/0x86 [] ? _raw_spin_unlock+0x26/0x2b [] ? alloc_fd+0x11d/0x12e [] do_sys_open+0x114/0x1a3 [] sys_open+0x1b/0x1d [] system_call_fastpath+0x16/0x1b 1 lock held by mysqld/17058: #0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693 INFO: task qmail-send:4899 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. qmail-send D 0000000000000000 0 4899 1 0x00020000 ffff88081c4afc38 0000000000000046 ffffffff814a52d5 0000000100000000 ffff88082cf5be70 ffff88081c4ae010 0000000000004000 ffff88082cf5b5d0 0000000000011c40 ffff88081c4affd8 ffff88081c4affd8 0000000000011c40 Call Trace: [] ? __schedule+0x2e8/0x9fd [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] schedule+0x57/0x59 [] xlog_grant_log_space+0x18e/0x4ae [] ? try_to_wake_up+0x330/0x330 [] xfs_log_reserve+0x11a/0x122 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_remove+0x136/0x34e [] ? mutex_lock_nested+0x275/0x290 [] ? mutex_lock_nested+0x281/0x290 [] ? vfs_unlink+0x51/0xdd [] xfs_vn_unlink+0x3c/0x75 [] vfs_unlink+0x69/0xdd [] do_unlinkat+0xde/0x170 [] ? retint_swapgs+0xe/0x13 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] ? file_free_rcu+0x35/0x35 [] sys_unlink+0x11/0x13 [] ia32_do_call+0x13/0x13 2 locks held by qmail-send/4899: #0: (&sb->s_type->i_mutex_key#5/1){+.+.+.}, at: [] do_unlinkat+0x63/0x170 #1: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] vfs_unlink+0x51/0xdd INFO: task httpd:6316 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. httpd D 0000000000000001 0 6316 6270 0x00000000 ffff880406edfb78 0000000000000046 ffff88041b792c30 0000000100000000 ffff88041b792c80 ffff880406ede010 0000000000004000 ffff88041b7923e0 0000000000011c40 ffff880406edffd8 ffff880406edffd8 0000000000011c40 Call Trace: [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] schedule+0x57/0x59 [] xlog_grant_log_space+0x18e/0x4ae [] ? try_to_wake_up+0x330/0x330 [] xfs_log_reserve+0x11a/0x122 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_create+0x200/0x53a [] ? d_lookup+0x2d/0x42 2 locks held by httpd/6316: [] ? __d_lookup+0x16a/0x17c [] ? __d_lookup+0x16a/0x17c 1 lock held by imap/11461: INFO: task flush-8:0:3658 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. flush-8:0 D 000000000000000b 0 3658 2 0x00000000 ffff88082c389690 0000000000000046 ffff88082c8bac30 0000000100000000 ffff88082c8bac58 ffff88082c388010 0000000000004000 ffff88082c8ba3e0 0000000000011c40 ffff88082c389fd8 ffff88082c389fd8 0000000000011c40 Call Trace: [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] schedule+0x57/0x59 [] xlog_grant_log_space+0x18e/0x4ae [] ? try_to_wake_up+0x330/0x330 [] xfs_log_reserve+0x11a/0x122 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_iomap_write_allocate+0xcc/0x2cc [] ? xfs_ilock_nowait+0x66/0xd5 [] ? up_read+0x1e/0x37 [] xfs_map_blocks+0x159/0x1ee [] xfs_vm_writepage+0x21e/0x3f9 [] __writepage+0x15/0x3b [] write_cache_pages+0x28c/0x3a8 [] ? alloc_pages_exact_nid+0x9a/0x9a [] generic_writepages+0x46/0x61 [] xfs_vm_writepages+0x45/0x4e [] do_writepages+0x1f/0x28 [] writeback_single_inode+0x18f/0x387 [] writeback_sb_inodes+0x196/0x237 [] ? grab_super_passive+0x52/0x76 [] __writeback_inodes_wb+0x73/0xb6 [] wb_writeback+0x163/0x24b [] ? trace_hardirqs_on+0xd/0xf [] ? local_bh_enable_ip+0xbc/0xc1 [] wb_do_writeback+0x183/0x210 [] bdi_writeback_thread+0xc0/0x1e4 [] ? wb_do_writeback+0x210/0x210 [] kthread+0x81/0x89 [] kernel_thread_helper+0x4/0x10 [] ? finish_task_switch+0x45/0xc3 [] ? retint_restore_args+0xe/0xe [] ? __init_kthread_worker+0x56/0x56 [] ? gs_change+0xb/0xb 1 lock held by flush-8:0/3658: #0: (&type->s_umount_key#31){++++.+}, at: [] grab_super_passive+0x52/0x76 INFO: task syslogd:4459 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syslogd D 000000000000000c 0 4459 1 0x00000000 ffff88082b4c3d78 0000000000000046 ffffffff814a52d5 ffff88082c8605d8 ffff88082b446ba0 ffff88082b4c2010 0000000000004000 ffff88082b446ba0 0000000000011c40 ffff88082b4c3fd8 ffff88082b4c3fd8 0000000000011c40 Call Trace: [] ? __schedule+0x2e8/0x9fd [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] schedule+0x57/0x59 [] xlog_grant_log_space+0x18e/0x4ae [] ? try_to_wake_up+0x330/0x330 [] xfs_log_reserve+0x11a/0x122 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_file_fsync+0x15f/0x22d [] vfs_fsync_range+0x18/0x21 [] vfs_fsync+0x17/0x19 [] do_fsync+0x2e/0x44 [] sys_fsync+0xb/0xf [] system_call_fastpath+0x16/0x1b no locks held by syslogd/4459. INFO: task mysqld:4612 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mysqld D 0000000000000000 0 4612 4567 0x00000000 ffff880429a31d78 0000000000000046 ffffffff814a52d5 ffff88082c8605d8 ffff88042cd8d9b0 ffff880429a30010 0000000000004000 ffff88042cd8d9b0 0000000000011c40 ffff880429a31fd8 ffff880429a31fd8 0000000000011c40 Call Trace: [] ? __schedule+0x2e8/0x9fd [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] schedule+0x57/0x59 [] xlog_grant_log_space+0x18e/0x4ae [] ? try_to_wake_up+0x330/0x330 [] xfs_log_reserve+0x11a/0x122 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_file_fsync+0x15f/0x22d [] vfs_fsync_range+0x18/0x21 [] vfs_fsync+0x17/0x19 [] do_fsync+0x2e/0x44 [] sys_fsync+0xb/0xf [] system_call_fastpath+0x16/0x1b no locks held by mysqld/4612. INFO: task mysqld:27595 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mysqld D 0000000000000008 0 27595 4567 0x00000000 ffff88011dda3ca8 0000000000000046 ffffffff814a52d5 ffff880403cd88a0 0000000000000246 ffff88011dda2010 0000000000004000 ffff880403cd8000 0000000000011c40 ffff88011dda3fd8 ffff88011dda3fd8 0000000000011c40 Call Trace: [] ? __schedule+0x2e8/0x9fd [] ? mark_held_locks+0xc9/0xef [] ? mutex_lock_nested+0x16b/0x290 [] schedule+0x57/0x59 [] mutex_lock_nested+0x173/0x290 [] ? do_last+0x287/0x693 [] do_last+0x287/0x693 [] path_openat+0xcd/0x342 [] ? might_fault+0x3b/0x88 [] do_filp_open+0x38/0x86 [] ? _raw_spin_unlock+0x26/0x2b [] ? alloc_fd+0x11d/0x12e [] do_sys_open+0x114/0x1a3 [] sys_open+0x1b/0x1d [] system_call_fastpath+0x16/0x1b 1 lock held by mysqld/27595: #0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693 INFO: task mysqld:4873 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mysqld D 0000000000000000 0 4873 4625 0x00000000 ffff88081bf61d78 0000000000000046 ffffffff814a52d5 0000000100000000 ffff88081e82f3f0 ffff88081bf60010 0000000000004000 ffff88081e82eba0 0000000000011c40 ffff88081bf61fd8 ffff88081bf61fd8 0000000000011c40 Call Trace: [] ? __schedule+0x2e8/0x9fd [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] schedule+0x57/0x59 [] xlog_grant_log_space+0x18e/0x4ae [] ? try_to_wake_up+0x330/0x330 [] xfs_log_reserve+0x11a/0x122 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_file_fsync+0x15f/0x22d [] vfs_fsync_range+0x18/0x21 [] vfs_fsync+0x17/0x19 [] do_fsync+0x2e/0x44 [] sys_fsync+0xb/0xf [] system_call_fastpath+0x16/0x1b no locks held by mysqld/4873. INFO: task mysqld:17058 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mysqld D 000000000000000c 0 17058 4625 0x00000000 ffff88010325fa88 0000000000000046 ffffffff814a52d5 0000000100000000 ffff88025be8f418 ffff88010325e010 0000000000004000 ffff88025be8eba0 0000000000011c40 ffff88010325ffd8 ffff88010325ffd8 0000000000011c40 Call Trace: [] ? __schedule+0x2e8/0x9fd [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 [] ? trace_hardirqs_on_caller+0x11c/0x153 [] schedule+0x57/0x59 [] xlog_grant_log_space+0x18e/0x4ae [] ? try_to_wake_up+0x330/0x330 [] xfs_log_reserve+0x11a/0x122 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_create+0x200/0x53a [] ? __d_lookup+0xbe/0x17c [] ? __d_lookup+0x16a/0x17c [] ? d_validate+0x96/0x96 [] xfs_vn_mknod+0x9a/0xf5 [] xfs_vn_create+0xb/0xd [] vfs_create+0x72/0xa4 [] do_last+0x323/0x693 [] path_openat+0xcd/0x342 Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-14 7:48 ` Stefan Priebe - Profihost AG @ 2011-09-14 8:49 ` Stefan Priebe - Profihost AG 2011-09-14 14:30 ` Christoph Hellwig 2011-09-14 14:30 ` Christoph Hellwig 1 sibling, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-14 8:49 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs Hi, Am 14.09.2011 09:48, schrieb Stefan Priebe - Profihost AG: > Hi, > >>> Oops, that's a bug that I actually introduced myself. Fix below: >> >> Thanks for the patch. >> >> Now we have the following situation: >> >> 1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch >> 2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X >> will become the next long term stable. So there will be a lot of people >> using it. >> 3.) I have seen this deadlock on systems with aacraid and with intel >> ahci onboard. (that's all we're using) >> 4.) I still write to other devices / raids on the same controller while >> the XFS root filesystem hangs. > > Sadly it was now crashing with 3.1 rc-6 + patch again. Sorry i was to > fast to write you an email. So might it be that the problem at least in 3.1 lies in: [] ? mark_held_locks+0xc9/0xef [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 and not in XFS? Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-14 8:49 ` Stefan Priebe - Profihost AG @ 2011-09-14 14:30 ` Christoph Hellwig 0 siblings, 0 replies; 49+ messages in thread From: Christoph Hellwig @ 2011-09-14 14:30 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder On Wed, Sep 14, 2011 at 10:49:20AM +0200, Stefan Priebe - Profihost AG wrote: > >Sadly it was now crashing with 3.1 rc-6 + patch again. Sorry i was to > >fast to write you an email. > > So might it be that the problem at least in 3.1 lies in: > [] ? mark_held_locks+0xc9/0xef > [] ? _raw_spin_unlock_irqrestore+0x3f/0x47 > > and not in XFS? That's the lockdep code. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-14 7:48 ` Stefan Priebe - Profihost AG 2011-09-14 8:49 ` Stefan Priebe - Profihost AG @ 2011-09-14 14:30 ` Christoph Hellwig 2011-09-14 16:06 ` Stefan Priebe - Profihost AG 2011-09-18 9:14 ` Stefan Priebe - Profihost AG 1 sibling, 2 replies; 49+ messages in thread From: Christoph Hellwig @ 2011-09-14 14:30 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder On Wed, Sep 14, 2011 at 09:48:18AM +0200, Stefan Priebe - Profihost AG wrote: > #0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693 This means you are running your heavy load with lockdep enabled. I can't see how it directly causes your issues, but it will slow anything down to almost a grinding halt on systems with more than say two cores. Can you run with CONFIG_DEBUG_LOCK_ALLOC / and CONFIG_PROVE_LOCKING disabled/ It might be worth if you have other really heavy debugging options enabled, too. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-14 14:30 ` Christoph Hellwig @ 2011-09-14 16:06 ` Stefan Priebe - Profihost AG 2011-09-18 9:14 ` Stefan Priebe - Profihost AG 1 sibling, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-14 16:06 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder Hi, Am 14.09.2011 16:30, schrieb Christoph Hellwig: > On Wed, Sep 14, 2011 at 09:48:18AM +0200, Stefan Priebe - Profihost AG wrote: >> #0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693 > > This means you are running your heavy load with lockdep enabled. I > can't see how it directly causes your issues, but it will slow anything > down to almost a grinding halt on systems with more than say two cores. > > Can you run with CONFIG_DEBUG_LOCK_ALLOC / and CONFIG_PROVE_LOCKING > disabled/ It might be worth if you have other really heavy debugging > options enabled, too. i just enabled it while trying to find out the cause of my problems. My actual config has: # grep -i 'DEBUG' .config|egrep -v "^# " CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_SLUB_DEBUG=y CONFIG_HAVE_DMA_API_DEBUG=y CONFIG_X86_DEBUGCTLMSR=y CONFIG_PNP_DEBUG_MESSAGES=y CONFIG_AIC7XXX_DEBUG_ENABLE=y CONFIG_AIC7XXX_DEBUG_MASK=0 CONFIG_AIC79XX_DEBUG_ENABLE=y CONFIG_AIC79XX_DEBUG_MASK=0 CONFIG_OCFS2_DEBUG_MASKLOG=y CONFIG_DEBUG_FS=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_SHIRQ=y CONFIG_SCHED_DEBUG=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_MEMORY_INIT=y CONFIG_DEBUG_RODATA=y CONFIG_KEYS_DEBUG_PROC_KEYS=y my original config had: # grep -i 'DEBUG' .config_stillnotworking|egrep -v "^# " CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_HAVE_DMA_API_DEBUG=y CONFIG_X86_DEBUGCTLMSR=y CONFIG_PNP_DEBUG_MESSAGES=y CONFIG_AIC7XXX_DEBUG_ENABLE=y CONFIG_AIC7XXX_DEBUG_MASK=0 CONFIG_AIC79XX_DEBUG_ENABLE=y CONFIG_AIC79XX_DEBUG_MASK=0 CONFIG_OCFS2_DEBUG_MASKLOG=y CONFIG_DEBUG_FS=y CONFIG_DEBUG_KERNEL=y CONFIG_SCHED_DEBUG=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_MEMORY_INIT=y CONFIG_DEBUG_RODATA=y CONFIG_KEYS_DEBUG_PROC_KEYS=y With both configs i'm seeing the SAME symptoms after a while. Which options should i disable? Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-14 14:30 ` Christoph Hellwig 2011-09-14 16:06 ` Stefan Priebe - Profihost AG @ 2011-09-18 9:14 ` Stefan Priebe - Profihost AG 2011-09-18 20:04 ` Christoph Hellwig 2011-09-18 23:02 ` Dave Chinner 1 sibling, 2 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-18 9:14 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder Hi, at least i'm now able to reproduce the issue. I hope this will help to investigate the issue and hopefully you can reproduce it as well. I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had detect hanging taks with 120s set. You'll then see that the bonnie++ command get's stuck in xlog_grant_log_space while creating or deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks) - i was not able to reproduce it on normal SATA disks even a 20x SATA Raid 10 didn't work. I used bonnie++ (V 1.96) to reproduce it. Mostly in the 1st run the bug is triggered - sometimes I needed two runs. bonnie++ -u root -s 0 -n 1024:32768:0:1024:4096 -d / I hope that helps - as i now have a testing machine and can trigger the bug pretty fast (10-30min instead of hours). I can also add debug code if you want or have one. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-18 9:14 ` Stefan Priebe - Profihost AG @ 2011-09-18 20:04 ` Christoph Hellwig 2011-09-19 10:54 ` Stefan Priebe - Profihost AG 2011-09-18 23:02 ` Dave Chinner 1 sibling, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-18 20:04 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, aelder, xfs On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote: > Hi, > > at least i'm now able to reproduce the issue. I hope this will help > to investigate the issue and hopefully you can reproduce it as well. > > I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had > detect hanging taks with 120s set. You'll then see that the bonnie++ > command get's stuck in xlog_grant_log_space while creating or > deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks) > - i was not able to reproduce it on normal SATA disks even a 20x > SATA Raid 10 didn't work. Thanks a lot for the reproducer! I've tried it on my laptop SSD and that didn't reproduce it yet. I'll try it on monday on a real high end setup. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-18 20:04 ` Christoph Hellwig @ 2011-09-19 10:54 ` Stefan Priebe - Profihost AG 0 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-19 10:54 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs Am 18.09.2011 22:04, schrieb Christoph Hellwig: > On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote: >> Hi, >> >> at least i'm now able to reproduce the issue. I hope this will help >> to investigate the issue and hopefully you can reproduce it as well. >> >> I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had >> detect hanging taks with 120s set. You'll then see that the bonnie++ >> command get's stuck in xlog_grant_log_space while creating or >> deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks) >> - i was not able to reproduce it on normal SATA disks even a 20x >> SATA Raid 10 didn't work. > > Thanks a lot for the reproducer! > > I've tried it on my laptop SSD and that didn't reproduce it yet. I'll > try it on monday on a real high end setup. Sadly my SSD briked tonight while doing heavy testing ;-( I was not able to reproduce it on every partition. Only on some. Sadly i was not able to find the common point which causes this. I've now to setup a new machine and try to reproduce it again. What i got so far is that bonnie++ is always hanging here: [] ? radix_tree_gang_lookup_slot+0x6a/0x8d [] ? xfs_bmap_search_extents+0x56/0xb9 [] ? find_get_pages+0x39/0xd8 [] xlog_wait+0x58/0x70 [] ? try_to_wake_up+0x1c6/0x1c6 [] ? xlog_grant_push_ail+0xb7/0xbf [] xlog_grant_log_space+0x162/0x2b1 [] xfs_log_reserve+0xbb/0xc4 [] xfs_trans_reserve+0xd6/0x1b1 [] xfs_free_eofblocks+0x16b/0x1fb [] xfs_release+0x1c7/0x202 [] xfs_file_release+0x10/0x14 [] fput+0xfd/0x1eb [] filp_close+0x6d/0x78 [] sys_close+0x9a/0xd4 [] system_call_fastpath+0x16/0x1b The traces we had in the past were difficult to check which process was causing the lookup. So it doesn't seem to be the xlog_grant_log_space itself it seems that it is more xfs_bmap_search_extents or radix_tree_gang_lookup_slot? Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-18 9:14 ` Stefan Priebe - Profihost AG 2011-09-18 20:04 ` Christoph Hellwig @ 2011-09-18 23:02 ` Dave Chinner 2011-09-20 0:47 ` Stefan Priebe 2011-09-20 10:09 ` Stefan Priebe - Profihost AG 1 sibling, 2 replies; 49+ messages in thread From: Dave Chinner @ 2011-09-18 23:02 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote: > Hi, > > at least i'm now able to reproduce the issue. I hope this will help > to investigate the issue and hopefully you can reproduce it as well. > > I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had > detect hanging taks with 120s set. You'll then see that the bonnie++ > command get's stuck in xlog_grant_log_space while creating or > deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks) > - i was not able to reproduce it on normal SATA disks even a 20x > SATA Raid 10 didn't work. > > I used bonnie++ (V 1.96) to reproduce it. Mostly in the 1st run the > bug is triggered - sometimes I needed two runs. > > bonnie++ -u root -s 0 -n 1024:32768:0:1024:4096 -d / > > I hope that helps - as i now have a testing machine and can trigger > the bug pretty fast (10-30min instead of hours). I can also add > debug code if you want or have one. If it is a log space accounting issue, then the output of 'xfs_info <mtpt>' is really necessary to set the filesystem up the same way (e.g. same log size, number of AGs, etc) so that it behaves the same way on different test machines.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-18 23:02 ` Dave Chinner @ 2011-09-20 0:47 ` Stefan Priebe 2011-09-20 1:01 ` Stefan Priebe 2011-09-20 10:09 ` Stefan Priebe - Profihost AG 1 sibling, 1 reply; 49+ messages in thread From: Stefan Priebe @ 2011-09-20 0:47 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder Am 19.09.2011 um 01:02 schrieb Dave Chinner <david@fromorbit.com>: > On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote: >> Hi, >> >> at least i'm now able to reproduce the issue. I hope this will help >> to investigate the issue and hopefully you can reproduce it as well. >> >> I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had >> detect hanging taks with 120s set. You'll then see that the bonnie++ >> command get's stuck in xlog_grant_log_space while creating or >> deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks) >> - i was not able to reproduce it on normal SATA disks even a 20x >> SATA Raid 10 didn't work. >> >> I used bonnie++ (V 1.96) to reproduce it. Mostly in the 1st run the >> bug is triggered - sometimes I needed two runs. >> >> bonnie++ -u root -s 0 -n 1024:32768:0:1024:4096 -d / >> >> I hope that helps - as i now have a testing machine and can trigger >> the bug pretty fast (10-30min instead of hours). I can also add >> debug code if you want or have one. > > If it is a log space accounting issue, then the output of 'xfs_info > <mtpt>' is really necessary to set the filesystem up the same way > (e.g. same log size, number of AGs, etc) so that it behaves the same > way on different I can't get it. It just works on some part. and not on the other. Even xfs_info shows the same for them. Also i have one part where it only happens when that one is root (/). When i mount that one as /mnt it does not happen ;-( Any idea on how to proceed now? Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-20 0:47 ` Stefan Priebe @ 2011-09-20 1:01 ` Stefan Priebe 0 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe @ 2011-09-20 1:01 UTC (permalink / raw) To: Stefan Priebe; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder Am 20.09.2011 um 02:47 schrieb Stefan Priebe <s.priebe@profihost.ag>: > I can't get it. It just works on some part. and not on the other. So works means here reproducing it with bonnie++. So i can reproduce it still very fast but i don't know how to create a testcase. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-18 23:02 ` Dave Chinner 2011-09-20 0:47 ` Stefan Priebe @ 2011-09-20 10:09 ` Stefan Priebe - Profihost AG 2011-09-20 16:02 ` Christoph Hellwig 1 sibling, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-20 10:09 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder Hi, any idea how to get deeper into this? I've tried using kgdb but strangely the error does not occur when kgdb is remote attached. When i unattach kgdb and restart bonnie the error happens again. So it seems to me a little bit like a timing issue? Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-20 10:09 ` Stefan Priebe - Profihost AG @ 2011-09-20 16:02 ` Christoph Hellwig 2011-09-20 17:23 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-20 16:02 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder On Tue, Sep 20, 2011 at 12:09:34PM +0200, Stefan Priebe - Profihost AG wrote: > Hi, > > any idea how to get deeper into this? I've tried using kgdb but > strangely the error does not occur when kgdb is remote attached. > When i unattach kgdb and restart bonnie the error happens again. > > So it seems to me a little bit like a timing issue? Sounds like it. Can you summarize all the data that we gather over this thread into one summary, e.g. - what kernel does it happens? Seems like 3.0 and 3.1 hit it easily, 2.6.38 some times, 2.6.32 is fine. Did you test anything between 2.6.32 and 2.6.38? - what hardware hits it often/sometimes/never? - what is the fs geometry? - what is the hardware? - is this a 32 or 64-bit kernel, or do you run both? I'm pretty sure most got posted somewhere, but let's get a summary as things was a bit confusing sometimes. Note that 2.6.38 moved the whole log grant code to a lockless algorithm, so this might be a likely culprit if you're managing to hit race windows no one else does, i.e. this really is a timing issue. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-20 16:02 ` Christoph Hellwig @ 2011-09-20 17:23 ` Stefan Priebe - Profihost AG 2011-09-20 17:24 ` Christoph Hellwig 0 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-20 17:23 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder > Can you summarize all the data that we gather over this thread into one > summary, e.g. Yes - hope it helps. > - what kernel does it happens? Seems like 3.0 and 3.1 hit it easily, > 2.6.38 some times, 2.6.32 is fine. Did you test anything between > 2.6.32 and 2.6.38? Hits very easily: 3.0.4 and 3.1-rc5 Very rare: 2.6.38 - as it happened only some times i cannot 100% guarantee that it is really the same issue No issues at all: 2.6.32 I've not tested anything between 2.6.32 as i cannot reproduce it under 2.6.38 at all - seen once a week of 500. > - what hardware hits it often/sometimes/never? I've seen this only on multi core CPUs with > 2.8Ghz and fast SAS Raid 10 or SSD. I cannot say if it's the CPU or the fast disks - as our low cost systems have only small CPUs and the high end ones have big cpus with fast disks. > - what is the fs geometry? What do you exactly mean? I've seen this on 1TB and 160GB SSD devices with totally different disk layout. > - what is the hardware? see above > - is this a 32 or 64-bit kernel, or do you run both? always 64bit > I'm pretty sure most got posted somewhere, but let's get a summary > as things was a bit confusing sometimes. no problem > Note that 2.6.38 moved the whole log grant code to a lockless algorithm, > so this might be a likely culprit if you're managing to hit race windows > no one else does, i.e. this really is a timing issue. I'm nearly willing todo anything to solve this. What can i do to help. My last hope from today was to get some code lines with kgdb - sadly it does not happen at all when kgdb is attached ;-( Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-20 17:23 ` Stefan Priebe - Profihost AG @ 2011-09-20 17:24 ` Christoph Hellwig 2011-09-20 17:35 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-20 17:24 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote: > > - what is the fs geometry? > What do you exactly mean? I've seen this on 1TB and 160GB SSD > devices with totally different disk layout. The output of mkfs.xfs (of xfs_info after it's been created) > >Note that 2.6.38 moved the whole log grant code to a lockless algorithm, > >so this might be a likely culprit if you're managing to hit race windows > >no one else does, i.e. this really is a timing issue. > I'm nearly willing todo anything to solve this. What can i do to > help. My last hope from today was to get some code lines with kgdb - > sadly it does not happen at all when kgdb is attached ;-( I'll run tests on a system with a pci-e flash device today. Just to make sure we are on the same page, can you give me your kernel .config in addition to the mkfs output above? _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-20 17:24 ` Christoph Hellwig @ 2011-09-20 17:35 ` Stefan Priebe - Profihost AG 2011-09-20 22:30 ` Christoph Hellwig 0 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-20 17:35 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder Am 20.09.2011 19:24, schrieb Christoph Hellwig: > On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote: >>> - what is the fs geometry? >> What do you exactly mean? I've seen this on 1TB and 160GB SSD >> devices with totally different disk layout. > > The output of mkfs.xfs (of xfs_info after it's been created) ssd:~# xfs_info /dev/sda3 meta-data=/dev/root isize=256 agcount=4, agsize=9517888 blks = sectsz=512 attr=2 data = bsize=4096 blocks=38071552, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=18589, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 > I'll run tests on a system with a pci-e flash device today. Just to > make sure we are on the same page, can you give me your kernel .config > in addition to the mkfs output above? OK i hope you can reproduce it as well. .config http://pastebin.com/raw.php?i=m8AAFJ1B I also found out that i was not able to reproduce it under a freshly new created xfs part. I needed to copy a bunch of files delete some create some new and then start the test. I just duplicated multiple times the root filesystem and then deleted some, created some hardlinks whatever... Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-20 17:35 ` Stefan Priebe - Profihost AG @ 2011-09-20 22:30 ` Christoph Hellwig 2011-09-21 2:11 ` [xfs-masters] " Dave Chinner 2011-09-21 7:36 ` Stefan Priebe - Profihost AG 0 siblings, 2 replies; 49+ messages in thread From: Christoph Hellwig @ 2011-09-20 22:30 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, aelder, xfs On Tue, Sep 20, 2011 at 07:35:57PM +0200, Stefan Priebe - Profihost AG wrote: > Am 20.09.2011 19:24, schrieb Christoph Hellwig: > >On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote: > >>> - what is the fs geometry? > >>What do you exactly mean? I've seen this on 1TB and 160GB SSD > >>devices with totally different disk layout. > > > >The output of mkfs.xfs (of xfs_info after it's been created) > > ssd:~# xfs_info /dev/sda3 > meta-data=/dev/root isize=256 agcount=4, agsize=9517888 blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=38071552, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal bsize=4096 blocks=18589, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 Nothing special there. So far I haven't been able to recreate it. How many runs did you normally need on 3.1-rc? Note that so far I've run my known working kernel, I'll test your config plus the drivers I need next. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-20 22:30 ` Christoph Hellwig @ 2011-09-21 2:11 ` Dave Chinner 2011-09-21 7:40 ` Stefan Priebe - Profihost AG 2011-09-21 7:36 ` Stefan Priebe - Profihost AG 1 sibling, 1 reply; 49+ messages in thread From: Dave Chinner @ 2011-09-21 2:11 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, Stefan Priebe - Profihost AG On Tue, Sep 20, 2011 at 06:30:47PM -0400, Christoph Hellwig wrote: > On Tue, Sep 20, 2011 at 07:35:57PM +0200, Stefan Priebe - Profihost AG wrote: > > Am 20.09.2011 19:24, schrieb Christoph Hellwig: > > >On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote: > > >>> - what is the fs geometry? > > >>What do you exactly mean? I've seen this on 1TB and 160GB SSD > > >>devices with totally different disk layout. > > > > > >The output of mkfs.xfs (of xfs_info after it's been created) > > > > ssd:~# xfs_info /dev/sda3 > > meta-data=/dev/root isize=256 agcount=4, agsize=9517888 blks > > = sectsz=512 attr=2 > > data = bsize=4096 blocks=38071552, imaxpct=25 > > = sunit=0 swidth=0 blks > > naming =version 2 bsize=4096 ascii-ci=0 > > log =internal bsize=4096 blocks=18589, version=2 > > = sectsz=512 sunit=0 blks, lazy-count=1 > > realtime =none extsz=4096 blocks=0, rtextents=0 > > Nothing special there. > > So far I haven't been able to recreate it. How many runs did you > normally need on 3.1-rc? Note that so far I've run my known working > kernel, I'll test your config plus the drivers I need next. How much memory does your test machine have? The performance will be vastly different if there is enough RAM to hold the working set of inodes and page cache (~20GB all up), and that could be one of the factors contributing to the problems. The above xfs_info output is from your 160GB SSD - what's the output from the 1TB device? Also, what phase do you see it hanging in? the random stat phase is terribly slow on spinning disks, so if I can avoid that it woul dbe nice.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 2:11 ` [xfs-masters] " Dave Chinner @ 2011-09-21 7:40 ` Stefan Priebe - Profihost AG 2011-09-21 11:42 ` Dave Chinner 0 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-21 7:40 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs Am 21.09.2011 04:11, schrieb Dave Chinner: > How much memory does your test machine have? The performance will be > vastly different if there is enough RAM to hold the working set of > inodes and page cache (~20GB all up), and that could be one of the > factors contributing to the problems. The livesystems which crash within hours have between 48GB and 64GB RAM. But my testing system has only 8GB. > The above xfs_info output is from your 160GB SSD - what's the output > from the 1TB device? The 1TB device is now doing something else and does not have XFS on it anymore. But here are the layouts of two livesystems. xfs_info /dev/sda6 meta-data=/dev/root isize=256 agcount=4, agsize=35767872 blks = sectsz=512 attr=2 data = bsize=4096 blocks=143071488, imaxpct=25 = sunit=64 swidth=512 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=69888, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 xfs_info /dev/sda6 meta-data=/dev/root isize=256 agcount=4, agsize=35768000 blks = sectsz=512 attr=2 data = bsize=4096 blocks=143071774, imaxpct=25 = sunit=64 swidth=512 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 > Also, what phase do you see it hanging in? the random stat phase is > terribly slow on spinning disks, so if I can avoid that it woul dbe > nice.... Creating or deleting files. never in the stat phase. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 7:40 ` Stefan Priebe - Profihost AG @ 2011-09-21 11:42 ` Dave Chinner 2011-09-21 11:55 ` Stefan Priebe - Profihost AG ` (2 more replies) 0 siblings, 3 replies; 49+ messages in thread From: Dave Chinner @ 2011-09-21 11:42 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost AG wrote: > Am 21.09.2011 04:11, schrieb Dave Chinner: > >Also, what phase do you see it hanging in? the random stat phase is > >terribly slow on spinning disks, so if I can avoid that it woul dbe > >nice.... > Creating or deleting files. never in the stat phase. Ok, I got a hang in the random delete phase. Not sure what is wrong yet, but inode reclaim is trying to reclaim inodes but failing, and the AIL is trying to push items but failing. Hence the tail of the log is not being moved forward and new transactions are being blocked until log space bcomes available. The AIl is particularly interesting. the number of pushes being executed is precisely 50/s, and precisely 5000 items/s are being scanned. All those items are pinned, so the "stuck" processing is what is triggering this pattern. Thing is, all the items are aparently pinned - I see that stat incrementing at 5,000/s. It's here: case XFS_ITEM_PINNED: XFS_STATS_INC(xs_push_ail_pinned); stuck++; flush_log = 1; break; so we should have the flush_log variable set. However, this code: if (flush_log) { /* * If something we need to push out was pinned, then * push out the log so it will become unpinned and * move forward in the AIL. */ XFS_STATS_INC(xs_push_ail_flush); xfs_log_force(mp, 0); } never seems to execute. I don't see the xs_push_ail_flush stat increase, nor the log force counter increase, either. Hence the pinned items are not getting unpinned, and progress is not being made. Background inode reclaim is not making progress, either, because it skips pinned inodes. The AIL code is clearly cycling - the push counter is increasing, and the run numbers match the stuck code precisely (aborts at 100 stuck items a cycle). The question is now why isn't the log force being triggered. Given this, just triggering a log force is shoul dget everything moving again. Running "echo 2 > /proc/sys/vm/drop_caches" gets inode reclaim running in sync mode, which causes pinned inodes to trigger a log force. And once I've done this, everything starts running again. So, the log force not triggering in the AIL code looks to be the problem. That, I simply cannot explain right now - it makes no sense but that is what all the stats and trace events point to. I need to do more investigation. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 11:42 ` Dave Chinner @ 2011-09-21 11:55 ` Stefan Priebe - Profihost AG 2011-09-21 12:26 ` Christoph Hellwig 2011-09-22 0:53 ` Dave Chinner 2 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-21 11:55 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs Am 21.09.2011 13:42, schrieb Dave Chinner: > Ok, I got a hang in the random delete phase. Not sure what is wrong > yet, but inode reclaim is trying to reclaim inodes but failing, and > the AIL is trying to push items but failing. Hence the tail of the > log is not being moved forward and new transactions are being > blocked until log space bcomes available. OK that matches my findings. It was also mostly in the random delete phase. But i've also seen it on creates. > Given this, just triggering a log force is shoul dget everything > moving again. Running "echo 2> /proc/sys/vm/drop_caches" gets inode > reclaim running in sync mode, which causes pinned inodes to trigger > a log force. And once I've done this, everything starts running > again. Oh man i was thinking about trying this. But then i forgot that idea ;-( > So, the log force not triggering in the AIL code looks to be the > problem. That, I simply cannot explain right now - it makes no sense > but that is what all the stats and trace events point to. I need to > do more investigation. Thanks Dave and great that you were able to repeat it. What helps is to build bonnie++ yourself and just remove the stat tests. I've done this too - so bonnie++ runs a lot faster. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 11:42 ` Dave Chinner 2011-09-21 11:55 ` Stefan Priebe - Profihost AG @ 2011-09-21 12:26 ` Christoph Hellwig 2011-09-21 13:42 ` Stefan Priebe ` (3 more replies) 2011-09-22 0:53 ` Dave Chinner 2 siblings, 4 replies; 49+ messages in thread From: Christoph Hellwig @ 2011-09-21 12:26 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs, Stefan Priebe - Profihost AG On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote: > So, the log force not triggering in the AIL code looks to be the > problem. That, I simply cannot explain right now - it makes no sense > but that is what all the stats and trace events point to. I need to > do more investigation. Could it be that we have a huge amount of instances of xfs_ail_worker running at the same time? xfs_sync_wq is marked as WQ_CPU_INTENSIVE, so running/runnable workers are not counted towards the concurrency limit. From my look at the workqueue code this means we'll spawn new instances fairly quickly if the others are stuck. This means more and more of them hammering the pinned items, and we'll rarely reach the limit where we'd need to do a log force. What is also strange is that we allocate a xfs_ail_wq, but don't actually use it, although it would have the same idea. Stefan, can you try the following patch? This moves the ail work to it's explicit queue, and makes sure we never have the same work item (= same fs to be pushed) concurrently. Note that before Linux 3.1-rc you'll need to edit fs/xfs/xfs_super.c to be fs/xfs/linux-2.6/xfs_super.c in the patch manually. Index: linux-2.6/fs/xfs/xfs_super.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_super.c 2011-09-21 08:00:01.864768359 -0400 +++ linux-2.6/fs/xfs/xfs_super.c 2011-09-21 08:04:01.335266079 -0400 @@ -1654,7 +1654,7 @@ xfs_init_workqueues(void) if (!xfs_syncd_wq) goto out; - xfs_ail_wq = alloc_workqueue("xfsail", WQ_CPU_INTENSIVE, 8); + xfs_ail_wq = alloc_workqueue("xfsail", WQ_NON_REENTRANT, 8); if (!xfs_ail_wq) goto out_destroy_syncd; Index: linux-2.6/fs/xfs/xfs_trans_ail.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_trans_ail.c 2011-09-21 08:02:28.172765827 -0400 +++ linux-2.6/fs/xfs/xfs_trans_ail.c 2011-09-21 08:02:46.843266108 -0400 @@ -538,7 +538,7 @@ out_done: } /* There is more to do, requeue us. */ - queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, + queue_delayed_work(xfs_ail_wq, &ailp->xa_work, msecs_to_jiffies(tout)); } @@ -575,7 +575,7 @@ xfs_ail_push( smp_wmb(); xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &threshold_lsn); if (!test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags)) - queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, 0); + queue_delayed_work(xfs_ail_wq, &ailp->xa_work, 0); } /* _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 12:26 ` Christoph Hellwig @ 2011-09-21 13:42 ` Stefan Priebe 2011-09-21 16:48 ` Stefan Priebe - Profihost AG ` (2 subsequent siblings) 3 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe @ 2011-09-21 13:42 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Am 21.09.2011 um 14:26 schrieb Christoph Hellwig <hch@infradead.org>: > What is also strange is that we allocate a xfs_ail_wq, but don't > actually use it, although it would have the same idea. Stefan, > can you try the following patch? This moves the ail work to it's > explicit queue, and makes sure we never have the same work item > (= same fs to be pushed) concurrently. I will have the chance to test in a few hours again. Perhaps can test too? Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 12:26 ` Christoph Hellwig 2011-09-21 13:42 ` Stefan Priebe @ 2011-09-21 16:48 ` Stefan Priebe - Profihost AG 2011-09-21 17:26 ` Stefan Priebe - Profihost AG 2011-09-21 19:01 ` Stefan Priebe - Profihost AG 2011-09-21 23:07 ` Dave Chinner 3 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-21 16:48 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Hi, Am 21.09.2011 14:26, schrieb Christoph Hellwig: > What is also strange is that we allocate a xfs_ail_wq, but don't > actually use it, although it would have the same idea. Stefan, > can you try the following patch? This moves the ail work to it's > explicit queue, and makes sure we never have the same work item > (= same fs to be pushed) concurrently. Sorry, but with your patch everything is awfully slow. Just the sequ. file creation takes on an SSD extremely long. I interrupted the test. i/o top from an SSD: Total DISK READ: 0 B/s | Total DISK WRITE: 9.88 M/s PID USER DISK READ DISK WRITE SWAPIN IO> COMMAND 1377 root 0 B/s 0 B/s 0.00 % 99.99 % [xfsbufd/sda3] 2219 root 0 B/s 0 B/s 0.00 % 99.99 % [flush-8:0] 2746 root 0 B/s 9.88 M/s 0.00 % 0.00 % bonnie++ -u root -s 0 -n 1024:32768:0:1024:4096 -d /mnt Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 16:48 ` Stefan Priebe - Profihost AG @ 2011-09-21 17:26 ` Stefan Priebe - Profihost AG 0 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-21 17:26 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs > Hi, > > Am 21.09.2011 14:26, schrieb Christoph Hellwig: >> What is also strange is that we allocate a xfs_ail_wq, but don't >> actually use it, although it would have the same idea. Stefan, >> can you try the following patch? This moves the ail work to it's >> explicit queue, and makes sure we never have the same work item >> (= same fs to be pushed) concurrently. > > Sorry, but with your patch everything is awfully slow. Just the sequ. > file creation takes on an SSD extremely long. I interrupted the test. > > i/o top from an SSD: > Total DISK READ: 0 B/s | Total DISK WRITE: 9.88 M/s > PID USER DISK READ DISK WRITE SWAPIN IO> COMMAND > 1377 root 0 B/s 0 B/s 0.00 % 99.99 % [xfsbufd/sda3] > 2219 root 0 B/s 0 B/s 0.00 % 99.99 % [flush-8:0] > 2746 root 0 B/s 9.88 M/s 0.00 % 0.00 % bonnie++ -u root -s 0 -n > 1024:32768:0:1024:4096 -d /mnt Please ignore this mail. Use the wrong disk. *gr* slow SATA Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 12:26 ` Christoph Hellwig 2011-09-21 13:42 ` Stefan Priebe 2011-09-21 16:48 ` Stefan Priebe - Profihost AG @ 2011-09-21 19:01 ` Stefan Priebe - Profihost AG 2011-09-21 23:07 ` Dave Chinner 3 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-21 19:01 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Am 21.09.2011 14:26, schrieb Christoph Hellwig: > What is also strange is that we allocate a xfs_ail_wq, but don't > actually use it, although it would have the same idea. Stefan, > can you try the following patch? This moves the ail work to it's > explicit queue, and makes sure we never have the same work item > (= same fs to be pushed) concurrently. no luck - problem still occurs ;-( Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 12:26 ` Christoph Hellwig ` (2 preceding siblings ...) 2011-09-21 19:01 ` Stefan Priebe - Profihost AG @ 2011-09-21 23:07 ` Dave Chinner 2011-09-22 14:14 ` Christoph Hellwig 3 siblings, 1 reply; 49+ messages in thread From: Dave Chinner @ 2011-09-21 23:07 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs, Stefan Priebe - Profihost AG On Wed, Sep 21, 2011 at 08:26:49AM -0400, Christoph Hellwig wrote: > On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote: > > So, the log force not triggering in the AIL code looks to be the > > problem. That, I simply cannot explain right now - it makes no sense > > but that is what all the stats and trace events point to. I need to > > do more investigation. > > Could it be that we have a huge amount of instances of xfs_ail_worker > running at the same time? xfs_sync_wq is marked as WQ_CPU_INTENSIVE, > so running/runnable workers are not counted towards the concurrency > limit. From my look at the workqueue code this means we'll spawn new > instances fairly quickly if the others are stuck. This means more > and more of them hammering the pinned items, and we'll rarely reach > the limit where we'd need to do a log force. No, that's not possible. The XFS_AIL_PUSHING_BIT ensures that there is only one instance of AIL pushing per struct xfs_ail running at once. It's also backed up by the fact that I couldn't find a single worker thread blocked running AIL pushing - it ran the 100 item scan, got stuck, requeued itself to run again 20ms later.... FYI, what we want the concurrency for in the AIL wq is for multiple filesystems to be able to run AIL pushing at the same time, which is why it was set up this way. If one filesystem AIL push blocks, then an unblocked one will simply run. > What is also strange is that we allocate a xfs_ail_wq, but don't > actually use it, although it would have the same idea. Stefan, > can you try the following patch? This moves the ail work to it's > explicit queue, and makes sure we never have the same work item > (= same fs to be pushed) concurrently. Oh, that's a bug. My bad. That definitely needs fixing. > Note that before Linux 3.1-rc you'll need to edit fs/xfs/xfs_super.c > to be fs/xfs/linux-2.6/xfs_super.c in the patch manually. > > > Index: linux-2.6/fs/xfs/xfs_super.c > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_super.c 2011-09-21 08:00:01.864768359 -0400 > +++ linux-2.6/fs/xfs/xfs_super.c 2011-09-21 08:04:01.335266079 -0400 > @@ -1654,7 +1654,7 @@ xfs_init_workqueues(void) > if (!xfs_syncd_wq) > goto out; > > - xfs_ail_wq = alloc_workqueue("xfsail", WQ_CPU_INTENSIVE, 8); > + xfs_ail_wq = alloc_workqueue("xfsail", WQ_NON_REENTRANT, 8); > if (!xfs_ail_wq) > goto out_destroy_syncd; Drop this hunk.... > > Index: linux-2.6/fs/xfs/xfs_trans_ail.c > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_trans_ail.c 2011-09-21 08:02:28.172765827 -0400 > +++ linux-2.6/fs/xfs/xfs_trans_ail.c 2011-09-21 08:02:46.843266108 -0400 > @@ -538,7 +538,7 @@ out_done: > } > > /* There is more to do, requeue us. */ > - queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, > + queue_delayed_work(xfs_ail_wq, &ailp->xa_work, > msecs_to_jiffies(tout)); > } > > @@ -575,7 +575,7 @@ xfs_ail_push( > smp_wmb(); > xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &threshold_lsn); > if (!test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags)) > - queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, 0); > + queue_delayed_work(xfs_ail_wq, &ailp->xa_work, 0); > } just keep these. Can you repost with a sign-off? Cheers, Dave -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 23:07 ` Dave Chinner @ 2011-09-22 14:14 ` Christoph Hellwig 2011-09-22 21:49 ` Dave Chinner 0 siblings, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-22 14:14 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs, Stefan Priebe - Profihost AG On Thu, Sep 22, 2011 at 09:07:18AM +1000, Dave Chinner wrote: > No, that's not possible. The XFS_AIL_PUSHING_BIT ensures that there > is only one instance of AIL pushing per struct xfs_ail running at > once. It's also backed up by the fact that I couldn't find a single > worker thread blocked running AIL pushing - it ran the 100 item > scan, got stuck, requeued itself to run again 20ms later.... True, it should prevent that - this was just my only theory based on the (incorrect) assumption that we'd never get to the log force. > FYI, what we want the concurrency for in the AIL wq is for multiple > filesystems to be able to run AIL pushing at the same time, which > is why it was set up this way. If one filesystem AIL push blocks, > then an unblocked one will simply run. A WQ_NON_REENTRANT workqueue will still provide that. From the documentation: By default, a wq guarantees non-reentrance only on the same CPU. A work item may not be executed concurrently on the same CPU by multiple workers but is allowed to be executed concurrently on multiple CPUs. This flag makes sure non-reentrance is enforced across all CPUs. Work items queued to a non-reentrant wq are guaranteed to be executed by at most one worker system-wide at any given time. So this still seems to preferable for the ail workqueue, and should be able to replace the XFS_AIL_PUSHING_BIT protections. I also suspect that we should mark the ail workqueue as WQ_MEM_RECLAIM - a lot of memory reclaim really requires moving the AIL forward. Currently we have other ways to reclaim inodes, but e.g. for buffers we rely entirely on AIL pushing, and with the proposed metadata writeback changes we're going to rely even more on the ail, even if we still keep emergency synchronous around it's going to be a lot less efficient than real ail pushing under actual OOM conditions. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-22 14:14 ` Christoph Hellwig @ 2011-09-22 21:49 ` Dave Chinner 2011-09-22 22:01 ` Christoph Hellwig 0 siblings, 1 reply; 49+ messages in thread From: Dave Chinner @ 2011-09-22 21:49 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs, Stefan Priebe - Profihost AG On Thu, Sep 22, 2011 at 10:14:57AM -0400, Christoph Hellwig wrote: > On Thu, Sep 22, 2011 at 09:07:18AM +1000, Dave Chinner wrote: > > No, that's not possible. The XFS_AIL_PUSHING_BIT ensures that there > > is only one instance of AIL pushing per struct xfs_ail running at > > once. It's also backed up by the fact that I couldn't find a single > > worker thread blocked running AIL pushing - it ran the 100 item > > scan, got stuck, requeued itself to run again 20ms later.... > > True, it should prevent that - this was just my only theory based > on the (incorrect) assumption that we'd never get to the log force. > > > FYI, what we want the concurrency for in the AIL wq is for multiple > > filesystems to be able to run AIL pushing at the same time, which > > is why it was set up this way. If one filesystem AIL push blocks, > > then an unblocked one will simply run. > > A WQ_NON_REENTRANT workqueue will still provide that. From the > documentation: > > By default, a wq guarantees non-reentrance only on the same > CPU. A work item may not be executed concurrently on the same > CPU by multiple workers but is allowed to be executed > concurrently on multiple CPUs. This flag makes sure > non-reentrance is enforced across all CPUs. Work items queued > to a non-reentrant wq are guaranteed to be executed by at most > one worker system-wide at any given time. > > So this still seems to preferable for the ail workqueue, and should be > able to replace the XFS_AIL_PUSHING_BIT protections. No, we can't. WQ_NON_REENTRANT only protects against concurrency on the same CPU, not across all CPUs - it still allows concurrent per-CPU work processing on the same work item. However, we want only a *single* AIL worker instance executing per filesystem, not per-cpu per filesystem. Concurrent per-filesystem workers will simply bash on the AIL lock trying to walk the AIL at the same time, and this is precisely the issue the single AIL worker setup is avoiding. The XFS_AIL_PUSHING_BIT is what enforces the single per-filesystem push worker running at any time. > I also suspect that we should mark the ail workqueue as WQ_MEM_RECLAIM - > a lot of memory reclaim really requires moving the AIL forward. Possibly, but I'm not sure it is necessary. > Currently we have other ways to reclaim inodes, but e.g. for buffers > we rely entirely on AIL pushing, We have the xfs_buf shrinker that walks the LRU that frees clean buffers. > and with the proposed metadata > writeback changes we're going to rely even more on the ail, even if > we still keep emergency synchronous around it's going to be a lot > less efficient than real ail pushing under actual OOM conditions. The inode shrinker kicks the AIL pushing - if we cannot get memory to queue the work, then the very next iteration of the shrinker will try again. Hence I'm not sure that it is absolutely necessary, though it probably won't hurt... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-22 21:49 ` Dave Chinner @ 2011-09-22 22:01 ` Christoph Hellwig 2011-09-23 5:28 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-22 22:01 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs, Stefan Priebe - Profihost AG On Fri, Sep 23, 2011 at 07:49:56AM +1000, Dave Chinner wrote: > On Thu, Sep 22, 2011 at 10:14:57AM -0400, Christoph Hellwig wrote: > > By default, a wq guarantees non-reentrance only on the same > > CPU. A work item may not be executed concurrently on the same > > CPU by multiple workers but is allowed to be executed > > concurrently on multiple CPUs. This flag makes sure > > non-reentrance is enforced across all CPUs. Work items queued > > to a non-reentrant wq are guaranteed to be executed by at most > > one worker system-wide at any given time. > > > > So this still seems to preferable for the ail workqueue, and should be > > able to replace the XFS_AIL_PUSHING_BIT protections. > > No, we can't. WQ_NON_REENTRANT only protects against concurrency on > the same CPU, not across all CPUs - it still allows concurrent > per-CPU work processing on the same work item. Non concurrently for a given work_struct on the same CPU is the default, WQ_NON_REENTRANT extents that to not beeing exectuted concurrently at all. Check the documentation above again, or the code - just look for the only occurance of WQ_NON_REENTRANT in kernel/workqueue.c and the surronuding code (e.g. find_worker_executing_work and the current_work field in struct worker) > However, we want only a *single* AIL worker instance executing per > filesystem, not per-cpu per filesystem. Concurrent per-filesystem > workers will simply bash on the AIL lock trying to walk the AIL at > the same time, and this is precisely the issue the single AIL worker > setup is avoiding. The XFS_AIL_PUSHING_BIT is what enforces the > single per-filesystem push worker running at any time. I think that's exactly what WQ_NON_REENTRANT is intended for. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-22 22:01 ` Christoph Hellwig @ 2011-09-23 5:28 ` Stefan Priebe - Profihost AG 0 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-23 5:28 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Hi, you can faster reproduce the issue if you set elevator=noop when booting. It then happens always on the 1st run of deleting random files. @Dave: Were you able to reproduce it too? Greets Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-21 11:42 ` Dave Chinner 2011-09-21 11:55 ` Stefan Priebe - Profihost AG 2011-09-21 12:26 ` Christoph Hellwig @ 2011-09-22 0:53 ` Dave Chinner 2011-09-22 5:27 ` Stefan Priebe - Profihost AG 2 siblings, 1 reply; 49+ messages in thread From: Dave Chinner @ 2011-09-22 0:53 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote: > On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost AG wrote: > > Am 21.09.2011 04:11, schrieb Dave Chinner: > > >Also, what phase do you see it hanging in? the random stat phase is > > >terribly slow on spinning disks, so if I can avoid that it woul dbe > > >nice.... > > Creating or deleting files. never in the stat phase. > > Ok, I got a hang in the random delete phase. Not sure what is wrong > yet, but inode reclaim is trying to reclaim inodes but failing, and > the AIL is trying to push items but failing. Hence the tail of the > log is not being moved forward and new transactions are being > blocked until log space bcomes available. > > The AIl is particularly interesting. the number of pushes being > executed is precisely 50/s, and precisely 5000 items/s are being > scanned. All those items are pinned, so the "stuck" processing is > what is triggering this pattern. > > Thing is, all the items are aparently pinned - I see that stat > incrementing at 5,000/s. It's here: > > case XFS_ITEM_PINNED: > XFS_STATS_INC(xs_push_ail_pinned); > stuck++; > flush_log = 1; > break; > > so we should have the flush_log variable set. However, this code: > > if (flush_log) { > /* > * If something we need to push out was pinned, then > * push out the log so it will become unpinned and > * move forward in the AIL. > */ > XFS_STATS_INC(xs_push_ail_flush); > xfs_log_force(mp, 0); > } > > never seems to execute. I don't see the xs_push_ail_flush stat > increase, nor the log force counter increase, either. Hence the > pinned items are not getting unpinned, and progress is not being > made. Background inode reclaim is not making progress, either, > because it skips pinned inodes. > > The AIL code is clearly cycling - the push counter is increasing, > and the run numbers match the stuck code precisely (aborts at 100 > stuck items a cycle). The question is now why isn't the log force > being triggered. > > Given this, just triggering a log force is shoul dget everything > moving again. Running "echo 2 > /proc/sys/vm/drop_caches" gets inode > reclaim running in sync mode, which causes pinned inodes to trigger > a log force. And once I've done this, everything starts running > again. > > So, the log force not triggering in the AIL code looks to be the > problem. That, I simply cannot explain right now - it makes no sense > but that is what all the stats and trace events point to. I need to > do more investigation. Ok, it makes sense now. The kernel I was running (from before I went on holidays) had this patch in it: http://oss.sgi.com/archives/xfs/2011-08/msg00472.html I found this out by disassembling the kernel code. That code has a bug it in when the stuck case is hit - it fails to issue the log force in that case, and that's why I've been seeing this kernel get stuck. False alarm - will now try to reproduce without any dev patches in the kernel. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-22 0:53 ` Dave Chinner @ 2011-09-22 5:27 ` Stefan Priebe - Profihost AG 2011-09-22 7:52 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-22 5:27 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs Am 22.09.2011 02:53, schrieb Dave Chinner: > On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote: >> On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost AG wrote: > I found this out by disassembling the kernel code. That code has a > bug it in when the stuck case is hit - it fails to issue the log > force in that case, and that's why I've been seeing this kernel get > stuck. False alarm - will now try to reproduce without any dev > patches in the kernel. Sad to here that ;-( I'm now trying to prepare a 160GB dd image for you where it is reproducable. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4 2011-09-22 5:27 ` Stefan Priebe - Profihost AG @ 2011-09-22 7:52 ` Stefan Priebe - Profihost AG 0 siblings, 0 replies; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-22 7:52 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs Am 22.09.2011 07:27, schrieb Stefan Priebe - Profihost AG: > Am 22.09.2011 02:53, schrieb Dave Chinner: >> On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote: >>> On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost >>> AG wrote: > >> I found this out by disassembling the kernel code. That code has a >> bug it in when the stuck case is hit - it fails to issue the log >> force in that case, and that's why I've been seeing this kernel get >> stuck. False alarm - will now try to reproduce without any dev >> patches in the kernel. > Sad to here that ;-( I'm now trying to prepare a 160GB dd image for you > where it is reproducable. > The testimage is ready. I'm just packing it und uploading it. I'll send you all details via private mail. Hope that helps. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-20 22:30 ` Christoph Hellwig 2011-09-21 2:11 ` [xfs-masters] " Dave Chinner @ 2011-09-21 7:36 ` Stefan Priebe - Profihost AG 2011-09-21 11:39 ` Christoph Hellwig 1 sibling, 1 reply; 49+ messages in thread From: Stefan Priebe - Profihost AG @ 2011-09-21 7:36 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs Am 21.09.2011 00:30, schrieb Christoph Hellwig: > On Tue, Sep 20, 2011 at 07:35:57PM +0200, Stefan Priebe - Profihost AG wrote: >> Am 20.09.2011 19:24, schrieb Christoph Hellwig: >>> On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote: >>>>> - what is the fs geometry? >>>> What do you exactly mean? I've seen this on 1TB and 160GB SSD >>>> devices with totally different disk layout. >>> >>> The output of mkfs.xfs (of xfs_info after it's been created) >> >> ssd:~# xfs_info /dev/sda3 >> meta-data=/dev/root isize=256 agcount=4, agsize=9517888 blks >> = sectsz=512 attr=2 >> data = bsize=4096 blocks=38071552, imaxpct=25 >> = sunit=0 swidth=0 blks >> naming =version 2 bsize=4096 ascii-ci=0 >> log =internal bsize=4096 blocks=18589, version=2 >> = sectsz=512 sunit=0 blks, lazy-count=1 >> realtime =none extsz=4096 blocks=0, rtextents=0 > > Nothing special there. > > So far I haven't been able to recreate it. How many runs did you > normally need on 3.1-rc? Note that so far I've run my known working > kernel, I'll test your config plus the drivers I need next. I had only used 3.0.4 with bonnie++ to reproduce. 3.1-rc was running on a prod. system. Sadly i'm also not able to reproduce it reliable on every partition. Sometimes it works sometimes not. Just retrying does not help. I had to copy and delete random files from the part. and then start bonnie++ on it. Perhaps i can give you a dd dump of the partition. But i had to recreate one. My Intel SSD is now massivly slower than when i started the tests. No idea why. Stefan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-21 7:36 ` Stefan Priebe - Profihost AG @ 2011-09-21 11:39 ` Christoph Hellwig 2011-09-21 13:39 ` Stefan Priebe 0 siblings, 1 reply; 49+ messages in thread From: Christoph Hellwig @ 2011-09-21 11:39 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder On Wed, Sep 21, 2011 at 09:36:42AM +0200, Stefan Priebe - Profihost AG wrote: > >So far I haven't been able to recreate it. How many runs did you > >normally need on 3.1-rc? Note that so far I've run my known working > >kernel, I'll test your config plus the drivers I need next. > > I had only used 3.0.4 with bonnie++ to reproduce. 3.1-rc was running > on a prod. system. > > Sadly i'm also not able to reproduce it reliable on every partition. > Sometimes it works sometimes not. Just retrying does not help. I had > to copy and delete random files from the part. and then start > bonnie++ on it. Perhaps i can give you a dd dump of the partition. > But i had to recreate one. My Intel SSD is now massivly slower than > when i started the tests. No idea why. So far it runs fine on 3.1-rc both with my default config and yours, the latter had been running all night. This is on a 8-core Nehalem with 8GB of memory, and a fast PCI-e flash device. One thing I noticed is that your config seems to run many fs tasks a lot slower than mine, but I'm not entirely sure why. The only interesting things I noticed in your config where that you use slub instead of slab, which does a lot of high order allocations and has caused lots of trouble in the past, and that you enable CONFIG_CC_OPTIMIZE_FOR_SIZE, which has caused mis-compilation of complicated code in the past. I don't want to blame it directly, but I could see how that causes problems with some of the atomic64_t games XFS plays since 2.6.38. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-21 11:39 ` Christoph Hellwig @ 2011-09-21 13:39 ` Stefan Priebe 2011-09-21 14:17 ` Christoph Hellwig 0 siblings, 1 reply; 49+ messages in thread From: Stefan Priebe @ 2011-09-21 13:39 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder > One thing I noticed is that your config seems to run many fs tasks > a lot slower than mine, but I'm not entirely sure why. Strange would you post your config too? > The only interesting things I noticed in your config where that you > use slub instead of slab, which does a lot of high order allocations > and has caused lots of trouble in the past, and that you enable > CONFIG_CC_OPTIMIZE_FOR_SIZE, which has caused mis-compilation > of complicated code in the past. I don't want to blame it directly, > but I could see how that causes problems with some of the atomic64_t > games XFS plays since 2.6.38. Will remove it. At least dave was able to reproduce so he can probably help too. Thanks! Stefan > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: xfs deadlock in stable kernel 3.0.4 2011-09-21 13:39 ` Stefan Priebe @ 2011-09-21 14:17 ` Christoph Hellwig 0 siblings, 0 replies; 49+ messages in thread From: Christoph Hellwig @ 2011-09-21 14:17 UTC (permalink / raw) To: Stefan Priebe; +Cc: aelder, xfs [-- Attachment #1: Type: text/plain, Size: 249 bytes --] On Wed, Sep 21, 2011 at 03:39:16PM +0200, Stefan Priebe wrote: > > > One thing I noticed is that your config seems to run many fs tasks > > a lot slower than mine, but I'm not entirely sure why. > Strange would you post your config too? Attached. [-- Attachment #2: config.2.6.40.bz2 --] [-- Type: application/x-bzip2, Size: 26558 bytes --] [-- Attachment #3: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2011-09-23 5:28 UTC | newest] Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-09-10 12:23 xfs deadlock in stable kernel 3.0.4 Stefan Priebe 2011-09-12 15:21 ` Christoph Hellwig 2011-09-12 16:46 ` Stefan Priebe 2011-09-12 20:05 ` Christoph Hellwig 2011-09-13 6:04 ` Stefan Priebe - Profihost AG 2011-09-13 19:31 ` Stefan Priebe - Profihost AG 2011-09-13 20:50 ` Christoph Hellwig 2011-09-13 21:52 ` [xfs-masters] " Alex Elder 2011-09-13 21:58 ` Alex Elder 2011-09-13 22:26 ` Christoph Hellwig 2011-09-14 7:26 ` Stefan Priebe - Profihost AG 2011-09-14 7:48 ` Stefan Priebe - Profihost AG 2011-09-14 8:49 ` Stefan Priebe - Profihost AG 2011-09-14 14:30 ` Christoph Hellwig 2011-09-14 14:30 ` Christoph Hellwig 2011-09-14 16:06 ` Stefan Priebe - Profihost AG 2011-09-18 9:14 ` Stefan Priebe - Profihost AG 2011-09-18 20:04 ` Christoph Hellwig 2011-09-19 10:54 ` Stefan Priebe - Profihost AG 2011-09-18 23:02 ` Dave Chinner 2011-09-20 0:47 ` Stefan Priebe 2011-09-20 1:01 ` Stefan Priebe 2011-09-20 10:09 ` Stefan Priebe - Profihost AG 2011-09-20 16:02 ` Christoph Hellwig 2011-09-20 17:23 ` Stefan Priebe - Profihost AG 2011-09-20 17:24 ` Christoph Hellwig 2011-09-20 17:35 ` Stefan Priebe - Profihost AG 2011-09-20 22:30 ` Christoph Hellwig 2011-09-21 2:11 ` [xfs-masters] " Dave Chinner 2011-09-21 7:40 ` Stefan Priebe - Profihost AG 2011-09-21 11:42 ` Dave Chinner 2011-09-21 11:55 ` Stefan Priebe - Profihost AG 2011-09-21 12:26 ` Christoph Hellwig 2011-09-21 13:42 ` Stefan Priebe 2011-09-21 16:48 ` Stefan Priebe - Profihost AG 2011-09-21 17:26 ` Stefan Priebe - Profihost AG 2011-09-21 19:01 ` Stefan Priebe - Profihost AG 2011-09-21 23:07 ` Dave Chinner 2011-09-22 14:14 ` Christoph Hellwig 2011-09-22 21:49 ` Dave Chinner 2011-09-22 22:01 ` Christoph Hellwig 2011-09-23 5:28 ` Stefan Priebe - Profihost AG 2011-09-22 0:53 ` Dave Chinner 2011-09-22 5:27 ` Stefan Priebe - Profihost AG 2011-09-22 7:52 ` Stefan Priebe - Profihost AG 2011-09-21 7:36 ` Stefan Priebe - Profihost AG 2011-09-21 11:39 ` Christoph Hellwig 2011-09-21 13:39 ` Stefan Priebe 2011-09-21 14:17 ` Christoph Hellwig
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.