All of lore.kernel.org
 help / color / mirror / Atom feed
* xfs deadlock in stable kernel 3.0.4
@ 2011-09-10 12:23 Stefan Priebe
  2011-09-12 15:21 ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe @ 2011-09-10 12:23 UTC (permalink / raw)
  To: xfs; +Cc: xfs-masters

Hello List,

on some of our heavy loaded servers using xfs we're seeing a deadlock where reading/writing to the xfs filesystem suddenly stops working.

Here you can find sysrq w triggered log messages of the locked processes.

http://pastebin.com/JWjrbrh4

Please help! Thanks!

Please cc me i'm not subscribed.

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-10 12:23 xfs deadlock in stable kernel 3.0.4 Stefan Priebe
@ 2011-09-12 15:21 ` Christoph Hellwig
  2011-09-12 16:46   ` Stefan Priebe
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-12 15:21 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: xfs-masters, xfs

On Sat, Sep 10, 2011 at 02:23:12PM +0200, Stefan Priebe wrote:
> Hello List,
> 
> on some of our heavy loaded servers using xfs we're seeing a deadlock where reading/writing to the xfs filesystem suddenly stops working.
> 
> Here you can find sysrq w triggered log messages of the locked processes.
> 
> http://pastebin.com/JWjrbrh4

What kind of workload are you running?  Also did the workload run fine
with an older kernel, and if yes which one?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-12 15:21 ` Christoph Hellwig
@ 2011-09-12 16:46   ` Stefan Priebe
  2011-09-12 20:05     ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe @ 2011-09-12 16:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs

Hi,

>> Hello List,
>> 
>> on some of our heavy loaded servers using xfs we're seeing a deadlock where reading/writing to the xfs filesystem suddenly stops working.
>> 
>> Here you can find sysrq w triggered log messages of the locked processes.
>> 
>> http://pastebin.com/JWjrbrh4
> 
> What kind of workload are you running?  Also did the workload run fine
> with an older kernel, and if yes which one?

Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from that version.

Stefan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-12 16:46   ` Stefan Priebe
@ 2011-09-12 20:05     ` Christoph Hellwig
  2011-09-13  6:04       ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-12 20:05 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: Christoph Hellwig, xfs-masters, xfs

On Mon, Sep 12, 2011 at 06:46:26PM +0200, Stefan Priebe wrote:
> > What kind of workload are you running?  Also did the workload run fine
> > with an older kernel, and if yes which one?
> 
> Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from that version.


Just curious, is this the same system that also shows the freezes
reported to the scsi list?  If I/Os don't get completed by lower layers
I can see how we get everything in XFS waiting on the log reservations,
given that we never get the log tail pushed.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-12 20:05     ` Christoph Hellwig
@ 2011-09-13  6:04       ` Stefan Priebe - Profihost AG
  2011-09-13 19:31         ` Stefan Priebe - Profihost AG
  2011-09-13 20:50         ` Christoph Hellwig
  0 siblings, 2 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-13  6:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs

Hi,

> On Mon, Sep 12, 2011 at 06:46:26PM +0200, Stefan Priebe wrote:
>>> What kind of workload are you running?  Also did the workload run fine
>>> with an older kernel, and if yes which one?
>>
>> Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from that version.
>
> Just curious, is this the same system that also shows the freezes
> reported to the scsi list?  If I/Os don't get completed by lower layers
> I can see how we get everything in XFS waiting on the log reservations,
> given that we never get the log tail pushed.

I just reported it to the scsi list as i didn't knew where the problems 
is. But then some people told be it must be a XFS problem.

Some more informations:
1.) It's running with 2.6.32 and 2.6.38
2.) I can also write to another ext2 part on the same disk array(aacraid 
driver) while xfs stucks - so i think it must be an xfs problem
3.) I've also tried running 3.1-rc5 but then i'm seeing this error:

BUG: unable to handle kernel NULL pointer dereference at 000000000000012c
IP: [] inode_dio_done+0x4/0x25
PGD 293724067 PUD 292930067 PMD 0
Oops: 0002 [#1] SMP
CPU 5
Modules linked in: ipt_REJECT xt_tcpudp iptable_filter ip_tables 
x_tables coretemp k8temp

Pid: 4775, comm: mysqld Not tainted 3.1-rc5 #1 Supermicro X8DT3/X8DT3
RIP: 0010:[] [] inode_dio_done+0x4/0x25
RSP: 0018:ffff880292b5fad8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff8806ab4927e0 RCX: 0000000000007524
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880292b5fad8 R08: ffff880292b5e000 R09: 0000000000000000
R10: ffff88047f85e040 R11: ffff88042ddb5d88 R12: ffff88002b7f8800
R13: ffff88002b7f8800 R14: 0000000000000000 R15: ffff88042d896040
FS: 0000000045c79950(0063) GS:ffff88083fc40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000012c CR3: 0000000293408000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mysqld (pid: 4775, threadinfo ffff880292b5e000, task 
ffff88042d896040)
Stack:
ffff880292b5faf8 ffffffff811938cd 0000000192b5fb18 0000000000004000
ffff880292b5fb18 ffffffff810feba2 0000000000000000 ffff88002b7f8920
ffff880292b5fbf8 ffffffff810ff4fb ffff880292b5fb78 ffff880292b5e000
Call Trace:
[] xfs_end_io_direct_write+0x6a/0x6e
[] dio_complete+0x90/0xbb
[] __blockdev_direct_IO+0x92e/0x964
[] ? mempool_alloc_slab+0x11/0x13
[] xfs_vm_direct_IO+0x90/0x101
[] ? __xfs_get_blocks+0x395/0x395
[] ? xfs_finish_ioend_sync+0x1a/0x1a
[] generic_file_direct_write+0xd7/0x147
[] xfs_file_dio_aio_write+0x1b9/0x1d1
[] ? wake_up_state+0xb/0xd
[] xfs_file_aio_write+0x16a/0x21d
[] ? do_futex+0xc0/0x988
[] do_sync_write+0xc7/0x10d
[] vfs_write+0xab/0x103
[] sys_pwrite64+0x5c/0x7d
[] system_call_fastpath+0x16/0x1b
Code: 00 48 8d 34 30 89 d9 4c 89 e7 e8 3a fe ff ff 85 c0 75 0b 44 89 e8 
49 01 84 24 90 00 00 00 41 5a 5b 41 5c 41 5d c9 c3 55 48 89 e5 ff 8f 2c 
01 00 00 0f 94 c0 84 c0 74 11 48 81 c7 90 00 00 00
RIP [] inode_dio_done+0x4/0x25
RSP
CR2: 000000000000012c
---[ end trace 79ce33ac2f7c10bd ]---


Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-13  6:04       ` Stefan Priebe - Profihost AG
@ 2011-09-13 19:31         ` Stefan Priebe - Profihost AG
  2011-09-13 20:50         ` Christoph Hellwig
  1 sibling, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-13 19:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs


Am 13.09.2011 08:04, schrieb Stefan Priebe - Profihost AG:
> Hi,
>
>> On Mon, Sep 12, 2011 at 06:46:26PM +0200, Stefan Priebe wrote:
>>>> What kind of workload are you running? Also did the workload run fine
>>>> with an older kernel, and if yes which one?
>>>
>>> Mysql, Web, Mail, ftp ;-) yes it was with 2.6.32. I upgraded from
>>> that version.
>>
>> Just curious, is this the same system that also shows the freezes
>> reported to the scsi list? If I/Os don't get completed by lower layers
>> I can see how we get everything in XFS waiting on the log reservations,
>> given that we never get the log tail pushed.
>
> I just reported it to the scsi list as i didn't knew where the problems
> is. But then some people told be it must be a XFS problem.
>
> Some more informations:
> 1.) It's running with 2.6.32 and 2.6.38
> 2.) I can also write to another ext2 part on the same disk array(aacraid
> driver) while xfs stucks - so i think it must be an xfs problem
> 3.) I've also tried running 3.1-rc5 but then i'm seeing this error:
>
 > ...
>

Any idea what we could try next or how to find the problem? At least 
this is happening with different devices and writing to other partitions 
is still working.

Greets
Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-13  6:04       ` Stefan Priebe - Profihost AG
  2011-09-13 19:31         ` Stefan Priebe - Profihost AG
@ 2011-09-13 20:50         ` Christoph Hellwig
  2011-09-13 21:52           ` [xfs-masters] " Alex Elder
  2011-09-14  7:26           ` Stefan Priebe - Profihost AG
  1 sibling, 2 replies; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-13 20:50 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs

On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote:
> I just reported it to the scsi list as i didn't knew where the
> problems is. But then some people told be it must be a XFS problem.
> 
> Some more informations:
> 1.) It's running with 2.6.32 and 2.6.38
> 2.) I can also write to another ext2 part on the same disk
> array(aacraid driver) while xfs stucks - so i think it must be an
> xfs problem

That points a bit more towards XFS, although we've seen storage setups
create issues depending on the exact workload.  The prime culprit for
used to be the md software RAID driver, though.

> 3.) I've also tried running 3.1-rc5 but then i'm seeing this error:
> 
> BUG: unable to handle kernel NULL pointer dereference at 000000000000012c
> IP: [] inode_dio_done+0x4/0x25

Oops, that's a bug that I actually introduced myself.  Fix below:


Index: linux-2.6/fs/xfs/xfs_aops.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_aops.c	2011-09-13 16:38:47.141089046 -0400
+++ linux-2.6/fs/xfs/xfs_aops.c	2011-09-13 16:39:09.991647077 -0400
@@ -1300,6 +1300,7 @@ xfs_end_io_direct_write(
 	bool			is_async)
 {
 	struct xfs_ioend	*ioend = iocb->private;
+	struct inode		*inode = ioend->io_inode;
 
 	/*
 	 * blockdev_direct_IO can return an error even after the I/O
@@ -1331,7 +1332,7 @@ xfs_end_io_direct_write(
 	}
 
 	/* XXX: probably should move into the real I/O completion handler */
-	inode_dio_done(ioend->io_inode);
+	inode_dio_done(inode);
 }
 
 STATIC ssize_t

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-13 20:50         ` Christoph Hellwig
@ 2011-09-13 21:52           ` Alex Elder
  2011-09-13 21:58             ` Alex Elder
  2011-09-14  7:26           ` Stefan Priebe - Profihost AG
  1 sibling, 1 reply; 50+ messages in thread
From: Alex Elder @ 2011-09-13 21:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, Stefan Priebe - Profihost AG

On Tue, 2011-09-13 at 16:50 -0400, Christoph Hellwig wrote:
> On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote:
> > I just reported it to the scsi list as i didn't knew where the
> > problems is. But then some people told be it must be a XFS problem.
> > 
> > Some more informations:
> > 1.) It's running with 2.6.32 and 2.6.38
> > 2.) I can also write to another ext2 part on the same disk
> > array(aacraid driver) while xfs stucks - so i think it must be an
> > xfs problem
> 
> That points a bit more towards XFS, although we've seen storage setups
> create issues depending on the exact workload.  The prime culprit for
> used to be the md software RAID driver, though.
> 
> > 3.) I've also tried running 3.1-rc5 but then i'm seeing this error:
> > 
> > BUG: unable to handle kernel NULL pointer dereference at 000000000000012c
> > IP: [] inode_dio_done+0x4/0x25
> 
> Oops, that's a bug that I actually introduced myself.  Fix below:

Yikes.  I'll prepare that one to send to Linus for 3.1.
I'll wait for your formal signoff, though, Christoph.

Reviewed-by: Alex Elder <aelder@sgi.com>

> 
> Index: linux-2.6/fs/xfs/xfs_aops.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_aops.c	2011-09-13 16:38:47.141089046 -0400
> +++ linux-2.6/fs/xfs/xfs_aops.c	2011-09-13 16:39:09.991647077 -0400
> @@ -1300,6 +1300,7 @@ xfs_end_io_direct_write(
>  	bool			is_async)
>  {
>  	struct xfs_ioend	*ioend = iocb->private;
> +	struct inode		*inode = ioend->io_inode;
>  
>  	/*
>  	 * blockdev_direct_IO can return an error even after the I/O
> @@ -1331,7 +1332,7 @@ xfs_end_io_direct_write(
>  	}
>  
>  	/* XXX: probably should move into the real I/O completion handler */
> -	inode_dio_done(ioend->io_inode);
> +	inode_dio_done(inode);
>  }
>  
>  STATIC ssize_t
> 
> _______________________________________________
> xfs-masters mailing list
> xfs-masters@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs-masters



_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-13 21:52           ` [xfs-masters] " Alex Elder
@ 2011-09-13 21:58             ` Alex Elder
  2011-09-13 22:26               ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Alex Elder @ 2011-09-13 21:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, Stefan Priebe - Profihost AG

On Tue, 2011-09-13 at 16:52 -0500, Alex Elder wrote:
> On Tue, 2011-09-13 at 16:50 -0400, Christoph Hellwig wrote:
> > On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote:
> > > I just reported it to the scsi list as i didn't knew where the
> > > problems is. But then some people told be it must be a XFS problem.
> > > 
> > > Some more informations:
> > > 1.) It's running with 2.6.32 and 2.6.38
> > > 2.) I can also write to another ext2 part on the same disk
> > > array(aacraid driver) while xfs stucks - so i think it must be an
> > > xfs problem
> > 
> > That points a bit more towards XFS, although we've seen storage setups
> > create issues depending on the exact workload.  The prime culprit for
> > used to be the md software RAID driver, though.
> > 
> > > 3.) I've also tried running 3.1-rc5 but then i'm seeing this error:
> > > 
> > > BUG: unable to handle kernel NULL pointer dereference at 000000000000012c
> > > IP: [] inode_dio_done+0x4/0x25
> > 
> > Oops, that's a bug that I actually introduced myself.  Fix below:
> 
> Yikes.  I'll prepare that one to send to Linus for 3.1.
> I'll wait for your formal signoff, though, Christoph.
> 
> Reviewed-by: Alex Elder <aelder@sgi.com>

Nevermind--the latest code doesn't look quite
like that and doesn't suffer the same problem.

Christoph, will you please ensure the fix gets
to the stable folks though?  You have my review
for the change.

					-Alex


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-13 21:58             ` Alex Elder
@ 2011-09-13 22:26               ` Christoph Hellwig
  0 siblings, 0 replies; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-13 22:26 UTC (permalink / raw)
  To: Alex Elder
  Cc: Christoph Hellwig, xfs-masters, xfs, Stefan Priebe - Profihost AG

On Tue, Sep 13, 2011 at 04:58:13PM -0500, Alex Elder wrote:
> > Reviewed-by: Alex Elder <aelder@sgi.com>
> 
> Nevermind--the latest code doesn't look quite
> like that and doesn't suffer the same problem.

It needs to go into 3.1, where this bug was introduced.  In the
3.2 queue it already has been fixed by different means

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-13 20:50         ` Christoph Hellwig
  2011-09-13 21:52           ` [xfs-masters] " Alex Elder
@ 2011-09-14  7:26           ` Stefan Priebe - Profihost AG
  2011-09-14  7:48             ` Stefan Priebe - Profihost AG
  1 sibling, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-14  7:26 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs

Hi,

Am 13.09.2011 22:50, schrieb Christoph Hellwig:
> On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote:
>> I just reported it to the scsi list as i didn't knew where the
>> problems is. But then some people told be it must be a XFS problem.
>>
>> Some more informations:
>> 1.) It's running with 2.6.32 and 2.6.38
>> 2.) I can also write to another ext2 part on the same disk
>> array(aacraid driver) while xfs stucks - so i think it must be an
>> xfs problem
>
> That points a bit more towards XFS, although we've seen storage setups
> create issues depending on the exact workload.  The prime culprit for
> used to be the md software RAID driver, though.
>
>> 3.) I've also tried running 3.1-rc5 but then i'm seeing this error:
>>
>> BUG: unable to handle kernel NULL pointer dereference at 000000000000012c
>> IP: [] inode_dio_done+0x4/0x25
>
> Oops, that's a bug that I actually introduced myself.  Fix below:

Thanks for the patch.

Now we have the following situation:

1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch
2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X 
will become the next long term stable. So there will be a lot of people 
using it.
3.) I have seen this deadlock on systems with aacraid and with intel 
ahci onboard. (that's all we're using)
4.) I still write to other devices / raids on the same controller while 
the XFS root filesystem hangs.

What can we do / try now / next?

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-14  7:26           ` Stefan Priebe - Profihost AG
@ 2011-09-14  7:48             ` Stefan Priebe - Profihost AG
  2011-09-14  8:49               ` Stefan Priebe - Profihost AG
  2011-09-14 14:30               ` Christoph Hellwig
  0 siblings, 2 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-14  7:48 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs

Hi,

>> Oops, that's a bug that I actually introduced myself. Fix below:
>
> Thanks for the patch.
>
> Now we have the following situation:
>
> 1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch
> 2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X
> will become the next long term stable. So there will be a lot of people
> using it.
> 3.) I have seen this deadlock on systems with aacraid and with intel
> ahci onboard. (that's all we're using)
> 4.) I still write to other devices / raids on the same controller while
> the XFS root filesystem hangs.

Sadly it was now crashing with  3.1 rc-6 + patch again. Sorry i was to 
fast to write you an email.

Hung Task detection showed me this with 3.1 rc-6:

[] ? might_fault+0x3b/0x88
[] do_filp_open+0x38/0x86
[] ? _raw_spin_unlock+0x26/0x2b
[] ? alloc_fd+0x11d/0x12e
[] do_sys_open+0x114/0x1a3
[] sys_open+0x1b/0x1d
[] system_call_fastpath+0x16/0x1b
1 lock held by mysqld/17058:
#0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693
INFO: task qmail-send:4899 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qmail-send D 0000000000000000 0 4899 1 0x00020000
ffff88081c4afc38 0000000000000046 ffffffff814a52d5 0000000100000000
ffff88082cf5be70 ffff88081c4ae010 0000000000004000 ffff88082cf5b5d0
0000000000011c40 ffff88081c4affd8 ffff88081c4affd8 0000000000011c40
Call Trace:
[] ? __schedule+0x2e8/0x9fd
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] schedule+0x57/0x59
[] xlog_grant_log_space+0x18e/0x4ae
[] ? try_to_wake_up+0x330/0x330
[] xfs_log_reserve+0x11a/0x122
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_remove+0x136/0x34e
[] ? mutex_lock_nested+0x275/0x290
[] ? mutex_lock_nested+0x281/0x290
[] ? vfs_unlink+0x51/0xdd
[] xfs_vn_unlink+0x3c/0x75
[] vfs_unlink+0x69/0xdd
[] do_unlinkat+0xde/0x170
[] ? retint_swapgs+0xe/0x13
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] ? trace_hardirqs_on_thunk+0x3a/0x3f
[] ? file_free_rcu+0x35/0x35
[] sys_unlink+0x11/0x13
[] ia32_do_call+0x13/0x13
2 locks held by qmail-send/4899:
#0: (&sb->s_type->i_mutex_key#5/1){+.+.+.}, at: [] do_unlinkat+0x63/0x170
#1: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] vfs_unlink+0x51/0xdd
INFO: task httpd:6316 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
httpd D 0000000000000001 0 6316 6270 0x00000000
ffff880406edfb78 0000000000000046 ffff88041b792c30 0000000100000000
ffff88041b792c80 ffff880406ede010 0000000000004000 ffff88041b7923e0
0000000000011c40 ffff880406edffd8 ffff880406edffd8 0000000000011c40
Call Trace:
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] schedule+0x57/0x59
[] xlog_grant_log_space+0x18e/0x4ae
[] ? try_to_wake_up+0x330/0x330
[] xfs_log_reserve+0x11a/0x122
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_create+0x200/0x53a
[] ? d_lookup+0x2d/0x42
2 locks held by httpd/6316:
[] ? __d_lookup+0x16a/0x17c
[] ? __d_lookup+0x16a/0x17c
1 lock held by imap/11461:
INFO: task flush-8:0:3658 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:0 D 000000000000000b 0 3658 2 0x00000000
ffff88082c389690 0000000000000046 ffff88082c8bac30 0000000100000000
ffff88082c8bac58 ffff88082c388010 0000000000004000 ffff88082c8ba3e0
0000000000011c40 ffff88082c389fd8 ffff88082c389fd8 0000000000011c40
Call Trace:
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] schedule+0x57/0x59
[] xlog_grant_log_space+0x18e/0x4ae
[] ? try_to_wake_up+0x330/0x330
[] xfs_log_reserve+0x11a/0x122
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_iomap_write_allocate+0xcc/0x2cc
[] ? xfs_ilock_nowait+0x66/0xd5
[] ? up_read+0x1e/0x37
[] xfs_map_blocks+0x159/0x1ee
[] xfs_vm_writepage+0x21e/0x3f9
[] __writepage+0x15/0x3b
[] write_cache_pages+0x28c/0x3a8
[] ? alloc_pages_exact_nid+0x9a/0x9a
[] generic_writepages+0x46/0x61
[] xfs_vm_writepages+0x45/0x4e
[] do_writepages+0x1f/0x28
[] writeback_single_inode+0x18f/0x387
[] writeback_sb_inodes+0x196/0x237
[] ? grab_super_passive+0x52/0x76
[] __writeback_inodes_wb+0x73/0xb6
[] wb_writeback+0x163/0x24b
[] ? trace_hardirqs_on+0xd/0xf
[] ? local_bh_enable_ip+0xbc/0xc1
[] wb_do_writeback+0x183/0x210
[] bdi_writeback_thread+0xc0/0x1e4
[] ? wb_do_writeback+0x210/0x210
[] kthread+0x81/0x89
[] kernel_thread_helper+0x4/0x10
[] ? finish_task_switch+0x45/0xc3
[] ? retint_restore_args+0xe/0xe
[] ? __init_kthread_worker+0x56/0x56
[] ? gs_change+0xb/0xb
1 lock held by flush-8:0/3658:
#0: (&type->s_umount_key#31){++++.+}, at: [] grab_super_passive+0x52/0x76
INFO: task syslogd:4459 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syslogd D 000000000000000c 0 4459 1 0x00000000
ffff88082b4c3d78 0000000000000046 ffffffff814a52d5 ffff88082c8605d8
ffff88082b446ba0 ffff88082b4c2010 0000000000004000 ffff88082b446ba0
0000000000011c40 ffff88082b4c3fd8 ffff88082b4c3fd8 0000000000011c40
Call Trace:
[] ? __schedule+0x2e8/0x9fd
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] schedule+0x57/0x59
[] xlog_grant_log_space+0x18e/0x4ae
[] ? try_to_wake_up+0x330/0x330
[] xfs_log_reserve+0x11a/0x122
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_file_fsync+0x15f/0x22d
[] vfs_fsync_range+0x18/0x21
[] vfs_fsync+0x17/0x19
[] do_fsync+0x2e/0x44
[] sys_fsync+0xb/0xf
[] system_call_fastpath+0x16/0x1b
no locks held by syslogd/4459.
INFO: task mysqld:4612 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mysqld D 0000000000000000 0 4612 4567 0x00000000
ffff880429a31d78 0000000000000046 ffffffff814a52d5 ffff88082c8605d8
ffff88042cd8d9b0 ffff880429a30010 0000000000004000 ffff88042cd8d9b0
0000000000011c40 ffff880429a31fd8 ffff880429a31fd8 0000000000011c40
Call Trace:
[] ? __schedule+0x2e8/0x9fd
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] schedule+0x57/0x59
[] xlog_grant_log_space+0x18e/0x4ae
[] ? try_to_wake_up+0x330/0x330
[] xfs_log_reserve+0x11a/0x122
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_file_fsync+0x15f/0x22d
[] vfs_fsync_range+0x18/0x21
[] vfs_fsync+0x17/0x19
[] do_fsync+0x2e/0x44
[] sys_fsync+0xb/0xf
[] system_call_fastpath+0x16/0x1b
no locks held by mysqld/4612.
INFO: task mysqld:27595 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mysqld D 0000000000000008 0 27595 4567 0x00000000
ffff88011dda3ca8 0000000000000046 ffffffff814a52d5 ffff880403cd88a0
0000000000000246 ffff88011dda2010 0000000000004000 ffff880403cd8000
0000000000011c40 ffff88011dda3fd8 ffff88011dda3fd8 0000000000011c40
Call Trace:
[] ? __schedule+0x2e8/0x9fd
[] ? mark_held_locks+0xc9/0xef
[] ? mutex_lock_nested+0x16b/0x290
[] schedule+0x57/0x59
[] mutex_lock_nested+0x173/0x290
[] ? do_last+0x287/0x693
[] do_last+0x287/0x693
[] path_openat+0xcd/0x342
[] ? might_fault+0x3b/0x88
[] do_filp_open+0x38/0x86
[] ? _raw_spin_unlock+0x26/0x2b
[] ? alloc_fd+0x11d/0x12e
[] do_sys_open+0x114/0x1a3
[] sys_open+0x1b/0x1d
[] system_call_fastpath+0x16/0x1b
1 lock held by mysqld/27595:
#0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693
INFO: task mysqld:4873 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mysqld D 0000000000000000 0 4873 4625 0x00000000
ffff88081bf61d78 0000000000000046 ffffffff814a52d5 0000000100000000
ffff88081e82f3f0 ffff88081bf60010 0000000000004000 ffff88081e82eba0
0000000000011c40 ffff88081bf61fd8 ffff88081bf61fd8 0000000000011c40
Call Trace:
[] ? __schedule+0x2e8/0x9fd
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] schedule+0x57/0x59
[] xlog_grant_log_space+0x18e/0x4ae
[] ? try_to_wake_up+0x330/0x330
[] xfs_log_reserve+0x11a/0x122
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_file_fsync+0x15f/0x22d
[] vfs_fsync_range+0x18/0x21
[] vfs_fsync+0x17/0x19
[] do_fsync+0x2e/0x44
[] sys_fsync+0xb/0xf
[] system_call_fastpath+0x16/0x1b
no locks held by mysqld/4873.
INFO: task mysqld:17058 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mysqld D 000000000000000c 0 17058 4625 0x00000000
ffff88010325fa88 0000000000000046 ffffffff814a52d5 0000000100000000
ffff88025be8f418 ffff88010325e010 0000000000004000 ffff88025be8eba0
0000000000011c40 ffff88010325ffd8 ffff88010325ffd8 0000000000011c40
Call Trace:
[] ? __schedule+0x2e8/0x9fd
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47
[] ? trace_hardirqs_on_caller+0x11c/0x153
[] schedule+0x57/0x59
[] xlog_grant_log_space+0x18e/0x4ae
[] ? try_to_wake_up+0x330/0x330
[] xfs_log_reserve+0x11a/0x122
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_create+0x200/0x53a
[] ? __d_lookup+0xbe/0x17c
[] ? __d_lookup+0x16a/0x17c
[] ? d_validate+0x96/0x96
[] xfs_vn_mknod+0x9a/0xf5
[] xfs_vn_create+0xb/0xd
[] vfs_create+0x72/0xa4
[] do_last+0x323/0x693
[] path_openat+0xcd/0x342


Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-14  7:48             ` Stefan Priebe - Profihost AG
@ 2011-09-14  8:49               ` Stefan Priebe - Profihost AG
  2011-09-14 14:30                 ` Christoph Hellwig
  2011-09-14 14:30               ` Christoph Hellwig
  1 sibling, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-14  8:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs

Hi,

Am 14.09.2011 09:48, schrieb Stefan Priebe - Profihost AG:
> Hi,
>
>>> Oops, that's a bug that I actually introduced myself. Fix below:
>>
>> Thanks for the patch.
>>
>> Now we have the following situation:
>>
>> 1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch
>> 2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X
>> will become the next long term stable. So there will be a lot of people
>> using it.
>> 3.) I have seen this deadlock on systems with aacraid and with intel
>> ahci onboard. (that's all we're using)
>> 4.) I still write to other devices / raids on the same controller while
>> the XFS root filesystem hangs.
>
> Sadly it was now crashing with 3.1 rc-6 + patch again. Sorry i was to
> fast to write you an email.

So might it be that the problem at least in 3.1 lies in:
[] ? mark_held_locks+0xc9/0xef
[] ? _raw_spin_unlock_irqrestore+0x3f/0x47

and not in XFS?

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-14  7:48             ` Stefan Priebe - Profihost AG
  2011-09-14  8:49               ` Stefan Priebe - Profihost AG
@ 2011-09-14 14:30               ` Christoph Hellwig
  2011-09-14 16:06                 ` Stefan Priebe - Profihost AG
  2011-09-18  9:14                 ` Stefan Priebe - Profihost AG
  1 sibling, 2 replies; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-14 14:30 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder

On Wed, Sep 14, 2011 at 09:48:18AM +0200, Stefan Priebe - Profihost AG wrote:
> #0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693

This means you are running your heavy load with lockdep enabled.  I
can't see how it directly causes your issues, but it will slow anything
down to almost a grinding halt on systems with more than say two cores.

Can you run with CONFIG_DEBUG_LOCK_ALLOC / and CONFIG_PROVE_LOCKING
disabled/  It might be worth if you have other really heavy debugging
options enabled, too.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-14  8:49               ` Stefan Priebe - Profihost AG
@ 2011-09-14 14:30                 ` Christoph Hellwig
  0 siblings, 0 replies; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-14 14:30 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder

On Wed, Sep 14, 2011 at 10:49:20AM +0200, Stefan Priebe - Profihost AG wrote:
> >Sadly it was now crashing with 3.1 rc-6 + patch again. Sorry i was to
> >fast to write you an email.
> 
> So might it be that the problem at least in 3.1 lies in:
> [] ? mark_held_locks+0xc9/0xef
> [] ? _raw_spin_unlock_irqrestore+0x3f/0x47
> 
> and not in XFS?

That's the lockdep code.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-14 14:30               ` Christoph Hellwig
@ 2011-09-14 16:06                 ` Stefan Priebe - Profihost AG
  2011-09-18  9:14                 ` Stefan Priebe - Profihost AG
  1 sibling, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-14 16:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder

Hi,

Am 14.09.2011 16:30, schrieb Christoph Hellwig:
> On Wed, Sep 14, 2011 at 09:48:18AM +0200, Stefan Priebe - Profihost AG wrote:
>> #0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_last+0x287/0x693
>
> This means you are running your heavy load with lockdep enabled.  I
> can't see how it directly causes your issues, but it will slow anything
> down to almost a grinding halt on systems with more than say two cores.
>
> Can you run with CONFIG_DEBUG_LOCK_ALLOC / and CONFIG_PROVE_LOCKING
> disabled/  It might be worth if you have other really heavy debugging
> options enabled, too.

i just enabled it while trying to find out the cause of my problems.

My actual config has:
# grep -i 'DEBUG' .config|egrep -v "^# "
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_SLUB_DEBUG=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC79XX_DEBUG_ENABLE=y
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_OCFS2_DEBUG_MASKLOG=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_RODATA=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y


my original config had:
# grep -i 'DEBUG' .config_stillnotworking|egrep -v "^# "
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC79XX_DEBUG_ENABLE=y
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_OCFS2_DEBUG_MASKLOG=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_RODATA=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y

With both configs i'm seeing the SAME symptoms after a while.

Which options should i disable?

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-14 14:30               ` Christoph Hellwig
  2011-09-14 16:06                 ` Stefan Priebe - Profihost AG
@ 2011-09-18  9:14                 ` Stefan Priebe - Profihost AG
  2011-09-18 20:04                   ` Christoph Hellwig
  2011-09-18 23:02                   ` Dave Chinner
  1 sibling, 2 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-18  9:14 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder

Hi,

at least i'm now able to reproduce the issue. I hope this will help to 
investigate the issue and hopefully you can reproduce it as well.

I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had detect 
hanging taks with 120s set. You'll then see that the bonnie++ command 
get's stuck in xlog_grant_log_space while creating or deleting files. I 
was using a SSD or a fast Raid 10 (24x SAS Disks) - i was not able to 
reproduce it on normal SATA disks even a 20x SATA Raid 10 didn't work.

I used bonnie++ (V 1.96) to reproduce it. Mostly in the 1st run the bug 
is triggered - sometimes I needed two runs.

bonnie++ -u root -s 0 -n 1024:32768:0:1024:4096 -d /

I hope that helps - as i now have a testing machine and can trigger the 
bug pretty fast (10-30min instead of hours). I can also add debug code 
if you want or have one.

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-18  9:14                 ` Stefan Priebe - Profihost AG
@ 2011-09-18 20:04                   ` Christoph Hellwig
  2011-09-19 10:54                     ` Stefan Priebe - Profihost AG
  2011-09-18 23:02                   ` Dave Chinner
  1 sibling, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-18 20:04 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, aelder, xfs

On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote:
> Hi,
> 
> at least i'm now able to reproduce the issue. I hope this will help
> to investigate the issue and hopefully you can reproduce it as well.
> 
> I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had
> detect hanging taks with 120s set. You'll then see that the bonnie++
> command get's stuck in xlog_grant_log_space while creating or
> deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks)
> - i was not able to reproduce it on normal SATA disks even a 20x
> SATA Raid 10 didn't work.

Thanks a lot for the reproducer!

I've tried it on my laptop SSD and that didn't reproduce it yet.  I'll
try it on monday on a real high end setup.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-18  9:14                 ` Stefan Priebe - Profihost AG
  2011-09-18 20:04                   ` Christoph Hellwig
@ 2011-09-18 23:02                   ` Dave Chinner
  2011-09-20  0:47                     ` Stefan Priebe
  2011-09-20 10:09                     ` Stefan Priebe - Profihost AG
  1 sibling, 2 replies; 50+ messages in thread
From: Dave Chinner @ 2011-09-18 23:02 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder

On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote:
> Hi,
> 
> at least i'm now able to reproduce the issue. I hope this will help
> to investigate the issue and hopefully you can reproduce it as well.
> 
> I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had
> detect hanging taks with 120s set. You'll then see that the bonnie++
> command get's stuck in xlog_grant_log_space while creating or
> deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks)
> - i was not able to reproduce it on normal SATA disks even a 20x
> SATA Raid 10 didn't work.
> 
> I used bonnie++ (V 1.96) to reproduce it. Mostly in the 1st run the
> bug is triggered - sometimes I needed two runs.
> 
> bonnie++ -u root -s 0 -n 1024:32768:0:1024:4096 -d /
> 
> I hope that helps - as i now have a testing machine and can trigger
> the bug pretty fast (10-30min instead of hours). I can also add
> debug code if you want or have one.

If it is a log space accounting issue, then the output of 'xfs_info
<mtpt>' is really necessary to set the filesystem up the same way
(e.g. same log size, number of AGs, etc) so that it behaves the same
way on different test machines....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-18 20:04                   ` Christoph Hellwig
@ 2011-09-19 10:54                     ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-19 10:54 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs

Am 18.09.2011 22:04, schrieb Christoph Hellwig:
> On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote:
>> Hi,
>>
>> at least i'm now able to reproduce the issue. I hope this will help
>> to investigate the issue and hopefully you can reproduce it as well.
>>
>> I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had
>> detect hanging taks with 120s set. You'll then see that the bonnie++
>> command get's stuck in xlog_grant_log_space while creating or
>> deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks)
>> - i was not able to reproduce it on normal SATA disks even a 20x
>> SATA Raid 10 didn't work.
>
> Thanks a lot for the reproducer!
>
> I've tried it on my laptop SSD and that didn't reproduce it yet.  I'll
> try it on monday on a real high end setup.

Sadly my SSD briked tonight while doing heavy testing ;-( I was not able 
to reproduce it on every partition. Only on some. Sadly i was not able 
to find the common point which causes this.

I've now to setup a new machine and try to reproduce it again.

What i got so far is that bonnie++ is always hanging here:

[] ? radix_tree_gang_lookup_slot+0x6a/0x8d
[] ? xfs_bmap_search_extents+0x56/0xb9
[] ? find_get_pages+0x39/0xd8
[] xlog_wait+0x58/0x70
[] ? try_to_wake_up+0x1c6/0x1c6
[] ? xlog_grant_push_ail+0xb7/0xbf
[] xlog_grant_log_space+0x162/0x2b1
[] xfs_log_reserve+0xbb/0xc4
[] xfs_trans_reserve+0xd6/0x1b1
[] xfs_free_eofblocks+0x16b/0x1fb
[] xfs_release+0x1c7/0x202
[] xfs_file_release+0x10/0x14
[] fput+0xfd/0x1eb
[] filp_close+0x6d/0x78
[] sys_close+0x9a/0xd4
[] system_call_fastpath+0x16/0x1b

The traces we had in the past were difficult to check which process was 
causing the lookup. So it doesn't seem to be the xlog_grant_log_space 
itself it seems that it is more xfs_bmap_search_extents or 
radix_tree_gang_lookup_slot?

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-18 23:02                   ` Dave Chinner
@ 2011-09-20  0:47                     ` Stefan Priebe
  2011-09-20  1:01                       ` Stefan Priebe
  2011-09-20 10:09                     ` Stefan Priebe - Profihost AG
  1 sibling, 1 reply; 50+ messages in thread
From: Stefan Priebe @ 2011-09-20  0:47 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder


Am 19.09.2011 um 01:02 schrieb Dave Chinner <david@fromorbit.com>:

> On Sun, Sep 18, 2011 at 11:14:08AM +0200, Stefan Priebe - Profihost AG wrote:
>> Hi,
>> 
>> at least i'm now able to reproduce the issue. I hope this will help
>> to investigate the issue and hopefully you can reproduce it as well.
>> 
>> I'm using vanilla 3.0.4 kernel + xfs as root filesystem and had
>> detect hanging taks with 120s set. You'll then see that the bonnie++
>> command get's stuck in xlog_grant_log_space while creating or
>> deleting files. I was using a SSD or a fast Raid 10 (24x SAS Disks)
>> - i was not able to reproduce it on normal SATA disks even a 20x
>> SATA Raid 10 didn't work.
>> 
>> I used bonnie++ (V 1.96) to reproduce it. Mostly in the 1st run the
>> bug is triggered - sometimes I needed two runs.
>> 
>> bonnie++ -u root -s 0 -n 1024:32768:0:1024:4096 -d /
>> 
>> I hope that helps - as i now have a testing machine and can trigger
>> the bug pretty fast (10-30min instead of hours). I can also add
>> debug code if you want or have one.
> 
> If it is a log space accounting issue, then the output of 'xfs_info
> <mtpt>' is really necessary to set the filesystem up the same way
> (e.g. same log size, number of AGs, etc) so that it behaves the same
> way on different 

I can't get it. It just works on some part. and not on the other. Even xfs_info shows the same
for them. Also i have one part where it only happens when that one is
root (/). When i mount that one as /mnt it does not happen ;-(

Any idea on how to proceed now?

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-20  0:47                     ` Stefan Priebe
@ 2011-09-20  1:01                       ` Stefan Priebe
  0 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe @ 2011-09-20  1:01 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder



Am 20.09.2011 um 02:47 schrieb Stefan Priebe <s.priebe@profihost.ag>:

> I can't get it. It just works on some part. and not on the other.

So works means here reproducing it with bonnie++.

So i can reproduce it still very fast but i don't know how to create a testcase.

Stefan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-18 23:02                   ` Dave Chinner
  2011-09-20  0:47                     ` Stefan Priebe
@ 2011-09-20 10:09                     ` Stefan Priebe - Profihost AG
  2011-09-20 16:02                       ` Christoph Hellwig
  1 sibling, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-20 10:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder

Hi,

any idea how to get deeper into this? I've tried using kgdb but 
strangely the error does not occur when kgdb is remote attached. When i 
unattach kgdb and restart bonnie the error happens again.

So it seems to me a little bit like a timing issue?

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-20 10:09                     ` Stefan Priebe - Profihost AG
@ 2011-09-20 16:02                       ` Christoph Hellwig
  2011-09-20 17:23                         ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-20 16:02 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder

On Tue, Sep 20, 2011 at 12:09:34PM +0200, Stefan Priebe - Profihost AG wrote:
> Hi,
> 
> any idea how to get deeper into this? I've tried using kgdb but
> strangely the error does not occur when kgdb is remote attached.
> When i unattach kgdb and restart bonnie the error happens again.
> 
> So it seems to me a little bit like a timing issue?

Sounds like it.

Can you summarize all the data that we gather over this thread into one
summary, e.g.

 - what kernel does it happens?  Seems like 3.0 and 3.1 hit it easily,
   2.6.38 some times, 2.6.32 is fine.  Did you test anything between
   2.6.32 and 2.6.38?
 - what hardware hits it often/sometimes/never?
 - what is the fs geometry?
 - what is the hardware?
 - is this a 32 or 64-bit kernel, or do you run both?

I'm pretty sure most got posted somewhere, but let's get a summary
as things was a bit confusing sometimes.

Note that 2.6.38 moved the whole log grant code to a lockless algorithm,
so this might be a likely culprit if you're managing to hit race windows
no one else does, i.e. this really is a timing issue.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-20 16:02                       ` Christoph Hellwig
@ 2011-09-20 17:23                         ` Stefan Priebe - Profihost AG
  2011-09-20 17:24                           ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-20 17:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder

> Can you summarize all the data that we gather over this thread into one
> summary, e.g.
Yes - hope it helps.

>   - what kernel does it happens?  Seems like 3.0 and 3.1 hit it easily,
>     2.6.38 some times, 2.6.32 is fine.  Did you test anything between
>     2.6.32 and 2.6.38?
Hits very easily: 3.0.4 and 3.1-rc5
Very rare: 2.6.38 - as it happened only some times i cannot 100% 
guarantee that it is really the same issue
No issues at all: 2.6.32

I've not tested anything between 2.6.32 as i cannot reproduce it under 
2.6.38 at all - seen once a week of 500.

>   - what hardware hits it often/sometimes/never?
I've seen this only on multi core CPUs with > 2.8Ghz and fast SAS Raid 
10 or SSD. I cannot say if it's the CPU or the fast disks - as our low 
cost systems have only small CPUs and the high end ones have big cpus 
with fast disks.

>   - what is the fs geometry?
What do you exactly mean? I've seen this on 1TB and 160GB SSD devices 
with totally different disk layout.

>   - what is the hardware?
see above

>   - is this a 32 or 64-bit kernel, or do you run both?
always 64bit

> I'm pretty sure most got posted somewhere, but let's get a summary
> as things was a bit confusing sometimes.
no problem

> Note that 2.6.38 moved the whole log grant code to a lockless algorithm,
> so this might be a likely culprit if you're managing to hit race windows
> no one else does, i.e. this really is a timing issue.
I'm nearly willing todo anything to solve this. What can i do to help. 
My last hope from today was to get some code lines with kgdb - sadly it 
does not happen at all when kgdb is attached ;-(

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-20 17:23                         ` Stefan Priebe - Profihost AG
@ 2011-09-20 17:24                           ` Christoph Hellwig
  2011-09-20 17:35                             ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-20 17:24 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder

On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote:
> >  - what is the fs geometry?
> What do you exactly mean? I've seen this on 1TB and 160GB SSD
> devices with totally different disk layout.

The output of mkfs.xfs (of xfs_info after it's been created)

> >Note that 2.6.38 moved the whole log grant code to a lockless algorithm,
> >so this might be a likely culprit if you're managing to hit race windows
> >no one else does, i.e. this really is a timing issue.
> I'm nearly willing todo anything to solve this. What can i do to
> help. My last hope from today was to get some code lines with kgdb -
> sadly it does not happen at all when kgdb is attached ;-(

I'll run tests on a system with a pci-e flash device today.  Just to
make sure we are on the same page, can you give me your kernel .config
in addition to the mkfs output above?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-20 17:24                           ` Christoph Hellwig
@ 2011-09-20 17:35                             ` Stefan Priebe - Profihost AG
  2011-09-20 22:30                               ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-20 17:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder

Am 20.09.2011 19:24, schrieb Christoph Hellwig:
> On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote:
>>>   - what is the fs geometry?
>> What do you exactly mean? I've seen this on 1TB and 160GB SSD
>> devices with totally different disk layout.
>
> The output of mkfs.xfs (of xfs_info after it's been created)

ssd:~# xfs_info /dev/sda3
meta-data=/dev/root              isize=256    agcount=4, agsize=9517888 blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=38071552, imaxpct=25
          =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=18589, version=2
          =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


> I'll run tests on a system with a pci-e flash device today.  Just to
> make sure we are on the same page, can you give me your kernel .config
> in addition to the mkfs output above?

OK i hope you can reproduce it as well.

.config
http://pastebin.com/raw.php?i=m8AAFJ1B

I also found out that i was not able to reproduce it under a freshly new 
created xfs part. I needed to copy a bunch of files delete some create 
some new and then start the test. I just duplicated multiple times the 
root filesystem and then deleted some, created some hardlinks whatever...

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-20 17:35                             ` Stefan Priebe - Profihost AG
@ 2011-09-20 22:30                               ` Christoph Hellwig
  2011-09-21  2:11                                 ` [xfs-masters] " Dave Chinner
  2011-09-21  7:36                                 ` Stefan Priebe - Profihost AG
  0 siblings, 2 replies; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-20 22:30 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, aelder, xfs

On Tue, Sep 20, 2011 at 07:35:57PM +0200, Stefan Priebe - Profihost AG wrote:
> Am 20.09.2011 19:24, schrieb Christoph Hellwig:
> >On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote:
> >>>  - what is the fs geometry?
> >>What do you exactly mean? I've seen this on 1TB and 160GB SSD
> >>devices with totally different disk layout.
> >
> >The output of mkfs.xfs (of xfs_info after it's been created)
> 
> ssd:~# xfs_info /dev/sda3
> meta-data=/dev/root              isize=256    agcount=4, agsize=9517888 blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=38071552, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=18589, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0

Nothing special there.

So far I haven't been able to recreate it.  How many runs did you
normally need on 3.1-rc?  Note that so far I've run my known working
kernel, I'll test your config plus the drivers I need next.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-20 22:30                               ` Christoph Hellwig
@ 2011-09-21  2:11                                 ` Dave Chinner
  2011-09-21  7:40                                   ` Stefan Priebe - Profihost AG
  2011-09-21  7:36                                 ` Stefan Priebe - Profihost AG
  1 sibling, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2011-09-21  2:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, Stefan Priebe - Profihost AG

On Tue, Sep 20, 2011 at 06:30:47PM -0400, Christoph Hellwig wrote:
> On Tue, Sep 20, 2011 at 07:35:57PM +0200, Stefan Priebe - Profihost AG wrote:
> > Am 20.09.2011 19:24, schrieb Christoph Hellwig:
> > >On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote:
> > >>>  - what is the fs geometry?
> > >>What do you exactly mean? I've seen this on 1TB and 160GB SSD
> > >>devices with totally different disk layout.
> > >
> > >The output of mkfs.xfs (of xfs_info after it's been created)
> > 
> > ssd:~# xfs_info /dev/sda3
> > meta-data=/dev/root              isize=256    agcount=4, agsize=9517888 blks
> >          =                       sectsz=512   attr=2
> > data     =                       bsize=4096   blocks=38071552, imaxpct=25
> >          =                       sunit=0      swidth=0 blks
> > naming   =version 2              bsize=4096   ascii-ci=0
> > log      =internal               bsize=4096   blocks=18589, version=2
> >          =                       sectsz=512   sunit=0 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> Nothing special there.
> 
> So far I haven't been able to recreate it.  How many runs did you
> normally need on 3.1-rc?  Note that so far I've run my known working
> kernel, I'll test your config plus the drivers I need next.

How much memory does your test machine have? The performance will be
vastly different if there is enough RAM to hold the working set of
inodes and page cache (~20GB all up), and that could be one of the
factors contributing to the problems.

The above xfs_info output is from your 160GB SSD - what's the output
from the 1TB device?

Also, what phase do you see it hanging in? the random stat phase is
terribly slow on spinning disks, so if I can avoid that it woul dbe
nice....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-20 22:30                               ` Christoph Hellwig
  2011-09-21  2:11                                 ` [xfs-masters] " Dave Chinner
@ 2011-09-21  7:36                                 ` Stefan Priebe - Profihost AG
  2011-09-21 11:39                                   ` Christoph Hellwig
  1 sibling, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-21  7:36 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, aelder, xfs

Am 21.09.2011 00:30, schrieb Christoph Hellwig:
> On Tue, Sep 20, 2011 at 07:35:57PM +0200, Stefan Priebe - Profihost AG wrote:
>> Am 20.09.2011 19:24, schrieb Christoph Hellwig:
>>> On Tue, Sep 20, 2011 at 07:23:00PM +0200, Stefan Priebe - Profihost AG wrote:
>>>>>   - what is the fs geometry?
>>>> What do you exactly mean? I've seen this on 1TB and 160GB SSD
>>>> devices with totally different disk layout.
>>>
>>> The output of mkfs.xfs (of xfs_info after it's been created)
>>
>> ssd:~# xfs_info /dev/sda3
>> meta-data=/dev/root              isize=256    agcount=4, agsize=9517888 blks
>>           =                       sectsz=512   attr=2
>> data     =                       bsize=4096   blocks=38071552, imaxpct=25
>>           =                       sunit=0      swidth=0 blks
>> naming   =version 2              bsize=4096   ascii-ci=0
>> log      =internal               bsize=4096   blocks=18589, version=2
>>           =                       sectsz=512   sunit=0 blks, lazy-count=1
>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>
> Nothing special there.
>
> So far I haven't been able to recreate it.  How many runs did you
> normally need on 3.1-rc?  Note that so far I've run my known working
> kernel, I'll test your config plus the drivers I need next.

I had only used 3.0.4 with bonnie++ to reproduce. 3.1-rc was running on 
a prod. system.

Sadly i'm also not able to reproduce it reliable on every partition. 
Sometimes it works sometimes not. Just retrying does not help. I had to 
copy and delete random files from the part. and then start bonnie++ on 
it. Perhaps i can give you a dd dump of the partition. But i had to 
recreate one. My Intel SSD is now massivly slower than when i started 
the tests. No idea why.

Stefan


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21  2:11                                 ` [xfs-masters] " Dave Chinner
@ 2011-09-21  7:40                                   ` Stefan Priebe - Profihost AG
  2011-09-21 11:42                                     ` Dave Chinner
  0 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-21  7:40 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs

Am 21.09.2011 04:11, schrieb Dave Chinner:

> How much memory does your test machine have? The performance will be
> vastly different if there is enough RAM to hold the working set of
> inodes and page cache (~20GB all up), and that could be one of the
> factors contributing to the problems.
The livesystems which crash within hours have between 48GB and 64GB RAM. 
But my testing system has only 8GB.

> The above xfs_info output is from your 160GB SSD - what's the output
> from the 1TB device?

The 1TB device is now doing something else and does not have XFS on it 
anymore. But here are the layouts of two livesystems.

xfs_info /dev/sda6
meta-data=/dev/root              isize=256    agcount=4, agsize=35767872 
blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=143071488, imaxpct=25
          =                       sunit=64     swidth=512 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=69888, version=2
          =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

xfs_info /dev/sda6
meta-data=/dev/root              isize=256    agcount=4, agsize=35768000 
blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=143071774, imaxpct=25
          =                       sunit=64     swidth=512 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32768, version=2
          =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


> Also, what phase do you see it hanging in? the random stat phase is
> terribly slow on spinning disks, so if I can avoid that it woul dbe
> nice....
Creating or deleting files. never in the stat phase.

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-21  7:36                                 ` Stefan Priebe - Profihost AG
@ 2011-09-21 11:39                                   ` Christoph Hellwig
  2011-09-21 13:39                                     ` Stefan Priebe
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-21 11:39 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs, aelder

On Wed, Sep 21, 2011 at 09:36:42AM +0200, Stefan Priebe - Profihost AG wrote:
> >So far I haven't been able to recreate it.  How many runs did you
> >normally need on 3.1-rc?  Note that so far I've run my known working
> >kernel, I'll test your config plus the drivers I need next.
> 
> I had only used 3.0.4 with bonnie++ to reproduce. 3.1-rc was running
> on a prod. system.
> 
> Sadly i'm also not able to reproduce it reliable on every partition.
> Sometimes it works sometimes not. Just retrying does not help. I had
> to copy and delete random files from the part. and then start
> bonnie++ on it. Perhaps i can give you a dd dump of the partition.
> But i had to recreate one. My Intel SSD is now massivly slower than
> when i started the tests. No idea why.

So far it runs fine on 3.1-rc both with my default config and yours,
the latter had been running all night.  This is on a 8-core Nehalem
with 8GB of memory, and a fast PCI-e flash device.

One thing I noticed is that your config seems to run many fs tasks
a lot slower than mine, but I'm not entirely sure why.

The only interesting things I noticed in your config where that you
use slub instead of slab, which does a lot of high order allocations
and has caused lots of trouble in the past, and that you enable
CONFIG_CC_OPTIMIZE_FOR_SIZE, which has caused mis-compilation
of complicated code in the past.  I don't want to blame it directly,
but I could see how that causes problems with some of the atomic64_t
games XFS plays since 2.6.38.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21  7:40                                   ` Stefan Priebe - Profihost AG
@ 2011-09-21 11:42                                     ` Dave Chinner
  2011-09-21 11:55                                       ` Stefan Priebe - Profihost AG
                                                         ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Dave Chinner @ 2011-09-21 11:42 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs

On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost AG wrote:
> Am 21.09.2011 04:11, schrieb Dave Chinner:
> >Also, what phase do you see it hanging in? the random stat phase is
> >terribly slow on spinning disks, so if I can avoid that it woul dbe
> >nice....
> Creating or deleting files. never in the stat phase.

Ok, I got a hang in the random delete phase. Not sure what is wrong
yet, but inode reclaim is trying to reclaim inodes but failing, and
the AIL is trying to push items but failing. Hence the tail of the
log is not being moved forward and new transactions are being
blocked until log space bcomes available.

The AIl is particularly interesting. the number of pushes being
executed is precisely 50/s, and precisely 5000 items/s are being
scanned. All those items are pinned, so the "stuck" processing is
what is triggering this pattern.

Thing is, all the items are aparently pinned - I see that stat
incrementing at 5,000/s. It's here:

                case XFS_ITEM_PINNED:
                        XFS_STATS_INC(xs_push_ail_pinned);
                        stuck++;
                        flush_log = 1;
                        break;

so we should have the flush_log variable set. However, this code:

        if (flush_log) {
                /*
                 * If something we need to push out was pinned, then
                 * push out the log so it will become unpinned and
                 * move forward in the AIL.
                 */
                XFS_STATS_INC(xs_push_ail_flush);
                xfs_log_force(mp, 0);
        }

never seems to execute. I don't see the xs_push_ail_flush stat
increase, nor the log force counter increase, either. Hence the
pinned items are not getting unpinned, and progress is not being
made. Background inode reclaim is not making progress, either,
because it skips pinned inodes.

The AIL code is clearly cycling - the push counter is increasing,
and the run numbers match the stuck code precisely (aborts at 100
stuck items a cycle). The question is now why isn't the log force
being triggered.

Given this, just triggering a log force is shoul dget everything
moving again. Running "echo 2 > /proc/sys/vm/drop_caches" gets inode
reclaim running in sync mode, which causes pinned inodes to trigger
a log force. And once I've done this, everything starts running
again.

So, the log force not triggering in the AIL code looks to be the
problem. That, I simply cannot explain right now - it makes no sense
but that is what all the stats and trace events point to. I need to
do more investigation.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 11:42                                     ` Dave Chinner
@ 2011-09-21 11:55                                       ` Stefan Priebe - Profihost AG
  2011-09-21 12:26                                       ` Christoph Hellwig
  2011-09-22  0:53                                       ` Dave Chinner
  2 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-21 11:55 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs

Am 21.09.2011 13:42, schrieb Dave Chinner:
> Ok, I got a hang in the random delete phase. Not sure what is wrong
> yet, but inode reclaim is trying to reclaim inodes but failing, and
> the AIL is trying to push items but failing. Hence the tail of the
> log is not being moved forward and new transactions are being
> blocked until log space bcomes available.
OK that matches my findings. It was also mostly in the random delete 
phase. But i've also seen it on creates.

> Given this, just triggering a log force is shoul dget everything
> moving again. Running "echo 2>  /proc/sys/vm/drop_caches" gets inode
> reclaim running in sync mode, which causes pinned inodes to trigger
> a log force. And once I've done this, everything starts running
> again.
Oh man i was thinking about trying this. But then i forgot that idea ;-(

> So, the log force not triggering in the AIL code looks to be the
> problem. That, I simply cannot explain right now - it makes no sense
> but that is what all the stats and trace events point to. I need to
> do more investigation.
Thanks Dave and great that you were able to repeat it.

What helps is to build bonnie++ yourself and just remove the stat tests. 
I've done this too - so bonnie++ runs a lot faster.

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 11:42                                     ` Dave Chinner
  2011-09-21 11:55                                       ` Stefan Priebe - Profihost AG
@ 2011-09-21 12:26                                       ` Christoph Hellwig
  2011-09-21 13:42                                         ` Stefan Priebe
                                                           ` (3 more replies)
  2011-09-22  0:53                                       ` Dave Chinner
  2 siblings, 4 replies; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-21 12:26 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Stefan Priebe - Profihost AG

On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote:
> So, the log force not triggering in the AIL code looks to be the
> problem. That, I simply cannot explain right now - it makes no sense
> but that is what all the stats and trace events point to. I need to
> do more investigation.

Could it be that we have a huge amount of instances of xfs_ail_worker
running at the same time?  xfs_sync_wq is marked as WQ_CPU_INTENSIVE,
so running/runnable workers are not counted towards the concurrency
limit.  From my look at the workqueue code this means we'll spawn new
instances fairly quickly if the others are stuck.  This means more
and more of them hammering the pinned items, and we'll rarely reach
the limit where we'd need to do a log force.

What is also strange is that we allocate a xfs_ail_wq, but don't
actually use it, although it would have the same idea.  Stefan,
can you try the following patch?  This moves the ail work to it's
explicit queue, and makes sure we never have the same work item
(= same fs to be pushed) concurrently.

Note that before Linux 3.1-rc you'll need to edit fs/xfs/xfs_super.c
to be fs/xfs/linux-2.6/xfs_super.c in the patch manually.


Index: linux-2.6/fs/xfs/xfs_super.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_super.c	2011-09-21 08:00:01.864768359 -0400
+++ linux-2.6/fs/xfs/xfs_super.c	2011-09-21 08:04:01.335266079 -0400
@@ -1654,7 +1654,7 @@ xfs_init_workqueues(void)
 	if (!xfs_syncd_wq)
 		goto out;
 
-	xfs_ail_wq = alloc_workqueue("xfsail", WQ_CPU_INTENSIVE, 8);
+	xfs_ail_wq = alloc_workqueue("xfsail", WQ_NON_REENTRANT, 8);
 	if (!xfs_ail_wq)
 		goto out_destroy_syncd;
 
Index: linux-2.6/fs/xfs/xfs_trans_ail.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_trans_ail.c	2011-09-21 08:02:28.172765827 -0400
+++ linux-2.6/fs/xfs/xfs_trans_ail.c	2011-09-21 08:02:46.843266108 -0400
@@ -538,7 +538,7 @@ out_done:
 	}
 
 	/* There is more to do, requeue us.  */
-	queue_delayed_work(xfs_syncd_wq, &ailp->xa_work,
+	queue_delayed_work(xfs_ail_wq, &ailp->xa_work,
 					msecs_to_jiffies(tout));
 }
 
@@ -575,7 +575,7 @@ xfs_ail_push(
 	smp_wmb();
 	xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &threshold_lsn);
 	if (!test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags))
-		queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, 0);
+		queue_delayed_work(xfs_ail_wq, &ailp->xa_work, 0);
 }
 
 /*

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-21 11:39                                   ` Christoph Hellwig
@ 2011-09-21 13:39                                     ` Stefan Priebe
  2011-09-21 14:17                                       ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe @ 2011-09-21 13:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs-masters, xfs, aelder


> One thing I noticed is that your config seems to run many fs tasks
> a lot slower than mine, but I'm not entirely sure why.
Strange would you post your config too?

> The only interesting things I noticed in your config where that you
> use slub instead of slab, which does a lot of high order allocations
> and has caused lots of trouble in the past, and that you enable
> CONFIG_CC_OPTIMIZE_FOR_SIZE, which has caused mis-compilation
> of complicated code in the past.  I don't want to blame it directly,
> but I could see how that causes problems with some of the atomic64_t
> games XFS plays since 2.6.38.
Will remove it.

At least dave was able to reproduce so he can probably help too.

Thanks!

Stefan
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 12:26                                       ` Christoph Hellwig
@ 2011-09-21 13:42                                         ` Stefan Priebe
  2011-09-21 16:48                                         ` Stefan Priebe - Profihost AG
                                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe @ 2011-09-21 13:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Am 21.09.2011 um 14:26 schrieb Christoph Hellwig <hch@infradead.org>:
> What is also strange is that we allocate a xfs_ail_wq, but don't
> actually use it, although it would have the same idea.  Stefan,
> can you try the following patch?  This moves the ail work to it's
> explicit queue, and makes sure we never have the same work item
> (= same fs to be pushed) concurrently.
I will have the chance to test in a few hours again. Perhaps  can test too?

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs deadlock in stable kernel 3.0.4
  2011-09-21 13:39                                     ` Stefan Priebe
@ 2011-09-21 14:17                                       ` Christoph Hellwig
  0 siblings, 0 replies; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-21 14:17 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: aelder, xfs

[-- Attachment #1: Type: text/plain, Size: 249 bytes --]

On Wed, Sep 21, 2011 at 03:39:16PM +0200, Stefan Priebe wrote:
> 
> > One thing I noticed is that your config seems to run many fs tasks
> > a lot slower than mine, but I'm not entirely sure why.
> Strange would you post your config too?

Attached.

[-- Attachment #2: config.2.6.40.bz2 --]
[-- Type: application/x-bzip2, Size: 26558 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 12:26                                       ` Christoph Hellwig
  2011-09-21 13:42                                         ` Stefan Priebe
@ 2011-09-21 16:48                                         ` Stefan Priebe - Profihost AG
  2011-09-21 17:26                                           ` Stefan Priebe - Profihost AG
  2011-09-21 19:01                                         ` Stefan Priebe - Profihost AG
  2011-09-21 23:07                                         ` Dave Chinner
  3 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-21 16:48 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Hi,

Am 21.09.2011 14:26, schrieb Christoph Hellwig:
> What is also strange is that we allocate a xfs_ail_wq, but don't
> actually use it, although it would have the same idea.  Stefan,
> can you try the following patch?  This moves the ail work to it's
> explicit queue, and makes sure we never have the same work item
> (= same fs to be pushed) concurrently.

Sorry, but with your patch everything is awfully slow. Just the sequ. 
file creation takes on an SSD extremely long. I interrupted the test.

i/o top from an SSD:
Total DISK READ: 0 B/s | Total DISK WRITE: 9.88 M/s
   PID USER      DISK READ  DISK WRITE   SWAPIN    IO>    COMMAND 

  1377 root           0 B/s       0 B/s  0.00 % 99.99 % [xfsbufd/sda3]
  2219 root           0 B/s       0 B/s  0.00 % 99.99 % [flush-8:0]
  2746 root           0 B/s    9.88 M/s  0.00 %  0.00 % bonnie++ -u root 
-s 0 -n 1024:32768:0:1024:4096 -d /mnt

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 16:48                                         ` Stefan Priebe - Profihost AG
@ 2011-09-21 17:26                                           ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-21 17:26 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

> Hi,
>
> Am 21.09.2011 14:26, schrieb Christoph Hellwig:
>> What is also strange is that we allocate a xfs_ail_wq, but don't
>> actually use it, although it would have the same idea. Stefan,
>> can you try the following patch? This moves the ail work to it's
>> explicit queue, and makes sure we never have the same work item
>> (= same fs to be pushed) concurrently.
>
> Sorry, but with your patch everything is awfully slow. Just the sequ.
> file creation takes on an SSD extremely long. I interrupted the test.
>
> i/o top from an SSD:
> Total DISK READ: 0 B/s | Total DISK WRITE: 9.88 M/s
> PID USER DISK READ DISK WRITE SWAPIN IO> COMMAND
> 1377 root 0 B/s 0 B/s 0.00 % 99.99 % [xfsbufd/sda3]
> 2219 root 0 B/s 0 B/s 0.00 % 99.99 % [flush-8:0]
> 2746 root 0 B/s 9.88 M/s 0.00 % 0.00 % bonnie++ -u root -s 0 -n
> 1024:32768:0:1024:4096 -d /mnt

Please ignore this mail. Use the wrong disk. *gr* slow SATA

Stefan


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 12:26                                       ` Christoph Hellwig
  2011-09-21 13:42                                         ` Stefan Priebe
  2011-09-21 16:48                                         ` Stefan Priebe - Profihost AG
@ 2011-09-21 19:01                                         ` Stefan Priebe - Profihost AG
  2011-09-21 23:07                                         ` Dave Chinner
  3 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-21 19:01 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Am 21.09.2011 14:26, schrieb Christoph Hellwig:
> What is also strange is that we allocate a xfs_ail_wq, but don't
> actually use it, although it would have the same idea.  Stefan,
> can you try the following patch?  This moves the ail work to it's
> explicit queue, and makes sure we never have the same work item
> (= same fs to be pushed) concurrently.

no luck - problem still occurs ;-(

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 12:26                                       ` Christoph Hellwig
                                                           ` (2 preceding siblings ...)
  2011-09-21 19:01                                         ` Stefan Priebe - Profihost AG
@ 2011-09-21 23:07                                         ` Dave Chinner
  2011-09-22 14:14                                           ` Christoph Hellwig
  3 siblings, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2011-09-21 23:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs, Stefan Priebe - Profihost AG

On Wed, Sep 21, 2011 at 08:26:49AM -0400, Christoph Hellwig wrote:
> On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote:
> > So, the log force not triggering in the AIL code looks to be the
> > problem. That, I simply cannot explain right now - it makes no sense
> > but that is what all the stats and trace events point to. I need to
> > do more investigation.
> 
> Could it be that we have a huge amount of instances of xfs_ail_worker
> running at the same time?  xfs_sync_wq is marked as WQ_CPU_INTENSIVE,
> so running/runnable workers are not counted towards the concurrency
> limit.  From my look at the workqueue code this means we'll spawn new
> instances fairly quickly if the others are stuck.  This means more
> and more of them hammering the pinned items, and we'll rarely reach
> the limit where we'd need to do a log force.

No, that's not possible. The XFS_AIL_PUSHING_BIT ensures that there
is only one instance of AIL pushing per struct xfs_ail running at
once. It's also backed up by the fact that I couldn't find a single
worker thread blocked running AIL pushing - it ran the 100 item
scan, got stuck, requeued itself to run again 20ms later....

FYI, what we want the concurrency for in the AIL wq is for multiple
filesystems to be able to run AIL pushing at the same time, which
is why it was set up this way. If one filesystem AIL push blocks,
then an unblocked one will simply run.

> What is also strange is that we allocate a xfs_ail_wq, but don't
> actually use it, although it would have the same idea.  Stefan,
> can you try the following patch?  This moves the ail work to it's
> explicit queue, and makes sure we never have the same work item
> (= same fs to be pushed) concurrently.

Oh, that's a bug. My bad. That definitely needs fixing.

> Note that before Linux 3.1-rc you'll need to edit fs/xfs/xfs_super.c
> to be fs/xfs/linux-2.6/xfs_super.c in the patch manually.
> 
> 
> Index: linux-2.6/fs/xfs/xfs_super.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_super.c	2011-09-21 08:00:01.864768359 -0400
> +++ linux-2.6/fs/xfs/xfs_super.c	2011-09-21 08:04:01.335266079 -0400
> @@ -1654,7 +1654,7 @@ xfs_init_workqueues(void)
>  	if (!xfs_syncd_wq)
>  		goto out;
>  
> -	xfs_ail_wq = alloc_workqueue("xfsail", WQ_CPU_INTENSIVE, 8);
> +	xfs_ail_wq = alloc_workqueue("xfsail", WQ_NON_REENTRANT, 8);
>  	if (!xfs_ail_wq)
>  		goto out_destroy_syncd;

Drop this hunk....

>  
> Index: linux-2.6/fs/xfs/xfs_trans_ail.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_trans_ail.c	2011-09-21 08:02:28.172765827 -0400
> +++ linux-2.6/fs/xfs/xfs_trans_ail.c	2011-09-21 08:02:46.843266108 -0400
> @@ -538,7 +538,7 @@ out_done:
>  	}
>  
>  	/* There is more to do, requeue us.  */
> -	queue_delayed_work(xfs_syncd_wq, &ailp->xa_work,
> +	queue_delayed_work(xfs_ail_wq, &ailp->xa_work,
>  					msecs_to_jiffies(tout));
>  }
>  
> @@ -575,7 +575,7 @@ xfs_ail_push(
>  	smp_wmb();
>  	xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &threshold_lsn);
>  	if (!test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags))
> -		queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, 0);
> +		queue_delayed_work(xfs_ail_wq, &ailp->xa_work, 0);
>  }

just keep these. Can you repost with a sign-off?

Cheers,

Dave
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 11:42                                     ` Dave Chinner
  2011-09-21 11:55                                       ` Stefan Priebe - Profihost AG
  2011-09-21 12:26                                       ` Christoph Hellwig
@ 2011-09-22  0:53                                       ` Dave Chinner
  2011-09-22  5:27                                         ` Stefan Priebe - Profihost AG
  2 siblings, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2011-09-22  0:53 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Christoph Hellwig, xfs-masters, xfs

On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote:
> On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost AG wrote:
> > Am 21.09.2011 04:11, schrieb Dave Chinner:
> > >Also, what phase do you see it hanging in? the random stat phase is
> > >terribly slow on spinning disks, so if I can avoid that it woul dbe
> > >nice....
> > Creating or deleting files. never in the stat phase.
> 
> Ok, I got a hang in the random delete phase. Not sure what is wrong
> yet, but inode reclaim is trying to reclaim inodes but failing, and
> the AIL is trying to push items but failing. Hence the tail of the
> log is not being moved forward and new transactions are being
> blocked until log space bcomes available.
> 
> The AIl is particularly interesting. the number of pushes being
> executed is precisely 50/s, and precisely 5000 items/s are being
> scanned. All those items are pinned, so the "stuck" processing is
> what is triggering this pattern.
> 
> Thing is, all the items are aparently pinned - I see that stat
> incrementing at 5,000/s. It's here:
> 
>                 case XFS_ITEM_PINNED:
>                         XFS_STATS_INC(xs_push_ail_pinned);
>                         stuck++;
>                         flush_log = 1;
>                         break;
> 
> so we should have the flush_log variable set. However, this code:
> 
>         if (flush_log) {
>                 /*
>                  * If something we need to push out was pinned, then
>                  * push out the log so it will become unpinned and
>                  * move forward in the AIL.
>                  */
>                 XFS_STATS_INC(xs_push_ail_flush);
>                 xfs_log_force(mp, 0);
>         }
> 
> never seems to execute. I don't see the xs_push_ail_flush stat
> increase, nor the log force counter increase, either. Hence the
> pinned items are not getting unpinned, and progress is not being
> made. Background inode reclaim is not making progress, either,
> because it skips pinned inodes.
> 
> The AIL code is clearly cycling - the push counter is increasing,
> and the run numbers match the stuck code precisely (aborts at 100
> stuck items a cycle). The question is now why isn't the log force
> being triggered.
> 
> Given this, just triggering a log force is shoul dget everything
> moving again. Running "echo 2 > /proc/sys/vm/drop_caches" gets inode
> reclaim running in sync mode, which causes pinned inodes to trigger
> a log force. And once I've done this, everything starts running
> again.
> 
> So, the log force not triggering in the AIL code looks to be the
> problem. That, I simply cannot explain right now - it makes no sense
> but that is what all the stats and trace events point to. I need to
> do more investigation.

Ok, it makes sense now. The kernel I was running (from before I went
on holidays) had this patch in it:

http://oss.sgi.com/archives/xfs/2011-08/msg00472.html

I found this out by disassembling the kernel code. That code has a
bug it in when the stuck case is hit - it fails to issue the log
force in that case, and that's why I've been seeing this kernel get
stuck. False alarm - will now try to reproduce without any dev
patches in the kernel.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-22  0:53                                       ` Dave Chinner
@ 2011-09-22  5:27                                         ` Stefan Priebe - Profihost AG
  2011-09-22  7:52                                           ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-22  5:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs

Am 22.09.2011 02:53, schrieb Dave Chinner:
> On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote:
>> On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost AG wrote:

> I found this out by disassembling the kernel code. That code has a
> bug it in when the stuck case is hit - it fails to issue the log
> force in that case, and that's why I've been seeing this kernel get
> stuck. False alarm - will now try to reproduce without any dev
> patches in the kernel.
Sad to here that ;-( I'm now trying to prepare a 160GB dd image for you 
where it is reproducable.

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-22  5:27                                         ` Stefan Priebe - Profihost AG
@ 2011-09-22  7:52                                           ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-22  7:52 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs-masters, xfs

Am 22.09.2011 07:27, schrieb Stefan Priebe - Profihost AG:
> Am 22.09.2011 02:53, schrieb Dave Chinner:
>> On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote:
>>> On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost
>>> AG wrote:
>
>> I found this out by disassembling the kernel code. That code has a
>> bug it in when the stuck case is hit - it fails to issue the log
>> force in that case, and that's why I've been seeing this kernel get
>> stuck. False alarm - will now try to reproduce without any dev
>> patches in the kernel.
> Sad to here that ;-( I'm now trying to prepare a 160GB dd image for you
> where it is reproducable.
>

The testimage is ready. I'm just packing it und uploading it. I'll send 
you all details via private mail.

Hope that helps.

Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-21 23:07                                         ` Dave Chinner
@ 2011-09-22 14:14                                           ` Christoph Hellwig
  2011-09-22 21:49                                             ` Dave Chinner
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-22 14:14 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs, Stefan Priebe - Profihost AG

On Thu, Sep 22, 2011 at 09:07:18AM +1000, Dave Chinner wrote:
> No, that's not possible. The XFS_AIL_PUSHING_BIT ensures that there
> is only one instance of AIL pushing per struct xfs_ail running at
> once. It's also backed up by the fact that I couldn't find a single
> worker thread blocked running AIL pushing - it ran the 100 item
> scan, got stuck, requeued itself to run again 20ms later....

True, it should prevent that - this was just my only theory based
on the (incorrect) assumption that we'd never get to the log force.

> FYI, what we want the concurrency for in the AIL wq is for multiple
> filesystems to be able to run AIL pushing at the same time, which
> is why it was set up this way. If one filesystem AIL push blocks,
> then an unblocked one will simply run.

A WQ_NON_REENTRANT workqueue will still provide that.  From the
documentation:

        By default, a wq guarantees non-reentrance only on the same
	CPU.  A work item may not be executed concurrently on the same
	CPU by multiple workers but is allowed to be executed
	concurrently on multiple CPUs.  This flag makes sure
	non-reentrance is enforced across all CPUs.  Work items queued
	to a non-reentrant wq are guaranteed to be executed by at most
	one worker system-wide at any given time.

So this still seems to preferable for the ail workqueue, and should be
able to replace the XFS_AIL_PUSHING_BIT protections.

I also suspect that we should mark the ail workqueue as WQ_MEM_RECLAIM -
a lot of memory reclaim really requires moving the AIL forward.
Currently we have other ways to reclaim inodes, but e.g. for buffers
we rely entirely on AIL pushing, and with the proposed metadata
writeback changes we're going to rely even more on the ail, even if
we still keep emergency synchronous around it's going to be a lot
less efficient than real ail pushing under actual OOM conditions.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-22 14:14                                           ` Christoph Hellwig
@ 2011-09-22 21:49                                             ` Dave Chinner
  2011-09-22 22:01                                               ` Christoph Hellwig
  0 siblings, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2011-09-22 21:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs, Stefan Priebe - Profihost AG

On Thu, Sep 22, 2011 at 10:14:57AM -0400, Christoph Hellwig wrote:
> On Thu, Sep 22, 2011 at 09:07:18AM +1000, Dave Chinner wrote:
> > No, that's not possible. The XFS_AIL_PUSHING_BIT ensures that there
> > is only one instance of AIL pushing per struct xfs_ail running at
> > once. It's also backed up by the fact that I couldn't find a single
> > worker thread blocked running AIL pushing - it ran the 100 item
> > scan, got stuck, requeued itself to run again 20ms later....
> 
> True, it should prevent that - this was just my only theory based
> on the (incorrect) assumption that we'd never get to the log force.
> 
> > FYI, what we want the concurrency for in the AIL wq is for multiple
> > filesystems to be able to run AIL pushing at the same time, which
> > is why it was set up this way. If one filesystem AIL push blocks,
> > then an unblocked one will simply run.
> 
> A WQ_NON_REENTRANT workqueue will still provide that.  From the
> documentation:
> 
>         By default, a wq guarantees non-reentrance only on the same
> 	CPU.  A work item may not be executed concurrently on the same
> 	CPU by multiple workers but is allowed to be executed
> 	concurrently on multiple CPUs.  This flag makes sure
> 	non-reentrance is enforced across all CPUs.  Work items queued
> 	to a non-reentrant wq are guaranteed to be executed by at most
> 	one worker system-wide at any given time.
> 
> So this still seems to preferable for the ail workqueue, and should be
> able to replace the XFS_AIL_PUSHING_BIT protections.

No, we can't. WQ_NON_REENTRANT only protects against concurrency on
the same CPU, not across all CPUs - it still allows concurrent
per-CPU work processing on the same work item.

However, we want only a *single* AIL worker instance executing per
filesystem, not per-cpu per filesystem. Concurrent per-filesystem
workers will simply bash on the AIL lock trying to walk the AIL at
the same time, and this is precisely the issue the single AIL worker
setup is avoiding. The XFS_AIL_PUSHING_BIT is what enforces the
single per-filesystem push worker running at any time.

> I also suspect that we should mark the ail workqueue as WQ_MEM_RECLAIM -
> a lot of memory reclaim really requires moving the AIL forward.

Possibly, but I'm not sure it is necessary.

> Currently we have other ways to reclaim inodes, but e.g. for buffers
> we rely entirely on AIL pushing,

We have the xfs_buf shrinker that walks the LRU that frees clean
buffers.

> and with the proposed metadata
> writeback changes we're going to rely even more on the ail, even if
> we still keep emergency synchronous around it's going to be a lot
> less efficient than real ail pushing under actual OOM conditions.

The inode shrinker kicks the AIL pushing - if we cannot get memory
to queue the work, then the very next iteration of the shrinker will
try again. Hence I'm not sure that it is absolutely necessary,
though it probably won't hurt...

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-22 21:49                                             ` Dave Chinner
@ 2011-09-22 22:01                                               ` Christoph Hellwig
  2011-09-23  5:28                                                 ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Hellwig @ 2011-09-22 22:01 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs, Stefan Priebe - Profihost AG

On Fri, Sep 23, 2011 at 07:49:56AM +1000, Dave Chinner wrote:
> On Thu, Sep 22, 2011 at 10:14:57AM -0400, Christoph Hellwig wrote:
> >         By default, a wq guarantees non-reentrance only on the same
> > 	CPU.  A work item may not be executed concurrently on the same
> > 	CPU by multiple workers but is allowed to be executed
> > 	concurrently on multiple CPUs.  This flag makes sure
> > 	non-reentrance is enforced across all CPUs.  Work items queued
> > 	to a non-reentrant wq are guaranteed to be executed by at most
> > 	one worker system-wide at any given time.
> > 
> > So this still seems to preferable for the ail workqueue, and should be
> > able to replace the XFS_AIL_PUSHING_BIT protections.
> 
> No, we can't. WQ_NON_REENTRANT only protects against concurrency on
> the same CPU, not across all CPUs - it still allows concurrent
> per-CPU work processing on the same work item.

Non concurrently for a given work_struct on the same CPU is the default,
WQ_NON_REENTRANT extents that to not beeing exectuted concurrently at
all.  Check the documentation above again, or the code - just look
for the only occurance of WQ_NON_REENTRANT in kernel/workqueue.c and
the surronuding code (e.g. find_worker_executing_work and the
current_work field in struct worker)

> However, we want only a *single* AIL worker instance executing per
> filesystem, not per-cpu per filesystem. Concurrent per-filesystem
> workers will simply bash on the AIL lock trying to walk the AIL at
> the same time, and this is precisely the issue the single AIL worker
> setup is avoiding. The XFS_AIL_PUSHING_BIT is what enforces the
> single per-filesystem push worker running at any time.

I think that's exactly what WQ_NON_REENTRANT is intended for.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
  2011-09-22 22:01                                               ` Christoph Hellwig
@ 2011-09-23  5:28                                                 ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-23  5:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Hi,

you can faster reproduce the issue if you set elevator=noop when 
booting. It then happens always on the 1st run of deleting random files.

@Dave:
Were you able to reproduce it too?

Greets
Stefan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* xfs deadlock in stable kernel 3.0.4
@ 2011-09-11 13:12 Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 50+ messages in thread
From: Stefan Priebe - Profihost AG @ 2011-09-11 13:12 UTC (permalink / raw)
  To: linux-fsdevel

Hello List,

on some of our heavy loaded servers using xfs we're seeing a deadlock 
where reading/writing to the xfs filesystem suddenly stops working. They 
seem to be running out of log space and then xfs deadlocks.

Here you can find sysrq w triggered log messages of the locked processes.

http://pastebin.com/JWjrbrh4

Please help! Thanks!

Please cc me i'm not subscribed.

Stefan

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2011-09-23  5:28 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-10 12:23 xfs deadlock in stable kernel 3.0.4 Stefan Priebe
2011-09-12 15:21 ` Christoph Hellwig
2011-09-12 16:46   ` Stefan Priebe
2011-09-12 20:05     ` Christoph Hellwig
2011-09-13  6:04       ` Stefan Priebe - Profihost AG
2011-09-13 19:31         ` Stefan Priebe - Profihost AG
2011-09-13 20:50         ` Christoph Hellwig
2011-09-13 21:52           ` [xfs-masters] " Alex Elder
2011-09-13 21:58             ` Alex Elder
2011-09-13 22:26               ` Christoph Hellwig
2011-09-14  7:26           ` Stefan Priebe - Profihost AG
2011-09-14  7:48             ` Stefan Priebe - Profihost AG
2011-09-14  8:49               ` Stefan Priebe - Profihost AG
2011-09-14 14:30                 ` Christoph Hellwig
2011-09-14 14:30               ` Christoph Hellwig
2011-09-14 16:06                 ` Stefan Priebe - Profihost AG
2011-09-18  9:14                 ` Stefan Priebe - Profihost AG
2011-09-18 20:04                   ` Christoph Hellwig
2011-09-19 10:54                     ` Stefan Priebe - Profihost AG
2011-09-18 23:02                   ` Dave Chinner
2011-09-20  0:47                     ` Stefan Priebe
2011-09-20  1:01                       ` Stefan Priebe
2011-09-20 10:09                     ` Stefan Priebe - Profihost AG
2011-09-20 16:02                       ` Christoph Hellwig
2011-09-20 17:23                         ` Stefan Priebe - Profihost AG
2011-09-20 17:24                           ` Christoph Hellwig
2011-09-20 17:35                             ` Stefan Priebe - Profihost AG
2011-09-20 22:30                               ` Christoph Hellwig
2011-09-21  2:11                                 ` [xfs-masters] " Dave Chinner
2011-09-21  7:40                                   ` Stefan Priebe - Profihost AG
2011-09-21 11:42                                     ` Dave Chinner
2011-09-21 11:55                                       ` Stefan Priebe - Profihost AG
2011-09-21 12:26                                       ` Christoph Hellwig
2011-09-21 13:42                                         ` Stefan Priebe
2011-09-21 16:48                                         ` Stefan Priebe - Profihost AG
2011-09-21 17:26                                           ` Stefan Priebe - Profihost AG
2011-09-21 19:01                                         ` Stefan Priebe - Profihost AG
2011-09-21 23:07                                         ` Dave Chinner
2011-09-22 14:14                                           ` Christoph Hellwig
2011-09-22 21:49                                             ` Dave Chinner
2011-09-22 22:01                                               ` Christoph Hellwig
2011-09-23  5:28                                                 ` Stefan Priebe - Profihost AG
2011-09-22  0:53                                       ` Dave Chinner
2011-09-22  5:27                                         ` Stefan Priebe - Profihost AG
2011-09-22  7:52                                           ` Stefan Priebe - Profihost AG
2011-09-21  7:36                                 ` Stefan Priebe - Profihost AG
2011-09-21 11:39                                   ` Christoph Hellwig
2011-09-21 13:39                                     ` Stefan Priebe
2011-09-21 14:17                                       ` Christoph Hellwig
2011-09-11 13:12 Stefan Priebe - Profihost AG

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.