All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.31 xfs_fs_destroy_inode: cannot reclaim
@ 2009-09-16 10:27 Tommy van Leeuwen
  2009-09-17 18:59 ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Tommy van Leeuwen @ 2009-09-16 10:27 UTC (permalink / raw)
  To: xfs

Hello All,

We had this error reported on the list about 1 or 2 months ago. During
that time a lot of fixes were applied. However, we still experience
this problem with the recent 2.6.31 tree. We've also applied an extra
log entry to aid in debugging.

printk("XFS: inode_init_always failed to re-initialize inode\n");

However, we didn't see this logging!

Here's a screenshot of our latest crash:
http://www.news-service.com/tmp/sb06-20090916.jpg

Here's the config used just in case:
http://www.news-service.com/tmp/config-2.6.31.txt

For now we've downgraded to 2.6.28 again. Please let me know if we can
do something to better troubleshoot this. We have a set of 8 servers
which can easily reproduce this. It mostly happens within a few days
after a clean reboot.

Kind Regards,
Tommy van Leeuwen

-- 
**Warning** New Addres from May 25th, 2009!
News-Service.com - European Usenet Provider
Pobox 12026, 1100 AA Amsterdam, Netherlands
http://www.news-service.com - +3120-3981111

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-09-16 10:27 2.6.31 xfs_fs_destroy_inode: cannot reclaim Tommy van Leeuwen
@ 2009-09-17 18:59 ` Christoph Hellwig
  2009-09-29 10:15   ` Patrick Schreurs
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-09-17 18:59 UTC (permalink / raw)
  To: Tommy van Leeuwen; +Cc: xfs

On Wed, Sep 16, 2009 at 12:27:21PM +0200, Tommy van Leeuwen wrote:
> Hello All,
> 
> We had this error reported on the list about 1 or 2 months ago. During
> that time a lot of fixes were applied. However, we still experience
> this problem with the recent 2.6.31 tree. We've also applied an extra
> log entry to aid in debugging.
> 
> printk("XFS: inode_init_always failed to re-initialize inode\n");
> 
> However, we didn't see this logging!

Can you try the patch below, its does two things

 - remove all that reclaimable flagging if we reclaim the inode
   directly.  This removes any possibility of racing with the reclaiming
   thread.
 - adds asserts if one of the reclaim-related flags is already set.


Index: xfs/fs/xfs/xfs_vnodeops.c
===================================================================
--- xfs.orig/fs/xfs/xfs_vnodeops.c	2009-09-17 14:39:37.799003843 -0300
+++ xfs/fs/xfs/xfs_vnodeops.c	2009-09-17 14:50:14.987005862 -0300
@@ -2460,39 +2460,35 @@ int
 xfs_reclaim(
 	xfs_inode_t	*ip)
 {
-
 	xfs_itrace_entry(ip);
 
 	ASSERT(!VN_MAPPED(VFS_I(ip)));
 
 	/* bad inode, get out here ASAP */
-	if (is_bad_inode(VFS_I(ip))) {
-		xfs_ireclaim(ip);
-		return 0;
-	}
+	if (is_bad_inode(VFS_I(ip)))
+		goto out_reclaim;
 
 	xfs_ioend_wait(ip);
 
 	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
 
 	/*
+	 * We should never get here with one of the reclaim flags already set.
+	 */
+	BUG_ON(xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	BUG_ON(xfs_iflags_test(ip, XFS_IRECLAIM));
+
+	/*
 	 * If we have nothing to flush with this inode then complete the
-	 * teardown now, otherwise break the link between the xfs inode and the
-	 * linux inode and clean up the xfs inode later. This avoids flushing
-	 * the inode to disk during the delete operation itself.
-	 *
-	 * When breaking the link, we need to set the XFS_IRECLAIMABLE flag
-	 * first to ensure that xfs_iunpin() will never see an xfs inode
-	 * that has a linux inode being reclaimed. Synchronisation is provided
-	 * by the i_flags_lock.
+	 * teardown now, otherwise delay the flush operation.
 	 */
-	if (!ip->i_update_core && (ip->i_itemp == NULL)) {
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_iflock(ip);
-		xfs_iflags_set(ip, XFS_IRECLAIMABLE);
-		return xfs_reclaim_inode(ip, 1, XFS_IFLUSH_DELWRI_ELSE_SYNC);
+	if (ip->i_update_core || ip->i_itemp) {
+		xfs_inode_set_reclaim_tag(ip);
+		return 0;
 	}
-	xfs_inode_set_reclaim_tag(ip);
+
+out_reclaim:
+	xfs_ireclaim(ip);
 	return 0;
 }
 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-09-17 18:59 ` Christoph Hellwig
@ 2009-09-29 10:15   ` Patrick Schreurs
  2009-09-29 12:57     ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2009-09-29 10:15 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, xfs

Christoph Hellwig wrote:
> On Wed, Sep 16, 2009 at 12:27:21PM +0200, Tommy van Leeuwen wrote:
>> Hello All,
>>
>> We had this error reported on the list about 1 or 2 months ago. During
>> that time a lot of fixes were applied. However, we still experience
>> this problem with the recent 2.6.31 tree. We've also applied an extra
>> log entry to aid in debugging.
>>
>> printk("XFS: inode_init_always failed to re-initialize inode\n");
>>
>> However, we didn't see this logging!
> 
> Can you try the patch below, its does two things
> 
>  - remove all that reclaimable flagging if we reclaim the inode
>    directly.  This removes any possibility of racing with the reclaiming
>    thread.
>  - adds asserts if one of the reclaim-related flags is already set.

Update: We've applied this patch on 2 servers. They didn't crash until 
now. Today we've applied the patch on 6 other servers.

We'll keep you posted.

-Patrick

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-09-29 10:15   ` Patrick Schreurs
@ 2009-09-29 12:57     ` Christoph Hellwig
  2009-09-30 10:48       ` Patrick Schreurs
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-09-29 12:57 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

On Tue, Sep 29, 2009 at 12:15:42PM +0200, Patrick Schreurs wrote:
> Update: We've applied this patch on 2 servers. They didn't crash until  
> now. Today we've applied the patch on 6 other servers.

Thanks.  I'll prepare a patch for upstream as the patch is extremly
useful by itself.  IF other issues show up I'll fix it on top of it.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-09-29 12:57     ` Christoph Hellwig
@ 2009-09-30 10:48       ` Patrick Schreurs
  2009-09-30 12:41         ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2009-09-30 10:48 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, xfs

[-- Attachment #1: Type: text/plain, Size: 544 bytes --]

Christoph Hellwig wrote:
> On Tue, Sep 29, 2009 at 12:15:42PM +0200, Patrick Schreurs wrote:
>> Update: We've applied this patch on 2 servers. They didn't crash until  
>> now. Today we've applied the patch on 6 other servers.
> 
> Thanks.  I'll prepare a patch for upstream as the patch is extremly
> useful by itself.  IF other issues show up I'll fix it on top of it.

Unfortunately we had a crashing server last night. Please see 
attachment. Hope it helps. Please advice if there is anything we could 
do to assist you.

Thanks,

-Patrick

[-- Attachment #2: sb03-090930.jpg --]
[-- Type: image/jpeg, Size: 78176 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-09-30 10:48       ` Patrick Schreurs
@ 2009-09-30 12:41         ` Christoph Hellwig
  2009-10-02 14:24           ` Bas Couwenberg
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-09-30 12:41 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

On Wed, Sep 30, 2009 at 12:48:55PM +0200, Patrick Schreurs wrote:
> Christoph Hellwig wrote:
>> On Tue, Sep 29, 2009 at 12:15:42PM +0200, Patrick Schreurs wrote:
>>> Update: We've applied this patch on 2 servers. They didn't crash 
>>> until  now. Today we've applied the patch on 6 other servers.
>>
>> Thanks.  I'll prepare a patch for upstream as the patch is extremly
>> useful by itself.  IF other issues show up I'll fix it on top of it.
>
> Unfortunately we had a crashing server last night. Please see  
> attachment. Hope it helps. Please advice if there is anything we could  
> do to assist you.

Can't really see much there except some common code.  Can you boot
the machine with a larger console resolution (vga= kernel parameter)
so a full backtrace can be captured?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-09-30 12:41         ` Christoph Hellwig
@ 2009-10-02 14:24           ` Bas Couwenberg
  2009-10-05 21:43             ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Bas Couwenberg @ 2009-10-02 14:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Patrick Schreurs, Tommy van Leeuwen, XFS List

[-- Attachment #1: Type: text/plain, Size: 526 bytes --]

Dear Christoph,

Yesterday two of our servers (2.6.31.1 + your patch) crashed again, this 
time we have a bigger console, but not the full backtrace unfortunately.

I did manage to get some more calltrace info from the logs, which I have 
attached together with the screenshots of the crashscreens.

I hope this info helps you.

Kind Regards,

Bas Couwenberg

-- 
News-Service.com - European Usenet Provider
Luttenbergweg 4, 1101 EC Amsterdam
P.O BOX: 12026 1100 AA, Netherlands
http://www.news-service.com  +31(0)20 398 1111

[-- Attachment #2: sb05-20091001.jpg --]
[-- Type: image/jpeg, Size: 98378 bytes --]

[-- Attachment #3: sb06-20091001.jpg --]
[-- Type: image/jpeg, Size: 99958 bytes --]

[-- Attachment #4: sb06-kernel.log --]
[-- Type: text/x-log, Size: 20516 bytes --]

Oct  1 22:44:01 sb06 kernel:
Oct  1 22:44:01 sb06 kernel: Call Trace:
Oct  1 22:44:01 sb06 kernel: [<ffffffff810e69e4>] ? xfs_bmap_read_extents+0x274/0x30c
Oct  1 22:44:01 sb06 kernel: [<ffffffff810e7f44>] ? xfs_bmapi+0x25d/0xea8
Oct  1 22:44:01 sb06 kernel: [<ffffffff8113de7c>] ? swiotlb_map_page+0x73/0xe1
Oct  1 22:44:01 sb06 kernel: [<ffffffff81055e8c>] ? find_get_page+0x1a/0x77
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105689d>] ? find_or_create_page+0x2d/0x88
Oct  1 22:44:01 sb06 kernel: [<ffffffff81103e59>] ? xfs_iomap+0x145/0x284
Oct  1 22:44:01 sb06 kernel: [<ffffffff811174cb>] ? __xfs_get_blocks+0x6c/0x15c
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff810a0fd6>] ? mpage_readpages+0xbd/0xff
Oct  1 22:44:01 sb06 kernel: [<ffffffff81102664>] ? xfs_iread+0x152/0x166
Oct  1 22:44:01 sb06 kernel: [<ffffffff8103e723>] ? bit_waitqueue+0x10/0x8b
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105cb1c>] ? __do_page_cache_readahead+0x125/0x1b1
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105cdb6>] ? ondemand_readahead+0x11f/0x1a7
Oct  1 22:44:01 sb06 kernel: [<ffffffff810a0b84>] ? do_mpage_readpage+0x163/0x486
Oct  1 22:44:01 sb06 kernel: [<ffffffff81136dd2>] ? radix_tree_insert+0xd7/0x19f
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105626b>] ? add_to_page_cache_locked+0x72/0x98
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff8111e0c8>] ? xfs_read+0x16e/0x1de
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107c889>] ? do_sync_read+0xce/0x113
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107d3ec>] ? sys_read+0x45/0x6e
Oct  1 22:44:01 sb06 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
Oct  1 22:44:01 sb06 kernel: IP: [<ffffffff810f9516>] xfs_dir2_sf_lookup+0xe3/0x219
Oct  1 22:44:01 sb06 kernel: Oops: 0000 [#1] SMP 
Oct  1 22:44:01 sb06 kernel: CPU 2 
Oct  1 22:44:01 sb06 kernel: Pid: 6804, comm: diablo Not tainted 2.6.31.1xfspatch #4 PowerEdge 1950
Oct  1 22:44:01 sb06 kernel: RSP: 0018:ffff88017ce8db68  EFLAGS: 00010202
Oct  1 22:44:01 sb06 kernel: RAX: 0000000000000006 RBX: 0000000000000000 RCX: 00000000e62cdb77
Oct  1 22:44:01 sb06 kernel: RDX: 00000000e62cc212 RSI: 0000000000000002 RDI: ffff88017ce8dbb8
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105ad0c>] ? __alloc_pages_nodemask+0xf8/0x524
Oct  1 22:44:01 sb06 kernel: FS:  0000000001369860(0063) GS:ffff880028066000(0000) knlGS:0000000000000000
Oct  1 22:44:01 sb06 kernel: [<ffffffff810574e4>] ? generic_file_aio_read+0x1ff/0x548
Oct  1 22:44:01 sb06 kernel: [<ffffffff8103e7ed>] ? autoremove_wake_function+0x0/0x2e
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107d294>] ? vfs_read+0xaa/0x146
Oct  1 22:44:01 sb06 kernel: [<ffffffff8100adab>] ? system_call_fastpath+0x16/0x1b
Oct  1 22:44:01 sb06 kernel: PGD 17ce81067 PUD 17ce82067 PMD 0 
Oct  1 22:44:01 sb06 kernel: last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Oct  1 22:44:01 sb06 kernel: Modules linked in: acpi_cpufreq cpufreq_ondemand ipmi_si ipmi_devintf ipmi_msghandler bonding serio_raw mptspi rng_core scsi_transport_spi bnx2 processor thermal 8250_pnp 8250 serial_core thermal_sys
Oct  1 22:44:01 sb06 kernel: RIP: 0010:[<ffffffff810f9516>]  [<ffffffff810f9516>] xfs_dir2_sf_lookup+0xe3/0x219
Oct  1 22:44:01 sb06 kernel: CR2: 0000000000000001 CR3: 000000017ce80000 CR4: 00000000000006a0
Oct  1 22:44:01 sb06 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct  1 22:44:01 sb06 kernel: RBP: 0000000000000000 R08: ffff880005cc3c00 R09: ffff88022d867080
Oct  1 22:44:01 sb06 kernel: R10: ffffffff813457b0 R11: ffff88017f661cd0 R12: ffff88017ce8dbb8
Oct  1 22:44:01 sb06 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff88017ce8dc98
Oct  1 22:44:01 sb06 kernel: Process diablo (pid: 6804, threadinfo ffff88017ce8c000, task ffff88022d867080)
Oct  1 22:44:01 sb06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct  1 22:44:01 sb06 kernel: <0> 0000000000000000 0000000000000000 ffff88017ce8dc98 ffffffff810f2754
Oct  1 22:44:01 sb06 kernel: <0> ffff8800a540b900 ffff88022d8672f8 ffff880154034e20 0000000000000006
Oct  1 22:44:01 sb06 kernel: [<ffffffff810f2754>] ? xfs_dir_lookup+0xa5/0x147
Oct  1 22:44:01 sb06 kernel: [<ffffffff81083b3d>] ? do_lookup+0xd5/0x1b3
Oct  1 22:44:01 sb06 kernel: [<ffffffff810858f0>] ? __link_path_walk+0x966/0xe0d
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107db53>] ? get_empty_filp+0x70/0x119
Oct  1 22:44:01 sb06 kernel: [<ffffffff81085fc5>] ? path_walk+0x66/0xca
Oct  1 22:44:01 sb06 kernel: [<ffffffff8108ee1c>] ? alloc_fd+0x67/0x10b
Oct  1 22:44:01 sb06 kernel: [<ffffffff8100adab>] ? system_call_fastpath+0x16/0x1b
Oct  1 22:44:01 sb06 kernel: RIP  [<ffffffff810f9516>] xfs_dir2_sf_lookup+0xe3/0x219
Oct  1 22:44:01 sb06 kernel: CR2: 0000000000000001
Oct  1 22:44:01 sb06 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct  1 22:44:01 sb06 kernel: Stack:
Oct  1 22:44:01 sb06 kernel: 00000000000107c0 ffff880005cc3c00 0000000000000000 ffff88017ce8dbb8
Oct  1 22:44:01 sb06 kernel: Call Trace:
Oct  1 22:44:01 sb06 kernel: [<ffffffff81114cc0>] ? xfs_lookup+0x47/0xa3
Oct  1 22:44:01 sb06 kernel: [<ffffffff8111c885>] ? xfs_vn_lookup+0x3c/0x7b
Oct  1 22:44:01 sb06 kernel: [<ffffffff81083acb>] ? do_lookup+0x63/0x1b3
Oct  1 22:44:01 sb06 kernel: [<ffffffff8108ad14>] ? dput+0x23/0x13d
Oct  1 22:44:01 sb06 kernel: [<ffffffff810860f7>] ? do_path_lookup+0x20/0x41
Oct  1 22:44:01 sb06 kernel: [<ffffffff81086c68>] ? do_filp_open+0xe3/0x92a
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107b24d>] ? do_sys_open+0x55/0x103
Oct  1 22:44:01 sb06 kernel: Code: 18 09 c2 0f b6 43 07 c1 e0 10 09 c2 0f b6 43 08 c1 e0 08 09 c2 48 09 d1 49 89 4c 24 28 41 c7 44 24 7c 01 00 00 00 e9 d2 00 00 00 <80> 7b 01 01 19 c0 45 31 ff 83 e0 fc 45 31 ed 83 c0 0a 48 98 48 
Oct  1 22:44:01 sb06 kernel: RSP <ffff88017ce8db68>
Oct  1 22:44:01 sb06 kernel: ---[ end trace 6e14835b29b5648a ]---
Oct  1 22:44:01 sb06 kernel: Filesystem "sdt": XFS internal error xfs_bmap_read_extents(1) at line 4648 of file fs/xfs/xfs_bmap.c.  Caller 0xffffffff81101202
Oct  1 22:44:01 sb06 kernel: Pid: 6771, comm: diablo Not tainted 2.6.31.1xfspatch #4
Oct  1 22:44:01 sb06 kernel: [<ffffffff81101202>] ? xfs_iread_extents+0xac/0xc8
Oct  1 22:44:01 sb06 kernel: [<ffffffff81101202>] ? xfs_iread_extents+0xac/0xc8
Oct  1 22:44:01 sb06 kernel: [<ffffffff810fe917>] ? xfs_iext_bno_to_ext+0xba/0x140
Oct  1 22:44:01 sb06 kernel: [<ffffffffa0042973>] ? bnx2_start_xmit+0x19a/0x3db [bnx2]
Oct  1 22:44:01 sb06 kernel: [<ffffffff81056104>] ? find_lock_page+0x15/0x50
Oct  1 22:44:01 sb06 kernel: [<ffffffff812409c8>] ? __down_write_nested+0x15/0x9d
Oct  1 22:44:01 sb06 kernel: [<ffffffff81116bda>] ? kmem_zone_alloc+0x5e/0xa4
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105689d>] ? find_or_create_page+0x2d/0x88
Oct  1 22:44:01 sb06 kernel: Filesystem "sdt": corrupt dinode 1208050920, (btree extents).  Unmount and run xfs_repair.
Oct  1 22:44:01 sb06 kernel:
Oct  1 22:44:01 sb06 kernel: Call Trace:
Oct  1 22:44:01 sb06 kernel: [<ffffffff810e69e4>] ? xfs_bmap_read_extents+0x274/0x30c
Oct  1 22:44:01 sb06 kernel: [<ffffffff810e7f44>] ? xfs_bmapi+0x25d/0xea8
Oct  1 22:44:01 sb06 kernel: [<ffffffff8113de7c>] ? swiotlb_map_page+0x73/0xe1
Oct  1 22:44:01 sb06 kernel: [<ffffffff81055e8c>] ? find_get_page+0x1a/0x77
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105689d>] ? find_or_create_page+0x2d/0x88
Oct  1 22:44:01 sb06 kernel: [<ffffffff81103e59>] ? xfs_iomap+0x145/0x284
Oct  1 22:44:01 sb06 kernel: [<ffffffff811174cb>] ? __xfs_get_blocks+0x6c/0x15c
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff810a0fd6>] ? mpage_readpages+0xbd/0xff
Oct  1 22:44:01 sb06 kernel: [<ffffffff81102664>] ? xfs_iread+0x152/0x166
Oct  1 22:44:01 sb06 kernel: [<ffffffff8103e723>] ? bit_waitqueue+0x10/0x8b
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105cb1c>] ? __do_page_cache_readahead+0x125/0x1b1
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105cdb6>] ? ondemand_readahead+0x11f/0x1a7
Oct  1 22:44:01 sb06 kernel: [<ffffffff810a0b84>] ? do_mpage_readpage+0x163/0x486
Oct  1 22:44:01 sb06 kernel: [<ffffffff81136dd2>] ? radix_tree_insert+0xd7/0x19f
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105626b>] ? add_to_page_cache_locked+0x72/0x98
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff8111e0c8>] ? xfs_read+0x16e/0x1de
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107c889>] ? do_sync_read+0xce/0x113
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107d3ec>] ? sys_read+0x45/0x6e
Oct  1 22:44:01 sb06 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
Oct  1 22:44:01 sb06 kernel: IP: [<ffffffff810f9516>] xfs_dir2_sf_lookup+0xe3/0x219
Oct  1 22:44:01 sb06 kernel: Oops: 0000 [#1] SMP 
Oct  1 22:44:01 sb06 kernel: CPU 2 
Oct  1 22:44:01 sb06 kernel: Pid: 6804, comm: diablo Not tainted 2.6.31.1xfspatch #4 PowerEdge 1950
Oct  1 22:44:01 sb06 kernel: RSP: 0018:ffff88017ce8db68  EFLAGS: 00010202
Oct  1 22:44:01 sb06 kernel: RAX: 0000000000000006 RBX: 0000000000000000 RCX: 00000000e62cdb77
Oct  1 22:44:01 sb06 kernel: RDX: 00000000e62cc212 RSI: 0000000000000002 RDI: ffff88017ce8dbb8
Oct  1 22:44:01 sb06 kernel: [<ffffffff811175cc>] ? xfs_get_blocks+0x0/0xe
Oct  1 22:44:01 sb06 kernel: [<ffffffff8105ad0c>] ? __alloc_pages_nodemask+0xf8/0x524
Oct  1 22:44:01 sb06 kernel: FS:  0000000001369860(0063) GS:ffff880028066000(0000) knlGS:0000000000000000
Oct  1 22:44:01 sb06 kernel: [<ffffffff810574e4>] ? generic_file_aio_read+0x1ff/0x548
Oct  1 22:44:01 sb06 kernel: [<ffffffff8103e7ed>] ? autoremove_wake_function+0x0/0x2e
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107d294>] ? vfs_read+0xaa/0x146
Oct  1 22:44:01 sb06 kernel: [<ffffffff8100adab>] ? system_call_fastpath+0x16/0x1b
Oct  1 22:44:01 sb06 kernel: PGD 17ce81067 PUD 17ce82067 PMD 0 
Oct  1 22:44:01 sb06 kernel: last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Oct  1 22:44:01 sb06 kernel: Modules linked in: acpi_cpufreq cpufreq_ondemand ipmi_si ipmi_devintf ipmi_msghandler bonding serio_raw mptspi rng_core scsi_transport_spi bnx2 processor thermal 8250_pnp 8250 serial_core thermal_sys
Oct  1 22:44:01 sb06 kernel: RIP: 0010:[<ffffffff810f9516>]  [<ffffffff810f9516>] xfs_dir2_sf_lookup+0xe3/0x219
Oct  1 22:44:01 sb06 kernel: CR2: 0000000000000001 CR3: 000000017ce80000 CR4: 00000000000006a0
Oct  1 22:44:01 sb06 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct  1 22:44:01 sb06 kernel: RBP: 0000000000000000 R08: ffff880005cc3c00 R09: ffff88022d867080
Oct  1 22:44:01 sb06 kernel: R10: ffffffff813457b0 R11: ffff88017f661cd0 R12: ffff88017ce8dbb8
Oct  1 22:44:01 sb06 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff88017ce8dc98
Oct  1 22:44:01 sb06 kernel: Process diablo (pid: 6804, threadinfo ffff88017ce8c000, task ffff88022d867080)
Oct  1 22:44:01 sb06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct  1 22:44:01 sb06 kernel: <0> 0000000000000000 0000000000000000 ffff88017ce8dc98 ffffffff810f2754
Oct  1 22:44:01 sb06 kernel: <0> ffff8800a540b900 ffff88022d8672f8 ffff880154034e20 0000000000000006
Oct  1 22:44:01 sb06 kernel: [<ffffffff810f2754>] ? xfs_dir_lookup+0xa5/0x147
Oct  1 22:44:01 sb06 kernel: [<ffffffff81083b3d>] ? do_lookup+0xd5/0x1b3
Oct  1 22:44:01 sb06 kernel: [<ffffffff810858f0>] ? __link_path_walk+0x966/0xe0d
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107db53>] ? get_empty_filp+0x70/0x119
Oct  1 22:44:01 sb06 kernel: [<ffffffff81085fc5>] ? path_walk+0x66/0xca
Oct  1 22:44:01 sb06 kernel: [<ffffffff8108ee1c>] ? alloc_fd+0x67/0x10b
Oct  1 22:44:01 sb06 kernel: [<ffffffff8100adab>] ? system_call_fastpath+0x16/0x1b
Oct  1 22:44:01 sb06 kernel: RIP  [<ffffffff810f9516>] xfs_dir2_sf_lookup+0xe3/0x219
Oct  1 22:44:01 sb06 kernel: CR2: 0000000000000001
Oct  1 22:44:01 sb06 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct  1 22:44:01 sb06 kernel: Stack:
Oct  1 22:44:01 sb06 kernel: 00000000000107c0 ffff880005cc3c00 0000000000000000 ffff88017ce8dbb8
Oct  1 22:44:01 sb06 kernel: Call Trace:
Oct  1 22:44:01 sb06 kernel: [<ffffffff81114cc0>] ? xfs_lookup+0x47/0xa3
Oct  1 22:44:01 sb06 kernel: [<ffffffff8111c885>] ? xfs_vn_lookup+0x3c/0x7b
Oct  1 22:44:01 sb06 kernel: [<ffffffff81083acb>] ? do_lookup+0x63/0x1b3
Oct  1 22:44:01 sb06 kernel: [<ffffffff8108ad14>] ? dput+0x23/0x13d
Oct  1 22:44:01 sb06 kernel: [<ffffffff810860f7>] ? do_path_lookup+0x20/0x41
Oct  1 22:44:01 sb06 kernel: [<ffffffff81086c68>] ? do_filp_open+0xe3/0x92a
Oct  1 22:44:01 sb06 kernel: [<ffffffff8107b24d>] ? do_sys_open+0x55/0x103
Oct  1 22:44:01 sb06 kernel: Code: 18 09 c2 0f b6 43 07 c1 e0 10 09 c2 0f b6 43 08 c1 e0 08 09 c2 48 09 d1 49 89 4c 24 28 41 c7 44 24 7c 01 00 00 00 e9 d2 00 00 00 <80> 7b 01 01 19 c0 45 31 ff 83 e0 fc 45 31 ed 83 c0 0a 48 98 48 
Oct  1 22:44:01 sb06 kernel: RSP <ffff88017ce8db68>
Oct  1 22:44:01 sb06 kernel: ---[ end trace 6e14835b29b5648a ]---
Oct  1 22:45:04 sb06 kernel: ------------[ cut here ]------------
Oct  1 22:45:04 sb06 kernel: invalid opcode: 0000 [#2] SMP 
Oct  1 22:45:04 sb06 kernel: CPU 2 
Oct  1 22:45:04 sb06 kernel: kernel BUG at fs/xfs/xfs_iget.c:334!
Oct  1 22:45:04 sb06 kernel: last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Oct  1 22:45:04 sb06 kernel: Modules linked in: acpi_cpufreq cpufreq_ondemand ipmi_si ipmi_devintf ipmi_msghandler bonding serio_raw mptspi rng_core scsi_transport_spi bnx2 processor thermal 8250_pnp 8250 serial_core thermal_sys
Oct  1 22:45:04 sb06 kernel: RIP: 0010:[<ffffffff810fe33f>]  [<ffffffff810fe33f>] xfs_iget+0x2e3/0x424
Oct  1 22:45:04 sb06 kernel: RDX: ffff880119c19080 RSI: 0000000000000296 RDI: ffff880005cc3c8c
Oct  1 22:45:04 sb06 kernel: R10: 0000000000000002 R11: 0001400100014004 R12: ffff88022d0c783c
Oct  1 22:45:04 sb06 kernel: FS:  0000000001369860(0063) GS:ffff880028066000(0000) knlGS:0000000000000000
Oct  1 22:45:04 sb06 kernel: CR2: 00007faaff8f2000 CR3: 00000001f54b3000 CR4: 00000000000006a0
Oct  1 22:45:04 sb06 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct  1 22:45:04 sb06 kernel: Stack:
Oct  1 22:45:04 sb06 kernel: <0> 000000000000dd70 00000000000001bb ffff8800642bdb70 0000000100000004
Oct  1 22:45:04 sb06 kernel: Pid: 17264, comm: diablo Tainted: G      D    2.6.31.1xfspatch #4 PowerEdge 1950
Oct  1 22:45:04 sb06 kernel: RSP: 0018:ffff8800642bdab8  EFLAGS: 00010246
Oct  1 22:45:04 sb06 kernel: RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffffffff81102664
Oct  1 22:45:04 sb06 kernel: RBP: ffff880005cc3c00 R08: 0000000000000001 R09: ffff88022c415400
Oct  1 22:45:04 sb06 kernel: R13: ffff88022d0c7800 R14: 000000000000001b R15: 0000000000000001
Oct  1 22:45:04 sb06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  1 22:45:04 sb06 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct  1 22:45:04 sb06 kernel: Process diablo (pid: 17264, threadinfo ffff8800642bc000, task ffff88010002ad00)
Oct  1 22:45:04 sb06 kernel: ffff8800022623c0 000000000000dd70 00000001015315f8 000000000000dd70
Oct  1 22:45:04 sb06 kernel: <0> 00000000000001bb ffff880001e692c0 ffff88022c415400 000001bb2d3d5400
Oct  1 22:45:04 sb06 kernel: Call Trace:
Oct  1 22:45:04 sb06 kernel: [<ffffffff811125f6>] ? xfs_trans_iget+0xa5/0xd3
Oct  1 22:45:04 sb06 kernel: [<ffffffff81100c9a>] ? xfs_ialloc+0xac/0x568
Oct  1 22:45:04 sb06 kernel: [<ffffffff81112eba>] ? xfs_dir_ialloc+0x84/0x2a2
Oct  1 22:45:04 sb06 kernel: [<ffffffff811111a4>] ? xfs_trans_reserve+0xda/0x1af
Oct  1 22:45:04 sb06 kernel: [<ffffffff812409c8>] ? __down_write_nested+0x15/0x9d
Oct  1 22:45:04 sb06 kernel: [<ffffffff81114aaf>] ? xfs_create+0x27e/0x448
Oct  1 22:45:04 sb06 kernel: [<ffffffff81114ccc>] ? xfs_lookup+0x53/0xa3
Oct  1 22:45:04 sb06 kernel: [<ffffffff8111ca06>] ? xfs_vn_mknod+0x9c/0xf2
Oct  1 22:45:04 sb06 kernel: [<ffffffff810844a3>] ? vfs_create+0x6e/0xb7
Oct  1 22:45:04 sb06 kernel: [<ffffffff81086e53>] ? do_filp_open+0x2ce/0x92a
Oct  1 22:45:04 sb06 kernel: [<ffffffff8107b24d>] ? do_sys_open+0x55/0x103
Oct  1 22:45:04 sb06 kernel: [<ffffffff8100adab>] ? system_call_fastpath+0x16/0x1b
Oct  1 22:45:04 sb06 kernel: Code: 00 00 bf d0 00 00 00 e8 7a 8b 03 00 85 c0 0f 85 cd 00 00 00 83 7c 24 38 00 74 14 8b 74 24 38 48 89 ef e8 ff f8 ff ff 85 c0 75 04 <0f> 0b eb fe 4c 89 e7 e8 92 28 14 00 44 88 f1 8b 74 24 5c b8 01 
Oct  1 22:45:04 sb06 kernel: RIP  [<ffffffff810fe33f>] xfs_iget+0x2e3/0x424
Oct  1 22:45:04 sb06 kernel: RSP <ffff8800642bdab8>
Oct  1 22:45:04 sb06 kernel: ---[ end trace 6e14835b29b5648b ]---
Oct  1 22:45:04 sb06 kernel: ------------[ cut here ]------------
Oct  1 22:45:04 sb06 kernel: kernel BUG at fs/xfs/xfs_iget.c:334!
Oct  1 22:45:04 sb06 kernel: invalid opcode: 0000 [#3] SMP 
Oct  1 22:45:04 sb06 kernel: last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Oct  1 22:45:04 sb06 kernel: CPU 2 
Oct  1 22:45:04 sb06 kernel: Modules linked in: acpi_cpufreq cpufreq_ondemand ipmi_si ipmi_devintf ipmi_msghandler bonding serio_raw mptspi rng_core scsi_transport_spi bnx2 processor thermal 8250_pnp 8250 serial_core thermal_sys
Oct  1 22:45:04 sb06 kernel: Pid: 17326, comm: diablo Tainted: G      D    2.6.31.1xfspatch #4 PowerEdge 1950
Oct  1 22:45:04 sb06 kernel: RIP: 0010:[<ffffffff810fe33f>]  [<ffffffff810fe33f>] xfs_iget+0x2e3/0x424
Oct  1 22:45:04 sb06 kernel: RSP: 0018:ffff88000fa79ab8  EFLAGS: 00010246
Oct  1 22:45:04 sb06 kernel: RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffffffff81102664
Oct  1 22:45:04 sb06 kernel: RDX: ffff880119c18780 RSI: 0000000000000296 RDI: ffff880005cc3c8c
Oct  1 22:45:04 sb06 kernel: RBP: ffff880005cc3c00 R08: 0000000000000001 R09: ffff88022f21dc00
Oct  1 22:45:04 sb06 kernel: R10: 0000000000000002 R11: 0001400100014004 R12: ffff88022ebc383c
Oct  1 22:45:04 sb06 kernel: R13: ffff88022ebc3800 R14: 000000000000001b R15: 0000000000000001
Oct  1 22:45:04 sb06 kernel: FS:  0000000001369860(0063) GS:ffff880028066000(0000) knlGS:0000000000000000
Oct  1 22:45:04 sb06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  1 22:45:04 sb06 kernel: CR2: 00007fffd3a2ce18 CR3: 0000000135a9d000 CR4: 00000000000006a0
Oct  1 22:45:04 sb06 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct  1 22:45:04 sb06 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct  1 22:45:04 sb06 kernel: Process diablo (pid: 17326, threadinfo ffff88000fa78000, task ffff8801000ccec0)
Oct  1 22:45:04 sb06 kernel: Stack:
Oct  1 22:45:04 sb06 kernel: ffff880101276180 000000000000dd70 0000000107e5b000 000000000000dd70
Oct  1 22:45:04 sb06 kernel: <0> 000000000000dd70 00000000000001a6 ffff88000fa79b70 0000000100000004
Oct  1 22:45:04 sb06 kernel: <0> 00000000000001a6 ffff8800c27d55e0 ffff88022f21dc00 000001a62fa9b400
Oct  1 22:45:04 sb06 kernel: Call Trace:
Oct  1 22:45:04 sb06 kernel: [<ffffffff811125f6>] ? xfs_trans_iget+0xa5/0xd3
Oct  1 22:45:04 sb06 kernel: [<ffffffff81100c9a>] ? xfs_ialloc+0xac/0x568
Oct  1 22:45:04 sb06 kernel: [<ffffffff81112eba>] ? xfs_dir_ialloc+0x84/0x2a2
Oct  1 22:45:04 sb06 kernel: [<ffffffff811111a4>] ? xfs_trans_reserve+0xda/0x1af
Oct  1 22:45:04 sb06 kernel: [<ffffffff812409c8>] ? __down_write_nested+0x15/0x9d
Oct  1 22:45:04 sb06 kernel: [<ffffffff81114aaf>] ? xfs_create+0x27e/0x448
Oct  1 22:45:04 sb06 kernel: [<ffffffff81114ccc>] ? xfs_lookup+0x53/0xa3
Oct  1 22:45:04 sb06 kernel: [<ffffffff8111ca06>] ? xfs_vn_mknod+0x9c/0xf2
Oct  1 22:45:04 sb06 kernel: [<ffffffff810844a3>] ? vfs_create+0x6e/0xb7
Oct  1 22:45:04 sb06 kernel: [<ffffffff81086e53>] ? do_filp_open+0x2ce/0x92a
Oct  1 22:45:04 sb06 kernel: [<ffffffff8107b24d>] ? do_sys_open+0x55/0x103
Oct  1 22:45:04 sb06 kernel: [<ffffffff8100adab>] ? system_call_fastpath+0x16/0x1b
Oct  1 22:45:04 sb06 kernel: Code: 00 00 bf d0 00 00 00 e8 7a 8b 03 00 85 c0 0f 85 cd 00 00 00 83 7c 24 38 00 74 14 8b 74 24 38 48 89 ef e8 ff f8 ff ff 85 c0 75 04 <0f> 0b eb fe 4c 89 e7 e8 92 28 14 00 44 88 f1 8b 74 24 5c b8 01 
Oct  1 22:45:04 sb06 kernel: RIP  [<ffffffff810fe33f>] xfs_iget+0x2e3/0x424
Oct  1 22:45:04 sb06 kernel: RSP <ffff88000fa79ab8>
Oct  1 22:45:04 sb06 kernel: ---[ end trace 6e14835b29b5648c ]---

[-- Attachment #5: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-02 14:24           ` Bas Couwenberg
@ 2009-10-05 21:43             ` Christoph Hellwig
  2009-10-06  9:04               ` Patrick Schreurs
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-05 21:43 UTC (permalink / raw)
  To: Bas Couwenberg
  Cc: Christoph Hellwig, Patrick Schreurs, Tommy van Leeuwen, XFS List

On Fri, Oct 02, 2009 at 04:24:39PM +0200, Bas Couwenberg wrote:
> Dear Christoph,
>
> Yesterday two of our servers (2.6.31.1 + your patch) crashed again, this  
> time we have a bigger console, but not the full backtrace unfortunately.
>
> I did manage to get some more calltrace info from the logs, which I have  
> attached together with the screenshots of the crashscreens.
>
> I hope this info helps you.

It helps a bit, but not so much.  I suspect it could be a double free
of an inode, and I have identified a possible race window that could
explain it.  But all the traces are really weird and I think only show
later symptoms of something that happened earlier.  I'll come up with
a patch for the race window ASAP, but could you in the meantime turn on
CONFIG_XFS_DEBUG for the test kernel to see if it triggers somehwere
and additionally apply the tiny patch below for additional debugging?


Subject: xfs: check for not fully initialized inodes in xfs_ireclaim
From: Christoph Hellwig <hch@lst.de>

Add an assert for inodes not added to the inode cache in xfs_ireclaim, to make
sure we're not going to introduce something like the famous nfsd inode cache
bug again.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: linux-2.6/fs/xfs/xfs_iget.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_iget.c	2009-08-10 11:30:55.729724742 -0300
+++ linux-2.6/fs/xfs/xfs_iget.c	2009-08-10 11:40:15.271748324 -0300
@@ -535,17 +535,21 @@ xfs_ireclaim(
 {
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_perag	*pag;
+	xfs_agino_t		agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
 
 	XFS_STATS_INC(xs_ig_reclaims);
 
 	/*
-	 * Remove the inode from the per-AG radix tree.  It doesn't matter
-	 * if it was never added to it because radix_tree_delete can deal
-	 * with that case just fine.
+	 * Remove the inode from the per-AG radix tree.
+	 *
+	 * Because radix_tree_delete won't complain even if the item was never
+	 * added to the tree assert that it's been there before to catch
+	 * problems with the inode life time early on.
 	 */
 	pag = xfs_get_perag(mp, ip->i_ino);
 	write_lock(&pag->pag_ici_lock);
-	radix_tree_delete(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino));
+	ASSERT(radix_tree_lookup(&pag->pag_ici_root, agino));
+	radix_tree_delete(&pag->pag_ici_root, agino);
 	write_unlock(&pag->pag_ici_lock);
 	xfs_put_perag(mp, pag);
 


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-05 21:43             ` Christoph Hellwig
@ 2009-10-06  9:04               ` Patrick Schreurs
  2009-10-07  1:19                 ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2009-10-06  9:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, Bas Couwenberg, XFS List

Christoph Hellwig wrote:
> It helps a bit, but not so much.  I suspect it could be a double free
> of an inode, and I have identified a possible race window that could
> explain it.  But all the traces are really weird and I think only show
> later symptoms of something that happened earlier.  I'll come up with
> a patch for the race window ASAP, but could you in the meantime turn on
> CONFIG_XFS_DEBUG for the test kernel to see if it triggers somehwere
> and additionally apply the tiny patch below for additional debugging?

Will try this.

Could this by any change be releated (from 2.6.32.2)?

commit 2f0ffb7ef75a9ad6140899f6d4df45e8a73a013e
Author: Jan Kara <jack@suse.cz>
Date:   Mon Sep 21 17:01:06 2009 -0700

   fs: make sure data stored into inode is properly seen before 
unlocking  new inode

     commit 580be0837a7a59b207c3d5c661d044d8dd0a6a30 upstream.

     In theory it could happen that on one CPU we initialize a new inode 
but clearing of I_NEW | I_LOCK gets reordered before some of the
     initialization.  Thus on another CPU we return not fully uptodate inode
     from iget_locked().

     This seems to fix a corruption issue on ext3 mounted over NFS.

Thanks,

Patrick Schreurs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-06  9:04               ` Patrick Schreurs
@ 2009-10-07  1:19                 ` Christoph Hellwig
  2009-10-08  8:45                   ` Patrick Schreurs
  2009-10-11  7:43                   ` Patrick Schreurs
  0 siblings, 2 replies; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-07  1:19 UTC (permalink / raw)
  To: Patrick Schreurs
  Cc: Christoph Hellwig, Tommy van Leeuwen, Bas Couwenberg, XFS List

On Tue, Oct 06, 2009 at 11:04:13AM +0200, Patrick Schreurs wrote:
> Christoph Hellwig wrote:
>> It helps a bit, but not so much.  I suspect it could be a double free
>> of an inode, and I have identified a possible race window that could
>> explain it.  But all the traces are really weird and I think only show
>> later symptoms of something that happened earlier.  I'll come up with
>> a patch for the race window ASAP, but could you in the meantime turn on
>> CONFIG_XFS_DEBUG for the test kernel to see if it triggers somehwere
>> and additionally apply the tiny patch below for additional debugging?
>
> Will try this.
>
> Could this by any change be releated (from 2.6.32.2)?

I doubt it, but it's losely in the same area.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-07  1:19                 ` Christoph Hellwig
@ 2009-10-08  8:45                   ` Patrick Schreurs
  2009-10-11  7:43                   ` Patrick Schreurs
  1 sibling, 0 replies; 42+ messages in thread
From: Patrick Schreurs @ 2009-10-08  8:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, Bas Couwenberg, XFS List

[-- Attachment #1: Type: text/plain, Size: 196 bytes --]

Attached is a screendump from 2.6.32.2 with your patches (including last 
one) applied, but without XFS_DEBUG.

We will turn on XFS_DEBUG and see if that helps.

Patrick Schreurs
News-Service.com

[-- Attachment #2: sb06-20091008.jpg --]
[-- Type: image/jpeg, Size: 98256 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-07  1:19                 ` Christoph Hellwig
  2009-10-08  8:45                   ` Patrick Schreurs
@ 2009-10-11  7:43                   ` Patrick Schreurs
  2009-10-11 12:24                     ` Christoph Hellwig
  2009-10-12 23:38                     ` Christoph Hellwig
  1 sibling, 2 replies; 42+ messages in thread
From: Patrick Schreurs @ 2009-10-11  7:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, Bas Couwenberg, XFS List

[-- Attachment #1: Type: text/plain, Size: 196 bytes --]

Hello Christoph,

Attached you'll find a screenshot from a 2.6.31.3 server, which includes 
your patches and has XFS_DEBUG turned on. I truly hope this is useful to 
you.

Thanks again,

-Patrick

[-- Attachment #2: sb04-20091011.jpg --]
[-- Type: image/jpeg, Size: 93960 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-11  7:43                   ` Patrick Schreurs
@ 2009-10-11 12:24                     ` Christoph Hellwig
  2009-10-12 23:38                     ` Christoph Hellwig
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-11 12:24 UTC (permalink / raw)
  To: Patrick Schreurs
  Cc: Christoph Hellwig, Tommy van Leeuwen, Bas Couwenberg, XFS List

On Sun, Oct 11, 2009 at 09:43:09AM +0200, Patrick Schreurs wrote:
> Hello Christoph,
>
> Attached you'll find a screenshot from a 2.6.31.3 server, which includes  
> your patches and has XFS_DEBUG turned on. I truly hope this is useful to  
> you.

This is very helpful as the assertation that I put gets hit.   Thanks a
lot Patrick, I'll have another patch for you real soon.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-11  7:43                   ` Patrick Schreurs
  2009-10-11 12:24                     ` Christoph Hellwig
@ 2009-10-12 23:38                     ` Christoph Hellwig
  2009-10-15 15:06                       ` Tommy van Leeuwen
  2009-10-19  1:16                       ` Dave Chinner
  1 sibling, 2 replies; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-12 23:38 UTC (permalink / raw)
  To: Patrick Schreurs
  Cc: Christoph Hellwig, Tommy van Leeuwen, Bas Couwenberg, XFS List

On Sun, Oct 11, 2009 at 09:43:09AM +0200, Patrick Schreurs wrote:
> Hello Christoph,
>
> Attached you'll find a screenshot from a 2.6.31.3 server, which includes  
> your patches and has XFS_DEBUG turned on. I truly hope this is useful to  
> you.

Thanks.  The patch below should fix the inode reclaim race that could
lead to the double free you're seeing.  To be applied ontop of all
the other patches I sent you.

Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2009-10-11 19:09:43.828254119 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2009-10-12 13:48:14.886006087 +0200
@@ -670,22 +670,22 @@ xfs_reclaim_inode(
 {
 	xfs_perag_t	*pag = xfs_get_perag(ip->i_mount, ip->i_ino);
 
-	/* The hash lock here protects a thread in xfs_iget_core from
-	 * racing with us on linking the inode back with a vnode.
-	 * Once we have the XFS_IRECLAIM flag set it will not touch
-	 * us.
+	/*
+	 * The hash lock here protects a thread in xfs_iget from racing with
+	 * us on recycling the inode.  Once we have the XFS_IRECLAIM flag set
+	 * it will not touch it.
 	 */
-	write_lock(&pag->pag_ici_lock);
 	spin_lock(&ip->i_flags_lock);
-	if (__xfs_iflags_test(ip, XFS_IRECLAIM) ||
-	    !__xfs_iflags_test(ip, XFS_IRECLAIMABLE)) {
+	ASSERT_ALWAYS(__xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	if (__xfs_iflags_test(ip, XFS_IRECLAIM)) {
 		spin_unlock(&ip->i_flags_lock);
 		write_unlock(&pag->pag_ici_lock);
-		return -EAGAIN;
+		return 0;
 	}
 	__xfs_iflags_set(ip, XFS_IRECLAIM);
 	spin_unlock(&ip->i_flags_lock);
 	write_unlock(&pag->pag_ici_lock);
+
 	xfs_put_perag(ip->i_mount, pag);
 
 	/*
@@ -758,27 +758,107 @@ __xfs_inode_clear_reclaim_tag(
 			XFS_INO_TO_AGINO(mp, ip->i_ino), XFS_ICI_RECLAIM_TAG);
 }
 
-STATIC int
-xfs_reclaim_inode_now(
-	struct xfs_inode	*ip,
+STATIC xfs_inode_t *
+xfs_reclaim_ag_lookup(
+	struct xfs_mount	*mp,
 	struct xfs_perag	*pag,
+	uint32_t		*first_index)
+{
+	int			nr_found;
+	struct xfs_inode	*ip;
+
+	/*
+	 * use a gang lookup to find the next inode in the tree
+	 * as the tree is sparse and a gang lookup walks to find
+	 * the number of objects requested.
+	 */
+	write_lock(&pag->pag_ici_lock);
+	nr_found = radix_tree_gang_lookup_tag(&pag->pag_ici_root,
+			(void **)&ip, *first_index, 1, XFS_ICI_RECLAIM_TAG);
+	if (!nr_found)
+		goto unlock;
+
+	/*
+	 * Update the index for the next lookup. Catch overflows
+	 * into the next AG range which can occur if we have inodes
+	 * in the last block of the AG and we are currently
+	 * pointing to the last inode.
+	 */
+	*first_index = XFS_INO_TO_AGINO(mp, ip->i_ino + 1);
+	if (*first_index < XFS_INO_TO_AGINO(mp, ip->i_ino))
+		goto unlock;
+
+	return ip;
+
+unlock:
+	write_unlock(&pag->pag_ici_lock);
+	return NULL;
+}
+
+STATIC int
+xfs_reclaim_ag_walk(
+	struct xfs_mount	*mp,
+	xfs_agnumber_t		ag,
 	int			flags)
 {
-	/* ignore if already under reclaim */
-	if (xfs_iflags_test(ip, XFS_IRECLAIM)) {
-		read_unlock(&pag->pag_ici_lock);
-		return 0;
+	struct xfs_perag	*pag = &mp->m_perag[ag];
+	uint32_t		first_index;
+	int			last_error = 0;
+	int			skipped;
+
+restart:
+	skipped = 0;
+	first_index = 0;
+	do {
+		int		error = 0;
+		xfs_inode_t	*ip;
+
+		ip = xfs_reclaim_ag_lookup(mp, pag, &first_index);
+		if (!ip)
+			break;
+
+		error = xfs_reclaim_inode(ip, flags);
+		if (error == EAGAIN) {
+			skipped++;
+			continue;
+		}
+		if (error)
+			last_error = error;
+		/*
+		 * bail out if the filesystem is corrupted.
+		 */
+		if (error == EFSCORRUPTED)
+			break;
+
+	} while (1);
+
+	if (skipped) {
+		delay(1);
+		goto restart;
 	}
-	read_unlock(&pag->pag_ici_lock);
 
-	return xfs_reclaim_inode(ip, flags);
+	xfs_put_perag(mp, pag);
+	return last_error;
 }
 
 int
 xfs_reclaim_inodes(
-	xfs_mount_t	*mp,
-	int		mode)
+	xfs_mount_t		*mp,
+	int			mode)
 {
-	return xfs_inode_ag_iterator(mp, xfs_reclaim_inode_now, mode,
-					XFS_ICI_RECLAIM_TAG);
+	int			error = 0;
+	int			last_error = 0;
+	xfs_agnumber_t		ag;
+
+	for (ag = 0; ag < mp->m_sb.sb_agcount; ag++) {
+		if (!mp->m_perag[ag].pag_ici_init)
+			continue;
+		error = xfs_reclaim_ag_walk(mp, ag, mode);
+		if (error) {
+			last_error = error;
+			if (error == EFSCORRUPTED)
+				break;
+		}
+	}
+	return XFS_ERROR(last_error);
 }

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-12 23:38                     ` Christoph Hellwig
@ 2009-10-15 15:06                       ` Tommy van Leeuwen
  2009-10-18 23:59                         ` Christoph Hellwig
  2009-10-19  1:16                       ` Dave Chinner
  1 sibling, 1 reply; 42+ messages in thread
From: Tommy van Leeuwen @ 2009-10-15 15:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Patrick Schreurs, Bas Couwenberg, XFS List

[-- Attachment #1: Type: text/plain, Size: 707 bytes --]

On Tue, Oct 13, 2009 at 1:38 AM, Christoph Hellwig <hch@infradead.org> wrote:
> On Sun, Oct 11, 2009 at 09:43:09AM +0200, Patrick Schreurs wrote:
>> Hello Christoph,
>>
>> Attached you'll find a screenshot from a 2.6.31.3 server, which includes
>> your patches and has XFS_DEBUG turned on. I truly hope this is useful to
>> you.
>
> Thanks.  The patch below should fix the inode reclaim race that could
> lead to the double free you're seeing.  To be applied ontop of all
> the other patches I sent you.

Hi Christoph,

Here are 2 more crashes with this patch applied, both having xfs_debug
on and showing different traces (not inode reclaim related?). Hope
it's usefull.

Cheers,
Tommy

[-- Attachment #2: sb08-20091014.jpg --]
[-- Type: image/jpeg, Size: 93003 bytes --]

[-- Attachment #3: sb07-20091015.jpg --]
[-- Type: image/jpeg, Size: 89456 bytes --]

[-- Attachment #4: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-15 15:06                       ` Tommy van Leeuwen
@ 2009-10-18 23:59                         ` Christoph Hellwig
  2009-10-19  1:17                           ` Dave Chinner
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-18 23:59 UTC (permalink / raw)
  To: Tommy van Leeuwen
  Cc: Christoph Hellwig, Patrick Schreurs, Bas Couwenberg, XFS List

On Thu, Oct 15, 2009 at 05:06:57PM +0200, Tommy van Leeuwen wrote:
> > Thanks. ?The patch below should fix the inode reclaim race that could
> > lead to the double free you're seeing. ?To be applied ontop of all
> > the other patches I sent you.
> 
> Hi Christoph,
> 
> Here are 2 more crashes with this patch applied, both having xfs_debug
> on and showing different traces (not inode reclaim related?). Hope
> it's usefull.

Can't make too much sense of it, but the dir2 is something you reported
earlier already.  We must be stomping over inodes somewhere, but I'm
not too sure where exactly.  Can you try throwing the patch below ontop
of your stack?  It fixes an area where we could theoretically corrupt
inode state.

Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2009-10-16 22:54:41.513254291 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2009-10-16 22:57:10.451256293 +0200
@@ -180,6 +180,11 @@ xfs_sync_inode_valid(
 		return EFSCORRUPTED;
 	}
 
+	if (xfs_iflags_test(ip, XFS_INEW | XFS_IRECLAIMABLE | XFS_IRECLAIM)) {
+		read_unlock(&pag->pag_ici_lock);
+		return ENOENT;
+	}
+
 	/*
 	 * If we can't get a reference on the inode, it must be in reclaim.
 	 * Leave it for the reclaim code to flush. Also avoid inodes that
@@ -191,7 +196,7 @@ xfs_sync_inode_valid(
 	}
 	read_unlock(&pag->pag_ici_lock);
 
-	if (is_bad_inode(inode) || xfs_iflags_test(ip, XFS_INEW)) {
+	if (is_bad_inode(inode)) {
 		IRELE(ip);
 		return ENOENT;
 	}

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-12 23:38                     ` Christoph Hellwig
  2009-10-15 15:06                       ` Tommy van Leeuwen
@ 2009-10-19  1:16                       ` Dave Chinner
  2009-10-19  3:54                         ` Christoph Hellwig
  1 sibling, 1 reply; 42+ messages in thread
From: Dave Chinner @ 2009-10-19  1:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Patrick Schreurs, Tommy van Leeuwen, Bas Couwenberg, XFS List

On Mon, Oct 12, 2009 at 07:38:54PM -0400, Christoph Hellwig wrote:
> On Sun, Oct 11, 2009 at 09:43:09AM +0200, Patrick Schreurs wrote:
> > Hello Christoph,
> >
> > Attached you'll find a screenshot from a 2.6.31.3 server, which includes  
> > your patches and has XFS_DEBUG turned on. I truly hope this is useful to  
> > you.
> 
> Thanks.  The patch below should fix the inode reclaim race that could
> lead to the double free you're seeing.  To be applied ontop of all
> the other patches I sent you.
> 
> Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
> ===================================================================
> --- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2009-10-11 19:09:43.828254119 +0200
> +++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2009-10-12 13:48:14.886006087 +0200
> @@ -670,22 +670,22 @@ xfs_reclaim_inode(
>  {
>  	xfs_perag_t	*pag = xfs_get_perag(ip->i_mount, ip->i_ino);
>  
> -	/* The hash lock here protects a thread in xfs_iget_core from
> -	 * racing with us on linking the inode back with a vnode.
> -	 * Once we have the XFS_IRECLAIM flag set it will not touch
> -	 * us.
> +	/*
> +	 * The hash lock here protects a thread in xfs_iget from racing with
> +	 * us on recycling the inode.  Once we have the XFS_IRECLAIM flag set
> +	 * it will not touch it.
>  	 */
> -	write_lock(&pag->pag_ici_lock);

Did you mean to remove this write_lock? The patch does not remove
the unlocks....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-18 23:59                         ` Christoph Hellwig
@ 2009-10-19  1:17                           ` Dave Chinner
  2009-10-19  3:53                             ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Dave Chinner @ 2009-10-19  1:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Patrick Schreurs, Tommy van Leeuwen, Bas Couwenberg, XFS List

On Sun, Oct 18, 2009 at 07:59:10PM -0400, Christoph Hellwig wrote:
> On Thu, Oct 15, 2009 at 05:06:57PM +0200, Tommy van Leeuwen wrote:
> > > Thanks. ?The patch below should fix the inode reclaim race that could
> > > lead to the double free you're seeing. ?To be applied ontop of all
> > > the other patches I sent you.
> > 
> > Hi Christoph,
> > 
> > Here are 2 more crashes with this patch applied, both having xfs_debug
> > on and showing different traces (not inode reclaim related?). Hope
> > it's usefull.
> 
> Can't make too much sense of it, but the dir2 is something you reported
> earlier already.  We must be stomping over inodes somewhere, but I'm
> not too sure where exactly.  Can you try throwing the patch below ontop
> of your stack?  It fixes an area where we could theoretically corrupt
> inode state.
> 
> Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
> ===================================================================
> --- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2009-10-16 22:54:41.513254291 +0200
> +++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2009-10-16 22:57:10.451256293 +0200
> @@ -180,6 +180,11 @@ xfs_sync_inode_valid(
>  		return EFSCORRUPTED;
>  	}
>  
> +	if (xfs_iflags_test(ip, XFS_INEW | XFS_IRECLAIMABLE | XFS_IRECLAIM)) {
> +		read_unlock(&pag->pag_ici_lock);
> +		return ENOENT;
> +	}

This needs an IRELE(ip) here, doesn't it?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-19  1:17                           ` Dave Chinner
@ 2009-10-19  3:53                             ` Christoph Hellwig
  0 siblings, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-19  3:53 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Patrick Schreurs, Tommy van Leeuwen,
	Bas Couwenberg, XFS List

On Mon, Oct 19, 2009 at 12:17:10PM +1100, Dave Chinner wrote:
> >  
> > +	if (xfs_iflags_test(ip, XFS_INEW | XFS_IRECLAIMABLE | XFS_IRECLAIM)) {
> > +		read_unlock(&pag->pag_ici_lock);
> > +		return ENOENT;
> > +	}
> 
> This needs an IRELE(ip) here, doesn't it?

No, the check is before the igrab now.  That was kinda the point as I
suspect that the igrab might be corrupting state of a reclaimable or
in reclaim inode.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-19  1:16                       ` Dave Chinner
@ 2009-10-19  3:54                         ` Christoph Hellwig
  2009-10-20  3:40                           ` Dave Chinner
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-19  3:54 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Patrick Schreurs, Tommy van Leeuwen,
	Bas Couwenberg, XFS List

On Mon, Oct 19, 2009 at 12:16:00PM +1100, Dave Chinner wrote:
> > +	 * The hash lock here protects a thread in xfs_iget from racing with
> > +	 * us on recycling the inode.  Once we have the XFS_IRECLAIM flag set
> > +	 * it will not touch it.
> >  	 */
> > -	write_lock(&pag->pag_ici_lock);
> 
> Did you mean to remove this write_lock? The patch does not remove
> the unlocks....

It's taken by the caller.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-19  3:54                         ` Christoph Hellwig
@ 2009-10-20  3:40                           ` Dave Chinner
  2009-10-21  9:45                             ` Tommy van Leeuwen
  0 siblings, 1 reply; 42+ messages in thread
From: Dave Chinner @ 2009-10-20  3:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Patrick Schreurs, Tommy van Leeuwen, Bas Couwenberg, XFS List

On Sun, Oct 18, 2009 at 11:54:26PM -0400, Christoph Hellwig wrote:
> On Mon, Oct 19, 2009 at 12:16:00PM +1100, Dave Chinner wrote:
> > > +	 * The hash lock here protects a thread in xfs_iget from racing with
> > > +	 * us on recycling the inode.  Once we have the XFS_IRECLAIM flag set
> > > +	 * it will not touch it.
> > >  	 */
> > > -	write_lock(&pag->pag_ici_lock);
> > 
> > Did you mean to remove this write_lock? The patch does not remove
> > the unlocks....
> 
> It's taken by the caller.

Ah, I guess I need to see the whole patch series, then.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-20  3:40                           ` Dave Chinner
@ 2009-10-21  9:45                             ` Tommy van Leeuwen
  2009-10-22  8:59                               ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Tommy van Leeuwen @ 2009-10-21  9:45 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Patrick Schreurs, Bas Couwenberg, XFS List

On Tue, Oct 20, 2009 at 5:40 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Sun, Oct 18, 2009 at 11:54:26PM -0400, Christoph Hellwig wrote:
>> On Mon, Oct 19, 2009 at 12:16:00PM +1100, Dave Chinner wrote:
>> > > +  * The hash lock here protects a thread in xfs_iget from racing with
>> > > +  * us on recycling the inode.  Once we have the XFS_IRECLAIM flag set
>> > > +  * it will not touch it.
>> > >    */
>> > > - write_lock(&pag->pag_ici_lock);
>> >
>> > Did you mean to remove this write_lock? The patch does not remove
>> > the unlocks....
>>
>> It's taken by the caller.
>
> Ah, I guess I need to see the whole patch series, then.

This is the full patch we're using now on 2.6.31.4. (Just running btw
so no results yet).

diff -ru linux-2.6.31.4/fs/xfs/linux-2.6/xfs_sync.c
linux-2.6.31.4-xfspatch/fs/xfs/linux-2.6/xfs_sync.c
--- linux-2.6.31.4/fs/xfs/linux-2.6/xfs_sync.c  2009-09-10
00:13:59.000000000 +0200
+++ linux-2.6.31.4-xfspatch/fs/xfs/linux-2.6/xfs_sync.c 2009-10-21
11:24:56.000000000 +0200
@@ -180,6 +180,11 @@
                return EFSCORRUPTED;
        }

+       if (xfs_iflags_test(ip, XFS_INEW | XFS_IRECLAIMABLE | XFS_IRECLAIM)) {
+               read_unlock(&pag->pag_ici_lock);
+               return ENOENT;
+       }
+
        /*
         * If we can't get a reference on the inode, it must be in reclaim.
         * Leave it for the reclaim code to flush. Also avoid inodes that
@@ -191,7 +196,7 @@
        }
        read_unlock(&pag->pag_ici_lock);

-       if (is_bad_inode(inode) || xfs_iflags_test(ip, XFS_INEW)) {
+       if (is_bad_inode(inode)) {
                IRELE(ip);
                return ENOENT;
        }
@@ -655,22 +660,21 @@
 {
        xfs_perag_t     *pag = xfs_get_perag(ip->i_mount, ip->i_ino);

-       /* The hash lock here protects a thread in xfs_iget_core from
-        * racing with us on linking the inode back with a vnode.
-        * Once we have the XFS_IRECLAIM flag set it will not touch
-        * us.
+       /*
+        * The hash lock here protects a thread in xfs_iget from racing with
+        * us on recycling the inode.  Once we have the XFS_IRECLAIM flag set
+        * it will not touch it.
         */
-       write_lock(&pag->pag_ici_lock);
        spin_lock(&ip->i_flags_lock);
-       if (__xfs_iflags_test(ip, XFS_IRECLAIM) ||
-           !__xfs_iflags_test(ip, XFS_IRECLAIMABLE)) {
+       ASSERT_ALWAYS(__xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+       if (__xfs_iflags_test(ip, XFS_IRECLAIM)) {
                spin_unlock(&ip->i_flags_lock);
                write_unlock(&pag->pag_ici_lock);
                if (locked) {
                        xfs_ifunlock(ip);
                        xfs_iunlock(ip, XFS_ILOCK_EXCL);
                }
-               return -EAGAIN;
+               return 0;
        }
        __xfs_iflags_set(ip, XFS_IRECLAIM);
        spin_unlock(&ip->i_flags_lock);
@@ -764,6 +768,88 @@
        xfs_put_perag(mp, pag);
 }

+STATIC xfs_inode_t *
+xfs_reclaim_ag_lookup(
+       struct xfs_mount        *mp,
+       struct xfs_perag        *pag,
+       uint32_t                *first_index)
+{
+       int                     nr_found;
+       struct xfs_inode        *ip;
+
+       /*
+        * use a gang lookup to find the next inode in the tree
+        * as the tree is sparse and a gang lookup walks to find
+        * the number of objects requested.
+        */
+       write_lock(&pag->pag_ici_lock);
+       nr_found = radix_tree_gang_lookup_tag(&pag->pag_ici_root,
+                       (void **)&ip, *first_index, 1, XFS_ICI_RECLAIM_TAG);
+       if (!nr_found)
+               goto unlock;
+
+       /*
+        * Update the index for the next lookup. Catch overflows
+        * into the next AG range which can occur if we have inodes
+        * in the last block of the AG and we are currently
+        * pointing to the last inode.
+        */
+       *first_index = XFS_INO_TO_AGINO(mp, ip->i_ino + 1);
+       if (*first_index < XFS_INO_TO_AGINO(mp, ip->i_ino))
+               goto unlock;
+
+       return ip;
+
+unlock:
+       write_unlock(&pag->pag_ici_lock);
+       return NULL;
+}
+
+STATIC int
+xfs_reclaim_ag_walk(
+       struct xfs_mount        *mp,
+       xfs_agnumber_t          ag,
+       int                     flags)
+{
+       struct xfs_perag        *pag = &mp->m_perag[ag];
+       uint32_t                first_index;
+       int                     last_error = 0;
+       int                     skipped;
+
+restart:
+       skipped = 0;
+       first_index = 0;
+       do {
+               int             error = 0;
+               xfs_inode_t     *ip;
+
+               ip = xfs_reclaim_ag_lookup(mp, pag, &first_index);
+               if (!ip)
+                       break;
+
+               error = xfs_reclaim_inode(ip, 0, flags);
+               if (error == EAGAIN) {
+                       skipped++;
+                       continue;
+               }
+               if (error)
+                       last_error = error;
+               /*
+                * bail out if the filesystem is corrupted.
+                */
+               if (error == EFSCORRUPTED)
+                       break;
+
+       } while (1);
+
+       if (skipped) {
+               delay(1);
+               goto restart;
+       }
+       xfs_put_perag(mp, pag);
+       return last_error;
+}
+
 STATIC int
 xfs_reclaim_inode_now(
        struct xfs_inode        *ip,
@@ -785,6 +871,19 @@
        xfs_mount_t     *mp,
        int             mode)
 {
-       return xfs_inode_ag_iterator(mp, xfs_reclaim_inode_now, mode,
-                                       XFS_ICI_RECLAIM_TAG);
+       int                     error = 0;
+       int                     last_error = 0;
+       xfs_agnumber_t          ag;
+
+       for (ag = 0; ag < mp->m_sb.sb_agcount; ag++) {
+               if (!mp->m_perag[ag].pag_ici_init)
+                       continue;
+               error = xfs_reclaim_ag_walk(mp, ag, mode);
+               if (error) {
+                       last_error = error;
+                       if (error == EFSCORRUPTED)
+                               break;
+               }
+       }
+       return XFS_ERROR(last_error);
 }
diff -ru linux-2.6.31.4/fs/xfs/xfs_iget.c
linux-2.6.31.4-xfspatch/fs/xfs/xfs_iget.c
--- linux-2.6.31.4/fs/xfs/xfs_iget.c    2009-09-10 00:13:59.000000000 +0200
+++ linux-2.6.31.4-xfspatch/fs/xfs/xfs_iget.c   2009-10-14
13:56:33.000000000 +0200
@@ -242,6 +242,8 @@

                error = -inode_init_always(mp->m_super, inode);
                if (error) {
+                       printk("XFS: inode_init_always failed to
re-initialize inode\n");
+
                        /*
                         * Re-initializing the inode failed, and we are in deep
                         * trouble.  Try to re-add it to the reclaim list.
@@ -538,17 +540,21 @@
 {
        struct xfs_mount        *mp = ip->i_mount;
        struct xfs_perag        *pag;
+       xfs_agino_t             agino = XFS_INO_TO_AGINO(mp, ip->i_ino);

        XFS_STATS_INC(xs_ig_reclaims);

        /*
-        * Remove the inode from the per-AG radix tree.  It doesn't matter
-        * if it was never added to it because radix_tree_delete can deal
-        * with that case just fine.
+        * Remove the inode from the per-AG radix tree.
+        *
+        * Because radix_tree_delete won't complain even if the item was never
+        * added to the tree assert that it's been there before to catch
+        * problems with the inode life time early on.
         */
        pag = xfs_get_perag(mp, ip->i_ino);
        write_lock(&pag->pag_ici_lock);
-       radix_tree_delete(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino));
+       ASSERT(radix_tree_lookup(&pag->pag_ici_root, agino));
+       radix_tree_delete(&pag->pag_ici_root, agino);
        write_unlock(&pag->pag_ici_lock);
        xfs_put_perag(mp, pag);

diff -ru linux-2.6.31.4/fs/xfs/xfs_vnodeops.c
linux-2.6.31.4-xfspatch/fs/xfs/xfs_vnodeops.c
--- linux-2.6.31.4/fs/xfs/xfs_vnodeops.c        2009-09-10
00:13:59.000000000 +0200
+++ linux-2.6.31.4-xfspatch/fs/xfs/xfs_vnodeops.c       2009-10-14
13:56:33.000000000 +0200
@@ -2465,45 +2465,36 @@
 xfs_reclaim(
        xfs_inode_t     *ip)
 {
-
        xfs_itrace_entry(ip);

        ASSERT(!VN_MAPPED(VFS_I(ip)));

        /* bad inode, get out here ASAP */
-       if (is_bad_inode(VFS_I(ip))) {
-               xfs_ireclaim(ip);
-               return 0;
-       }
+       if (is_bad_inode(VFS_I(ip)))
+               goto out_reclaim;

        xfs_ioend_wait(ip);

        ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);

        /*
-        * Make sure the atime in the XFS inode is correct before freeing the
-        * Linux inode.
+        * We should never get here with one of the reclaim flags already set.
         */
-       xfs_synchronize_atime(ip);
+       BUG_ON(xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+       BUG_ON(xfs_iflags_test(ip, XFS_IRECLAIM));

        /*
         * If we have nothing to flush with this inode then complete the
-        * teardown now, otherwise break the link between the xfs inode and the
-        * linux inode and clean up the xfs inode later. This avoids flushing
-        * the inode to disk during the delete operation itself.
-        *
-        * When breaking the link, we need to set the XFS_IRECLAIMABLE flag
-        * first to ensure that xfs_iunpin() will never see an xfs inode
-        * that has a linux inode being reclaimed. Synchronisation is provided
-        * by the i_flags_lock.
+        * teardown now, otherwise delay the flush operation.
         */
-       if (!ip->i_update_core && (ip->i_itemp == NULL)) {
-               xfs_ilock(ip, XFS_ILOCK_EXCL);
-               xfs_iflock(ip);
-               xfs_iflags_set(ip, XFS_IRECLAIMABLE);
-               return xfs_reclaim_inode(ip, 1, XFS_IFLUSH_DELWRI_ELSE_SYNC);
+       if (ip->i_update_core || ip->i_itemp) {
+               xfs_inode_set_reclaim_tag(ip);
+               return 0;
        }
        xfs_inode_set_reclaim_tag(ip);
+
+out_reclaim:
+       xfs_ireclaim(ip);
        return 0;
 }

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-21  9:45                             ` Tommy van Leeuwen
@ 2009-10-22  8:59                               ` Christoph Hellwig
  2009-10-27 10:41                                 ` Tommy van Leeuwen
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-22  8:59 UTC (permalink / raw)
  To: Tommy van Leeuwen
  Cc: Christoph Hellwig, Patrick Schreurs, Bas Couwenberg, XFS List

> This is the full patch we're using now on 2.6.31.4. (Just running btw
> so no results yet).

Yes, that looks like what you should be testing with..  Thanks a lot
again!

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-22  8:59                               ` Christoph Hellwig
@ 2009-10-27 10:41                                 ` Tommy van Leeuwen
       [not found]                                   ` <89c4f90c0910280519k759230c1r7b1586932ac792f7@mail.gmail.com>
  0 siblings, 1 reply; 42+ messages in thread
From: Tommy van Leeuwen @ 2009-10-27 10:41 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Patrick Schreurs, Bas Couwenberg, XFS List

[-- Attachment #1: Type: text/plain, Size: 549 bytes --]

On Thu, Oct 22, 2009 at 9:59 AM, Christoph Hellwig <hch@infradead.org> wrote:
>> This is the full patch we're using now on 2.6.31.4. (Just running btw
>> so no results yet).
>
> Yes, that looks like what you should be testing with..  Thanks a lot
> again!

Hello Christoph,

We had two more crashes during the weekend. The interresting part is
it seems it's not crashing at xfs anymore directly. And both crashes
happen at address 0200200 (just like one of the previous crashes).

Thanks again for looking into this.

Cheers,
Tommy

[-- Attachment #2: sb08.jpg --]
[-- Type: image/jpeg, Size: 92498 bytes --]

[-- Attachment #3: sb07.jpg --]
[-- Type: image/jpeg, Size: 98600 bytes --]

[-- Attachment #4: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
       [not found]                                   ` <89c4f90c0910280519k759230c1r7b1586932ac792f7@mail.gmail.com>
@ 2009-10-30 10:16                                     ` Christoph Hellwig
  2009-11-03 14:46                                       ` Patrick Schreurs
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-10-30 10:16 UTC (permalink / raw)
  To: Tommy van Leeuwen
  Cc: Christoph Hellwig, Patrick Schreurs, Bas Couwenberg, XFS List

On Wed, Oct 28, 2009 at 01:19:44PM +0100, Tommy van Leeuwen wrote:
> Another one just seconds ago. Might be more usefull then the last 2.

Ist this one also with COFNIG_XFS_DEBUG enabled?  May assert should have
hit the condition we're seing in that one earlier.

Anyway, it still looks like we can get inodes that are partially
or fully torn down before entering ->destroy_inode.  I'll try to figure
out why.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-10-30 10:16                                     ` Christoph Hellwig
@ 2009-11-03 14:46                                       ` Patrick Schreurs
  2009-11-14 16:21                                         ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2009-11-03 14:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, Bas Couwenberg, XFS List

Christoph Hellwig wrote:
> On Wed, Oct 28, 2009 at 01:19:44PM +0100, Tommy van Leeuwen wrote:
>> Another one just seconds ago. Might be more usefull then the last 2.
> 
> Ist this one also with COFNIG_XFS_DEBUG enabled?  May assert should have
> hit the condition we're seing in that one earlier.

Sorry for the delay. Yes, XFS_DEBUG was enabled on these servers.

> Anyway, it still looks like we can get inodes that are partially
> or fully torn down before entering ->destroy_inode.  I'll try to figure
> out why.

We're back to 2.6.28 at the moment. Please advice if we can do anything 
to assist.

Any clue why we seem to be the only one hitting this problem? It might 
have something to do with the short term data retention on these 
particular servers. All partitions are always 100% full and data is only 
kept for a couple of days.

Thanks for looking into this.

-Patrick

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
  2009-11-03 14:46                                       ` Patrick Schreurs
@ 2009-11-14 16:21                                         ` Christoph Hellwig
       [not found]                                           ` <4B0A8075.8080008@news-service.com>
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2009-11-14 16:21 UTC (permalink / raw)
  To: Patrick Schreurs
  Cc: Christoph Hellwig, Tommy van Leeuwen, Bas Couwenberg, XFS List

On Tue, Nov 03, 2009 at 03:46:05PM +0100, Patrick Schreurs wrote:
> We're back to 2.6.28 at the moment. Please advice if we can do anything  
> to assist.
>
> Any clue why we seem to be the only one hitting this problem? It might  
> have something to do with the short term data retention on these  
> particular servers. All partitions are always 100% full and data is only  
> kept for a couple of days.

Sorry for the lack of updates, been travelling and working a lot and
didn't have much time to look at your screen dumps.  I really wish where
the magic offset for the NULL pointer dereference comes from.  The 100%
full might be a pretty good hint that is has to deal with ENOSPC
handling somewhere.  I don't know of anything you can help me with for
now, but will come back as soon as I have something more.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
       [not found]                                                   ` <4B45CFAC.4000607@news-service.com>
@ 2010-01-08 11:31                                                     ` Dave Chinner
  2010-01-11 20:22                                                       ` Patrick Schreurs
  2010-01-15 11:01                                                       ` Patrick Schreurs
  0 siblings, 2 replies; 42+ messages in thread
From: Dave Chinner @ 2010-01-08 11:31 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

[-- Attachment #1: Type: text/plain, Size: 693 bytes --]

Hi Patrick,

I've attached two compendium patches that will hopefully fix
the inode reclaim problems you've been seeing - one is for 2.6.31,
the other is for 2.6.32. I've cc'd this to the XFS list ѕo that
anyone else who has been seeing crashes, assert failures and
general nastiness around inode reclaim can test them as well.

These are not final patches - there's a few changes that Christoph
has picked up on during review - so there'll be another round of
patches before checkins and -stable backports can be requested.

I'm hoping that these patches fix your problem, because with them
I can't make my machines fall over anymore....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

[-- Attachment #2: xfs-inode-reclaim-2.6.31 --]
[-- Type: text/plain, Size: 12712 bytes --]

diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index a220d36..793f5d0 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -926,13 +926,37 @@ xfs_fs_alloc_inode(
  */
 STATIC void
 xfs_fs_destroy_inode(
-	struct inode	*inode)
+	struct inode		*inode)
 {
-	xfs_inode_t		*ip = XFS_I(inode);
+	struct xfs_inode	*ip = XFS_I(inode);
+
+	xfs_itrace_entry(ip);
 
 	XFS_STATS_INC(vn_reclaim);
-	if (xfs_reclaim(ip))
-		panic("%s: cannot reclaim 0x%p\n", __func__, inode);
+
+	/* bad inode, get out here ASAP */
+	if (is_bad_inode(inode))
+		goto out_reclaim;
+
+	xfs_ioend_wait(ip);
+
+	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
+
+	/*
+	 * We should never get here with one of the reclaim flags already set.
+	 */
+	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));
+
+	/*
+	 * we always use background reclaim here because even if the
+	 * inode is clean, it still may be under IO and hence we have
+	 * to take the flush lock. The background reclaim path handles
+	 * this more efficiently than we can here, so simply let background
+	 * reclaim tear down all inodes.
+	 */
+out_reclaim:
+	xfs_inode_set_reclaim_tag(ip);
 }
 
 /*
diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index 98ef624..cfc2e70 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -54,7 +54,8 @@ xfs_inode_ag_lookup(
 	struct xfs_mount	*mp,
 	struct xfs_perag	*pag,
 	uint32_t		*first_index,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	int			nr_found;
 	struct xfs_inode	*ip;
@@ -64,7 +65,10 @@ xfs_inode_ag_lookup(
 	 * as the tree is sparse and a gang lookup walks to find
 	 * the number of objects requested.
 	 */
-	read_lock(&pag->pag_ici_lock);
+	if (write_lock)
+		write_lock(&pag->pag_ici_lock);
+	else
+		read_lock(&pag->pag_ici_lock);
 	if (tag == XFS_ICI_NO_TAG) {
 		nr_found = radix_tree_gang_lookup(&pag->pag_ici_root,
 				(void **)&ip, *first_index, 1);
@@ -88,7 +92,10 @@ xfs_inode_ag_lookup(
 	return ip;
 
 unlock:
-	read_unlock(&pag->pag_ici_lock);
+	if (write_lock)
+		write_unlock(&pag->pag_ici_lock);
+	else
+		read_unlock(&pag->pag_ici_lock);
 	return NULL;
 }
 
@@ -99,7 +106,8 @@ xfs_inode_ag_walk(
 	int			(*execute)(struct xfs_inode *ip,
 					   struct xfs_perag *pag, int flags),
 	int			flags,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	struct xfs_perag	*pag = &mp->m_perag[ag];
 	uint32_t		first_index;
@@ -113,7 +121,8 @@ restart:
 		int		error = 0;
 		xfs_inode_t	*ip;
 
-		ip = xfs_inode_ag_lookup(mp, pag, &first_index, tag);
+		ip = xfs_inode_ag_lookup(mp, pag, &first_index, tag,
+						write_lock);
 		if (!ip)
 			break;
 
@@ -147,7 +156,8 @@ xfs_inode_ag_iterator(
 	int			(*execute)(struct xfs_inode *ip,
 					   struct xfs_perag *pag, int flags),
 	int			flags,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	int			error = 0;
 	int			last_error = 0;
@@ -156,7 +166,8 @@ xfs_inode_ag_iterator(
 	for (ag = 0; ag < mp->m_sb.sb_agcount; ag++) {
 		if (!mp->m_perag[ag].pag_ici_init)
 			continue;
-		error = xfs_inode_ag_walk(mp, ag, execute, flags, tag);
+		error = xfs_inode_ag_walk(mp, ag, execute, flags, tag,
+					write_lock);
 		if (error) {
 			last_error = error;
 			if (error == EFSCORRUPTED)
@@ -180,18 +191,20 @@ xfs_sync_inode_valid(
 		return EFSCORRUPTED;
 	}
 
-	/*
-	 * If we can't get a reference on the inode, it must be in reclaim.
-	 * Leave it for the reclaim code to flush. Also avoid inodes that
-	 * haven't been fully initialised.
-	 */
+	/* avoid new or reclaimable inodes. Leave for reclaim code to flush */
+	if (xfs_iflags_test(ip, XFS_INEW | XFS_IRECLAIMABLE | XFS_IRECLAIM)) {
+		read_unlock(&pag->pag_ici_lock);
+		return ENOENT;
+	}
+
+	/* If we can't get a reference on the inode, it must be in reclaim. */
 	if (!igrab(inode)) {
 		read_unlock(&pag->pag_ici_lock);
 		return ENOENT;
 	}
 	read_unlock(&pag->pag_ici_lock);
 
-	if (is_bad_inode(inode) || xfs_iflags_test(ip, XFS_INEW)) {
+	if (is_bad_inode(inode)) {
 		IRELE(ip);
 		return ENOENT;
 	}
@@ -281,7 +294,7 @@ xfs_sync_data(
 	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
 
 	error = xfs_inode_ag_iterator(mp, xfs_sync_inode_data, flags,
-				      XFS_ICI_NO_TAG);
+				      XFS_ICI_NO_TAG, 0);
 	if (error)
 		return XFS_ERROR(error);
 
@@ -303,7 +316,7 @@ xfs_sync_attr(
 	ASSERT((flags & ~SYNC_WAIT) == 0);
 
 	return xfs_inode_ag_iterator(mp, xfs_sync_inode_attr, flags,
-				     XFS_ICI_NO_TAG);
+				     XFS_ICI_NO_TAG, 0);
 }
 
 STATIC int
@@ -647,36 +660,11 @@ xfs_syncd_stop(
 	kthread_stop(mp->m_sync_task);
 }
 
-int
+STATIC int
 xfs_reclaim_inode(
 	xfs_inode_t	*ip,
-	int		locked,
 	int		sync_mode)
 {
-	xfs_perag_t	*pag = xfs_get_perag(ip->i_mount, ip->i_ino);
-
-	/* The hash lock here protects a thread in xfs_iget_core from
-	 * racing with us on linking the inode back with a vnode.
-	 * Once we have the XFS_IRECLAIM flag set it will not touch
-	 * us.
-	 */
-	write_lock(&pag->pag_ici_lock);
-	spin_lock(&ip->i_flags_lock);
-	if (__xfs_iflags_test(ip, XFS_IRECLAIM) ||
-	    !__xfs_iflags_test(ip, XFS_IRECLAIMABLE)) {
-		spin_unlock(&ip->i_flags_lock);
-		write_unlock(&pag->pag_ici_lock);
-		if (locked) {
-			xfs_ifunlock(ip);
-			xfs_iunlock(ip, XFS_ILOCK_EXCL);
-		}
-		return -EAGAIN;
-	}
-	__xfs_iflags_set(ip, XFS_IRECLAIM);
-	spin_unlock(&ip->i_flags_lock);
-	write_unlock(&pag->pag_ici_lock);
-	xfs_put_perag(ip->i_mount, pag);
-
 	/*
 	 * If the inode is still dirty, then flush it out.  If the inode
 	 * is not in the AIL, then it will be OK to flush it delwri as
@@ -688,14 +676,14 @@ xfs_reclaim_inode(
 	 * We get the flush lock regardless, though, just to make sure
 	 * we don't free it while it is being flushed.
 	 */
-	if (!locked) {
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_iflock(ip);
-	}
+	xfs_ilock(ip, XFS_ILOCK_EXCL);
+	xfs_iflock(ip);
 
 	/*
 	 * In the case of a forced shutdown we rely on xfs_iflush() to
 	 * wait for the inode to be unpinned before returning an error.
+	 * Because we hold the flush lock, we know that the inode cannot
+	 * be under IO, so if it reports clean it can be reclaimed.
 	 */
 	if (!is_bad_inode(VFS_I(ip)) && xfs_iflush(ip, sync_mode) == 0) {
 		/* synchronize with xfs_iflush_done */
@@ -770,14 +758,24 @@ xfs_reclaim_inode_now(
 	struct xfs_perag	*pag,
 	int			flags)
 {
-	/* ignore if already under reclaim */
-	if (xfs_iflags_test(ip, XFS_IRECLAIM)) {
-		read_unlock(&pag->pag_ici_lock);
+	/*
+	 * The radix tree lock here protects a thread in xfs_iget from racing
+	 * with us starting reclaim on the inode.  Once we have the
+	 * XFS_IRECLAIM flag set it will not touch us.
+	 */
+	spin_lock(&ip->i_flags_lock);
+	ASSERT_ALWAYS(__xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	if (__xfs_iflags_test(ip, XFS_IRECLAIM)) {
+		/* ignore as it is already under reclaim */
+		spin_unlock(&ip->i_flags_lock);
+		write_unlock(&pag->pag_ici_lock);
 		return 0;
 	}
-	read_unlock(&pag->pag_ici_lock);
+	__xfs_iflags_set(ip, XFS_IRECLAIM);
+	spin_unlock(&ip->i_flags_lock);
+	write_unlock(&pag->pag_ici_lock);
 
-	return xfs_reclaim_inode(ip, 0, flags);
+	return xfs_reclaim_inode(ip, flags);
 }
 
 int
@@ -786,5 +784,5 @@ xfs_reclaim_inodes(
 	int		mode)
 {
 	return xfs_inode_ag_iterator(mp, xfs_reclaim_inode_now, mode,
-					XFS_ICI_RECLAIM_TAG);
+					XFS_ICI_RECLAIM_TAG, 1);
 }
diff --git a/fs/xfs/linux-2.6/xfs_sync.h b/fs/xfs/linux-2.6/xfs_sync.h
index 5912060..0cef0b8 100644
--- a/fs/xfs/linux-2.6/xfs_sync.h
+++ b/fs/xfs/linux-2.6/xfs_sync.h
@@ -44,7 +44,6 @@ void xfs_quiesce_attr(struct xfs_mount *mp);
 
 void xfs_flush_inodes(struct xfs_inode *ip);
 
-int xfs_reclaim_inode(struct xfs_inode *ip, int locked, int sync_mode);
 int xfs_reclaim_inodes(struct xfs_mount *mp, int mode);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
@@ -56,6 +55,6 @@ void __xfs_inode_clear_reclaim_tag(struct xfs_mount *mp, struct xfs_perag *pag,
 int xfs_sync_inode_valid(struct xfs_inode *ip, struct xfs_perag *pag);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
 	int (*execute)(struct xfs_inode *ip, struct xfs_perag *pag, int flags),
-	int flags, int tag);
+	int flags, int tag, int write_lock);
 
 #endif
diff --git a/fs/xfs/quota/xfs_qm_syscalls.c b/fs/xfs/quota/xfs_qm_syscalls.c
index 4e4276b..1024e4f 100644
--- a/fs/xfs/quota/xfs_qm_syscalls.c
+++ b/fs/xfs/quota/xfs_qm_syscalls.c
@@ -894,7 +894,7 @@ xfs_qm_dqrele_all_inodes(
 	uint		 flags)
 {
 	ASSERT(mp->m_quotainfo);
-	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags, XFS_ICI_NO_TAG);
+	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags, XFS_ICI_NO_TAG, 0);
 }
 
 /*------------------------------------------------------------------------*/
diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index ecbf8b4..883bca9 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -538,17 +538,21 @@ xfs_ireclaim(
 {
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_perag	*pag;
+	xfs_agino_t		agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
 
 	XFS_STATS_INC(xs_ig_reclaims);
 
 	/*
-	 * Remove the inode from the per-AG radix tree.  It doesn't matter
-	 * if it was never added to it because radix_tree_delete can deal
-	 * with that case just fine.
+	 * Remove the inode from the per-AG radix tree.
+	 *
+	 * Because radix_tree_delete won't complain even if the item was never
+	 * added to the tree assert that it's been there before to catch
+	 * problems with the inode life time early on.
 	 */
 	pag = xfs_get_perag(mp, ip->i_ino);
 	write_lock(&pag->pag_ici_lock);
-	radix_tree_delete(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino));
+	if (!radix_tree_delete(&pag->pag_ici_root, agino))
+		ASSERT(0);
 	write_unlock(&pag->pag_ici_lock);
 	xfs_put_perag(mp, pag);
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index da428b3..64373bc 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2877,10 +2877,14 @@ xfs_iflush(
 	mp = ip->i_mount;
 
 	/*
-	 * If the inode isn't dirty, then just release the inode
-	 * flush lock and do nothing.
+	 * If the inode isn't dirty, then just release the inode flush lock and
+	 * do nothing. Treat stale inodes the same; we cannot rely on the
+	 * backing buffer remaining stale in cache for the remaining life of
+	 * the stale inode and so xfs_itobp() below may give us a buffer that
+	 * no longer contains inodes below. Doing this stale check here also
+	 * avoids forcing the log on pinned, stale inodes.
 	 */
-	if (xfs_inode_clean(ip)) {
+	if (xfs_inode_clean(ip) || xfs_iflags_test(ip, XFS_ISTALE)) {
 		xfs_ifunlock(ip);
 		return 0;
 	}
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 492d75b..15875e5 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -2461,52 +2461,6 @@ xfs_set_dmattrs(
 	return error;
 }
 
-int
-xfs_reclaim(
-	xfs_inode_t	*ip)
-{
-
-	xfs_itrace_entry(ip);
-
-	ASSERT(!VN_MAPPED(VFS_I(ip)));
-
-	/* bad inode, get out here ASAP */
-	if (is_bad_inode(VFS_I(ip))) {
-		xfs_ireclaim(ip);
-		return 0;
-	}
-
-	xfs_ioend_wait(ip);
-
-	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
-
-	/*
-	 * Make sure the atime in the XFS inode is correct before freeing the
-	 * Linux inode.
-	 */
-	xfs_synchronize_atime(ip);
-
-	/*
-	 * If we have nothing to flush with this inode then complete the
-	 * teardown now, otherwise break the link between the xfs inode and the
-	 * linux inode and clean up the xfs inode later. This avoids flushing
-	 * the inode to disk during the delete operation itself.
-	 *
-	 * When breaking the link, we need to set the XFS_IRECLAIMABLE flag
-	 * first to ensure that xfs_iunpin() will never see an xfs inode
-	 * that has a linux inode being reclaimed. Synchronisation is provided
-	 * by the i_flags_lock.
-	 */
-	if (!ip->i_update_core && (ip->i_itemp == NULL)) {
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_iflock(ip);
-		xfs_iflags_set(ip, XFS_IRECLAIMABLE);
-		return xfs_reclaim_inode(ip, 1, XFS_IFLUSH_DELWRI_ELSE_SYNC);
-	}
-	xfs_inode_set_reclaim_tag(ip);
-	return 0;
-}
-
 /*
  * xfs_alloc_file_space()
  *      This routine allocates disk space for the given file.
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index a9e102d..167a467 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -38,7 +38,6 @@ int xfs_symlink(struct xfs_inode *dp, struct xfs_name *link_name,
 		const char *target_path, mode_t mode, struct xfs_inode **ipp,
 		cred_t *credp);
 int xfs_set_dmattrs(struct xfs_inode *ip, u_int evmask, u_int16_t state);
-int xfs_reclaim(struct xfs_inode *ip);
 int xfs_change_file_space(struct xfs_inode *ip, int cmd,
 		xfs_flock64_t *bf, xfs_off_t offset, int attr_flags);
 int xfs_rename(struct xfs_inode *src_dp, struct xfs_name *src_name,

[-- Attachment #3: xfs-inode-reclaim-2.6.32 --]
[-- Type: text/plain, Size: 12580 bytes --]

diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 18a4b8e..f3c622a 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -930,13 +930,37 @@ xfs_fs_alloc_inode(
  */
 STATIC void
 xfs_fs_destroy_inode(
-	struct inode	*inode)
+	struct inode		*inode)
 {
-	xfs_inode_t		*ip = XFS_I(inode);
+	struct xfs_inode	*ip = XFS_I(inode);
+
+	xfs_itrace_entry(ip);
 
 	XFS_STATS_INC(vn_reclaim);
-	if (xfs_reclaim(ip))
-		panic("%s: cannot reclaim 0x%p\n", __func__, inode);
+
+	/* bad inode, get out here ASAP */
+	if (is_bad_inode(inode))
+		goto out_reclaim;
+
+	xfs_ioend_wait(ip);
+
+	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
+
+	/*
+	 * We should never get here with one of the reclaim flags already set.
+	 */
+	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));
+
+	/*
+	 * we always use background reclaim here because even if the
+	 * inode is clean, it still may be under IO and hence we have
+	 * to take the flush lock. The background reclaim path handles
+	 * this more efficiently than we can here, so simply let background
+	 * reclaim tear down all inodes.
+	 */
+out_reclaim:
+	xfs_inode_set_reclaim_tag(ip);
 }
 
 /*
diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index 961df0a..897fcb5 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -54,7 +54,8 @@ xfs_inode_ag_lookup(
 	struct xfs_mount	*mp,
 	struct xfs_perag	*pag,
 	uint32_t		*first_index,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	int			nr_found;
 	struct xfs_inode	*ip;
@@ -64,7 +65,10 @@ xfs_inode_ag_lookup(
 	 * as the tree is sparse and a gang lookup walks to find
 	 * the number of objects requested.
 	 */
-	read_lock(&pag->pag_ici_lock);
+	if (write_lock)
+		write_lock(&pag->pag_ici_lock);
+	else
+		read_lock(&pag->pag_ici_lock);
 	if (tag == XFS_ICI_NO_TAG) {
 		nr_found = radix_tree_gang_lookup(&pag->pag_ici_root,
 				(void **)&ip, *first_index, 1);
@@ -88,7 +92,10 @@ xfs_inode_ag_lookup(
 	return ip;
 
 unlock:
-	read_unlock(&pag->pag_ici_lock);
+	if (write_lock)
+		write_unlock(&pag->pag_ici_lock);
+	else
+		read_unlock(&pag->pag_ici_lock);
 	return NULL;
 }
 
@@ -99,7 +106,8 @@ xfs_inode_ag_walk(
 	int			(*execute)(struct xfs_inode *ip,
 					   struct xfs_perag *pag, int flags),
 	int			flags,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	struct xfs_perag	*pag = &mp->m_perag[ag];
 	uint32_t		first_index;
@@ -113,7 +121,8 @@ restart:
 		int		error = 0;
 		xfs_inode_t	*ip;
 
-		ip = xfs_inode_ag_lookup(mp, pag, &first_index, tag);
+		ip = xfs_inode_ag_lookup(mp, pag, &first_index, tag,
+						write_lock);
 		if (!ip)
 			break;
 
@@ -147,7 +156,8 @@ xfs_inode_ag_iterator(
 	int			(*execute)(struct xfs_inode *ip,
 					   struct xfs_perag *pag, int flags),
 	int			flags,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	int			error = 0;
 	int			last_error = 0;
@@ -156,7 +166,8 @@ xfs_inode_ag_iterator(
 	for (ag = 0; ag < mp->m_sb.sb_agcount; ag++) {
 		if (!mp->m_perag[ag].pag_ici_init)
 			continue;
-		error = xfs_inode_ag_walk(mp, ag, execute, flags, tag);
+		error = xfs_inode_ag_walk(mp, ag, execute, flags, tag,
+					write_lock);
 		if (error) {
 			last_error = error;
 			if (error == EFSCORRUPTED)
@@ -180,18 +191,20 @@ xfs_sync_inode_valid(
 		return EFSCORRUPTED;
 	}
 
-	/*
-	 * If we can't get a reference on the inode, it must be in reclaim.
-	 * Leave it for the reclaim code to flush. Also avoid inodes that
-	 * haven't been fully initialised.
-	 */
+	/* avoid new or reclaimable inodes. Leave for reclaim code to flush */
+	if (xfs_iflags_test(ip, XFS_INEW | XFS_IRECLAIMABLE | XFS_IRECLAIM)) {
+		read_unlock(&pag->pag_ici_lock);
+		return ENOENT;
+	}
+
+	/* If we can't get a reference on the inode, it must be in reclaim. */
 	if (!igrab(inode)) {
 		read_unlock(&pag->pag_ici_lock);
 		return ENOENT;
 	}
 	read_unlock(&pag->pag_ici_lock);
 
-	if (is_bad_inode(inode) || xfs_iflags_test(ip, XFS_INEW)) {
+	if (is_bad_inode(inode)) {
 		IRELE(ip);
 		return ENOENT;
 	}
@@ -281,7 +294,7 @@ xfs_sync_data(
 	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
 
 	error = xfs_inode_ag_iterator(mp, xfs_sync_inode_data, flags,
-				      XFS_ICI_NO_TAG);
+				      XFS_ICI_NO_TAG, 0);
 	if (error)
 		return XFS_ERROR(error);
 
@@ -303,7 +316,7 @@ xfs_sync_attr(
 	ASSERT((flags & ~SYNC_WAIT) == 0);
 
 	return xfs_inode_ag_iterator(mp, xfs_sync_inode_attr, flags,
-				     XFS_ICI_NO_TAG);
+				     XFS_ICI_NO_TAG, 0);
 }
 
 STATIC int
@@ -663,36 +676,11 @@ xfs_syncd_stop(
 	kthread_stop(mp->m_sync_task);
 }
 
-int
+STATIC int
 xfs_reclaim_inode(
 	xfs_inode_t	*ip,
-	int		locked,
 	int		sync_mode)
 {
-	xfs_perag_t	*pag = xfs_get_perag(ip->i_mount, ip->i_ino);
-
-	/* The hash lock here protects a thread in xfs_iget_core from
-	 * racing with us on linking the inode back with a vnode.
-	 * Once we have the XFS_IRECLAIM flag set it will not touch
-	 * us.
-	 */
-	write_lock(&pag->pag_ici_lock);
-	spin_lock(&ip->i_flags_lock);
-	if (__xfs_iflags_test(ip, XFS_IRECLAIM) ||
-	    !__xfs_iflags_test(ip, XFS_IRECLAIMABLE)) {
-		spin_unlock(&ip->i_flags_lock);
-		write_unlock(&pag->pag_ici_lock);
-		if (locked) {
-			xfs_ifunlock(ip);
-			xfs_iunlock(ip, XFS_ILOCK_EXCL);
-		}
-		return -EAGAIN;
-	}
-	__xfs_iflags_set(ip, XFS_IRECLAIM);
-	spin_unlock(&ip->i_flags_lock);
-	write_unlock(&pag->pag_ici_lock);
-	xfs_put_perag(ip->i_mount, pag);
-
 	/*
 	 * If the inode is still dirty, then flush it out.  If the inode
 	 * is not in the AIL, then it will be OK to flush it delwri as
@@ -704,14 +692,14 @@ xfs_reclaim_inode(
 	 * We get the flush lock regardless, though, just to make sure
 	 * we don't free it while it is being flushed.
 	 */
-	if (!locked) {
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_iflock(ip);
-	}
+	xfs_ilock(ip, XFS_ILOCK_EXCL);
+	xfs_iflock(ip);
 
 	/*
 	 * In the case of a forced shutdown we rely on xfs_iflush() to
 	 * wait for the inode to be unpinned before returning an error.
+	 * Because we hold the flush lock, we know that the inode cannot
+	 * be under IO, so if it reports clean it can be reclaimed.
 	 */
 	if (!is_bad_inode(VFS_I(ip)) && xfs_iflush(ip, sync_mode) == 0) {
 		/* synchronize with xfs_iflush_done */
@@ -771,14 +759,24 @@ xfs_reclaim_inode_now(
 	struct xfs_perag	*pag,
 	int			flags)
 {
-	/* ignore if already under reclaim */
-	if (xfs_iflags_test(ip, XFS_IRECLAIM)) {
-		read_unlock(&pag->pag_ici_lock);
+	/*
+	 * The radix tree lock here protects a thread in xfs_iget from racing
+	 * with us starting reclaim on the inode.  Once we have the
+	 * XFS_IRECLAIM flag set it will not touch us.
+	 */
+	spin_lock(&ip->i_flags_lock);
+	ASSERT_ALWAYS(__xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	if (__xfs_iflags_test(ip, XFS_IRECLAIM)) {
+		/* ignore as it is already under reclaim */
+		spin_unlock(&ip->i_flags_lock);
+		write_unlock(&pag->pag_ici_lock);
 		return 0;
 	}
-	read_unlock(&pag->pag_ici_lock);
+	__xfs_iflags_set(ip, XFS_IRECLAIM);
+	spin_unlock(&ip->i_flags_lock);
+	write_unlock(&pag->pag_ici_lock);
 
-	return xfs_reclaim_inode(ip, 0, flags);
+	return xfs_reclaim_inode(ip, flags);
 }
 
 int
@@ -787,5 +785,5 @@ xfs_reclaim_inodes(
 	int		mode)
 {
 	return xfs_inode_ag_iterator(mp, xfs_reclaim_inode_now, mode,
-					XFS_ICI_RECLAIM_TAG);
+					XFS_ICI_RECLAIM_TAG, 1);
 }
diff --git a/fs/xfs/linux-2.6/xfs_sync.h b/fs/xfs/linux-2.6/xfs_sync.h
index 27920eb..ea932b4 100644
--- a/fs/xfs/linux-2.6/xfs_sync.h
+++ b/fs/xfs/linux-2.6/xfs_sync.h
@@ -44,7 +44,6 @@ void xfs_quiesce_attr(struct xfs_mount *mp);
 
 void xfs_flush_inodes(struct xfs_inode *ip);
 
-int xfs_reclaim_inode(struct xfs_inode *ip, int locked, int sync_mode);
 int xfs_reclaim_inodes(struct xfs_mount *mp, int mode);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
@@ -55,6 +54,6 @@ void __xfs_inode_clear_reclaim_tag(struct xfs_mount *mp, struct xfs_perag *pag,
 int xfs_sync_inode_valid(struct xfs_inode *ip, struct xfs_perag *pag);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
 	int (*execute)(struct xfs_inode *ip, struct xfs_perag *pag, int flags),
-	int flags, int tag);
+	int flags, int tag, int write_lock);
 
 #endif
diff --git a/fs/xfs/quota/xfs_qm_syscalls.c b/fs/xfs/quota/xfs_qm_syscalls.c
index 5d1a3b9..f99cfa4 100644
--- a/fs/xfs/quota/xfs_qm_syscalls.c
+++ b/fs/xfs/quota/xfs_qm_syscalls.c
@@ -893,7 +893,7 @@ xfs_qm_dqrele_all_inodes(
 	uint		 flags)
 {
 	ASSERT(mp->m_quotainfo);
-	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags, XFS_ICI_NO_TAG);
+	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags, XFS_ICI_NO_TAG, 0);
 }
 
 /*------------------------------------------------------------------------*/
diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index 80e5264..40e8775 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -511,17 +511,21 @@ xfs_ireclaim(
 {
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_perag	*pag;
+	xfs_agino_t		agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
 
 	XFS_STATS_INC(xs_ig_reclaims);
 
 	/*
-	 * Remove the inode from the per-AG radix tree.  It doesn't matter
-	 * if it was never added to it because radix_tree_delete can deal
-	 * with that case just fine.
+	 * Remove the inode from the per-AG radix tree.
+	 *
+	 * Because radix_tree_delete won't complain even if the item was never
+	 * added to the tree assert that it's been there before to catch
+	 * problems with the inode life time early on.
 	 */
 	pag = xfs_get_perag(mp, ip->i_ino);
 	write_lock(&pag->pag_ici_lock);
-	radix_tree_delete(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino));
+	if (!radix_tree_delete(&pag->pag_ici_root, agino))
+		ASSERT(0);
 	write_unlock(&pag->pag_ici_lock);
 	xfs_put_perag(mp, pag);
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index b92a4fa..13d7d21 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2877,10 +2877,14 @@ xfs_iflush(
 	mp = ip->i_mount;
 
 	/*
-	 * If the inode isn't dirty, then just release the inode
-	 * flush lock and do nothing.
+	 * If the inode isn't dirty, then just release the inode flush lock and
+	 * do nothing. Treat stale inodes the same; we cannot rely on the
+	 * backing buffer remaining stale in cache for the remaining life of
+	 * the stale inode and so xfs_itobp() below may give us a buffer that
+	 * no longer contains inodes below. Doing this stale check here also
+	 * avoids forcing the log on pinned, stale inodes.
 	 */
-	if (xfs_inode_clean(ip)) {
+	if (xfs_inode_clean(ip) || xfs_iflags_test(ip, XFS_ISTALE)) {
 		xfs_ifunlock(ip);
 		return 0;
 	}
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index b572f7e..3fac146 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -2456,46 +2456,6 @@ xfs_set_dmattrs(
 	return error;
 }
 
-int
-xfs_reclaim(
-	xfs_inode_t	*ip)
-{
-
-	xfs_itrace_entry(ip);
-
-	ASSERT(!VN_MAPPED(VFS_I(ip)));
-
-	/* bad inode, get out here ASAP */
-	if (is_bad_inode(VFS_I(ip))) {
-		xfs_ireclaim(ip);
-		return 0;
-	}
-
-	xfs_ioend_wait(ip);
-
-	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
-
-	/*
-	 * If we have nothing to flush with this inode then complete the
-	 * teardown now, otherwise break the link between the xfs inode and the
-	 * linux inode and clean up the xfs inode later. This avoids flushing
-	 * the inode to disk during the delete operation itself.
-	 *
-	 * When breaking the link, we need to set the XFS_IRECLAIMABLE flag
-	 * first to ensure that xfs_iunpin() will never see an xfs inode
-	 * that has a linux inode being reclaimed. Synchronisation is provided
-	 * by the i_flags_lock.
-	 */
-	if (!ip->i_update_core && (ip->i_itemp == NULL)) {
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_iflock(ip);
-		xfs_iflags_set(ip, XFS_IRECLAIMABLE);
-		return xfs_reclaim_inode(ip, 1, XFS_IFLUSH_DELWRI_ELSE_SYNC);
-	}
-	xfs_inode_set_reclaim_tag(ip);
-	return 0;
-}
-
 /*
  * xfs_alloc_file_space()
  *      This routine allocates disk space for the given file.
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index a9e102d..167a467 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -38,7 +38,6 @@ int xfs_symlink(struct xfs_inode *dp, struct xfs_name *link_name,
 		const char *target_path, mode_t mode, struct xfs_inode **ipp,
 		cred_t *credp);
 int xfs_set_dmattrs(struct xfs_inode *ip, u_int evmask, u_int16_t state);
-int xfs_reclaim(struct xfs_inode *ip);
 int xfs_change_file_space(struct xfs_inode *ip, int cmd,
 		xfs_flock64_t *bf, xfs_off_t offset, int attr_flags);
 int xfs_rename(struct xfs_inode *src_dp, struct xfs_name *src_name,

[-- Attachment #4: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-01-08 11:31                                                     ` [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim) Dave Chinner
@ 2010-01-11 20:22                                                       ` Patrick Schreurs
  2010-01-15 11:01                                                       ` Patrick Schreurs
  1 sibling, 0 replies; 42+ messages in thread
From: Patrick Schreurs @ 2010-01-11 20:22 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

Hi Dave,

Just wanted to let you know we are running a few servers with your 
patches. So far so good, but it's a but too early to conclude anything.

Thanks for the patches. I'll keep you posted on our results.

-Patrick

On 8-1-2010 12:31, Dave Chinner wrote:
> Hi Patrick,
>
> I've attached two compendium patches that will hopefully fix
> the inode reclaim problems you've been seeing - one is for 2.6.31,
> the other is for 2.6.32. I've cc'd this to the XFS list ѕo that
> anyone else who has been seeing crashes, assert failures and
> general nastiness around inode reclaim can test them as well.
>
> These are not final patches - there's a few changes that Christoph
> has picked up on during review - so there'll be another round of
> patches before checkins and -stable backports can be requested.
>
> I'm hoping that these patches fix your problem, because with them
> I can't make my machines fall over anymore....
>
> Cheers,
>
> Dave.
>
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-01-08 11:31                                                     ` [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim) Dave Chinner
  2010-01-11 20:22                                                       ` Patrick Schreurs
@ 2010-01-15 11:01                                                       ` Patrick Schreurs
  2010-02-01 16:52                                                         ` Patrick Schreurs
  1 sibling, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2010-01-15 11:01 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

Hi Dave,

I think it's save to consider this issue fixed. We have currently 9 
servers operational with these patches and they have been stable so far. 
For a 100% certainty we'll have to test/wait a little bit longer, but 
considering the frequency of crashes we saw earlier i think it's save to 
come to a conclusion.

I hope these patches will be included in 2.6.33 and will be back ported 
to at least 2.6.32.

Many thanks to Dave and to Christoph for fixing this apparently rare and 
seldom triggered condition.

-Patrick

Dave Chinner wrote:
> Hi Patrick,
> 
> I've attached two compendium patches that will hopefully fix
> the inode reclaim problems you've been seeing - one is for 2.6.31,
> the other is for 2.6.32. I've cc'd this to the XFS list ѕo that
> anyone else who has been seeing crashes, assert failures and
> general nastiness around inode reclaim can test them as well.
> 
> These are not final patches - there's a few changes that Christoph
> has picked up on during review - so there'll be another round of
> patches before checkins and -stable backports can be requested.
> 
> I'm hoping that these patches fix your problem, because with them
> I can't make my machines fall over anymore....
> 
> Cheers,
> 
> Dave.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-01-15 11:01                                                       ` Patrick Schreurs
@ 2010-02-01 16:52                                                         ` Patrick Schreurs
  2010-02-08 10:16                                                           ` Patrick Schreurs
  2010-02-08 19:42                                                           ` Christoph Hellwig
  0 siblings, 2 replies; 42+ messages in thread
From: Patrick Schreurs @ 2010-02-01 16:52 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

[-- Attachment #1: Type: text/plain, Size: 1938 bytes --]

Hello Dave,

I'm afraid we had another crash a few days ago. Have a look at the 
screenshot. Can you make anything out of it? This server is running 
2.6.32.3 with your inode-reclaim patch applied and XFS_DEBUG still enabled.

Thanks for looking into this.

-Patrick

On 15-1-2010 12:01, Patrick Schreurs wrote:
> Hi Dave,
>
> I think it's save to consider this issue fixed. We have currently 9
> servers operational with these patches and they have been stable so far.
> For a 100% certainty we'll have to test/wait a little bit longer, but
> considering the frequency of crashes we saw earlier i think it's save to
> come to a conclusion.
>
> I hope these patches will be included in 2.6.33 and will be back ported
> to at least 2.6.32.
>
> Many thanks to Dave and to Christoph for fixing this apparently rare and
> seldom triggered condition.
>
> -Patrick
>
> Dave Chinner wrote:
>> Hi Patrick,
>>
>> I've attached two compendium patches that will hopefully fix
>> the inode reclaim problems you've been seeing - one is for 2.6.31,
>> the other is for 2.6.32. I've cc'd this to the XFS list ѕo that
>> anyone else who has been seeing crashes, assert failures and
>> general nastiness around inode reclaim can test them as well.
>>
>> These are not final patches - there's a few changes that Christoph
>> has picked up on during review - so there'll be another round of
>> patches before checkins and -stable backports can be requested.
>>
>> I'm hoping that these patches fix your problem, because with them
>> I can't make my machines fall over anymore....
>>
>> Cheers,
>>
>> Dave.
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs


[-- Attachment #2: sb06-20100128.jpg --]
[-- Type: image/jpeg, Size: 90397 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-01 16:52                                                         ` Patrick Schreurs
@ 2010-02-08 10:16                                                           ` Patrick Schreurs
  2010-02-08 19:42                                                           ` Christoph Hellwig
  1 sibling, 0 replies; 42+ messages in thread
From: Patrick Schreurs @ 2010-02-08 10:16 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

[-- Attachment #1: Type: text/plain, Size: 2297 bytes --]

We had another crash on the same server last weekend. Looks the same to me.

Thanks for looking into this.

-Patrick

On 1-2-2010 17:52, Patrick Schreurs wrote:
> Hello Dave,
>
> I'm afraid we had another crash a few days ago. Have a look at the
> screenshot. Can you make anything out of it? This server is running
> 2.6.32.3 with your inode-reclaim patch applied and XFS_DEBUG still enabled.
>
> Thanks for looking into this.
>
> -Patrick
>
> On 15-1-2010 12:01, Patrick Schreurs wrote:
>> Hi Dave,
>>
>> I think it's save to consider this issue fixed. We have currently 9
>> servers operational with these patches and they have been stable so far.
>> For a 100% certainty we'll have to test/wait a little bit longer, but
>> considering the frequency of crashes we saw earlier i think it's save to
>> come to a conclusion.
>>
>> I hope these patches will be included in 2.6.33 and will be back ported
>> to at least 2.6.32.
>>
>> Many thanks to Dave and to Christoph for fixing this apparently rare and
>> seldom triggered condition.
>>
>> -Patrick
>>
>> Dave Chinner wrote:
>>> Hi Patrick,
>>>
>>> I've attached two compendium patches that will hopefully fix
>>> the inode reclaim problems you've been seeing - one is for 2.6.31,
>>> the other is for 2.6.32. I've cc'd this to the XFS list ѕo that
>>> anyone else who has been seeing crashes, assert failures and
>>> general nastiness around inode reclaim can test them as well.
>>>
>>> These are not final patches - there's a few changes that Christoph
>>> has picked up on during review - so there'll be another round of
>>> patches before checkins and -stable backports can be requested.
>>>
>>> I'm hoping that these patches fix your problem, because with them
>>> I can't make my machines fall over anymore....
>>>
>>> Cheers,
>>>
>>> Dave.
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> xfs mailing list
>>> xfs@oss.sgi.com
>>> http://oss.sgi.com/mailman/listinfo/xfs
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
>
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-01 16:52                                                         ` Patrick Schreurs
  2010-02-08 10:16                                                           ` Patrick Schreurs
@ 2010-02-08 19:42                                                           ` Christoph Hellwig
  2010-02-09  8:48                                                             ` Patrick Schreurs
  1 sibling, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2010-02-08 19:42 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

On Mon, Feb 01, 2010 at 05:52:30PM +0100, Patrick Schreurs wrote:
> Hello Dave,
>
> I'm afraid we had another crash a few days ago. Have a look at the  
> screenshot. Can you make anything out of it? This server is running  
> 2.6.32.3 with your inode-reclaim patch applied and XFS_DEBUG still 
> enabled.

Just wondering, which set of patches is this exactly?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-08 19:42                                                           ` Christoph Hellwig
@ 2010-02-09  8:48                                                             ` Patrick Schreurs
  2010-02-09 10:31                                                               ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2010-02-09  8:48 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, xfs

[-- Attachment #1: Type: text/plain, Size: 551 bytes --]

On 8-2-2010 20:42, Christoph Hellwig wrote:
> On Mon, Feb 01, 2010 at 05:52:30PM +0100, Patrick Schreurs wrote:
>> Hello Dave,
>>
>> I'm afraid we had another crash a few days ago. Have a look at the
>> screenshot. Can you make anything out of it? This server is running
>> 2.6.32.3 with your inode-reclaim patch applied and XFS_DEBUG still
>> enabled.
>
> Just wondering, which set of patches is this exactly?

This is a clean 2.6.32.3 with the xfs-inode-reclaim-2.6.32 patch i 
received from Dave on January 8th (see attachment).

Thanks,

-Patrick

[-- Attachment #2: xfs-inode-reclaim-2.6.32 --]
[-- Type: text/plain, Size: 12580 bytes --]

diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 18a4b8e..f3c622a 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -930,13 +930,37 @@ xfs_fs_alloc_inode(
  */
 STATIC void
 xfs_fs_destroy_inode(
-	struct inode	*inode)
+	struct inode		*inode)
 {
-	xfs_inode_t		*ip = XFS_I(inode);
+	struct xfs_inode	*ip = XFS_I(inode);
+
+	xfs_itrace_entry(ip);
 
 	XFS_STATS_INC(vn_reclaim);
-	if (xfs_reclaim(ip))
-		panic("%s: cannot reclaim 0x%p\n", __func__, inode);
+
+	/* bad inode, get out here ASAP */
+	if (is_bad_inode(inode))
+		goto out_reclaim;
+
+	xfs_ioend_wait(ip);
+
+	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
+
+	/*
+	 * We should never get here with one of the reclaim flags already set.
+	 */
+	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));
+
+	/*
+	 * we always use background reclaim here because even if the
+	 * inode is clean, it still may be under IO and hence we have
+	 * to take the flush lock. The background reclaim path handles
+	 * this more efficiently than we can here, so simply let background
+	 * reclaim tear down all inodes.
+	 */
+out_reclaim:
+	xfs_inode_set_reclaim_tag(ip);
 }
 
 /*
diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index 961df0a..897fcb5 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -54,7 +54,8 @@ xfs_inode_ag_lookup(
 	struct xfs_mount	*mp,
 	struct xfs_perag	*pag,
 	uint32_t		*first_index,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	int			nr_found;
 	struct xfs_inode	*ip;
@@ -64,7 +65,10 @@ xfs_inode_ag_lookup(
 	 * as the tree is sparse and a gang lookup walks to find
 	 * the number of objects requested.
 	 */
-	read_lock(&pag->pag_ici_lock);
+	if (write_lock)
+		write_lock(&pag->pag_ici_lock);
+	else
+		read_lock(&pag->pag_ici_lock);
 	if (tag == XFS_ICI_NO_TAG) {
 		nr_found = radix_tree_gang_lookup(&pag->pag_ici_root,
 				(void **)&ip, *first_index, 1);
@@ -88,7 +92,10 @@ xfs_inode_ag_lookup(
 	return ip;
 
 unlock:
-	read_unlock(&pag->pag_ici_lock);
+	if (write_lock)
+		write_unlock(&pag->pag_ici_lock);
+	else
+		read_unlock(&pag->pag_ici_lock);
 	return NULL;
 }
 
@@ -99,7 +106,8 @@ xfs_inode_ag_walk(
 	int			(*execute)(struct xfs_inode *ip,
 					   struct xfs_perag *pag, int flags),
 	int			flags,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	struct xfs_perag	*pag = &mp->m_perag[ag];
 	uint32_t		first_index;
@@ -113,7 +121,8 @@ restart:
 		int		error = 0;
 		xfs_inode_t	*ip;
 
-		ip = xfs_inode_ag_lookup(mp, pag, &first_index, tag);
+		ip = xfs_inode_ag_lookup(mp, pag, &first_index, tag,
+						write_lock);
 		if (!ip)
 			break;
 
@@ -147,7 +156,8 @@ xfs_inode_ag_iterator(
 	int			(*execute)(struct xfs_inode *ip,
 					   struct xfs_perag *pag, int flags),
 	int			flags,
-	int			tag)
+	int			tag,
+	int			write_lock)
 {
 	int			error = 0;
 	int			last_error = 0;
@@ -156,7 +166,8 @@ xfs_inode_ag_iterator(
 	for (ag = 0; ag < mp->m_sb.sb_agcount; ag++) {
 		if (!mp->m_perag[ag].pag_ici_init)
 			continue;
-		error = xfs_inode_ag_walk(mp, ag, execute, flags, tag);
+		error = xfs_inode_ag_walk(mp, ag, execute, flags, tag,
+					write_lock);
 		if (error) {
 			last_error = error;
 			if (error == EFSCORRUPTED)
@@ -180,18 +191,20 @@ xfs_sync_inode_valid(
 		return EFSCORRUPTED;
 	}
 
-	/*
-	 * If we can't get a reference on the inode, it must be in reclaim.
-	 * Leave it for the reclaim code to flush. Also avoid inodes that
-	 * haven't been fully initialised.
-	 */
+	/* avoid new or reclaimable inodes. Leave for reclaim code to flush */
+	if (xfs_iflags_test(ip, XFS_INEW | XFS_IRECLAIMABLE | XFS_IRECLAIM)) {
+		read_unlock(&pag->pag_ici_lock);
+		return ENOENT;
+	}
+
+	/* If we can't get a reference on the inode, it must be in reclaim. */
 	if (!igrab(inode)) {
 		read_unlock(&pag->pag_ici_lock);
 		return ENOENT;
 	}
 	read_unlock(&pag->pag_ici_lock);
 
-	if (is_bad_inode(inode) || xfs_iflags_test(ip, XFS_INEW)) {
+	if (is_bad_inode(inode)) {
 		IRELE(ip);
 		return ENOENT;
 	}
@@ -281,7 +294,7 @@ xfs_sync_data(
 	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
 
 	error = xfs_inode_ag_iterator(mp, xfs_sync_inode_data, flags,
-				      XFS_ICI_NO_TAG);
+				      XFS_ICI_NO_TAG, 0);
 	if (error)
 		return XFS_ERROR(error);
 
@@ -303,7 +316,7 @@ xfs_sync_attr(
 	ASSERT((flags & ~SYNC_WAIT) == 0);
 
 	return xfs_inode_ag_iterator(mp, xfs_sync_inode_attr, flags,
-				     XFS_ICI_NO_TAG);
+				     XFS_ICI_NO_TAG, 0);
 }
 
 STATIC int
@@ -663,36 +676,11 @@ xfs_syncd_stop(
 	kthread_stop(mp->m_sync_task);
 }
 
-int
+STATIC int
 xfs_reclaim_inode(
 	xfs_inode_t	*ip,
-	int		locked,
 	int		sync_mode)
 {
-	xfs_perag_t	*pag = xfs_get_perag(ip->i_mount, ip->i_ino);
-
-	/* The hash lock here protects a thread in xfs_iget_core from
-	 * racing with us on linking the inode back with a vnode.
-	 * Once we have the XFS_IRECLAIM flag set it will not touch
-	 * us.
-	 */
-	write_lock(&pag->pag_ici_lock);
-	spin_lock(&ip->i_flags_lock);
-	if (__xfs_iflags_test(ip, XFS_IRECLAIM) ||
-	    !__xfs_iflags_test(ip, XFS_IRECLAIMABLE)) {
-		spin_unlock(&ip->i_flags_lock);
-		write_unlock(&pag->pag_ici_lock);
-		if (locked) {
-			xfs_ifunlock(ip);
-			xfs_iunlock(ip, XFS_ILOCK_EXCL);
-		}
-		return -EAGAIN;
-	}
-	__xfs_iflags_set(ip, XFS_IRECLAIM);
-	spin_unlock(&ip->i_flags_lock);
-	write_unlock(&pag->pag_ici_lock);
-	xfs_put_perag(ip->i_mount, pag);
-
 	/*
 	 * If the inode is still dirty, then flush it out.  If the inode
 	 * is not in the AIL, then it will be OK to flush it delwri as
@@ -704,14 +692,14 @@ xfs_reclaim_inode(
 	 * We get the flush lock regardless, though, just to make sure
 	 * we don't free it while it is being flushed.
 	 */
-	if (!locked) {
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_iflock(ip);
-	}
+	xfs_ilock(ip, XFS_ILOCK_EXCL);
+	xfs_iflock(ip);
 
 	/*
 	 * In the case of a forced shutdown we rely on xfs_iflush() to
 	 * wait for the inode to be unpinned before returning an error.
+	 * Because we hold the flush lock, we know that the inode cannot
+	 * be under IO, so if it reports clean it can be reclaimed.
 	 */
 	if (!is_bad_inode(VFS_I(ip)) && xfs_iflush(ip, sync_mode) == 0) {
 		/* synchronize with xfs_iflush_done */
@@ -771,14 +759,24 @@ xfs_reclaim_inode_now(
 	struct xfs_perag	*pag,
 	int			flags)
 {
-	/* ignore if already under reclaim */
-	if (xfs_iflags_test(ip, XFS_IRECLAIM)) {
-		read_unlock(&pag->pag_ici_lock);
+	/*
+	 * The radix tree lock here protects a thread in xfs_iget from racing
+	 * with us starting reclaim on the inode.  Once we have the
+	 * XFS_IRECLAIM flag set it will not touch us.
+	 */
+	spin_lock(&ip->i_flags_lock);
+	ASSERT_ALWAYS(__xfs_iflags_test(ip, XFS_IRECLAIMABLE));
+	if (__xfs_iflags_test(ip, XFS_IRECLAIM)) {
+		/* ignore as it is already under reclaim */
+		spin_unlock(&ip->i_flags_lock);
+		write_unlock(&pag->pag_ici_lock);
 		return 0;
 	}
-	read_unlock(&pag->pag_ici_lock);
+	__xfs_iflags_set(ip, XFS_IRECLAIM);
+	spin_unlock(&ip->i_flags_lock);
+	write_unlock(&pag->pag_ici_lock);
 
-	return xfs_reclaim_inode(ip, 0, flags);
+	return xfs_reclaim_inode(ip, flags);
 }
 
 int
@@ -787,5 +785,5 @@ xfs_reclaim_inodes(
 	int		mode)
 {
 	return xfs_inode_ag_iterator(mp, xfs_reclaim_inode_now, mode,
-					XFS_ICI_RECLAIM_TAG);
+					XFS_ICI_RECLAIM_TAG, 1);
 }
diff --git a/fs/xfs/linux-2.6/xfs_sync.h b/fs/xfs/linux-2.6/xfs_sync.h
index 27920eb..ea932b4 100644
--- a/fs/xfs/linux-2.6/xfs_sync.h
+++ b/fs/xfs/linux-2.6/xfs_sync.h
@@ -44,7 +44,6 @@ void xfs_quiesce_attr(struct xfs_mount *mp);
 
 void xfs_flush_inodes(struct xfs_inode *ip);
 
-int xfs_reclaim_inode(struct xfs_inode *ip, int locked, int sync_mode);
 int xfs_reclaim_inodes(struct xfs_mount *mp, int mode);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
@@ -55,6 +54,6 @@ void __xfs_inode_clear_reclaim_tag(struct xfs_mount *mp, struct xfs_perag *pag,
 int xfs_sync_inode_valid(struct xfs_inode *ip, struct xfs_perag *pag);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
 	int (*execute)(struct xfs_inode *ip, struct xfs_perag *pag, int flags),
-	int flags, int tag);
+	int flags, int tag, int write_lock);
 
 #endif
diff --git a/fs/xfs/quota/xfs_qm_syscalls.c b/fs/xfs/quota/xfs_qm_syscalls.c
index 5d1a3b9..f99cfa4 100644
--- a/fs/xfs/quota/xfs_qm_syscalls.c
+++ b/fs/xfs/quota/xfs_qm_syscalls.c
@@ -893,7 +893,7 @@ xfs_qm_dqrele_all_inodes(
 	uint		 flags)
 {
 	ASSERT(mp->m_quotainfo);
-	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags, XFS_ICI_NO_TAG);
+	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags, XFS_ICI_NO_TAG, 0);
 }
 
 /*------------------------------------------------------------------------*/
diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index 80e5264..40e8775 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -511,17 +511,21 @@ xfs_ireclaim(
 {
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_perag	*pag;
+	xfs_agino_t		agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
 
 	XFS_STATS_INC(xs_ig_reclaims);
 
 	/*
-	 * Remove the inode from the per-AG radix tree.  It doesn't matter
-	 * if it was never added to it because radix_tree_delete can deal
-	 * with that case just fine.
+	 * Remove the inode from the per-AG radix tree.
+	 *
+	 * Because radix_tree_delete won't complain even if the item was never
+	 * added to the tree assert that it's been there before to catch
+	 * problems with the inode life time early on.
 	 */
 	pag = xfs_get_perag(mp, ip->i_ino);
 	write_lock(&pag->pag_ici_lock);
-	radix_tree_delete(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino));
+	if (!radix_tree_delete(&pag->pag_ici_root, agino))
+		ASSERT(0);
 	write_unlock(&pag->pag_ici_lock);
 	xfs_put_perag(mp, pag);
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index b92a4fa..13d7d21 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2877,10 +2877,14 @@ xfs_iflush(
 	mp = ip->i_mount;
 
 	/*
-	 * If the inode isn't dirty, then just release the inode
-	 * flush lock and do nothing.
+	 * If the inode isn't dirty, then just release the inode flush lock and
+	 * do nothing. Treat stale inodes the same; we cannot rely on the
+	 * backing buffer remaining stale in cache for the remaining life of
+	 * the stale inode and so xfs_itobp() below may give us a buffer that
+	 * no longer contains inodes below. Doing this stale check here also
+	 * avoids forcing the log on pinned, stale inodes.
 	 */
-	if (xfs_inode_clean(ip)) {
+	if (xfs_inode_clean(ip) || xfs_iflags_test(ip, XFS_ISTALE)) {
 		xfs_ifunlock(ip);
 		return 0;
 	}
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index b572f7e..3fac146 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -2456,46 +2456,6 @@ xfs_set_dmattrs(
 	return error;
 }
 
-int
-xfs_reclaim(
-	xfs_inode_t	*ip)
-{
-
-	xfs_itrace_entry(ip);
-
-	ASSERT(!VN_MAPPED(VFS_I(ip)));
-
-	/* bad inode, get out here ASAP */
-	if (is_bad_inode(VFS_I(ip))) {
-		xfs_ireclaim(ip);
-		return 0;
-	}
-
-	xfs_ioend_wait(ip);
-
-	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
-
-	/*
-	 * If we have nothing to flush with this inode then complete the
-	 * teardown now, otherwise break the link between the xfs inode and the
-	 * linux inode and clean up the xfs inode later. This avoids flushing
-	 * the inode to disk during the delete operation itself.
-	 *
-	 * When breaking the link, we need to set the XFS_IRECLAIMABLE flag
-	 * first to ensure that xfs_iunpin() will never see an xfs inode
-	 * that has a linux inode being reclaimed. Synchronisation is provided
-	 * by the i_flags_lock.
-	 */
-	if (!ip->i_update_core && (ip->i_itemp == NULL)) {
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_iflock(ip);
-		xfs_iflags_set(ip, XFS_IRECLAIMABLE);
-		return xfs_reclaim_inode(ip, 1, XFS_IFLUSH_DELWRI_ELSE_SYNC);
-	}
-	xfs_inode_set_reclaim_tag(ip);
-	return 0;
-}
-
 /*
  * xfs_alloc_file_space()
  *      This routine allocates disk space for the given file.
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index a9e102d..167a467 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -38,7 +38,6 @@ int xfs_symlink(struct xfs_inode *dp, struct xfs_name *link_name,
 		const char *target_path, mode_t mode, struct xfs_inode **ipp,
 		cred_t *credp);
 int xfs_set_dmattrs(struct xfs_inode *ip, u_int evmask, u_int16_t state);
-int xfs_reclaim(struct xfs_inode *ip);
 int xfs_change_file_space(struct xfs_inode *ip, int cmd,
 		xfs_flock64_t *bf, xfs_off_t offset, int attr_flags);
 int xfs_rename(struct xfs_inode *src_dp, struct xfs_name *src_name,

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-09  8:48                                                             ` Patrick Schreurs
@ 2010-02-09 10:31                                                               ` Christoph Hellwig
  2010-02-10 12:42                                                                 ` Patrick Schreurs
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2010-02-09 10:31 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

On Tue, Feb 09, 2010 at 09:48:38AM +0100, Patrick Schreurs wrote:
> This is a clean 2.6.32.3 with the xfs-inode-reclaim-2.6.32 patch i  
> received from Dave on January 8th (see attachment).

I can't find anything interesting regarding I_RECLAIMABLE manipulation
in there.  The only thing I could think off going wrong is i_flags
and i_update_core sitting in the same word and the compiler causing
some read-modify-write cycles for it.  Can you test the patch below?
It fixes the abose issue up, and to make sure sure the assert you hit
isn't as lethal changes it into a WARN_ON, which will still print the
backtrace, but not crash the machine.


Index: linux-2.6/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_super.c	2010-02-09 10:38:51.771004413 +0100
+++ linux-2.6/fs/xfs/linux-2.6/xfs_super.c	2010-02-09 10:42:02.102254796 +0100
@@ -1004,13 +1004,13 @@ xfs_fs_inode_init_once(
  * Dirty the XFS inode when mark_inode_dirty_sync() is called so that
  * we catch unlogged VFS level updates to the inode. Care must be taken
  * here - the transaction code calls mark_inode_dirty_sync() to mark the
- * VFS inode dirty in a transaction and clears the i_update_core field;
+ * VFS inode dirty in a transaction and clears the XFS_IDIRTY_CORE flag;
  * it must clear the field after calling mark_inode_dirty_sync() to
  * correctly indicate that the dirty state has been propagated into the
  * inode log item.
  *
  * We need the barrier() to maintain correct ordering between unlogged
- * updates and the transaction commit code that clears the i_update_core
+ * updates and the transaction commit code that clears the XFS_IDIRTY_CORE
  * field. This requires all updates to be completed before marking the
  * inode dirty.
  */
@@ -1018,8 +1018,7 @@ STATIC void
 xfs_fs_dirty_inode(
 	struct inode	*inode)
 {
-	barrier();
-	XFS_I(inode)->i_update_core = 1;
+	xfs_iflags_set(XFS_I(inode), XFS_IDIRTY_CORE);
 }
 
 /*
Index: linux-2.6/fs/xfs/xfs_iget.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_iget.c	2010-02-09 10:38:41.343004133 +0100
+++ linux-2.6/fs/xfs/xfs_iget.c	2010-02-09 10:38:47.069003783 +0100
@@ -81,7 +81,6 @@ xfs_inode_alloc(
 	ip->i_afp = NULL;
 	memset(&ip->i_df, 0, sizeof(xfs_ifork_t));
 	ip->i_flags = 0;
-	ip->i_update_core = 0;
 	ip->i_delayed_blks = 0;
 	memset(&ip->i_d, 0, sizeof(xfs_icdinode_t));
 	ip->i_size = 0;
Index: linux-2.6/fs/xfs/xfs_inode.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_inode.c	2010-02-09 10:37:28.821254796 +0100
+++ linux-2.6/fs/xfs/xfs_inode.c	2010-02-09 10:38:33.537253468 +0100
@@ -2098,7 +2098,7 @@ xfs_ifree_cluster(
 			iip = ip->i_itemp;
 
 			if (!iip) {
-				ip->i_update_core = 0;
+				xfs_iflags_clear(ip, XFS_IDIRTY_CORE);
 				xfs_ifunlock(ip);
 				xfs_iunlock(ip, XFS_ILOCK_EXCL);
 				continue;
@@ -2913,7 +2913,7 @@ xfs_iflush(
 	 * to disk, because the log record didn't make it to disk!
 	 */
 	if (XFS_FORCED_SHUTDOWN(mp)) {
-		ip->i_update_core = 0;
+		xfs_iflags_clear(ip, XFS_IDIRTY_CORE);
 		if (iip)
 			iip->ili_format.ilf_fields = 0;
 		xfs_ifunlock(ip);
@@ -3057,19 +3057,18 @@ xfs_iflush_int(
 	dip = (xfs_dinode_t *)xfs_buf_offset(bp, ip->i_imap.im_boffset);
 
 	/*
-	 * Clear i_update_core before copying out the data.
+	 * Clear XFS_IDIRTY_CORE before copying out the data.
 	 * This is for coordination with our timestamp updates
 	 * that don't hold the inode lock. They will always
-	 * update the timestamps BEFORE setting i_update_core,
-	 * so if we clear i_update_core after they set it we
+	 * update the timestamps BEFORE setting XFS_IDIRTY_CORE,
+	 * so if we clear XFS_IDIRTY_CORE after they set it we
 	 * are guaranteed to see their updates to the timestamps.
 	 * I believe that this depends on strongly ordered memory
 	 * semantics, but we have that.  We use the SYNCHRONIZE
 	 * macro to make sure that the compiler does not reorder
-	 * the i_update_core access below the data copy below.
+	 * the XFS_IDIRTY_CORE access below the data copy below.
 	 */
-	ip->i_update_core = 0;
-	SYNCHRONIZE();
+	xfs_iflags_clear(ip, XFS_IDIRTY_CORE);
 
 	/*
 	 * Make sure to get the latest timestamps from the Linux inode.
@@ -3235,7 +3234,7 @@ xfs_iflush_int(
 	} else {
 		/*
 		 * We're flushing an inode which is not in the AIL and has
-		 * not been logged but has i_update_core set.  For this
+		 * not been logged but has XFS_IDIRTY_CORE set.  For this
 		 * case we can use a B_DELWRI flush and immediately drop
 		 * the inode flush lock because we can avoid the whole
 		 * AIL state thing.  It's OK to drop the flush lock now,
Index: linux-2.6/fs/xfs/xfs_inode.h
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_inode.h	2010-02-09 10:36:31.621003922 +0100
+++ linux-2.6/fs/xfs/xfs_inode.h	2010-02-09 10:37:23.518004062 +0100
@@ -259,8 +259,7 @@ typedef struct xfs_inode {
 	wait_queue_head_t	i_ipin_wait;	/* inode pinning wait queue */
 	spinlock_t		i_flags_lock;	/* inode i_flags lock */
 	/* Miscellaneous state. */
-	unsigned short		i_flags;	/* see defined flags below */
-	unsigned char		i_update_core;	/* timestamps/size is dirty */
+	unsigned int		i_flags;	/* see defined flags below */
 	unsigned int		i_delayed_blks;	/* count of delay alloc blks */
 
 	xfs_icdinode_t		i_d;		/* most of ondisk inode */
@@ -391,6 +390,7 @@ static inline void xfs_ifunlock(xfs_inod
 #define XFS_INEW	0x0008	/* inode has just been allocated */
 #define XFS_IFILESTREAM	0x0010	/* inode is in a filestream directory */
 #define XFS_ITRUNCATED	0x0020	/* truncated down so flush-on-close */
+#define XFS_IDIRTY_CORE 0x0040	/* non-transaction updates pending */
 
 /*
  * Flags for inode locking.
Index: linux-2.6/fs/xfs/xfs_inode_item.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_inode_item.c	2010-02-09 10:40:38.194253398 +0100
+++ linux-2.6/fs/xfs/xfs_inode_item.c	2010-02-09 10:41:44.883256052 +0100
@@ -233,15 +233,15 @@ xfs_inode_item_format(
 
 	/*
 	 * Make sure the linux inode is dirty. We do this before
-	 * clearing i_update_core as the VFS will call back into
-	 * XFS here and set i_update_core, so we need to dirty the
-	 * inode first so that the ordering of i_update_core and
+	 * clearing XFS_IDIRTY_CORE as the VFS will call back into
+	 * XFS here and set XFS_IDIRTY_CORE, so we need to dirty the
+	 * inode first so that the ordering of XFS_IDIRTY_CORE and
 	 * unlogged modifications still works as described below.
 	 */
 	xfs_mark_inode_dirty_sync(ip);
 
 	/*
-	 * Clear i_update_core if the timestamps (or any other
+	 * Clear XFS_IDIRTY_CORE if the timestamps (or any other
 	 * non-transactional modification) need flushing/logging
 	 * and we're about to log them with the rest of the core.
 	 *
@@ -252,11 +252,11 @@ xfs_inode_item_format(
 	 * for the timestamps if both routines were to grab the
 	 * timestamps or not.  That would be ok.
 	 *
-	 * We clear i_update_core before copying out the data.
+	 * We clear XFS_IDIRTY_CORE before copying out the data.
 	 * This is for coordination with our timestamp updates
 	 * that don't hold the inode lock. They will always
-	 * update the timestamps BEFORE setting i_update_core,
-	 * so if we clear i_update_core after they set it we
+	 * update the timestamps BEFORE setting XFS_IDIRTY_CORE,
+	 * so if we clear XFS_IDIRTY_CORE after they set it we
 	 * are guaranteed to see their updates to the timestamps
 	 * either here.  Likewise, if they set it after we clear it
 	 * here, we'll see it either on the next commit of this
@@ -264,12 +264,9 @@ xfs_inode_item_format(
 	 * xfs_iflush().  This depends on strongly ordered memory
 	 * semantics, but we have that.  We use the SYNCHRONIZE
 	 * macro to make sure that the compiler does not reorder
-	 * the i_update_core access below the data copy below.
+	 * the XFS_IDIRTY_CORE access below the data copy below.
 	 */
-	if (ip->i_update_core)  {
-		ip->i_update_core = 0;
-		SYNCHRONIZE();
-	}
+	xfs_iflags_clear(ip, XFS_IDIRTY_CORE);
 
 	/*
 	 * Make sure to get the latest timestamps from the Linux inode.
Index: linux-2.6/fs/xfs/xfs_inode_item.h
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_inode_item.h	2010-02-09 10:40:24.678024386 +0100
+++ linux-2.6/fs/xfs/xfs_inode_item.h	2010-02-09 10:40:35.674015377 +0100
@@ -162,7 +162,7 @@ static inline int xfs_inode_clean(xfs_in
 {
 	return (!ip->i_itemp ||
 		!(ip->i_itemp->ili_format.ilf_fields & XFS_ILOG_ALL)) &&
-	       !ip->i_update_core;
+		!xfs_iflags_test(ip, XFS_IDIRTY_CORE);
 }
 
 extern void xfs_inode_item_init(struct xfs_inode *, struct xfs_mount *);
Index: linux-2.6/fs/xfs/xfs_vnodeops.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_vnodeops.c	2010-02-09 10:39:37.171274212 +0100
+++ linux-2.6/fs/xfs/xfs_vnodeops.c	2010-02-09 10:40:13.425004481 +0100
@@ -405,7 +405,7 @@ xfs_setattr(
 			inode->i_atime = iattr->ia_atime;
 			ip->i_d.di_atime.t_sec = iattr->ia_atime.tv_sec;
 			ip->i_d.di_atime.t_nsec = iattr->ia_atime.tv_nsec;
-			ip->i_update_core = 1;
+			xfs_iflags_set(ip, XFS_IDIRTY_CORE);
 		}
 		if (mask & ATTR_MTIME) {
 			inode->i_mtime = iattr->ia_mtime;
@@ -427,7 +427,7 @@ xfs_setattr(
 		inode->i_ctime = iattr->ia_ctime;
 		ip->i_d.di_ctime.t_sec = iattr->ia_ctime.tv_sec;
 		ip->i_d.di_ctime.t_nsec = iattr->ia_ctime.tv_nsec;
-		ip->i_update_core = 1;
+		xfs_iflags_set(ip, XFS_IDIRTY_CORE);
 		timeflags &= ~XFS_ICHGTIME_CHG;
 	}
 
@@ -633,7 +633,7 @@ xfs_fsync(
 	 */
 	xfs_ilock(ip, XFS_ILOCK_SHARED);
 
-	if (!ip->i_update_core) {
+	if (!xfs_iflags_test(ip, XFS_IDIRTY_CORE)) {
 		/*
 		 * Timestamps/size haven't changed since last inode flush or
 		 * inode transaction commit.  That means either nothing got
Index: linux-2.6/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_sync.c	2010-02-09 10:43:25.201003853 +0100
+++ linux-2.6/fs/xfs/linux-2.6/xfs_sync.c	2010-02-09 10:44:00.886284060 +0100
@@ -765,8 +765,8 @@ xfs_reclaim_inode_now(
 	 * XFS_IRECLAIM flag set it will not touch us.
 	 */
 	spin_lock(&ip->i_flags_lock);
-	ASSERT_ALWAYS(__xfs_iflags_test(ip, XFS_IRECLAIMABLE));
-	if (__xfs_iflags_test(ip, XFS_IRECLAIM)) {
+	if (WARN_ON(__xfs_iflags_test(ip, XFS_IRECLAIMABLE)) ||
+	    __xfs_iflags_test(ip, XFS_IRECLAIM)) {
 		/* ignore as it is already under reclaim */
 		spin_unlock(&ip->i_flags_lock);
 		write_unlock(&pag->pag_ici_lock);

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-09 10:31                                                               ` Christoph Hellwig
@ 2010-02-10 12:42                                                                 ` Patrick Schreurs
  2010-02-10 14:55                                                                   ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2010-02-10 12:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, xfs

On 9-2-2010 11:31, Christoph Hellwig wrote:
> On Tue, Feb 09, 2010 at 09:48:38AM +0100, Patrick Schreurs wrote:
>> This is a clean 2.6.32.3 with the xfs-inode-reclaim-2.6.32 patch i
>> received from Dave on January 8th (see attachment).
>
> I can't find anything interesting regarding I_RECLAIMABLE manipulation
> in there.  The only thing I could think off going wrong is i_flags
> and i_update_core sitting in the same word and the compiler causing
> some read-modify-write cycles for it.  Can you test the patch below?
> It fixes the abose issue up, and to make sure sure the assert you hit
> isn't as lethal changes it into a WARN_ON, which will still print the
> backtrace, but not crash the machine.

Thanks for the patch. After having this patch applied we saw *a lot* 
warnings. They all look like this:

Feb 10 13:20:38 sb06 kernel: ------------[ cut here ]------------
Feb 10 13:20:38 sb06 kernel: WARNING: at fs/xfs/linux-2.6/xfs_sync.c:768 
xfs_reclaim_inode_now+0x3d/0x84()
Feb 10 13:20:38 sb06 kernel: Hardware name: PowerEdge 1950
Feb 10 13:20:38 sb06 kernel: Modules linked in: acpi_cpufreq 
cpufreq_ondemand ipmi_si ipmi_devintf ipmi_msghandler bonding mptspi 
serio_raw rng_core scsi_transport_spi bnx2 thermal processor thermal_sys
Feb 10 13:20:38 sb06 kernel: Pid: 3145, comm: xfssyncd Not tainted 
2.6.32.3 #2
Feb 10 13:20:38 sb06 kernel: Call Trace:
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a897>] ? 
xfs_reclaim_inode_now+0x3d/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a897>] ? 
xfs_reclaim_inode_now+0x3d/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8102e8c9>] ? 
warn_slowpath_common+0x77/0xa3
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a897>] ? 
xfs_reclaim_inode_now+0x3d/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113b130>] ? 
xfs_inode_ag_walk+0x68/0xa2
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a85a>] ? 
xfs_reclaim_inode_now+0x0/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113b1ba>] ? 
xfs_inode_ag_iterator+0x50/0x7e
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a85a>] ? 
xfs_reclaim_inode_now+0x0/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113b224>] ? 
xfs_sync_worker+0x26/0x52
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113aa79>] ? xfssyncd+0x123/0x180
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a956>] ? xfssyncd+0x0/0x180
Feb 10 13:20:38 sb06 kernel: [<ffffffff8103f5cd>] ? kthread+0x79/0x81
Feb 10 13:20:38 sb06 kernel: [<ffffffff8100bcda>] ? child_rip+0xa/0x20
Feb 10 13:20:38 sb06 kernel: [<ffffffff8103f554>] ? kthread+0x0/0x81
Feb 10 13:20:38 sb06 kernel: [<ffffffff8100bcd0>] ? child_rip+0x0/0x20
Feb 10 13:20:38 sb06 kernel: ---[ end trace 1ae862ca12666a87 ]---

and some look like this:

Feb 10 13:20:38 sb06 kernel: ------------[ cut here ]------------
Feb 10 13:20:38 sb06 kernel: WARNING: at fs/xfs/linux-2.6/xfs_sync.c:768 
xfs_reclaim_inode_now+0x3d/0x84()
Feb 10 13:20:38 sb06 kernel: Hardware name: PowerEdge 1950
Feb 10 13:20:38 sb06 kernel: Modules linked in: acpi_cpufreq 
cpufreq_ondemand ipmi_si ipmi_devintf ipmi_msghandlerspes  13<f3dat? 
ff4ode_now+0x0/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113b1ba>]n2r4a 
2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? ff4ode_n]n2r4a 
2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? xff4ode_n]n2r4a 
2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? xff4ode_n]n2r4a 
2f1>nx85k7[<4x0-[e :7we_ospies  13<f3dat? ff4ode]n2r4a 
2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? xff4ode_now+0x0/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffff]n2r4a 2f1>nx85k7[<4x0-[e 
:7we_ospes  13<ode_]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? 
ff4ode_now+0x0/0]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? 
ff4ode_now+0x0/0x]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? 
ff4ode_no]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? ff4ode_now+0x0/0x84
Feb 10 13:20:38 sb06 kernel: [<fff]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes 
13<f3dat? ff4ode_now+]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? 
ff4>ode_no]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? 
xff4ode_now+]n2r4a 2f1>nx85k7[<4x0-[e :7we_ospes  13<f3dat? 
xff4ode_now+0x0/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113b1ba>] ? 
xfs_inode_ag_iterator+0x50/0x7e
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a85a>] ? 
xfs_reclaim_inode_now+0x0/0x84
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113b224>] ? 
xfs_sync_worker+0x26/0x52
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113aa79>] ? xfssyncd+0x123/0x180
Feb 10 13:20:38 sb06 kernel: [<ffffffff8113a956>] ? xfssyncd+0x0/0x180
Feb 10 13:20:38 sb06 kernel: [<ffffffff8103f5cd>] ? kthread+0x79/0x81
Feb 10 13:20:38 sb06 kernel: [<ffffffff8100bcda>] ? child_rip+0xa/0x20
Feb 10 13:20:38 sb06 kernel: [<ffffffff8103f554>] ? kthread+0x0/0x81
Feb 10 13:20:38 sb06 kernel: [<ffffffff8100bcd0>] ? child_rip+0x0/0x20
Feb 10 13:20:38 sb06 kernel: ---[ end trace 1ae862ca12666b1c ]---

I hope this clarifies things. If you need more info, don't hesitate to 
contact me.

Thanks,

-Patrick

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-10 12:42                                                                 ` Patrick Schreurs
@ 2010-02-10 14:55                                                                   ` Christoph Hellwig
  2010-02-10 15:42                                                                     ` Patrick Schreurs
  0 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2010-02-10 14:55 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

On Wed, Feb 10, 2010 at 01:42:57PM +0100, Patrick Schreurs wrote:
> Thanks for the patch. After having this patch applied we saw *a lot*  
> warnings. They all look like this:

Ok, looks like that is not an issue, so you can discard that patch.

I went down to the radix tree code to look for races in it's tag
handling, but then noticed that we might have an issue with our
usage of the radix-tree API.  Can you try the patch below ontop
of Dave's rollup, and instead of my previous one?

---

From: Christoph Hellwig <hch@lst.de>
Subject: xfs: fix locking for inode cache radix tree tag updates

The radix-tree code requires it's users to serialize tag updates against
other updates to the tree.  While XFS protects tag updates against each
other it does not serialize them against updates of the tree contents,
which can lead to tag corruption.  Fix the inode cache to always take
pag_ici_lock in exclusive mode when updating radix tree tags.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: linux-2.6/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_sync.c	2010-02-10 14:28:46.648004203 +0100
+++ linux-2.6/fs/xfs/linux-2.6/xfs_sync.c	2010-02-10 14:29:56.657023619 +0100
@@ -734,12 +734,12 @@ xfs_inode_set_reclaim_tag(
 	xfs_mount_t	*mp = ip->i_mount;
 	xfs_perag_t	*pag = xfs_get_perag(mp, ip->i_ino);
 
-	read_lock(&pag->pag_ici_lock);
+	write_lock(&pag->pag_ici_lock);
 	spin_lock(&ip->i_flags_lock);
 	__xfs_inode_set_reclaim_tag(pag, ip);
 	__xfs_iflags_set(ip, XFS_IRECLAIMABLE);
 	spin_unlock(&ip->i_flags_lock);
-	read_unlock(&pag->pag_ici_lock);
+	write_unlock(&pag->pag_ici_lock);
 	xfs_put_perag(mp, pag);
 }
 
Index: linux-2.6/fs/xfs/xfs_iget.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_iget.c	2010-02-10 14:30:01.092254586 +0100
+++ linux-2.6/fs/xfs/xfs_iget.c	2010-02-10 14:34:00.199005529 +0100
@@ -228,13 +228,12 @@ xfs_iget_cache_hit(
 		xfs_itrace_exit_tag(ip, "xfs_iget.alloc");
 
 		/*
-		 * We need to set XFS_INEW atomically with clearing the
-		 * reclaimable tag so that we do have an indicator of the
-		 * inode still being initialized.
+		 * We need to set XFS_IRECLAIM to prevent xfs_reclaim_inode
+		 * from stomping over us while we recycle the inode.  We can't
+		 * clear the radix tree reclaimable tag yet as it requires
+		 * pag_ici_lock to be helt exclusive.
 		 */
-		ip->i_flags |= XFS_INEW;
-		ip->i_flags &= ~XFS_IRECLAIMABLE;
-		__xfs_inode_clear_reclaim_tag(mp, pag, ip);
+		ip->i_flags |= XFS_IRECLAIM;
 
 		spin_unlock(&ip->i_flags_lock);
 		read_unlock(&pag->pag_ici_lock);
@@ -253,7 +252,15 @@ xfs_iget_cache_hit(
 			__xfs_inode_set_reclaim_tag(pag, ip);
 			goto out_error;
 		}
-		inode->i_state = I_LOCK|I_NEW;
+
+		write_lock(&pag->pag_ici_lock);
+		spin_lock(&ip->i_flags_lock);
+		ip->i_flags &= ~(XFS_IRECLAIMABLE | XFS_IRECLAIM);
+		ip->i_flags |= XFS_INEW;
+		__xfs_inode_clear_reclaim_tag(mp, pag, ip);
+		inode->i_state = I_LOCK | I_NEW;
+		spin_unlock(&ip->i_flags_lock);
+		write_unlock(&pag->pag_ici_lock);
 	} else {
 		/* If the VFS inode is being torn down, pause and try again. */
 		if (!igrab(inode)) {

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-10 14:55                                                                   ` Christoph Hellwig
@ 2010-02-10 15:42                                                                     ` Patrick Schreurs
  2010-02-10 15:47                                                                       ` Christoph Hellwig
  2010-02-24 18:30                                                                       ` Patrick Schreurs
  0 siblings, 2 replies; 42+ messages in thread
From: Patrick Schreurs @ 2010-02-10 15:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, xfs

On 10-2-2010 15:55, Christoph Hellwig wrote:
> On Wed, Feb 10, 2010 at 01:42:57PM +0100, Patrick Schreurs wrote:
>> Thanks for the patch. After having this patch applied we saw *a lot*
>> warnings. They all look like this:
>
> Ok, looks like that is not an issue, so you can discard that patch.
>
> I went down to the radix tree code to look for races in it's tag
> handling, but then noticed that we might have an issue with our
> usage of the radix-tree API.  Can you try the patch below ontop
> of Dave's rollup, and instead of my previous one?

Okay. This patch is currently active. Thanks. I don't have a way to 
trigger it, so we'll have to wait and see what happens.

Obviously we'll keep you posted.

Have any of these patches been sent to the stable team? And have these 
patches been submitted to the upcoming 2.6.33 kernel?

-Patrick

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-10 15:42                                                                     ` Patrick Schreurs
@ 2010-02-10 15:47                                                                       ` Christoph Hellwig
  2010-02-24 18:30                                                                       ` Patrick Schreurs
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2010-02-10 15:47 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

On Wed, Feb 10, 2010 at 04:42:43PM +0100, Patrick Schreurs wrote:
> Have any of these patches been sent to the stable team? And have these  
> patches been submitted to the upcoming 2.6.33 kernel?

Everything before the current test patches is in 2.6.33-rc, I'm not
sure what has made it to stable yet, but if I remember correctly
Dave was going to send his fixes to -stable.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-10 15:42                                                                     ` Patrick Schreurs
  2010-02-10 15:47                                                                       ` Christoph Hellwig
@ 2010-02-24 18:30                                                                       ` Patrick Schreurs
  2010-02-25 23:45                                                                         ` Dave Chinner
  1 sibling, 1 reply; 42+ messages in thread
From: Patrick Schreurs @ 2010-02-24 18:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Tommy van Leeuwen, xfs

On 10-2-2010 16:42, Patrick Schreurs wrote:
> On 10-2-2010 15:55, Christoph Hellwig wrote:
>> On Wed, Feb 10, 2010 at 01:42:57PM +0100, Patrick Schreurs wrote:
>>> Thanks for the patch. After having this patch applied we saw *a lot*
>>> warnings. They all look like this:
>>
>> Ok, looks like that is not an issue, so you can discard that patch.
>>
>> I went down to the radix tree code to look for races in it's tag
>> handling, but then noticed that we might have an issue with our
>> usage of the radix-tree API. Can you try the patch below ontop
>> of Dave's rollup, and instead of my previous one?
>
> Okay. This patch is currently active. Thanks. I don't have a way to
> trigger it, so we'll have to wait and see what happens.

Servers running with this patch applied are still running stable. The 
first server we've patched is running stable for 2 weeks now. Should we 
try to have this patches included for 2.6.33?

-Patrick

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-24 18:30                                                                       ` Patrick Schreurs
@ 2010-02-25 23:45                                                                         ` Dave Chinner
  2010-03-01  9:51                                                                           ` Christoph Hellwig
  0 siblings, 1 reply; 42+ messages in thread
From: Dave Chinner @ 2010-02-25 23:45 UTC (permalink / raw)
  To: Patrick Schreurs; +Cc: Christoph Hellwig, Tommy van Leeuwen, xfs

On Wed, Feb 24, 2010 at 07:30:18PM +0100, Patrick Schreurs wrote:
> On 10-2-2010 16:42, Patrick Schreurs wrote:
>> On 10-2-2010 15:55, Christoph Hellwig wrote:
>>> On Wed, Feb 10, 2010 at 01:42:57PM +0100, Patrick Schreurs wrote:
>>>> Thanks for the patch. After having this patch applied we saw *a lot*
>>>> warnings. They all look like this:
>>>
>>> Ok, looks like that is not an issue, so you can discard that patch.
>>>
>>> I went down to the radix tree code to look for races in it's tag
>>> handling, but then noticed that we might have an issue with our
>>> usage of the radix-tree API. Can you try the patch below ontop
>>> of Dave's rollup, and instead of my previous one?
>>
>> Okay. This patch is currently active. Thanks. I don't have a way to
>> trigger it, so we'll have to wait and see what happens.
>
> Servers running with this patch applied are still running stable. The  
> first server we've patched is running stable for 2 weeks now. Should we  
> try to have this patches included for 2.6.33?

Good to hear. The fixes are already in 2.6.33 (just released), so
the question is whether we backport to 2.6.32 or not. Christoph,
Alex, Eric - should we push these fixes back to .32-stable?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim)
  2010-02-25 23:45                                                                         ` Dave Chinner
@ 2010-03-01  9:51                                                                           ` Christoph Hellwig
  0 siblings, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2010-03-01  9:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Patrick Schreurs, Christoph Hellwig, Tommy van Leeuwen, xfs

On Fri, Feb 26, 2010 at 10:45:53AM +1100, Dave Chinner wrote:
> Good to hear. The fixes are already in 2.6.33 (just released), so
> the question is whether we backport to 2.6.32 or not. Christoph,
> Alex, Eric - should we push these fixes back to .32-stable?

My latests patch to fix the locking for tag manipulations isn't in
any tree yet.   We should get it into mainline and 2.6.33-stable,
and if the previous patches are backported .32-stable as well.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2010-03-01  9:50 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-16 10:27 2.6.31 xfs_fs_destroy_inode: cannot reclaim Tommy van Leeuwen
2009-09-17 18:59 ` Christoph Hellwig
2009-09-29 10:15   ` Patrick Schreurs
2009-09-29 12:57     ` Christoph Hellwig
2009-09-30 10:48       ` Patrick Schreurs
2009-09-30 12:41         ` Christoph Hellwig
2009-10-02 14:24           ` Bas Couwenberg
2009-10-05 21:43             ` Christoph Hellwig
2009-10-06  9:04               ` Patrick Schreurs
2009-10-07  1:19                 ` Christoph Hellwig
2009-10-08  8:45                   ` Patrick Schreurs
2009-10-11  7:43                   ` Patrick Schreurs
2009-10-11 12:24                     ` Christoph Hellwig
2009-10-12 23:38                     ` Christoph Hellwig
2009-10-15 15:06                       ` Tommy van Leeuwen
2009-10-18 23:59                         ` Christoph Hellwig
2009-10-19  1:17                           ` Dave Chinner
2009-10-19  3:53                             ` Christoph Hellwig
2009-10-19  1:16                       ` Dave Chinner
2009-10-19  3:54                         ` Christoph Hellwig
2009-10-20  3:40                           ` Dave Chinner
2009-10-21  9:45                             ` Tommy van Leeuwen
2009-10-22  8:59                               ` Christoph Hellwig
2009-10-27 10:41                                 ` Tommy van Leeuwen
     [not found]                                   ` <89c4f90c0910280519k759230c1r7b1586932ac792f7@mail.gmail.com>
2009-10-30 10:16                                     ` Christoph Hellwig
2009-11-03 14:46                                       ` Patrick Schreurs
2009-11-14 16:21                                         ` Christoph Hellwig
     [not found]                                           ` <4B0A8075.8080008@news-service.com>
     [not found]                                             ` <20091211115932.GA20632@infradead.org>
     [not found]                                               ` <4B3F9F88.9030307@news-service.com>
     [not found]                                                 ` <20100107110446.GA13802@discord.disaster>
     [not found]                                                   ` <4B45CFAC.4000607@news-service.com>
2010-01-08 11:31                                                     ` [PATCH] Inode reclaim fixes (was Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim) Dave Chinner
2010-01-11 20:22                                                       ` Patrick Schreurs
2010-01-15 11:01                                                       ` Patrick Schreurs
2010-02-01 16:52                                                         ` Patrick Schreurs
2010-02-08 10:16                                                           ` Patrick Schreurs
2010-02-08 19:42                                                           ` Christoph Hellwig
2010-02-09  8:48                                                             ` Patrick Schreurs
2010-02-09 10:31                                                               ` Christoph Hellwig
2010-02-10 12:42                                                                 ` Patrick Schreurs
2010-02-10 14:55                                                                   ` Christoph Hellwig
2010-02-10 15:42                                                                     ` Patrick Schreurs
2010-02-10 15:47                                                                       ` Christoph Hellwig
2010-02-24 18:30                                                                       ` Patrick Schreurs
2010-02-25 23:45                                                                         ` Dave Chinner
2010-03-01  9:51                                                                           ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.