All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix warning in iput for bad-inode
@ 2011-08-17 18:56 Konstantin Khlebnikov
  2011-08-29  3:34 ` Sergei Trofimovich
  2011-09-01  2:45 ` David Sterba
  0 siblings, 2 replies; 12+ messages in thread
From: Konstantin Khlebnikov @ 2011-08-17 18:56 UTC (permalink / raw)
  To: linux-btrfs, Chris Mason

iput() shouldn't be called for inodes in I_NEW state,
lets call __destroy_inode() and btrfs_destroy_inode() instead

[    1.871723] WARNING: at fs/inode.c:1309 iput+0x1d9/0x200()
[    1.873722] Modules linked in:
[    1.873722] Pid: 1, comm: swapper Tainted: G        W   3.1.0-rc2-zurg #58
[    1.875722] Call Trace:
[    1.875722]  [<ffffffff8113cb99>] ? iput+0x1d9/0x200
[    1.876722]  [<ffffffff81044c3a>] warn_slowpath_common+0x7a/0xb0
[    1.877722]  [<ffffffff81044c85>] warn_slowpath_null+0x15/0x20
[    1.879722]  [<ffffffff8113cb99>] iput+0x1d9/0x200
[    1.879722]  [<ffffffff81295cf4>] btrfs_iget+0x1c4/0x450
[    1.881721]  [<ffffffff812b7e6b>] ? btrfs_tree_read_unlock_blocking+0x3b/0x60
[    1.882721]  [<ffffffff8111769a>] ? kmem_cache_free+0x2a/0x160
[    1.883721]  [<ffffffff812966f3>] btrfs_lookup_dentry+0x413/0x490
[    1.885721]  [<ffffffff8103b1e1>] ? get_parent_ip+0x11/0x50
[    1.886720]  [<ffffffff81296781>] btrfs_lookup+0x11/0x30
[    1.887720]  [<ffffffff8112de50>] d_alloc_and_lookup+0x40/0x80
[    1.888720]  [<ffffffff8113ac10>] ? d_lookup+0x30/0x50
[    1.889720]  [<ffffffff811301a8>] do_lookup+0x288/0x370
[    1.890720]  [<ffffffff8103b1e1>] ? get_parent_ip+0x11/0x50
[    1.891720]  [<ffffffff81132210>] do_last+0xe0/0x910
[    1.892720]  [<ffffffff81132b4d>] path_openat+0xcd/0x3a0
[    1.893719]  [<ffffffff813bab4b>] ? wait_for_xmitr+0x3b/0xa0
[    1.895719]  [<ffffffff8131c50a>] ? put_dec_full+0x5a/0xb0
[    1.896719]  [<ffffffff813babdb>] ? serial8250_console_putchar+0x2b/0x40
[    1.897719]  [<ffffffff81132e7d>] do_filp_open+0x3d/0xa0
[    1.898719]  [<ffffffff8103b1e1>] ? get_parent_ip+0x11/0x50
[    1.899718]  [<ffffffff8103b1e1>] ? get_parent_ip+0x11/0x50
[    1.900718]  [<ffffffff816e80fd>] ? sub_preempt_count+0x9d/0xd0
[    1.902718]  [<ffffffff8112a09d>] open_exec+0x2d/0xf0
[    1.903718]  [<ffffffff8112aaaf>] do_execve_common.isra.32+0x12f/0x340
[    1.906717]  [<ffffffff8112acd6>] do_execve+0x16/0x20
[    1.907717]  [<ffffffff8100af02>] sys_execve+0x42/0x70
[    1.908717]  [<ffffffff816ed968>] kernel_execve+0x68/0xd0
[    1.909717]  [<ffffffff816d828e>] ? run_init_process+0x1e/0x20
[    1.911717]  [<ffffffff816d831e>] init_post+0x8e/0xc0
[    1.912716]  [<ffffffff81cb8c79>] kernel_init+0x13d/0x13d
[    1.913716]  [<ffffffff816ed8f4>] kernel_thread_helper+0x4/0x10
[    1.914716]  [<ffffffff81cb8b3c>] ? start_kernel+0x33f/0x33f
[    1.915716]  [<ffffffff816ed8f0>] ? gs_change+0xb/0xb

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 fs/btrfs/inode.c |   10 +++-------
 1 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 15fceef..3e949bd 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3952,7 +3952,6 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 			 struct btrfs_root *root, int *new)
 {
 	struct inode *inode;
-	int bad_inode = 0;
 
 	inode = btrfs_iget_locked(s, location->objectid, root);
 	if (!inode)
@@ -3968,15 +3967,12 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 			if (new)
 				*new = 1;
 		} else {
-			bad_inode = 1;
+			__destroy_inode(inode);
+			btrfs_destroy_inode(inode);
+			inode = ERR_PTR(-ESTALE);
 		}
 	}
 
-	if (bad_inode) {
-		iput(inode);
-		inode = ERR_PTR(-ESTALE);
-	}
-
 	return inode;
 }
 


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-17 18:56 [PATCH] btrfs: fix warning in iput for bad-inode Konstantin Khlebnikov
@ 2011-08-29  3:34 ` Sergei Trofimovich
  2011-08-30 16:53   ` Sergei Trofimovich
  2011-09-01  2:45 ` David Sterba
  1 sibling, 1 reply; 12+ messages in thread
From: Sergei Trofimovich @ 2011-08-29  3:34 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-btrfs, Chris Mason

[-- Attachment #1: Type: text/plain, Size: 503 bytes --]

On Wed, 17 Aug 2011 22:56:19 +0400
Konstantin Khlebnikov <khlebnikov@openvz.org> wrote:

> iput() shouldn't be called for inodes in I_NEW state,
> lets call __destroy_inode() and btrfs_destroy_inode() instead

Hi Konstantin!

Running 'sync' program after the load does not finish and eats 100%CPU
busy-waiting for something in kernel.

It's easy to reproduce hang with patch for me. I just run liferea and sync
after it. Without patch I haven't managed to hang btrfs up.

-- 

  Sergei

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-29  3:34 ` Sergei Trofimovich
@ 2011-08-30 16:53   ` Sergei Trofimovich
  2011-08-30 18:02     ` Josef Bacik
  0 siblings, 1 reply; 12+ messages in thread
From: Sergei Trofimovich @ 2011-08-30 16:53 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-btrfs, Chris Mason

[-- Attachment #1: Type: text/plain, Size: 622 bytes --]

> Running 'sync' program after the load does not finish and eats 100%CPU
> busy-waiting for something in kernel.
> 
> It's easy to reproduce hang with patch for me. I just run liferea and sync
> after it. Without patch I haven't managed to hang btrfs up.

And I think it's another btrfs bug. I've managed to reproduce it _without_
your patch and _without_ autodefrag enabled by manually running the
following commands:
    $ btrfs fi defrag file-with-20_000-extents
    $ sync

I think your patch just shuffles things a bit and forces autodefrag races
to pop-up sooner (which is good! :])

-- 

  Sergei

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-30 16:53   ` Sergei Trofimovich
@ 2011-08-30 18:02     ` Josef Bacik
  2011-08-30 19:31       ` Sergei Trofimovich
  0 siblings, 1 reply; 12+ messages in thread
From: Josef Bacik @ 2011-08-30 18:02 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: Konstantin Khlebnikov, linux-btrfs, Chris Mason

On 08/30/2011 12:53 PM, Sergei Trofimovich wrote:
>> Running 'sync' program after the load does not finish and eats 
>> 100%CPU busy-waiting for something in kernel.
>> 
>> It's easy to reproduce hang with patch for me. I just run
>> liferea and sync after it. Without patch I haven't managed to
>> hang btrfs up.
> 
> And I think it's another btrfs bug. I've managed to reproduce it 
> _without_ your patch and _without_ autodefrag enabled by manually 
> running the following commands: $ btrfs fi defrag 
> file-with-20_000-extents $ sync
> 
> I think your patch just shuffles things a bit and forces
> autodefrag races to pop-up sooner (which is good! :])
> 

Sergei, can you do sysrq+w when this is happening, and maybe turn on
the softlockup detector so we can see where sync is getting stuck?
Thanks,

Josef

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-30 18:02     ` Josef Bacik
@ 2011-08-30 19:31       ` Sergei Trofimovich
  2011-08-30 19:40         ` Josef Bacik
  0 siblings, 1 reply; 12+ messages in thread
From: Sergei Trofimovich @ 2011-08-30 19:31 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Konstantin Khlebnikov, linux-btrfs, Chris Mason

[-- Attachment #1: Type: text/plain, Size: 11394 bytes --]

On Tue, 30 Aug 2011 14:02:37 -0400
Josef Bacik <josef@redhat.com> wrote:

> On 08/30/2011 12:53 PM, Sergei Trofimovich wrote:
> >> Running 'sync' program after the load does not finish and eats 
> >> 100%CPU busy-waiting for something in kernel.
> >> 
> >> It's easy to reproduce hang with patch for me. I just run
> >> liferea and sync after it. Without patch I haven't managed to
> >> hang btrfs up.
> > 
> > And I think it's another btrfs bug. I've managed to reproduce it 
> > _without_ your patch and _without_ autodefrag enabled by manually 
> > running the following commands: $ btrfs fi defrag 
> > file-with-20_000-extents $ sync
> > 
> > I think your patch just shuffles things a bit and forces
> > autodefrag races to pop-up sooner (which is good! :])
> > 
> 
> Sergei, can you do sysrq+w when this is happening, and maybe turn on
> the softlockup detector so we can see where sync is getting stuck?
> Thanks,

Sure. As I keep telling about 2 cases in IRC I will state both here explicitely:

==The First Issue (aka "The Hung sync()" case) ==

- it's an unpatched linus's v3.1-rc4-80-g0f43dd5
- /dev/root on / type btrfs (rw,noatime,compress=lzo)
- 50% full 30GB filesystem (usual nonmixed mode)

How I hung it:
    $ /usr/sbin/filefrag ~/.bogofilter/wordlist.db 
        /home/st/.bogofilter/wordlist.db: 19070 extents found
      the file is 138MB sqlite database for bayesian SPAM filter,
      it's being read and written every 20 minutes or so.
      Maybe, it was writtent even in defrag/sync time!
    $~/dev/git/btrfs-progs-unstable/btrfs fi defrag ~/.bogofilter/wordlist.db 
    $ sync
    ^C<hung in D-state>

I didn't try to reproduce it yet.
As for lockdep I'll try but I'm afraid I will fail to reproduce, but I'll try tomorrow.
I suspect I'll need to seriously fragment some file first down to such horrible state.

With help of David I've some (hopefully relevant) info:
    #!/bin/sh -x
    
    for i in $(ps aux|grep " D[+ ]\?"|awk '{print $2}'); do
            ps $i
            sudo cat /proc/$i/stack
    done

  PID TTY      STAT   TIME COMMAND
 1291 ?        D      0:00 [btrfs-endio-wri]
[<ffffffff8130055d>] btrfs_tree_read_lock+0x6d/0x120
[<ffffffff812b8e88>] btrfs_search_slot+0x698/0x8b0
[<ffffffff812c9e18>] btrfs_lookup_csum+0x68/0x190
[<ffffffff812ca10f>] __btrfs_lookup_bio_sums+0x1cf/0x3e0
[<ffffffff812ca371>] btrfs_lookup_bio_sums+0x11/0x20
[<ffffffff812d6a50>] btrfs_submit_bio_hook+0x140/0x170
[<ffffffff812ed594>] submit_one_bio+0x64/0xa0
[<ffffffff812f14f5>] extent_readpages+0xe5/0x100
[<ffffffff812d7aaa>] btrfs_readpages+0x1a/0x20
[<ffffffff810a6a02>] __do_page_cache_readahead+0x1d2/0x280
[<ffffffff810a6d8c>] ra_submit+0x1c/0x20
[<ffffffff810a6ebd>] ondemand_readahead+0x12d/0x270
[<ffffffff810a70cc>] page_cache_sync_readahead+0x2c/0x40
[<ffffffff81309987>] __load_free_space_cache+0x1a7/0x5b0
[<ffffffff81309e61>] load_free_space_cache+0xd1/0x190
[<ffffffff812be07b>] cache_block_group+0xab/0x290
[<ffffffff812c3def>] find_free_extent.clone.71+0x39f/0xab0
[<ffffffff812c5160>] btrfs_reserve_extent+0xe0/0x170
[<ffffffff812c56df>] btrfs_alloc_free_block+0xcf/0x330
[<ffffffff812b498d>] __btrfs_cow_block+0x11d/0x4a0
[<ffffffff812b4df8>] btrfs_cow_block+0xe8/0x1a0
[<ffffffff812b8965>] btrfs_search_slot+0x175/0x8b0
[<ffffffff812c9e18>] btrfs_lookup_csum+0x68/0x190
[<ffffffff812caf6e>] btrfs_csum_file_blocks+0xbe/0x670
[<ffffffff812d7d91>] add_pending_csums.clone.39+0x41/0x60
[<ffffffff812da528>] btrfs_finish_ordered_io+0x218/0x310
[<ffffffff812da635>] btrfs_writepage_end_io_hook+0x15/0x20
[<ffffffff8130c71a>] end_compressed_bio_write+0x7a/0xe0
[<ffffffff811146f8>] bio_endio+0x18/0x30
[<ffffffff812cd8fc>] end_workqueue_fn+0xec/0x120
[<ffffffff812fb0ac>] worker_loop+0xac/0x520
[<ffffffff8105d486>] kthread+0x96/0xa0
[<ffffffff815f9214>] kernel_thread_helper+0x4/0x10
[<ffffffffffffffff>] 0xffffffffffffffff
  PID TTY      STAT   TIME COMMAND
 1296 ?        D      0:00 [btrfs-transacti]
[<ffffffff812d3d6e>] btrfs_commit_transaction+0x22e/0x870
[<ffffffff812cd5bb>] transaction_kthread+0x26b/0x280
[<ffffffff8105d486>] kthread+0x96/0xa0
[<ffffffff815f9214>] kernel_thread_helper+0x4/0x10
[<ffffffffffffffff>] 0xffffffffffffffff
  PID TTY      STAT   TIME COMMAND
 2891 pts/0    D+     0:00 sync
[<ffffffff8109d069>] sleep_on_page+0x9/0x10
[<ffffffff8109d28e>] wait_on_page_bit+0x6e/0x80
[<ffffffff8109d9aa>] filemap_fdatawait_range+0xfa/0x190
[<ffffffff8109da62>] filemap_fdatawait+0x22/0x30
[<ffffffff8110ae5a>] sync_inodes_sb+0x17a/0x1e0
[<ffffffff8110eae8>] __sync_filesystem+0x78/0x80
[<ffffffff8110eb07>] sync_one_sb+0x17/0x20
[<ffffffff810e9cc5>] iterate_supers+0x95/0xf0
[<ffffffff8110ea2b>] sync_filesystems+0x1b/0x20
[<ffffffff8110eb8c>] sys_sync+0x1c/0x40
[<ffffffff815f813b>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
  PID TTY      STAT   TIME COMMAND
cat: /proc/4454/stack: Нет такого файла или каталога

disassebbled btrfs_tree_read_lock:

ffffffff813004f0 <btrfs_tree_read_lock> push   %rbp
ffffffff813004f1 <btrfs_tree_read_lock+0x1> mov    %gs:0xb5c0,%rax
ffffffff813004fa <btrfs_tree_read_lock+0xa> mov    %rsp,%rbp
ffffffff813004fd <btrfs_tree_read_lock+0xd> push   %r15
ffffffff813004ff <btrfs_tree_read_lock+0xf> push   %r14
ffffffff81300501 <btrfs_tree_read_lock+0x11> lea    0x6c(%rdi),%r14
ffffffff81300505 <btrfs_tree_read_lock+0x15> push   %r13
ffffffff81300507 <btrfs_tree_read_lock+0x17> lea    0x70(%rdi),%r13
ffffffff8130050b <btrfs_tree_read_lock+0x1b> push   %r12
ffffffff8130050d <btrfs_tree_read_lock+0x1d> lea    -0x60(%rbp),%r12
ffffffff81300511 <btrfs_tree_read_lock+0x21> push   %rbx
ffffffff81300512 <btrfs_tree_read_lock+0x22> lea    0x18(%r12),%r15
ffffffff81300517 <btrfs_tree_read_lock+0x27> lea    0x5c(%rdi),%rbx
ffffffff8130051b <btrfs_tree_read_lock+0x2b> sub    $0x58,%rsp
ffffffff8130051f <btrfs_tree_read_lock+0x2f> mov    %rdi,-0x70(%rbp)
ffffffff81300523 <btrfs_tree_read_lock+0x33> mov    %rax,-0x68(%rbp)
ffffffff81300527 <btrfs_tree_read_lock+0x37> mov    %rax,-0x78(%rbp)
ffffffff8130052b <btrfs_tree_read_lock+0x3b> nopl   0x0(%rax,%rax,1)
ffffffff81300530 <btrfs_tree_read_lock+0x40> mov    (%rbx),%eax
ffffffff81300532 <btrfs_tree_read_lock+0x42> test   %eax,%eax
ffffffff81300534 <btrfs_tree_read_lock+0x44> je     ffffffff8130057e <btrfs_tree_read_lock+0x8e>
ffffffff81300536 <btrfs_tree_read_lock+0x46> mov    -0x68(%rbp),%rax
ffffffff8130053a <btrfs_tree_read_lock+0x4a> movq   $0x0,-0x60(%rbp)
ffffffff81300542 <btrfs_tree_read_lock+0x52> movq   $0xffffffff8105d950,-0x50(%rbp)
ffffffff8130054a <btrfs_tree_read_lock+0x5a> mov    %r15,-0x48(%rbp)
ffffffff8130054e <btrfs_tree_read_lock+0x5e> mov    %r15,-0x40(%rbp)
ffffffff81300552 <btrfs_tree_read_lock+0x62> mov    %rax,-0x58(%rbp)
ffffffff81300556 <btrfs_tree_read_lock+0x66> jmp    ffffffff8130055d <btrfs_tree_read_lock+0x6d>
ffffffff81300558 <btrfs_tree_read_lock+0x68> callq  ffffffff815f4670 <schedule>
ffffffff8130055d <btrfs_tree_read_lock+0x6d> mov    $0x2,%edx
ffffffff81300562 <btrfs_tree_read_lock+0x72> mov    %r12,%rsi
ffffffff81300565 <btrfs_tree_read_lock+0x75> mov    %r13,%rdi
ffffffff81300568 <btrfs_tree_read_lock+0x78> callq  ffffffff8105db90 <prepare_to_wait>
ffffffff8130056d <btrfs_tree_read_lock+0x7d> mov    (%rbx),%eax
ffffffff8130056f <btrfs_tree_read_lock+0x7f> test   %eax,%eax
ffffffff81300571 <btrfs_tree_read_lock+0x81> jne    ffffffff81300558 <btrfs_tree_read_lock+0x68>
ffffffff81300573 <btrfs_tree_read_lock+0x83> mov    %r12,%rsi
ffffffff81300576 <btrfs_tree_read_lock+0x86> mov    %r13,%rdi
ffffffff81300579 <btrfs_tree_read_lock+0x89> callq  ffffffff8105da80 <finish_wait>
ffffffff8130057e <btrfs_tree_read_lock+0x8e> mov    %r14,%rdi
ffffffff81300581 <btrfs_tree_read_lock+0x91> callq  ffffffff815f76d0 <_raw_read_lock>
ffffffff81300586 <btrfs_tree_read_lock+0x96> mov    (%rbx),%eax
ffffffff81300588 <btrfs_tree_read_lock+0x98> test   %eax,%eax
ffffffff8130058a <btrfs_tree_read_lock+0x9a> je     ffffffff813005eb <btrfs_tree_read_lock+0xfb>
ffffffff8130058c <btrfs_tree_read_lock+0x9c> mov    %r14,%rdi
ffffffff8130058f <btrfs_tree_read_lock+0x9f> callq  ffffffff815f73f0 <_raw_read_unlock>
ffffffff81300594 <btrfs_tree_read_lock+0xa4> mov    (%rbx),%eax
ffffffff81300596 <btrfs_tree_read_lock+0xa6> test   %eax,%eax
ffffffff81300598 <btrfs_tree_read_lock+0xa8> je     ffffffff81300530 <btrfs_tree_read_lock+0x40>
ffffffff8130059a <btrfs_tree_read_lock+0xaa> mov    -0x78(%rbp),%rax
ffffffff8130059e <btrfs_tree_read_lock+0xae> movq   $0x0,-0x60(%rbp)
ffffffff813005a6 <btrfs_tree_read_lock+0xb6> movq   $0xffffffff8105d950,-0x50(%rbp)
ffffffff813005ae <btrfs_tree_read_lock+0xbe> mov    %r15,-0x48(%rbp)
ffffffff813005b2 <btrfs_tree_read_lock+0xc2> mov    %r15,-0x40(%rbp)
ffffffff813005b6 <btrfs_tree_read_lock+0xc6> mov    %rax,-0x58(%rbp)
ffffffff813005ba <btrfs_tree_read_lock+0xca> jmp    ffffffff813005c5 <btrfs_tree_read_lock+0xd5>
ffffffff813005bc <btrfs_tree_read_lock+0xcc> nopl   0x0(%rax)
ffffffff813005c0 <btrfs_tree_read_lock+0xd0> callq  ffffffff815f4670 <schedule>
ffffffff813005c5 <btrfs_tree_read_lock+0xd5> mov    $0x2,%edx
ffffffff813005ca <btrfs_tree_read_lock+0xda> mov    %r12,%rsi
ffffffff813005cd <btrfs_tree_read_lock+0xdd> mov    %r13,%rdi
ffffffff813005d0 <btrfs_tree_read_lock+0xe0> callq  ffffffff8105db90 <prepare_to_wait>
ffffffff813005d5 <btrfs_tree_read_lock+0xe5> mov    (%rbx),%eax
ffffffff813005d7 <btrfs_tree_read_lock+0xe7> test   %eax,%eax
ffffffff813005d9 <btrfs_tree_read_lock+0xe9> jne    ffffffff813005c0 <btrfs_tree_read_lock+0xd0>
ffffffff813005db <btrfs_tree_read_lock+0xeb> mov    %r12,%rsi
ffffffff813005de <btrfs_tree_read_lock+0xee> mov    %r13,%rdi
ffffffff813005e1 <btrfs_tree_read_lock+0xf1> callq  ffffffff8105da80 <finish_wait>
ffffffff813005e6 <btrfs_tree_read_lock+0xf6> jmpq   ffffffff81300530 <btrfs_tree_read_lock+0x40>
ffffffff813005eb <btrfs_tree_read_lock+0xfb> mov    -0x70(%rbp),%rax
ffffffff813005ef <btrfs_tree_read_lock+0xff> lock incl 0x58(%rax)
ffffffff813005f3 <btrfs_tree_read_lock+0x103> lock incl 0x64(%rax)
ffffffff813005f7 <btrfs_tree_read_lock+0x107> add    $0x58,%rsp
ffffffff813005fb <btrfs_tree_read_lock+0x10b> pop    %rbx
ffffffff813005fc <btrfs_tree_read_lock+0x10c> pop    %r12
ffffffff813005fe <btrfs_tree_read_lock+0x10e> pop    %r13
ffffffff81300600 <btrfs_tree_read_lock+0x110> pop    %r14
ffffffff81300602 <btrfs_tree_read_lock+0x112> pop    %r15
ffffffff81300604 <btrfs_tree_read_lock+0x114> leaveq 
ffffffff81300605 <btrfs_tree_read_lock+0x115> retq   
ffffffff81300606 <btrfs_tree_read_lock+0x116> nopw   %cs:0x0(%rax,%rax,1)

==The Second Issue (aka "The Busy Looping sync()" case) ==

The box is different from first, so conditions are a bit different.

- it's a _patched_ linus's 3.1.0-rc3-00271-g3230739 (+patch from $SUBJ on top)
- /dev/root on / type btrfs (rw,noatime,autodefrag)
  (note autodefrag!)
- 15% full 594GB filesystem (usual nonmixed mode)

How I hung it:
    $ liferea
    <wait it to calm down. it does a lot of SQLite reads/writes>
    $ sync
    Got CPU is 100% loaded

Perfectly reproducible.
I don't remembed it it exits or hangs. Will send detailed followup today.

-- 

  Sergei

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-30 19:31       ` Sergei Trofimovich
@ 2011-08-30 19:40         ` Josef Bacik
  2011-08-30 19:45           ` Josef Bacik
  0 siblings, 1 reply; 12+ messages in thread
From: Josef Bacik @ 2011-08-30 19:40 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: Konstantin Khlebnikov, linux-btrfs, Chris Mason

On 08/30/2011 03:31 PM, Sergei Trofimovich wrote:
> On Tue, 30 Aug 2011 14:02:37 -0400 Josef Bacik <josef@redhat.com>
> wrote:
> 
>> On 08/30/2011 12:53 PM, Sergei Trofimovich wrote:
>>>> Running 'sync' program after the load does not finish and
>>>> eats 100%CPU busy-waiting for something in kernel.
>>>> 
>>>> It's easy to reproduce hang with patch for me. I just run 
>>>> liferea and sync after it. Without patch I haven't managed
>>>> to hang btrfs up.
>>> 
>>> And I think it's another btrfs bug. I've managed to reproduce
>>> it _without_ your patch and _without_ autodefrag enabled by
>>> manually running the following commands: $ btrfs fi defrag 
>>> file-with-20_000-extents $ sync
>>> 
>>> I think your patch just shuffles things a bit and forces 
>>> autodefrag races to pop-up sooner (which is good! :])
>>> 
>> 
>> Sergei, can you do sysrq+w when this is happening, and maybe turn
>> on the softlockup detector so we can see where sync is getting
>> stuck? Thanks,
> 
> Sure. As I keep telling about 2 cases in IRC I will state both here
> explicitely:
> 
> ==The First Issue (aka "The Hung sync()" case) ==
> 
> - it's an unpatched linus's v3.1-rc4-80-g0f43dd5 - /dev/root on /
> type btrfs (rw,noatime,compress=lzo) - 50% full 30GB filesystem
> (usual nonmixed mode)
> 
> How I hung it: $ /usr/sbin/filefrag ~/.bogofilter/wordlist.db 
> /home/st/.bogofilter/wordlist.db: 19070 extents found the file is
> 138MB sqlite database for bayesian SPAM filter, it's being read and
> written every 20 minutes or so. Maybe, it was writtent even in
> defrag/sync time! $~/dev/git/btrfs-progs-unstable/btrfs fi defrag
> ~/.bogofilter/wordlist.db $ sync ^C<hung in D-state>
> 
> I didn't try to reproduce it yet. As for lockdep I'll try but I'm
> afraid I will fail to reproduce, but I'll try tomorrow. I suspect
> I'll need to seriously fragment some file first down to such
> horrible state.
> 
> With help of David I've some (hopefully relevant) info: #!/bin/sh
> -x
> 
> for i in $(ps aux|grep " D[+ ]\?"|awk '{print $2}'); do ps $i sudo
> cat /proc/$i/stack done
> 
> PID TTY      STAT   TIME COMMAND 1291 ?        D      0:00
> [btrfs-endio-wri] [<ffffffff8130055d>]
> btrfs_tree_read_lock+0x6d/0x120 [<ffffffff812b8e88>]
> btrfs_search_slot+0x698/0x8b0 [<ffffffff812c9e18>]
> btrfs_lookup_csum+0x68/0x190 [<ffffffff812ca10f>]
> __btrfs_lookup_bio_sums+0x1cf/0x3e0 [<ffffffff812ca371>]
> btrfs_lookup_bio_sums+0x11/0x20 [<ffffffff812d6a50>]
> btrfs_submit_bio_hook+0x140/0x170 [<ffffffff812ed594>]
> submit_one_bio+0x64/0xa0 [<ffffffff812f14f5>]
> extent_readpages+0xe5/0x100 [<ffffffff812d7aaa>]
> btrfs_readpages+0x1a/0x20 [<ffffffff810a6a02>]
> __do_page_cache_readahead+0x1d2/0x280 [<ffffffff810a6d8c>]
> ra_submit+0x1c/0x20 [<ffffffff810a6ebd>]
> ondemand_readahead+0x12d/0x270 [<ffffffff810a70cc>]
> page_cache_sync_readahead+0x2c/0x40 [<ffffffff81309987>]
> __load_free_space_cache+0x1a7/0x5b0 [<ffffffff81309e61>]
> load_free_space_cache+0xd1/0x190 [<ffffffff812be07b>]
> cache_block_group+0xab/0x290 [<ffffffff812c3def>]
> find_free_extent.clone.71+0x39f/0xab0 [<ffffffff812c5160>]
> btrfs_reserve_extent+0xe0/0x170 [<ffffffff812c56df>]
> btrfs_alloc_free_block+0xcf/0x330 [<ffffffff812b498d>]
> __btrfs_cow_block+0x11d/0x4a0 [<ffffffff812b4df8>]
> btrfs_cow_block+0xe8/0x1a0 [<ffffffff812b8965>]
> btrfs_search_slot+0x175/0x8b0 [<ffffffff812c9e18>]
> btrfs_lookup_csum+0x68/0x190 [<ffffffff812caf6e>]
> btrfs_csum_file_blocks+0xbe/0x670 [<ffffffff812d7d91>]
> add_pending_csums.clone.39+0x41/0x60 [<ffffffff812da528>]
> btrfs_finish_ordered_io+0x218/0x310 [<ffffffff812da635>]
> btrfs_writepage_end_io_hook+0x15/0x20 [<ffffffff8130c71a>]
> end_compressed_bio_write+0x7a/0xe0 [<ffffffff811146f8>]
> bio_endio+0x18/0x30 [<ffffffff812cd8fc>]
> end_workqueue_fn+0xec/0x120 [<ffffffff812fb0ac>]
> worker_loop+0xac/0x520 [<ffffffff8105d486>] kthread+0x96/0xa0 
> [<ffffffff815f9214>] kernel_thread_helper+0x4/0x10 
> [<ffffffffffffffff>] 0xffffffffffffffff

Ok this should have been fixed with

Btrfs: use the commit_root for reading free_space_inode crcs

which is commit # 2cf8572dac62cc2ff7e995173e95b6c694401b3f.  Does your
kernel have this commit?  Because if it does then we did something
wrong.  If not it should be in linus's latest tree, so update and it
should go away (hopefully).  Thanks,

Josef

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-30 19:40         ` Josef Bacik
@ 2011-08-30 19:45           ` Josef Bacik
  2011-08-30 20:46             ` Sergei Trofimovich
  0 siblings, 1 reply; 12+ messages in thread
From: Josef Bacik @ 2011-08-30 19:45 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: Konstantin Khlebnikov, linux-btrfs, Chris Mason

On 08/30/2011 03:40 PM, Josef Bacik wrote:
> On 08/30/2011 03:31 PM, Sergei Trofimovich wrote:
>> On Tue, 30 Aug 2011 14:02:37 -0400 Josef Bacik <josef@redhat.com>
>> wrote:
>>
>>> On 08/30/2011 12:53 PM, Sergei Trofimovich wrote:
>>>>> Running 'sync' program after the load does not finish and
>>>>> eats 100%CPU busy-waiting for something in kernel.
>>>>>
>>>>> It's easy to reproduce hang with patch for me. I just run 
>>>>> liferea and sync after it. Without patch I haven't managed
>>>>> to hang btrfs up.
>>>>
>>>> And I think it's another btrfs bug. I've managed to reproduce
>>>> it _without_ your patch and _without_ autodefrag enabled by
>>>> manually running the following commands: $ btrfs fi defrag 
>>>> file-with-20_000-extents $ sync
>>>>
>>>> I think your patch just shuffles things a bit and forces 
>>>> autodefrag races to pop-up sooner (which is good! :])
>>>>
>>>
>>> Sergei, can you do sysrq+w when this is happening, and maybe turn
>>> on the softlockup detector so we can see where sync is getting
>>> stuck? Thanks,
>>
>> Sure. As I keep telling about 2 cases in IRC I will state both here
>> explicitely:
>>
>> ==The First Issue (aka "The Hung sync()" case) ==
>>
>> - it's an unpatched linus's v3.1-rc4-80-g0f43dd5 - /dev/root on /
>> type btrfs (rw,noatime,compress=lzo) - 50% full 30GB filesystem
>> (usual nonmixed mode)
>>
>> How I hung it: $ /usr/sbin/filefrag ~/.bogofilter/wordlist.db 
>> /home/st/.bogofilter/wordlist.db: 19070 extents found the file is
>> 138MB sqlite database for bayesian SPAM filter, it's being read and
>> written every 20 minutes or so. Maybe, it was writtent even in
>> defrag/sync time! $~/dev/git/btrfs-progs-unstable/btrfs fi defrag
>> ~/.bogofilter/wordlist.db $ sync ^C<hung in D-state>
>>
>> I didn't try to reproduce it yet. As for lockdep I'll try but I'm
>> afraid I will fail to reproduce, but I'll try tomorrow. I suspect
>> I'll need to seriously fragment some file first down to such
>> horrible state.
>>
>> With help of David I've some (hopefully relevant) info: #!/bin/sh
>> -x
>>
>> for i in $(ps aux|grep " D[+ ]\?"|awk '{print $2}'); do ps $i sudo
>> cat /proc/$i/stack done
>>
>> PID TTY      STAT   TIME COMMAND 1291 ?        D      0:00
>> [btrfs-endio-wri] [<ffffffff8130055d>]
>> btrfs_tree_read_lock+0x6d/0x120 [<ffffffff812b8e88>]
>> btrfs_search_slot+0x698/0x8b0 [<ffffffff812c9e18>]
>> btrfs_lookup_csum+0x68/0x190 [<ffffffff812ca10f>]
>> __btrfs_lookup_bio_sums+0x1cf/0x3e0 [<ffffffff812ca371>]
>> btrfs_lookup_bio_sums+0x11/0x20 [<ffffffff812d6a50>]
>> btrfs_submit_bio_hook+0x140/0x170 [<ffffffff812ed594>]
>> submit_one_bio+0x64/0xa0 [<ffffffff812f14f5>]
>> extent_readpages+0xe5/0x100 [<ffffffff812d7aaa>]
>> btrfs_readpages+0x1a/0x20 [<ffffffff810a6a02>]
>> __do_page_cache_readahead+0x1d2/0x280 [<ffffffff810a6d8c>]
>> ra_submit+0x1c/0x20 [<ffffffff810a6ebd>]
>> ondemand_readahead+0x12d/0x270 [<ffffffff810a70cc>]
>> page_cache_sync_readahead+0x2c/0x40 [<ffffffff81309987>]
>> __load_free_space_cache+0x1a7/0x5b0 [<ffffffff81309e61>]
>> load_free_space_cache+0xd1/0x190 [<ffffffff812be07b>]
>> cache_block_group+0xab/0x290 [<ffffffff812c3def>]
>> find_free_extent.clone.71+0x39f/0xab0 [<ffffffff812c5160>]
>> btrfs_reserve_extent+0xe0/0x170 [<ffffffff812c56df>]
>> btrfs_alloc_free_block+0xcf/0x330 [<ffffffff812b498d>]
>> __btrfs_cow_block+0x11d/0x4a0 [<ffffffff812b4df8>]
>> btrfs_cow_block+0xe8/0x1a0 [<ffffffff812b8965>]
>> btrfs_search_slot+0x175/0x8b0 [<ffffffff812c9e18>]
>> btrfs_lookup_csum+0x68/0x190 [<ffffffff812caf6e>]
>> btrfs_csum_file_blocks+0xbe/0x670 [<ffffffff812d7d91>]
>> add_pending_csums.clone.39+0x41/0x60 [<ffffffff812da528>]
>> btrfs_finish_ordered_io+0x218/0x310 [<ffffffff812da635>]
>> btrfs_writepage_end_io_hook+0x15/0x20 [<ffffffff8130c71a>]
>> end_compressed_bio_write+0x7a/0xe0 [<ffffffff811146f8>]
>> bio_endio+0x18/0x30 [<ffffffff812cd8fc>]
>> end_workqueue_fn+0xec/0x120 [<ffffffff812fb0ac>]
>> worker_loop+0xac/0x520 [<ffffffff8105d486>] kthread+0x96/0xa0 
>> [<ffffffff815f9214>] kernel_thread_helper+0x4/0x10 
>> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Ok this should have been fixed with
> 
> Btrfs: use the commit_root for reading free_space_inode crcs
> 
> which is commit # 2cf8572dac62cc2ff7e995173e95b6c694401b3f.  Does your
> kernel have this commit?  Because if it does then we did something
> wrong.  If not it should be in linus's latest tree, so update and it
> should go away (hopefully).  Thanks,
> 

Oops looks like that patch won't fix it completely, I just sent another
patch that will fix this problem totally, sorry about that

[PATCH] Btrfs: skip locking if searching the commit root in lookup_csums

Thanks,

Josef

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-30 19:45           ` Josef Bacik
@ 2011-08-30 20:46             ` Sergei Trofimovich
  2011-08-30 21:17               ` Sergei Trofimovich
  0 siblings, 1 reply; 12+ messages in thread
From: Sergei Trofimovich @ 2011-08-30 20:46 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Konstantin Khlebnikov, linux-btrfs, Chris Mason

[-- Attachment #1: Type: text/plain, Size: 5695 bytes --]

> On 08/30/2011 03:40 PM, Josef Bacik wrote:
> > On 08/30/2011 03:31 PM, Sergei Trofimovich wrote:
> >> On Tue, 30 Aug 2011 14:02:37 -0400 Josef Bacik <josef@redhat.com>
> >> wrote:
> >>
> >>> On 08/30/2011 12:53 PM, Sergei Trofimovich wrote:
> >>>>> Running 'sync' program after the load does not finish and
> >>>>> eats 100%CPU busy-waiting for something in kernel.
> >>>>>
> >>>>> It's easy to reproduce hang with patch for me. I just run 
> >>>>> liferea and sync after it. Without patch I haven't managed
> >>>>> to hang btrfs up.
> >>>>
> >>>> And I think it's another btrfs bug. I've managed to reproduce
> >>>> it _without_ your patch and _without_ autodefrag enabled by
> >>>> manually running the following commands: $ btrfs fi defrag 
> >>>> file-with-20_000-extents $ sync
> >>>>
> >>>> I think your patch just shuffles things a bit and forces 
> >>>> autodefrag races to pop-up sooner (which is good! :])
> >>>>
> >>>
> >>> Sergei, can you do sysrq+w when this is happening, and maybe turn
> >>> on the softlockup detector so we can see where sync is getting
> >>> stuck? Thanks,
> >>
> >> Sure. As I keep telling about 2 cases in IRC I will state both here
> >> explicitely:
> >>
> >> ==The First Issue (aka "The Hung sync()" case) ==
> >>
> >> - it's an unpatched linus's v3.1-rc4-80-g0f43dd5 - /dev/root on /
> >> type btrfs (rw,noatime,compress=lzo) - 50% full 30GB filesystem
> >> (usual nonmixed mode)
> >>
> >> How I hung it: $ /usr/sbin/filefrag ~/.bogofilter/wordlist.db 
> >> /home/st/.bogofilter/wordlist.db: 19070 extents found the file is
> >> 138MB sqlite database for bayesian SPAM filter, it's being read and
> >> written every 20 minutes or so. Maybe, it was writtent even in
> >> defrag/sync time! $~/dev/git/btrfs-progs-unstable/btrfs fi defrag
> >> ~/.bogofilter/wordlist.db $ sync ^C<hung in D-state>
> >>
> >> I didn't try to reproduce it yet. As for lockdep I'll try but I'm
> >> afraid I will fail to reproduce, but I'll try tomorrow. I suspect
> >> I'll need to seriously fragment some file first down to such
> >> horrible state.
> >>
> >> With help of David I've some (hopefully relevant) info: #!/bin/sh
> >> -x
> >>
> >> for i in $(ps aux|grep " D[+ ]\?"|awk '{print $2}'); do ps $i sudo
> >> cat /proc/$i/stack done
> >>
> >> PID TTY      STAT   TIME COMMAND 1291 ?        D      0:00
> >> [btrfs-endio-wri] [<ffffffff8130055d>]
> >> btrfs_tree_read_lock+0x6d/0x120 [<ffffffff812b8e88>]
> >> btrfs_search_slot+0x698/0x8b0 [<ffffffff812c9e18>]
> >> btrfs_lookup_csum+0x68/0x190 [<ffffffff812ca10f>]
> >> __btrfs_lookup_bio_sums+0x1cf/0x3e0 [<ffffffff812ca371>]
> >> btrfs_lookup_bio_sums+0x11/0x20 [<ffffffff812d6a50>]
> >> btrfs_submit_bio_hook+0x140/0x170 [<ffffffff812ed594>]
> >> submit_one_bio+0x64/0xa0 [<ffffffff812f14f5>]
> >> extent_readpages+0xe5/0x100 [<ffffffff812d7aaa>]
> >> btrfs_readpages+0x1a/0x20 [<ffffffff810a6a02>]
> >> __do_page_cache_readahead+0x1d2/0x280 [<ffffffff810a6d8c>]
> >> ra_submit+0x1c/0x20 [<ffffffff810a6ebd>]
> >> ondemand_readahead+0x12d/0x270 [<ffffffff810a70cc>]
> >> page_cache_sync_readahead+0x2c/0x40 [<ffffffff81309987>]
> >> __load_free_space_cache+0x1a7/0x5b0 [<ffffffff81309e61>]
> >> load_free_space_cache+0xd1/0x190 [<ffffffff812be07b>]
> >> cache_block_group+0xab/0x290 [<ffffffff812c3def>]
> >> find_free_extent.clone.71+0x39f/0xab0 [<ffffffff812c5160>]
> >> btrfs_reserve_extent+0xe0/0x170 [<ffffffff812c56df>]
> >> btrfs_alloc_free_block+0xcf/0x330 [<ffffffff812b498d>]
> >> __btrfs_cow_block+0x11d/0x4a0 [<ffffffff812b4df8>]
> >> btrfs_cow_block+0xe8/0x1a0 [<ffffffff812b8965>]
> >> btrfs_search_slot+0x175/0x8b0 [<ffffffff812c9e18>]
> >> btrfs_lookup_csum+0x68/0x190 [<ffffffff812caf6e>]
> >> btrfs_csum_file_blocks+0xbe/0x670 [<ffffffff812d7d91>]
> >> add_pending_csums.clone.39+0x41/0x60 [<ffffffff812da528>]
> >> btrfs_finish_ordered_io+0x218/0x310 [<ffffffff812da635>]
> >> btrfs_writepage_end_io_hook+0x15/0x20 [<ffffffff8130c71a>]
> >> end_compressed_bio_write+0x7a/0xe0 [<ffffffff811146f8>]
> >> bio_endio+0x18/0x30 [<ffffffff812cd8fc>]
> >> end_workqueue_fn+0xec/0x120 [<ffffffff812fb0ac>]
> >> worker_loop+0xac/0x520 [<ffffffff8105d486>] kthread+0x96/0xa0 
> >> [<ffffffff815f9214>] kernel_thread_helper+0x4/0x10 
> >> [<ffffffffffffffff>] 0xffffffffffffffff
> > 
> > Ok this should have been fixed with
> > 
> > Btrfs: use the commit_root for reading free_space_inode crcs
> > 
> > which is commit # 2cf8572dac62cc2ff7e995173e95b6c694401b3f.  Does your
> > kernel have this commit?  Because if it does then we did something
> > wrong.  If not it should be in linus's latest tree, so update and it
> > should go away (hopefully).  Thanks,

Yeah, this one was in my local tree when hung.

> Oops looks like that patch won't fix it completely, I just sent another
> patch that will fix this problem totally, sorry about that
> 
> [PATCH] Btrfs: skip locking if searching the commit root in lookup_csums

I'll try to reproduce/test it tomorrow.

About the second one:

> ==The Second Issue (aka "The Busy Looping sync()" case) ==
> The box is different from first, so conditions are a bit different.
> - /dev/root on / type btrfs (rw,noatime,autodefrag)
>   (note autodefrag!)
> - 15% full 594GB filesystem (usual nonmixed mode)
> 
>    $ liferea
>    <wait it to calm down. it does a lot of SQLite reads/writes>
>    $ sync
>    Got CPU is 100% loaded <hung>

Still reproducible with 2 patches above + $SUBJ one. strace says it hangs in
strace() syscall. Stack trace is odd:
# cat /proc/`pidof sync`/stack
[<ffffffffffffffff>] 0xffffffffffffffff


-- 

  Sergei

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-30 20:46             ` Sergei Trofimovich
@ 2011-08-30 21:17               ` Sergei Trofimovich
  2011-09-02 17:01                 ` slyich
  0 siblings, 1 reply; 12+ messages in thread
From: Sergei Trofimovich @ 2011-08-30 21:17 UTC (permalink / raw)
  To: Josef Bacik, Konstantin Khlebnikov; +Cc: linux-btrfs, Chris Mason

[-- Attachment #1: Type: text/plain, Size: 2537 bytes --]

> About the second one:
> 
> > ==The Second Issue (aka "The Busy Looping sync()" case) ==
> > The box is different from first, so conditions are a bit different.
> > - /dev/root on / type btrfs (rw,noatime,autodefrag)
> >   (note autodefrag!)
> > - 15% full 594GB filesystem (usual nonmixed mode)
> > 
> >    $ liferea
> >    <wait it to calm down. it does a lot of SQLite reads/writes>
> >    $ sync
> >    Got CPU is 100% loaded <hung>
> 
> Still reproducible with 2 patches above + $SUBJ one. strace says it hangs in
> strace() syscall. Stack trace is odd:
> # cat /proc/`pidof sync`/stack
> [<ffffffffffffffff>] 0xffffffffffffffff

"got" stack of the looper with such hack:
    $ while :; do sudo echo "-"; cat /proc/2304/stack | fgrep -v '[<ffffffffffffffff>] 0xffffffffffffffff'; done | uniq

[<ffffffff8109d39d>] find_get_pages_tag+0x14d/0x1a0
[<ffffffff81076981>] __lock_acquire+0x5a1/0xbd0
[<ffffffff8109e7bd>] filemap_fdatawait_range+0x9d/0x1b0
[<ffffffff8109e8f2>] filemap_fdatawait+0x22/0x30
[<ffffffff81106a04>] sync_inodes_sb+0x1c4/0x250
[<ffffffff8110b398>] __sync_filesystem+0x78/0x80
[<ffffffff8110b3b7>] sync_one_sb+0x17/0x20
[<ffffffff810e3fbe>] iterate_supers+0x7e/0xe0
[<ffffffff8110b400>] sys_sync+0x40/0x60
[<ffffffff81456afb>] system_call_fastpath+0x16/0x1b
-
[<ffffffff811069f1>] sync_inodes_sb+0x1b1/0x250
[<ffffffff8110b398>] __sync_filesystem+0x78/0x80
[<ffffffff8110b3b7>] sync_one_sb+0x17/0x20
[<ffffffff810e3fbe>] iterate_supers+0x7e/0xe0
[<ffffffff8110b400>] sys_sync+0x40/0x60
[<ffffffff81456afb>] system_call_fastpath+0x16/0x1b
-
[<ffffffff811069f1>] sync_inodes_sb+0x1b1/0x250
[<ffffffff8110b398>] __sync_filesystem+0x78/0x80
[<ffffffff8110b3b7>] sync_one_sb+0x17/0x20
[<ffffffff810e3fbe>] iterate_supers+0x7e/0xe0
[<ffffffff8110b400>] sys_sync+0x40/0x60
[<ffffffff81456afb>] system_call_fastpath+0x16/0x1b
-
[<ffffffff814564b6>] retint_kernel+0x26/0x30
[<ffffffff8109d398>] find_get_pages_tag+0x148/0x1a0
[<ffffffff810a7a60>] pagevec_lookup_tag+0x20/0x30
[<ffffffff8109e7bd>] filemap_fdatawait_range+0x9d/0x1b0
[<ffffffff8109e8f2>] filemap_fdatawait+0x22/0x30
[<ffffffff81106a04>] sync_inodes_sb+0x1c4/0x250
[<ffffffff8110b398>] __sync_filesystem+0x78/0x80
[<ffffffff8110b3b7>] sync_one_sb+0x17/0x20
[<ffffffff810e3fbe>] iterate_supers+0x7e/0xe0
[<ffffffff8110b400>] sys_sync+0x40/0x60
[<ffffffff81456afb>] system_call_fastpath+0x16/0x1b

Looks like dirty inode count (or dirty page count) is affected by
the $SUBJ patch.

-- 

  Sergei

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-17 18:56 [PATCH] btrfs: fix warning in iput for bad-inode Konstantin Khlebnikov
  2011-08-29  3:34 ` Sergei Trofimovich
@ 2011-09-01  2:45 ` David Sterba
  1 sibling, 0 replies; 12+ messages in thread
From: David Sterba @ 2011-09-01  2:45 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-btrfs, Chris Mason, josef

Hi,

I saw something different with your patch. After creating and deleting
lots of subvolumes followed by an umount, there is one unpleasant
interaction with orphans.

After first mount attempt, the mount fails with

# mount /dev/sdb5
mount: Stale NFS file handle

syslog:
[  256.050787] btrfs: could not do orphan cleanup -116
[  256.962781] btrfs: open_ctree failed


the -116 is -ESTALE which your patch adds.

Another mount attempt and success, no relevant message in syslog. Reproducible.


btrfs_orphan_cleanup
  btrfs_iget -> fails

2373                 inode = btrfs_iget(root->fs_info->sb, &found_key, root, NULL);
2374                 if (IS_ERR(inode)) {
2375                         ret = PTR_ERR(inode);
2376                         goto out;
2377                 }
...
2440 out:
2441         if (ret)
2442                 printk(KERN_CRIT "btrfs: could not do orphan cleanup %d\n", ret);
2443         btrfs_free_path(path);
2444         return ret;
2445 }


I'm concerned about leaving some unprocessed orphans "somewhere".


david

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] btrfs: fix warning in iput for bad-inode
  2011-08-30 21:17               ` Sergei Trofimovich
@ 2011-09-02 17:01                 ` slyich
  2011-09-07  9:18                   ` David Sterba
  0 siblings, 1 reply; 12+ messages in thread
From: slyich @ 2011-09-02 17:01 UTC (permalink / raw)
  To: linux-btrfs
  Cc: Chris Mason, Josef Bacik, Sergei Trofimovich, Konstantin Khlebnikov

From: Sergei Trofimovich <slyfox@gentoo.org>

iput() shouldn't be called for inodes in I_NEW state.
We need to mark inode as constructed first.

WARNING: at fs/inode.c:1309 iput+0x20b/0x210()
Call Trace:
 [<ffffffff8103e7ba>] warn_slowpath_common+0x7a/0xb0
 [<ffffffff8103e805>] warn_slowpath_null+0x15/0x20
 [<ffffffff810eaf0b>] iput+0x20b/0x210
 [<ffffffff811b96fb>] btrfs_iget+0x1eb/0x4a0
 [<ffffffff811c3ad6>] btrfs_run_defrag_inodes+0x136/0x210
 [<ffffffff811ad55f>] cleaner_kthread+0x17f/0x1a0
 [<ffffffff81035b7d>] ? sub_preempt_count+0x9d/0xd0
 [<ffffffff811ad3e0>] ? transaction_kthread+0x280/0x280
 [<ffffffff8105af86>] kthread+0x96/0xa0
 [<ffffffff814336d4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8105aef0>] ? kthread_worker_fn+0x190/0x190
 [<ffffffff814336d0>] ? gs_change+0xb/0xb

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
Change since Konstantin's patch:
- avoid using explicit destructor for inode, as it can(?)
  be shared among other threads. Use unlock_new_inode()/iput()
  instead. It fixes 100% CPU load after 'sync()' popped-up in
  Konstantin's patch
 fs/btrfs/inode.c |   10 +++-------
 1 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0ccc743..ec4b550 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3952,7 +3952,6 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 			 struct btrfs_root *root, int *new)
 {
 	struct inode *inode;
-	int bad_inode = 0;
 
 	inode = btrfs_iget_locked(s, location->objectid, root);
 	if (!inode)
@@ -3968,15 +3967,12 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 			if (new)
 				*new = 1;
 		} else {
-			bad_inode = 1;
+			unlock_new_inode(inode);
+			iput(inode);
+			inode = ERR_PTR(-ESTALE);
 		}
 	}
 
-	if (bad_inode) {
-		iput(inode);
-		inode = ERR_PTR(-ESTALE);
-	}
-
 	return inode;
 }
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] btrfs: fix warning in iput for bad-inode
  2011-09-02 17:01                 ` slyich
@ 2011-09-07  9:18                   ` David Sterba
  0 siblings, 0 replies; 12+ messages in thread
From: David Sterba @ 2011-09-07  9:18 UTC (permalink / raw)
  To: slyich
  Cc: linux-btrfs, Chris Mason, Josef Bacik, Sergei Trofimovich,
	Konstantin Khlebnikov

On Fri, Sep 02, 2011 at 08:01:21PM +0300, slyich@gmail.com wrote:
> From: Sergei Trofimovich <slyfox@gentoo.org>
> 
> iput() shouldn't be called for inodes in I_NEW state.
> We need to mark inode as constructed first.
> 
> WARNING: at fs/inode.c:1309 iput+0x20b/0x210()
> Call Trace:
>  [<ffffffff8103e7ba>] warn_slowpath_common+0x7a/0xb0
>  [<ffffffff8103e805>] warn_slowpath_null+0x15/0x20
>  [<ffffffff810eaf0b>] iput+0x20b/0x210
>  [<ffffffff811b96fb>] btrfs_iget+0x1eb/0x4a0
>  [<ffffffff811c3ad6>] btrfs_run_defrag_inodes+0x136/0x210
>  [<ffffffff811ad55f>] cleaner_kthread+0x17f/0x1a0
>  [<ffffffff81035b7d>] ? sub_preempt_count+0x9d/0xd0
>  [<ffffffff811ad3e0>] ? transaction_kthread+0x280/0x280
>  [<ffffffff8105af86>] kthread+0x96/0xa0
>  [<ffffffff814336d4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff8105aef0>] ? kthread_worker_fn+0x190/0x190
>  [<ffffffff814336d0>] ? gs_change+0xb/0xb
> 
> Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
> CC: Konstantin Khlebnikov <khlebnikov@openvz.org>

Tested-by: David Sterba <dsterba@suse.cz>

> ---
> Change since Konstantin's patch:
> - avoid using explicit destructor for inode, as it can(?)
>   be shared among other threads. Use unlock_new_inode()/iput()
>   instead. It fixes 100% CPU load after 'sync()' popped-up in
>   Konstantin's patch
>  fs/btrfs/inode.c |   10 +++-------
>  1 files changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 0ccc743..ec4b550 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -3952,7 +3952,6 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
>  			 struct btrfs_root *root, int *new)
>  {
>  	struct inode *inode;
> -	int bad_inode = 0;
>  
>  	inode = btrfs_iget_locked(s, location->objectid, root);
>  	if (!inode)
> @@ -3968,15 +3967,12 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
>  			if (new)
>  				*new = 1;
>  		} else {
> -			bad_inode = 1;
> +			unlock_new_inode(inode);
> +			iput(inode);
> +			inode = ERR_PTR(-ESTALE);
>  		}
>  	}
>  
> -	if (bad_inode) {
> -		iput(inode);
> -		inode = ERR_PTR(-ESTALE);
> -	}
> -
>  	return inode;
>  }
>  
> -- 
> 1.7.3.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-09-07  9:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-17 18:56 [PATCH] btrfs: fix warning in iput for bad-inode Konstantin Khlebnikov
2011-08-29  3:34 ` Sergei Trofimovich
2011-08-30 16:53   ` Sergei Trofimovich
2011-08-30 18:02     ` Josef Bacik
2011-08-30 19:31       ` Sergei Trofimovich
2011-08-30 19:40         ` Josef Bacik
2011-08-30 19:45           ` Josef Bacik
2011-08-30 20:46             ` Sergei Trofimovich
2011-08-30 21:17               ` Sergei Trofimovich
2011-09-02 17:01                 ` slyich
2011-09-07  9:18                   ` David Sterba
2011-09-01  2:45 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.