linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] Btrfs: fix hang when loading existing inode cache off disk
@ 2019-07-04 15:24 fdmanana
  2019-07-05  9:09 ` Nikolay Borisov
  0 siblings, 1 reply; 3+ messages in thread
From: fdmanana @ 2019-07-04 15:24 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

If we are able to load an existing inode cache off disk, we set the state
of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting
for the cache to be available. This means that anyone waiting for the
cache to be available, waiting on the condition that either its state is
BTRFS_CACHE_FINISHED or its available free space is greather than zero,
can hang forever.

This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache",
in particular test case generic/161 triggered it very frequently for me,
producing a trace like the following:

  [63795.739712] BTRFS info (device sdc): enabling inode map caching
  [63795.739714] BTRFS info (device sdc): disk space caching is enabled
  [63795.739716] BTRFS info (device sdc): has skinny extents
  [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds.
  [64036.654079]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
  [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [64036.654232] btrfs-transacti D    0  3917      2 0x80004000
  [64036.654239] Call Trace:
  [64036.654258]  ? __schedule+0x3ae/0x7b0
  [64036.654271]  schedule+0x3a/0xb0
  [64036.654325]  btrfs_commit_transaction+0x978/0xae0 [btrfs]
  [64036.654339]  ? remove_wait_queue+0x60/0x60
  [64036.654395]  transaction_kthread+0x146/0x180 [btrfs]
  [64036.654450]  ? btrfs_cleanup_transaction+0x620/0x620 [btrfs]
  [64036.654456]  kthread+0x103/0x140
  [64036.654464]  ? kthread_create_worker_on_cpu+0x70/0x70
  [64036.654476]  ret_from_fork+0x3a/0x50
  [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds.
  [64036.654568]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
  [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [64036.654685] xfs_io          D    0  3919   3633 0x00000000
  [64036.654691] Call Trace:
  [64036.654703]  ? __schedule+0x3ae/0x7b0
  [64036.654716]  schedule+0x3a/0xb0
  [64036.654756]  btrfs_find_free_ino+0xa9/0x120 [btrfs]
  [64036.654764]  ? remove_wait_queue+0x60/0x60
  [64036.654809]  btrfs_create+0x72/0x1f0 [btrfs]
  [64036.654822]  lookup_open+0x6bc/0x790
  [64036.654849]  path_openat+0x3bc/0xc00
  [64036.654854]  ? __lock_acquire+0x331/0x1cb0
  [64036.654869]  do_filp_open+0x99/0x110
  [64036.654884]  ? __alloc_fd+0xee/0x200
  [64036.654895]  ? do_raw_spin_unlock+0x49/0xc0
  [64036.654909]  ? do_sys_open+0x132/0x220
  [64036.654913]  do_sys_open+0x132/0x220
  [64036.654926]  do_syscall_64+0x60/0x1d0
  [64036.654933]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fix this by adding a wake_up() call right after setting the cache state to
BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the
cache from disk.

Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/inode-map.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index ffca2abf13d0..4a5882665f8a 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -145,6 +145,7 @@ static void start_caching(struct btrfs_root *root)
 		spin_lock(&root->ino_cache_lock);
 		root->ino_cache_state = BTRFS_CACHE_FINISHED;
 		spin_unlock(&root->ino_cache_lock);
+		wake_up(&root->ino_cache_wait);
 		return;
 	}
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/5] Btrfs: fix hang when loading existing inode cache off disk
  2019-07-04 15:24 [PATCH 1/5] Btrfs: fix hang when loading existing inode cache off disk fdmanana
@ 2019-07-05  9:09 ` Nikolay Borisov
  2019-07-05 10:44   ` Filipe Manana
  0 siblings, 1 reply; 3+ messages in thread
From: Nikolay Borisov @ 2019-07-05  9:09 UTC (permalink / raw)
  To: fdmanana, linux-btrfs



On 4.07.19 г. 18:24 ч., fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If we are able to load an existing inode cache off disk, we set the state
> of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting
> for the cache to be available. This means that anyone waiting for the
> cache to be available, waiting on the condition that either its state is
> BTRFS_CACHE_FINISHED or its available free space is greather than zero,
> can hang forever.
> 
> This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache",
> in particular test case generic/161 triggered it very frequently for me,
> producing a trace like the following:
> 
>   [63795.739712] BTRFS info (device sdc): enabling inode map caching
>   [63795.739714] BTRFS info (device sdc): disk space caching is enabled
>   [63795.739716] BTRFS info (device sdc): has skinny extents
>   [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds.
>   [64036.654079]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
>   [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>   [64036.654232] btrfs-transacti D    0  3917      2 0x80004000
>   [64036.654239] Call Trace:
>   [64036.654258]  ? __schedule+0x3ae/0x7b0
>   [64036.654271]  schedule+0x3a/0xb0
>   [64036.654325]  btrfs_commit_transaction+0x978/0xae0 [btrfs]
>   [64036.654339]  ? remove_wait_queue+0x60/0x60
>   [64036.654395]  transaction_kthread+0x146/0x180 [btrfs]
>   [64036.654450]  ? btrfs_cleanup_transaction+0x620/0x620 [btrfs]
>   [64036.654456]  kthread+0x103/0x140
>   [64036.654464]  ? kthread_create_worker_on_cpu+0x70/0x70
>   [64036.654476]  ret_from_fork+0x3a/0x50
>   [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds.
>   [64036.654568]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
>   [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>   [64036.654685] xfs_io          D    0  3919   3633 0x00000000
>   [64036.654691] Call Trace:
>   [64036.654703]  ? __schedule+0x3ae/0x7b0
>   [64036.654716]  schedule+0x3a/0xb0
>   [64036.654756]  btrfs_find_free_ino+0xa9/0x120 [btrfs]
>   [64036.654764]  ? remove_wait_queue+0x60/0x60
>   [64036.654809]  btrfs_create+0x72/0x1f0 [btrfs]
>   [64036.654822]  lookup_open+0x6bc/0x790
>   [64036.654849]  path_openat+0x3bc/0xc00
>   [64036.654854]  ? __lock_acquire+0x331/0x1cb0
>   [64036.654869]  do_filp_open+0x99/0x110
>   [64036.654884]  ? __alloc_fd+0xee/0x200
>   [64036.654895]  ? do_raw_spin_unlock+0x49/0xc0
>   [64036.654909]  ? do_sys_open+0x132/0x220
>   [64036.654913]  do_sys_open+0x132/0x220
>   [64036.654926]  do_syscall_64+0x60/0x1d0
>   [64036.654933]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> Fix this by adding a wake_up() call right after setting the cache state to
> BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the
> cache from disk.
> 
> Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
>  fs/btrfs/inode-map.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
> index ffca2abf13d0..4a5882665f8a 100644
> --- a/fs/btrfs/inode-map.c
> +++ b/fs/btrfs/inode-map.c
> @@ -145,6 +145,7 @@ static void start_caching(struct btrfs_root *root)
>  		spin_lock(&root->ino_cache_lock);
>  		root->ino_cache_state = BTRFS_CACHE_FINISHED;
>  		spin_unlock(&root->ino_cache_lock);
> +		wake_up(&root->ino_cache_wait);

One of the two callers of start_caching doesn't actually wait for the
cache to load - btrfs_return_ino. Is this expected or is it also a bug?

The presence of such a glaring omission of the wake up means this code
hasn't been tested much.

>  		return;
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/5] Btrfs: fix hang when loading existing inode cache off disk
  2019-07-05  9:09 ` Nikolay Borisov
@ 2019-07-05 10:44   ` Filipe Manana
  0 siblings, 0 replies; 3+ messages in thread
From: Filipe Manana @ 2019-07-05 10:44 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs

On Fri, Jul 5, 2019 at 10:09 AM Nikolay Borisov <nborisov@suse.com> wrote:
>
>
>
> On 4.07.19 г. 18:24 ч., fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > If we are able to load an existing inode cache off disk, we set the state
> > of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting
> > for the cache to be available. This means that anyone waiting for the
> > cache to be available, waiting on the condition that either its state is
> > BTRFS_CACHE_FINISHED or its available free space is greather than zero,
> > can hang forever.
> >
> > This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache",
> > in particular test case generic/161 triggered it very frequently for me,
> > producing a trace like the following:
> >
> >   [63795.739712] BTRFS info (device sdc): enabling inode map caching
> >   [63795.739714] BTRFS info (device sdc): disk space caching is enabled
> >   [63795.739716] BTRFS info (device sdc): has skinny extents
> >   [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds.
> >   [64036.654079]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
> >   [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >   [64036.654232] btrfs-transacti D    0  3917      2 0x80004000
> >   [64036.654239] Call Trace:
> >   [64036.654258]  ? __schedule+0x3ae/0x7b0
> >   [64036.654271]  schedule+0x3a/0xb0
> >   [64036.654325]  btrfs_commit_transaction+0x978/0xae0 [btrfs]
> >   [64036.654339]  ? remove_wait_queue+0x60/0x60
> >   [64036.654395]  transaction_kthread+0x146/0x180 [btrfs]
> >   [64036.654450]  ? btrfs_cleanup_transaction+0x620/0x620 [btrfs]
> >   [64036.654456]  kthread+0x103/0x140
> >   [64036.654464]  ? kthread_create_worker_on_cpu+0x70/0x70
> >   [64036.654476]  ret_from_fork+0x3a/0x50
> >   [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds.
> >   [64036.654568]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
> >   [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >   [64036.654685] xfs_io          D    0  3919   3633 0x00000000
> >   [64036.654691] Call Trace:
> >   [64036.654703]  ? __schedule+0x3ae/0x7b0
> >   [64036.654716]  schedule+0x3a/0xb0
> >   [64036.654756]  btrfs_find_free_ino+0xa9/0x120 [btrfs]
> >   [64036.654764]  ? remove_wait_queue+0x60/0x60
> >   [64036.654809]  btrfs_create+0x72/0x1f0 [btrfs]
> >   [64036.654822]  lookup_open+0x6bc/0x790
> >   [64036.654849]  path_openat+0x3bc/0xc00
> >   [64036.654854]  ? __lock_acquire+0x331/0x1cb0
> >   [64036.654869]  do_filp_open+0x99/0x110
> >   [64036.654884]  ? __alloc_fd+0xee/0x200
> >   [64036.654895]  ? do_raw_spin_unlock+0x49/0xc0
> >   [64036.654909]  ? do_sys_open+0x132/0x220
> >   [64036.654913]  do_sys_open+0x132/0x220
> >   [64036.654926]  do_syscall_64+0x60/0x1d0
> >   [64036.654933]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > Fix this by adding a wake_up() call right after setting the cache state to
> > BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the
> > cache from disk.
> >
> > Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache")
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> > ---
> >  fs/btrfs/inode-map.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
> > index ffca2abf13d0..4a5882665f8a 100644
> > --- a/fs/btrfs/inode-map.c
> > +++ b/fs/btrfs/inode-map.c
> > @@ -145,6 +145,7 @@ static void start_caching(struct btrfs_root *root)
> >               spin_lock(&root->ino_cache_lock);
> >               root->ino_cache_state = BTRFS_CACHE_FINISHED;
> >               spin_unlock(&root->ino_cache_lock);
> > +             wake_up(&root->ino_cache_wait);
>
> One of the two callers of start_caching doesn't actually wait for the
> cache to load - btrfs_return_ino. Is this expected or is it also a bug?

It's expected. It doesn't need to allocate an inode, so it doesn't
need to wait for the caching to complete - it just wants to mark an
inode as free in the cache.

>
> The presence of such a glaring omission of the wake up means this code
> hasn't been tested much.

It hasn't. Last time I tried it was more than one year ago, maybe two.

Thanks.

>
> >               return;
> >       }
> >
> >

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-07-05 10:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-04 15:24 [PATCH 1/5] Btrfs: fix hang when loading existing inode cache off disk fdmanana
2019-07-05  9:09 ` Nikolay Borisov
2019-07-05 10:44   ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).