[PATCH] Btrfs: fix list_add corruption and soft lockups in fsync

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
@ 2017-11-13 17:57 Liu Bo
  2017-11-20 17:34 ` David Sterba
  2017-11-21 21:35 ` [PATCH v2] " Liu Bo
  0 siblings, 2 replies; 6+ messages in thread
From: Liu Bo @ 2017-11-13 17:57 UTC (permalink / raw)
  To: linux-btrfs

Xfstests btrfs/146 revealed this corruption,

[   58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
[   58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[   58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
[   58.153518] ------------[ cut here ]------------
[   58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
...
[   58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
...
[   58.161956] Call Trace:
[   58.162264]  btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
[   58.163583]  btrfs_log_dentry_safe+0x60/0x80 [btrfs]
[   58.164003]  btrfs_sync_file+0x4c2/0x6f0 [btrfs]
[   58.164393]  vfs_fsync_range+0x5f/0xd0
[   58.164898]  do_fsync+0x5a/0x90
[   58.165170]  SyS_fsync+0x10/0x20
[   58.165395]  entry_SYSCALL_64_fastpath+0x1f/0xbe
...

It turns out that we could record btrfs_log_ctx:io_err in
log_one_extents when IO fails, but make log_one_extents() return '0'
instead of -EIO, so the IO error is not acknowledged by the callers,
i.e.  btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
from list head 'root->log_ctxs'.  Since btrfs_log_ctx is allocated
from stack memory, it'd get freed with a object alive on the
list. then a future list_add will throw the above warning.

This returns the correct error in the above case.

Jeff also reported this while testing against his fsync error
patch set[1].

[1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
"btrfs list corruption and soft lockups while testing writeback error handling"

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/file.c     | 1 +
 fs/btrfs/tree-log.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 063180b..db70eaa 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	if (ctx.io_err) {
 		btrfs_end_transaction(trans);
 		ret = ctx.io_err;
+		ASSERT(list_empty(&ctx.list));
 		goto out;
 	}
 
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index c800d06..d300284 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
 
 	if (ordered_io_err) {
 		ctx->io_err = -EIO;
-		return 0;
+		return ctx->io_err;
 	}
 
 	btrfs_init_map_token(&token);
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
  2017-11-13 17:57 [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync Liu Bo
@ 2017-11-20 17:34 ` David Sterba
  2017-11-20 17:37   ` Liu Bo
  2017-11-21 21:35 ` [PATCH v2] " Liu Bo
  1 sibling, 1 reply; 6+ messages in thread
From: David Sterba @ 2017-11-20 17:34 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Mon, Nov 13, 2017 at 10:57:24AM -0700, Liu Bo wrote:
> Xfstests btrfs/146 revealed this corruption,
> 
> [   58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
> [   58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> [   58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
> [   58.153518] ------------[ cut here ]------------
> [   58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
> ...
> [   58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
> ...
> [   58.161956] Call Trace:
> [   58.162264]  btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
> [   58.163583]  btrfs_log_dentry_safe+0x60/0x80 [btrfs]
> [   58.164003]  btrfs_sync_file+0x4c2/0x6f0 [btrfs]
> [   58.164393]  vfs_fsync_range+0x5f/0xd0
> [   58.164898]  do_fsync+0x5a/0x90
> [   58.165170]  SyS_fsync+0x10/0x20
> [   58.165395]  entry_SYSCALL_64_fastpath+0x1f/0xbe
> ...
> 
> It turns out that we could record btrfs_log_ctx:io_err in
> log_one_extents when IO fails, but make log_one_extents() return '0'
> instead of -EIO, so the IO error is not acknowledged by the callers,
> i.e.  btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
> from list head 'root->log_ctxs'.  Since btrfs_log_ctx is allocated
> from stack memory, it'd get freed with a object alive on the
> list. then a future list_add will throw the above warning.
> 
> This returns the correct error in the above case.
> 
> Jeff also reported this while testing against his fsync error
> patch set[1].
> 
> [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> "btrfs list corruption and soft lockups while testing writeback error handling"
> 

Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")

you can also add the commit author to CC.

> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/file.c     | 1 +
>  fs/btrfs/tree-log.c | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 063180b..db70eaa 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
>  	if (ctx.io_err) {
>  		btrfs_end_transaction(trans);
>  		ret = ctx.io_err;
> +		ASSERT(list_empty(&ctx.list));

Please move that to the label 'out', so all exit paths can catch the
problem.

>  		goto out;
>  	}
>  
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index c800d06..d300284 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
>  
>  	if (ordered_io_err) {
>  		ctx->io_err = -EIO;
> -		return 0;
> +		return ctx->io_err;
>  	}
>  
>  	btrfs_init_map_token(&token);
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
  2017-11-20 17:34 ` David Sterba
@ 2017-11-20 17:37   ` Liu Bo
  2017-11-21 14:00     ` David Sterba
  0 siblings, 1 reply; 6+ messages in thread
From: Liu Bo @ 2017-11-20 17:37 UTC (permalink / raw)
  To: dsterba, linux-btrfs

On Mon, Nov 20, 2017 at 06:34:34PM +0100, David Sterba wrote:
> On Mon, Nov 13, 2017 at 10:57:24AM -0700, Liu Bo wrote:
> > Xfstests btrfs/146 revealed this corruption,
> > 
> > [   58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
> > [   58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> > [   58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
> > [   58.153518] ------------[ cut here ]------------
> > [   58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
> > ...
> > [   58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
> > ...
> > [   58.161956] Call Trace:
> > [   58.162264]  btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
> > [   58.163583]  btrfs_log_dentry_safe+0x60/0x80 [btrfs]
> > [   58.164003]  btrfs_sync_file+0x4c2/0x6f0 [btrfs]
> > [   58.164393]  vfs_fsync_range+0x5f/0xd0
> > [   58.164898]  do_fsync+0x5a/0x90
> > [   58.165170]  SyS_fsync+0x10/0x20
> > [   58.165395]  entry_SYSCALL_64_fastpath+0x1f/0xbe
> > ...
> > 
> > It turns out that we could record btrfs_log_ctx:io_err in
> > log_one_extents when IO fails, but make log_one_extents() return '0'
> > instead of -EIO, so the IO error is not acknowledged by the callers,
> > i.e.  btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
> > from list head 'root->log_ctxs'.  Since btrfs_log_ctx is allocated
> > from stack memory, it'd get freed with a object alive on the
> > list. then a future list_add will throw the above warning.
> > 
> > This returns the correct error in the above case.
> > 
> > Jeff also reported this while testing against his fsync error
> > patch set[1].
> > 
> > [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> > "btrfs list corruption and soft lockups while testing writeback error handling"
> > 
> 
> Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
> 
> you can also add the commit author to CC.

Hmm, somehow I thought ctx->list hasn't been added at that time, but
looks like you're right.

> 
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > ---
> >  fs/btrfs/file.c     | 1 +
> >  fs/btrfs/tree-log.c | 2 +-
> >  2 files changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 063180b..db70eaa 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
> >  	if (ctx.io_err) {
> >  		btrfs_end_transaction(trans);
> >  		ret = ctx.io_err;
> > +		ASSERT(list_empty(&ctx.list));
> 
> Please move that to the label 'out', so all exit paths can catch the
> problem.
>

'out' can't be used as we can goto 'out' before ctx is initialized,
I'll add a new label 'out_ctx' in a separate patch (Feel free to fold
them into one if you prefer).

Thanks,

-liubo
> >  		goto out;
> >  	}
> >  
> > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> > index c800d06..d300284 100644
> > --- a/fs/btrfs/tree-log.c
> > +++ b/fs/btrfs/tree-log.c
> > @@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
> >  
> >  	if (ordered_io_err) {
> >  		ctx->io_err = -EIO;
> > -		return 0;
> > +		return ctx->io_err;
> >  	}
> >  
> >  	btrfs_init_map_token(&token);
> > -- 
> > 2.9.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
  2017-11-20 17:37   ` Liu Bo
@ 2017-11-21 14:00     ` David Sterba
  0 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2017-11-21 14:00 UTC (permalink / raw)
  To: Liu Bo; +Cc: dsterba, linux-btrfs

On Mon, Nov 20, 2017 at 10:37:25AM -0700, Liu Bo wrote:
> > > [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> > > "btrfs list corruption and soft lockups while testing writeback error handling"
> > > 
> > 
> > Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
> > 
> > you can also add the commit author to CC.
> 
> Hmm, somehow I thought ctx->list hasn't been added at that time, but
> looks like you're right.

The commit added

+       if (ordered_io_err) {
+               ctx->io_err = -EIO;
+               return 0;
+       }
+

which you fix.

> > > --- a/fs/btrfs/file.c
> > > +++ b/fs/btrfs/file.c
> > > @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
> > >  	if (ctx.io_err) {
> > >  		btrfs_end_transaction(trans);
> > >  		ret = ctx.io_err;
> > > +		ASSERT(list_empty(&ctx.list));
> > 
> > Please move that to the label 'out', so all exit paths can catch the
> > problem.
> >
> 
> 'out' can't be used as we can goto 'out' before ctx is initialized,
> I'll add a new label 'out_ctx' in a separate patch (Feel free to fold
> them into one if you prefer).

Would be better to initialize ctx->list at the beginning and then always do the
list_empty check, instead of selectively jumping to out or out_ctx.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] Btrfs: fix list_add corruption and soft lockups in fsync
  2017-11-13 17:57 [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync Liu Bo
  2017-11-20 17:34 ` David Sterba
@ 2017-11-21 21:35 ` Liu Bo
  2017-11-27 16:41   ` David Sterba
  1 sibling, 1 reply; 6+ messages in thread
From: Liu Bo @ 2017-11-21 21:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

Xfstests btrfs/146 revealed this corruption,

[   58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
[   58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[   58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
[   58.153518] ------------[ cut here ]------------
[   58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
...
[   58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
...
[   58.161956] Call Trace:
[   58.162264]  btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
[   58.163583]  btrfs_log_dentry_safe+0x60/0x80 [btrfs]
[   58.164003]  btrfs_sync_file+0x4c2/0x6f0 [btrfs]
[   58.164393]  vfs_fsync_range+0x5f/0xd0
[   58.164898]  do_fsync+0x5a/0x90
[   58.165170]  SyS_fsync+0x10/0x20
[   58.165395]  entry_SYSCALL_64_fastpath+0x1f/0xbe
...

It turns out that we could record btrfs_log_ctx:io_err in
log_one_extents when IO fails, but make log_one_extents() return '0'
instead of -EIO, so the IO error is not acknowledged by the callers,
i.e.  btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
from list head 'root->log_ctxs'.  Since btrfs_log_ctx is allocated
from stack memory, it'd get freed with a object alive on the
list. then a future list_add will throw the above warning.

This returns the correct error in the above case.

Jeff also reported this while testing against his fsync error
patch set[1].

[1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
"btrfs list corruption and soft lockups while testing writeback error handling"

Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2:
- Initialize ctx at the very beginning of fsync so that we can check
  if list is empty at out label

 fs/btrfs/file.c     | 5 +++--
 fs/btrfs/tree-log.c | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 504e96d..1e6956b 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2066,6 +2066,8 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	len = (u64)end - (u64)start + 1;
 	trace_btrfs_sync_file(file, datasync);
 
+	btrfs_init_log_ctx(&ctx, inode);
+
 	/*
 	 * We write the dirty pages in the range and wait until they complete
 	 * out of the ->i_mutex. If so, we can flush the dirty pages by
@@ -2212,8 +2214,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	}
 	trans->sync = true;
 
-	btrfs_init_log_ctx(&ctx, inode);
-
 	ret = btrfs_log_dentry_safe(trans, root, dentry, start, end, &ctx);
 	if (ret < 0) {
 		/* Fallthrough and commit/free transaction. */
@@ -2271,6 +2271,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 		ret = btrfs_end_transaction(trans);
 	}
 out:
+	ASSERT(list_empty(&ctx.list));
 	err = file_check_and_advance_wb_err(file);
 	if (!ret)
 		ret = err;
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index c800d06..d300284 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
 
 	if (ordered_io_err) {
 		ctx->io_err = -EIO;
-		return 0;
+		return ctx->io_err;
 	}
 
 	btrfs_init_map_token(&token);
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] Btrfs: fix list_add corruption and soft lockups in fsync
  2017-11-21 21:35 ` [PATCH v2] " Liu Bo
@ 2017-11-27 16:41   ` David Sterba
  0 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2017-11-27 16:41 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, Filipe Manana

On Tue, Nov 21, 2017 at 02:35:40PM -0700, Liu Bo wrote:
> Xfstests btrfs/146 revealed this corruption,
> 
> [   58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
> [   58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> [   58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
> [   58.153518] ------------[ cut here ]------------
> [   58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
> ...
> [   58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
> ...
> [   58.161956] Call Trace:
> [   58.162264]  btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
> [   58.163583]  btrfs_log_dentry_safe+0x60/0x80 [btrfs]
> [   58.164003]  btrfs_sync_file+0x4c2/0x6f0 [btrfs]
> [   58.164393]  vfs_fsync_range+0x5f/0xd0
> [   58.164898]  do_fsync+0x5a/0x90
> [   58.165170]  SyS_fsync+0x10/0x20
> [   58.165395]  entry_SYSCALL_64_fastpath+0x1f/0xbe
> ...
> 
> It turns out that we could record btrfs_log_ctx:io_err in
> log_one_extents when IO fails, but make log_one_extents() return '0'
> instead of -EIO, so the IO error is not acknowledged by the callers,
> i.e.  btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
> from list head 'root->log_ctxs'.  Since btrfs_log_ctx is allocated
> from stack memory, it'd get freed with a object alive on the
> list. then a future list_add will throw the above warning.
> 
> This returns the correct error in the above case.
> 
> Jeff also reported this while testing against his fsync error
> patch set[1].
> 
> [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> "btrfs list corruption and soft lockups while testing writeback error handling"
> 
> Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-27 16:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-13 17:57 [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync Liu Bo
2017-11-20 17:34 ` David Sterba
2017-11-20 17:37   ` Liu Bo
2017-11-21 14:00     ` David Sterba
2017-11-21 21:35 ` [PATCH v2] " Liu Bo
2017-11-27 16:41   ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.