* [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
@ 2017-11-13 17:57 Liu Bo
2017-11-20 17:34 ` David Sterba
2017-11-21 21:35 ` [PATCH v2] " Liu Bo
0 siblings, 2 replies; 6+ messages in thread
From: Liu Bo @ 2017-11-13 17:57 UTC (permalink / raw)
To: linux-btrfs
Xfstests btrfs/146 revealed this corruption,
[ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
[ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
[ 58.153518] ------------[ cut here ]------------
[ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
...
[ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
...
[ 58.161956] Call Trace:
[ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
[ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
[ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs]
[ 58.164393] vfs_fsync_range+0x5f/0xd0
[ 58.164898] do_fsync+0x5a/0x90
[ 58.165170] SyS_fsync+0x10/0x20
[ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe
...
It turns out that we could record btrfs_log_ctx:io_err in
log_one_extents when IO fails, but make log_one_extents() return '0'
instead of -EIO, so the IO error is not acknowledged by the callers,
i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated
from stack memory, it'd get freed with a object alive on the
list. then a future list_add will throw the above warning.
This returns the correct error in the above case.
Jeff also reported this while testing against his fsync error
patch set[1].
[1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
"btrfs list corruption and soft lockups while testing writeback error handling"
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
fs/btrfs/file.c | 1 +
fs/btrfs/tree-log.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 063180b..db70eaa 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
if (ctx.io_err) {
btrfs_end_transaction(trans);
ret = ctx.io_err;
+ ASSERT(list_empty(&ctx.list));
goto out;
}
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index c800d06..d300284 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
if (ordered_io_err) {
ctx->io_err = -EIO;
- return 0;
+ return ctx->io_err;
}
btrfs_init_map_token(&token);
--
2.9.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
2017-11-13 17:57 [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync Liu Bo
@ 2017-11-20 17:34 ` David Sterba
2017-11-20 17:37 ` Liu Bo
2017-11-21 21:35 ` [PATCH v2] " Liu Bo
1 sibling, 1 reply; 6+ messages in thread
From: David Sterba @ 2017-11-20 17:34 UTC (permalink / raw)
To: Liu Bo; +Cc: linux-btrfs
On Mon, Nov 13, 2017 at 10:57:24AM -0700, Liu Bo wrote:
> Xfstests btrfs/146 revealed this corruption,
>
> [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
> [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
> [ 58.153518] ------------[ cut here ]------------
> [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
> ...
> [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
> ...
> [ 58.161956] Call Trace:
> [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
> [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
> [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs]
> [ 58.164393] vfs_fsync_range+0x5f/0xd0
> [ 58.164898] do_fsync+0x5a/0x90
> [ 58.165170] SyS_fsync+0x10/0x20
> [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe
> ...
>
> It turns out that we could record btrfs_log_ctx:io_err in
> log_one_extents when IO fails, but make log_one_extents() return '0'
> instead of -EIO, so the IO error is not acknowledged by the callers,
> i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
> from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated
> from stack memory, it'd get freed with a object alive on the
> list. then a future list_add will throw the above warning.
>
> This returns the correct error in the above case.
>
> Jeff also reported this while testing against his fsync error
> patch set[1].
>
> [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> "btrfs list corruption and soft lockups while testing writeback error handling"
>
Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
you can also add the commit author to CC.
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
> fs/btrfs/file.c | 1 +
> fs/btrfs/tree-log.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 063180b..db70eaa 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
> if (ctx.io_err) {
> btrfs_end_transaction(trans);
> ret = ctx.io_err;
> + ASSERT(list_empty(&ctx.list));
Please move that to the label 'out', so all exit paths can catch the
problem.
> goto out;
> }
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index c800d06..d300284 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
>
> if (ordered_io_err) {
> ctx->io_err = -EIO;
> - return 0;
> + return ctx->io_err;
> }
>
> btrfs_init_map_token(&token);
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
2017-11-20 17:34 ` David Sterba
@ 2017-11-20 17:37 ` Liu Bo
2017-11-21 14:00 ` David Sterba
0 siblings, 1 reply; 6+ messages in thread
From: Liu Bo @ 2017-11-20 17:37 UTC (permalink / raw)
To: dsterba, linux-btrfs
On Mon, Nov 20, 2017 at 06:34:34PM +0100, David Sterba wrote:
> On Mon, Nov 13, 2017 at 10:57:24AM -0700, Liu Bo wrote:
> > Xfstests btrfs/146 revealed this corruption,
> >
> > [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
> > [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> > [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
> > [ 58.153518] ------------[ cut here ]------------
> > [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
> > ...
> > [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
> > ...
> > [ 58.161956] Call Trace:
> > [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
> > [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
> > [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs]
> > [ 58.164393] vfs_fsync_range+0x5f/0xd0
> > [ 58.164898] do_fsync+0x5a/0x90
> > [ 58.165170] SyS_fsync+0x10/0x20
> > [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe
> > ...
> >
> > It turns out that we could record btrfs_log_ctx:io_err in
> > log_one_extents when IO fails, but make log_one_extents() return '0'
> > instead of -EIO, so the IO error is not acknowledged by the callers,
> > i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
> > from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated
> > from stack memory, it'd get freed with a object alive on the
> > list. then a future list_add will throw the above warning.
> >
> > This returns the correct error in the above case.
> >
> > Jeff also reported this while testing against his fsync error
> > patch set[1].
> >
> > [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> > "btrfs list corruption and soft lockups while testing writeback error handling"
> >
>
> Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
>
> you can also add the commit author to CC.
Hmm, somehow I thought ctx->list hasn't been added at that time, but
looks like you're right.
>
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > ---
> > fs/btrfs/file.c | 1 +
> > fs/btrfs/tree-log.c | 2 +-
> > 2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 063180b..db70eaa 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
> > if (ctx.io_err) {
> > btrfs_end_transaction(trans);
> > ret = ctx.io_err;
> > + ASSERT(list_empty(&ctx.list));
>
> Please move that to the label 'out', so all exit paths can catch the
> problem.
>
'out' can't be used as we can goto 'out' before ctx is initialized,
I'll add a new label 'out_ctx' in a separate patch (Feel free to fold
them into one if you prefer).
Thanks,
-liubo
> > goto out;
> > }
> >
> > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> > index c800d06..d300284 100644
> > --- a/fs/btrfs/tree-log.c
> > +++ b/fs/btrfs/tree-log.c
> > @@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
> >
> > if (ordered_io_err) {
> > ctx->io_err = -EIO;
> > - return 0;
> > + return ctx->io_err;
> > }
> >
> > btrfs_init_map_token(&token);
> > --
> > 2.9.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync
2017-11-20 17:37 ` Liu Bo
@ 2017-11-21 14:00 ` David Sterba
0 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2017-11-21 14:00 UTC (permalink / raw)
To: Liu Bo; +Cc: dsterba, linux-btrfs
On Mon, Nov 20, 2017 at 10:37:25AM -0700, Liu Bo wrote:
> > > [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> > > "btrfs list corruption and soft lockups while testing writeback error handling"
> > >
> >
> > Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
> >
> > you can also add the commit author to CC.
>
> Hmm, somehow I thought ctx->list hasn't been added at that time, but
> looks like you're right.
The commit added
+ if (ordered_io_err) {
+ ctx->io_err = -EIO;
+ return 0;
+ }
+
which you fix.
> > > --- a/fs/btrfs/file.c
> > > +++ b/fs/btrfs/file.c
> > > @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
> > > if (ctx.io_err) {
> > > btrfs_end_transaction(trans);
> > > ret = ctx.io_err;
> > > + ASSERT(list_empty(&ctx.list));
> >
> > Please move that to the label 'out', so all exit paths can catch the
> > problem.
> >
>
> 'out' can't be used as we can goto 'out' before ctx is initialized,
> I'll add a new label 'out_ctx' in a separate patch (Feel free to fold
> them into one if you prefer).
Would be better to initialize ctx->list at the beginning and then always do the
list_empty check, instead of selectively jumping to out or out_ctx.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] Btrfs: fix list_add corruption and soft lockups in fsync
2017-11-13 17:57 [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync Liu Bo
2017-11-20 17:34 ` David Sterba
@ 2017-11-21 21:35 ` Liu Bo
2017-11-27 16:41 ` David Sterba
1 sibling, 1 reply; 6+ messages in thread
From: Liu Bo @ 2017-11-21 21:35 UTC (permalink / raw)
To: linux-btrfs; +Cc: Filipe Manana
Xfstests btrfs/146 revealed this corruption,
[ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
[ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
[ 58.153518] ------------[ cut here ]------------
[ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
...
[ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
...
[ 58.161956] Call Trace:
[ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
[ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
[ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs]
[ 58.164393] vfs_fsync_range+0x5f/0xd0
[ 58.164898] do_fsync+0x5a/0x90
[ 58.165170] SyS_fsync+0x10/0x20
[ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe
...
It turns out that we could record btrfs_log_ctx:io_err in
log_one_extents when IO fails, but make log_one_extents() return '0'
instead of -EIO, so the IO error is not acknowledged by the callers,
i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated
from stack memory, it'd get freed with a object alive on the
list. then a future list_add will throw the above warning.
This returns the correct error in the above case.
Jeff also reported this while testing against his fsync error
patch set[1].
[1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
"btrfs list corruption and soft lockups while testing writeback error handling"
Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2:
- Initialize ctx at the very beginning of fsync so that we can check
if list is empty at out label
fs/btrfs/file.c | 5 +++--
fs/btrfs/tree-log.c | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 504e96d..1e6956b 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2066,6 +2066,8 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
len = (u64)end - (u64)start + 1;
trace_btrfs_sync_file(file, datasync);
+ btrfs_init_log_ctx(&ctx, inode);
+
/*
* We write the dirty pages in the range and wait until they complete
* out of the ->i_mutex. If so, we can flush the dirty pages by
@@ -2212,8 +2214,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
}
trans->sync = true;
- btrfs_init_log_ctx(&ctx, inode);
-
ret = btrfs_log_dentry_safe(trans, root, dentry, start, end, &ctx);
if (ret < 0) {
/* Fallthrough and commit/free transaction. */
@@ -2271,6 +2271,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
ret = btrfs_end_transaction(trans);
}
out:
+ ASSERT(list_empty(&ctx.list));
err = file_check_and_advance_wb_err(file);
if (!ret)
ret = err;
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index c800d06..d300284 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
if (ordered_io_err) {
ctx->io_err = -EIO;
- return 0;
+ return ctx->io_err;
}
btrfs_init_map_token(&token);
--
2.9.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] Btrfs: fix list_add corruption and soft lockups in fsync
2017-11-21 21:35 ` [PATCH v2] " Liu Bo
@ 2017-11-27 16:41 ` David Sterba
0 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2017-11-27 16:41 UTC (permalink / raw)
To: Liu Bo; +Cc: linux-btrfs, Filipe Manana
On Tue, Nov 21, 2017 at 02:35:40PM -0700, Liu Bo wrote:
> Xfstests btrfs/146 revealed this corruption,
>
> [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
> [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
> [ 58.153518] ------------[ cut here ]------------
> [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
> ...
> [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
> ...
> [ 58.161956] Call Trace:
> [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
> [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
> [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs]
> [ 58.164393] vfs_fsync_range+0x5f/0xd0
> [ 58.164898] do_fsync+0x5a/0x90
> [ 58.165170] SyS_fsync+0x10/0x20
> [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe
> ...
>
> It turns out that we could record btrfs_log_ctx:io_err in
> log_one_extents when IO fails, but make log_one_extents() return '0'
> instead of -EIO, so the IO error is not acknowledged by the callers,
> i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
> from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated
> from stack memory, it'd get freed with a object alive on the
> list. then a future list_add will throw the above warning.
>
> This returns the correct error in the above case.
>
> Jeff also reported this while testing against his fsync error
> patch set[1].
>
> [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
> "btrfs list corruption and soft lockups while testing writeback error handling"
>
> Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error")
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-11-27 16:43 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-13 17:57 [PATCH] Btrfs: fix list_add corruption and soft lockups in fsync Liu Bo
2017-11-20 17:34 ` David Sterba
2017-11-20 17:37 ` Liu Bo
2017-11-21 14:00 ` David Sterba
2017-11-21 21:35 ` [PATCH v2] " Liu Bo
2017-11-27 16:41 ` David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.