linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] btrfs: zoned: mark relocation as writing
@ 2022-02-18  4:14 Naohiro Aota
  2022-02-18  4:14 ` [PATCH v3 1/2] fs: add asserting functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-02-18  4:14 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

There is a hung_task issue with running generic/068 on an SMR
device. The hang occurs while a process is trying to thaw the
filesystem. The process is trying to take sb->s_umount to thaw the
FS. The lock is held by fsstress, which calls btrfs_sync_fs() and is
waiting for an ordered extent to finish. However, as the FS is frozen,
the ordered extent never finish.

Having an ordered extent while the FS is frozen is the root cause of
the hang. The ordered extent is initiated from btrfs_relocate_chunk()
which is called from btrfs_reclaim_bgs_work().

The first patch is a preparation patch to add asserting functions to
check if sb_start_{write,pagefault,intwrite} is called.

The second patch adds sb_{start,end}_write and the assert function at
proper places.

Changelog:
v3:
  - Return bool instead of asserting and let caller decide what to do
    (suggested by Dave Chinner)
v2:
  - Implement asserting functions not to directly touch the internal
    implementation

Naohiro Aota (2):
  fs: add asserting functions for sb_start_{write,pagefault,intwrite}
  btrfs: zoned: mark relocation as writing

 fs/btrfs/block-group.c |  8 +++++++-
 fs/btrfs/volumes.c     |  6 ++++++
 include/linux/fs.h     | 20 ++++++++++++++++++++
 3 files changed, 33 insertions(+), 1 deletion(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/2] fs: add asserting functions for sb_start_{write,pagefault,intwrite}
  2022-02-18  4:14 [PATCH v3 0/2] btrfs: zoned: mark relocation as writing Naohiro Aota
@ 2022-02-18  4:14 ` Naohiro Aota
  2022-02-18  4:14 ` [PATCH v3 2/2] btrfs: zoned: mark relocation as writing Naohiro Aota
  2022-02-18 16:54 ` [PATCH v3 0/2] " David Sterba
  2 siblings, 0 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-02-18  4:14 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

Add a function sb_write_started() to return if sb_start_write() is
properly called. It is used in the next commit.

Also, add the similar functions for sb_start_pagefault() and
sb_start_intwrite().

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 include/linux/fs.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index e2d892b201b0..8c7d01388feb 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1732,6 +1732,11 @@ static inline bool __sb_start_write_trylock(struct super_block *sb, int level)
 #define __sb_writers_release(sb, lev)	\
 	percpu_rwsem_release(&(sb)->s_writers.rw_sem[(lev)-1], 1, _THIS_IP_)
 
+static inline bool __sb_write_started(struct super_block *sb, int level)
+{
+	return lockdep_is_held_type(sb->s_writers.rw_sem + level - 1, 1);
+}
+
 /**
  * sb_end_write - drop write access to a superblock
  * @sb: the super we wrote to
@@ -1797,6 +1802,11 @@ static inline bool sb_start_write_trylock(struct super_block *sb)
 	return __sb_start_write_trylock(sb, SB_FREEZE_WRITE);
 }
 
+static inline bool sb_write_started(struct super_block *sb)
+{
+	return __sb_write_started(sb, SB_FREEZE_WRITE);
+}
+
 /**
  * sb_start_pagefault - get write access to a superblock from a page fault
  * @sb: the super we write to
@@ -1821,6 +1831,11 @@ static inline void sb_start_pagefault(struct super_block *sb)
 	__sb_start_write(sb, SB_FREEZE_PAGEFAULT);
 }
 
+static inline bool sb_pagefault_started(struct super_block *sb)
+{
+	return __sb_write_started(sb, SB_FREEZE_PAGEFAULT);
+}
+
 /**
  * sb_start_intwrite - get write access to a superblock for internal fs purposes
  * @sb: the super we write to
@@ -1844,6 +1859,11 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb)
 	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
 }
 
+static inline bool sb_intwrite_started(struct super_block *sb)
+{
+	return __sb_write_started(sb, SB_FREEZE_FS);
+}
+
 bool inode_owner_or_capable(struct user_namespace *mnt_userns,
 			    const struct inode *inode);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/2] btrfs: zoned: mark relocation as writing
  2022-02-18  4:14 [PATCH v3 0/2] btrfs: zoned: mark relocation as writing Naohiro Aota
  2022-02-18  4:14 ` [PATCH v3 1/2] fs: add asserting functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
@ 2022-02-18  4:14 ` Naohiro Aota
  2022-02-18  6:13   ` Johannes Thumshirn
  2022-02-23 10:31   ` David Sterba
  2022-02-18 16:54 ` [PATCH v3 0/2] " David Sterba
  2 siblings, 2 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-02-18  4:14 UTC (permalink / raw)
  To: linux-btrfs; +Cc: johannes.thumshirn, linux-fsdevel, viro, david, Naohiro Aota

There is a hung_task issue with running generic/068 on an SMR
device. The hang occurs while a process is trying to thaw the
filesystem. The process is trying to take sb->s_umount to thaw the
FS. The lock is held by fsstress, which calls btrfs_sync_fs() and is
waiting for an ordered extent to finish. However, as the FS is frozen,
the ordered extent never finish.

Having an ordered extent while the FS is frozen is the root cause of
the hang. The ordered extent is initiated from btrfs_relocate_chunk()
which is called from btrfs_reclaim_bgs_work().

This commit add sb_*_write() around btrfs_relocate_chunk() call
site. For the usual "btrfs balance" command, we already call it with
mnt_want_file() in btrfs_ioctl_balance().

Additionally, add an ASSERT in btrfs_relocate_chunk() to check it is
properly called.

Fixes: 18bb8bbf13c1 ("btrfs: zoned: automatically reclaim zones")
Cc: stable@vger.kernel.org # 5.13+
Link: https://github.com/naota/linux/issues/56
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/block-group.c | 8 +++++++-
 fs/btrfs/volumes.c     | 6 ++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 3113f6d7f335..c22d287e020b 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1522,8 +1522,12 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
 	if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags))
 		return;
 
-	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE))
+	sb_start_write(fs_info->sb);
+
+	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) {
+		sb_end_write(fs_info->sb);
 		return;
+	}
 
 	/*
 	 * Long running balances can keep us blocked here for eternity, so
@@ -1531,6 +1535,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
 	 */
 	if (!mutex_trylock(&fs_info->reclaim_bgs_lock)) {
 		btrfs_exclop_finish(fs_info);
+		sb_end_write(fs_info->sb);
 		return;
 	}
 
@@ -1605,6 +1610,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
 	spin_unlock(&fs_info->unused_bgs_lock);
 	mutex_unlock(&fs_info->reclaim_bgs_lock);
 	btrfs_exclop_finish(fs_info);
+	sb_end_write(fs_info->sb);
 }
 
 void btrfs_reclaim_bgs(struct btrfs_fs_info *fs_info)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index fa7fee09e39b..74c8024d8f96 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3240,6 +3240,9 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
 	u64 length;
 	int ret;
 
+	/* Assert we called sb_start_write(), not to race with FS freezing */
+	ASSERT(sb_write_started(fs_info->sb));
+
 	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
 		btrfs_err(fs_info,
 			  "relocate: not supported on extent tree v2 yet");
@@ -8304,10 +8307,12 @@ static int relocating_repair_kthread(void *data)
 	target = cache->start;
 	btrfs_put_block_group(cache);
 
+	sb_start_write(fs_info->sb);
 	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) {
 		btrfs_info(fs_info,
 			   "zoned: skip relocating block group %llu to repair: EBUSY",
 			   target);
+		sb_end_write(fs_info->sb);
 		return -EBUSY;
 	}
 
@@ -8335,6 +8340,7 @@ static int relocating_repair_kthread(void *data)
 		btrfs_put_block_group(cache);
 	mutex_unlock(&fs_info->reclaim_bgs_lock);
 	btrfs_exclop_finish(fs_info);
+	sb_end_write(fs_info->sb);
 
 	return ret;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] btrfs: zoned: mark relocation as writing
  2022-02-18  4:14 ` [PATCH v3 2/2] btrfs: zoned: mark relocation as writing Naohiro Aota
@ 2022-02-18  6:13   ` Johannes Thumshirn
  2022-02-23 10:31   ` David Sterba
  1 sibling, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2022-02-18  6:13 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs; +Cc: linux-fsdevel, viro, david

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/2] btrfs: zoned: mark relocation as writing
  2022-02-18  4:14 [PATCH v3 0/2] btrfs: zoned: mark relocation as writing Naohiro Aota
  2022-02-18  4:14 ` [PATCH v3 1/2] fs: add asserting functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
  2022-02-18  4:14 ` [PATCH v3 2/2] btrfs: zoned: mark relocation as writing Naohiro Aota
@ 2022-02-18 16:54 ` David Sterba
  2 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2022-02-18 16:54 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, johannes.thumshirn, linux-fsdevel, viro, david

On Fri, Feb 18, 2022 at 01:14:17PM +0900, Naohiro Aota wrote:
> There is a hung_task issue with running generic/068 on an SMR
> device. The hang occurs while a process is trying to thaw the
> filesystem. The process is trying to take sb->s_umount to thaw the
> FS. The lock is held by fsstress, which calls btrfs_sync_fs() and is
> waiting for an ordered extent to finish. However, as the FS is frozen,
> the ordered extent never finish.
> 
> Having an ordered extent while the FS is frozen is the root cause of
> the hang. The ordered extent is initiated from btrfs_relocate_chunk()
> which is called from btrfs_reclaim_bgs_work().
> 
> The first patch is a preparation patch to add asserting functions to
> check if sb_start_{write,pagefault,intwrite} is called.
> 
> The second patch adds sb_{start,end}_write and the assert function at
> proper places.
> 
> Changelog:
> v3:
>   - Return bool instead of asserting and let caller decide what to do
>     (suggested by Dave Chinner)
> v2:
>   - Implement asserting functions not to directly touch the internal
>     implementation
> 
> Naohiro Aota (2):
>   fs: add asserting functions for sb_start_{write,pagefault,intwrite}
>   btrfs: zoned: mark relocation as writing

Topic branch updated, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] btrfs: zoned: mark relocation as writing
  2022-02-18  4:14 ` [PATCH v3 2/2] btrfs: zoned: mark relocation as writing Naohiro Aota
  2022-02-18  6:13   ` Johannes Thumshirn
@ 2022-02-23 10:31   ` David Sterba
  2022-02-24  2:15     ` Naohiro Aota
  1 sibling, 1 reply; 9+ messages in thread
From: David Sterba @ 2022-02-23 10:31 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, johannes.thumshirn, linux-fsdevel, viro, david

On Fri, Feb 18, 2022 at 01:14:19PM +0900, Naohiro Aota wrote:
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3240,6 +3240,9 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
>  	u64 length;
>  	int ret;
>  
> +	/* Assert we called sb_start_write(), not to race with FS freezing */
> +	ASSERT(sb_write_started(fs_info->sb));

I see this assertion to fail, it's not on all testing VMs, but has
happened a few times already so it's probably some race:

[ 2927.013859] BTRFS warning (device vdc): devid 1 uuid 4335c7a6-652c-4389-8ea9-270c00fa9880 is missing
[ 2927.017693] BTRFS warning (device vdc): devid 1 uuid 4335c7a6-652c-4389-8ea9-270c00fa9880 is missing
[ 2927.022921] BTRFS info (device vdc): bdev /dev/vdd errs: wr 0, rd 0, flush 0, corrupt 6000, gen 0
[ 2927.031780] BTRFS info (device vdc): checking UUID tree
[ 2927.045348] BTRFS: error (device vdc: state X) in __btrfs_free_extent:3199: errno=-5 IO failure
[ 2927.049729] BTRFS info (device vdc: state EX): forced readonly
[ 2927.051787] BTRFS: error (device vdc: state EX) in btrfs_run_delayed_refs:2159: errno=-5 IO failure
[ 2927.058758] BTRFS info (device vdc: state EX): balance: resume -dusage=90 -musage=90 -susage=90
[ 2927.062457] assertion failed: sb_write_started(fs_info->sb), in fs/btrfs/volumes.c:3244
[ 2927.066121] ------------[ cut here ]------------
[ 2927.067682] kernel BUG at fs/btrfs/ctree.h:3552!
[ 2927.069214] invalid opcode: 0000 [#1] PREEMPT SMP
[ 2927.070926] CPU: 2 PID: 22817 Comm: btrfs-balance Not tainted 5.17.0-rc5-default+ #1632
[ 2927.075299] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
[ 2927.080897] RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
[ 2927.092652] RSP: 0018:ffffaed9c610fdc0 EFLAGS: 00010246
[ 2927.095227] RAX: 000000000000004b RBX: ffffa13a873db000 RCX: 0000000000000000
[ 2927.096898] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 00000000ffffffff
[ 2927.100514] RBP: ffffa13a55324000 R08: 0000000000000003 R09: 0000000000000001
[ 2927.102518] R10: 0000000000000000 R11: 0000000000000001 R12: ffffa13a6922f098
[ 2927.104330] R13: 000000008cfa0000 R14: ffffa13a553262a0 R15: ffffa13a873db000
[ 2927.106025] FS:  0000000000000000(0000) GS:ffffa13abda00000(0000) knlGS:0000000000000000
[ 2927.108652] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2927.110568] CR2: 000055fdf2a94fd0 CR3: 000000005d012005 CR4: 0000000000170ea0
[ 2927.112167] Call Trace:
[ 2927.112801]  <TASK>
[ 2927.113212]  btrfs_relocate_chunk.cold+0x42/0x67 [btrfs]
[ 2927.114328]  __btrfs_balance+0x2ea/0x490 [btrfs]
[ 2927.114871] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 131072 csum 0x7e797e3e expected csum 0x8941f998 mirror 2
[ 2927.115469]  btrfs_balance+0x4ed/0x7e0 [btrfs]
[ 2927.118802] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 139264 csum 0x27df6522 expected csum 0x8941f998 mirror 2
[ 2927.119691]  ? btrfs_balance+0x7e0/0x7e0 [btrfs]
[ 2927.123158] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 143360 csum 0x9f144c35 expected csum 0x8941f998 mirror 2
[ 2927.123965]  balance_kthread+0x37/0x50 [btrfs]
[ 2927.127299] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 147456 csum 0x1027ab9a expected csum 0x8941f998 mirror 2
[ 2927.128016]  kthread+0xea/0x110
[ 2927.128023]  ? kthread_complete_and_exit+0x20/0x20
[ 2927.128027]  ret_from_fork+0x1f/0x30
[ 2927.128031]  </TASK>
[ 2927.128032] Modules linked in:
[ 2927.131390] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 155648 csum 0x428b86d5 expected csum 0x8941f998 mirror 2
[ 2927.131400] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 163840 csum 0x8fff7df2 expected csum 0x8941f998 mirror 2
[ 2927.131401] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 159744 csum 0x9893a835 expected csum 0x8941f998 mirror 2
[ 2927.131416] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 180224 csum 0x83d83877 expected csum 0x8941f998 mirror 2
[ 2927.131832] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 524288 csum 0x1a0c8fd4 expected csum 0x8941f998 mirror 2
[ 2927.132128] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 540672 csum 0xcaaf83cc expected csum 0x8941f998 mirror 2
[ 2927.133105]  dm_flakey dm_mod btrfs blake2b_generic libcrc32c crc32c_intel xor lzo_compress lzo_decompress raid6_pq zstd_decompress zstd_compress xxhash loop
[ 2927.144290] ---[ end trace 0000000000000000 ]---
[ 2927.145080] RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
[ 2927.147738] RSP: 0018:ffffaed9c610fdc0 EFLAGS: 00010246
[ 2927.148220] RAX: 000000000000004b RBX: ffffa13a873db000 RCX: 0000000000000000
[ 2927.149126] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 00000000ffffffff
[ 2927.150057] RBP: ffffa13a55324000 R08: 0000000000000003 R09: 0000000000000001
[ 2927.150676] R10: 0000000000000000 R11: 0000000000000001 R12: ffffa13a6922f098
[ 2927.151297] R13: 000000008cfa0000 R14: ffffa13a553262a0 R15: ffffa13a873db000
[ 2927.152529] FS:  0000000000000000(0000) GS:ffffa13abda00000(0000) knlGS:0000000000000000
[ 2927.153646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2927.154280] CR2: 000055fdf2a94fd0 CR3: 000000005d012005 CR4: 0000000000170ea0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] btrfs: zoned: mark relocation as writing
  2022-02-23 10:31   ` David Sterba
@ 2022-02-24  2:15     ` Naohiro Aota
  2022-02-24 19:12       ` David Sterba
  2022-02-28 20:18       ` David Sterba
  0 siblings, 2 replies; 9+ messages in thread
From: Naohiro Aota @ 2022-02-24  2:15 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Johannes Thumshirn, linux-fsdevel, viro, david

On Wed, Feb 23, 2022 at 11:31:07AM +0100, David Sterba wrote:
> On Fri, Feb 18, 2022 at 01:14:19PM +0900, Naohiro Aota wrote:
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -3240,6 +3240,9 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
> >  	u64 length;
> >  	int ret;
> >  
> > +	/* Assert we called sb_start_write(), not to race with FS freezing */
> > +	ASSERT(sb_write_started(fs_info->sb));
> 
> I see this assertion to fail, it's not on all testing VMs, but has
> happened a few times already so it's probably some race:
> 
> [ 2927.013859] BTRFS warning (device vdc): devid 1 uuid 4335c7a6-652c-4389-8ea9-270c00fa9880 is missing
> [ 2927.017693] BTRFS warning (device vdc): devid 1 uuid 4335c7a6-652c-4389-8ea9-270c00fa9880 is missing
> [ 2927.022921] BTRFS info (device vdc): bdev /dev/vdd errs: wr 0, rd 0, flush 0, corrupt 6000, gen 0
> [ 2927.031780] BTRFS info (device vdc): checking UUID tree
> [ 2927.045348] BTRFS: error (device vdc: state X) in __btrfs_free_extent:3199: errno=-5 IO failure
> [ 2927.049729] BTRFS info (device vdc: state EX): forced readonly
> [ 2927.051787] BTRFS: error (device vdc: state EX) in btrfs_run_delayed_refs:2159: errno=-5 IO failure
> [ 2927.058758] BTRFS info (device vdc: state EX): balance: resume -dusage=90 -musage=90 -susage=90
> [ 2927.062457] assertion failed: sb_write_started(fs_info->sb), in fs/btrfs/volumes.c:3244
> [ 2927.066121] ------------[ cut here ]------------
> [ 2927.067682] kernel BUG at fs/btrfs/ctree.h:3552!
> [ 2927.069214] invalid opcode: 0000 [#1] PREEMPT SMP
> [ 2927.070926] CPU: 2 PID: 22817 Comm: btrfs-balance Not tainted 5.17.0-rc5-default+ #1632
> [ 2927.075299] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
> [ 2927.080897] RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
> [ 2927.092652] RSP: 0018:ffffaed9c610fdc0 EFLAGS: 00010246
> [ 2927.095227] RAX: 000000000000004b RBX: ffffa13a873db000 RCX: 0000000000000000
> [ 2927.096898] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 00000000ffffffff
> [ 2927.100514] RBP: ffffa13a55324000 R08: 0000000000000003 R09: 0000000000000001
> [ 2927.102518] R10: 0000000000000000 R11: 0000000000000001 R12: ffffa13a6922f098
> [ 2927.104330] R13: 000000008cfa0000 R14: ffffa13a553262a0 R15: ffffa13a873db000
> [ 2927.106025] FS:  0000000000000000(0000) GS:ffffa13abda00000(0000) knlGS:0000000000000000
> [ 2927.108652] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2927.110568] CR2: 000055fdf2a94fd0 CR3: 000000005d012005 CR4: 0000000000170ea0
> [ 2927.112167] Call Trace:
> [ 2927.112801]  <TASK>
> [ 2927.113212]  btrfs_relocate_chunk.cold+0x42/0x67 [btrfs]
> [ 2927.114328]  __btrfs_balance+0x2ea/0x490 [btrfs]
> [ 2927.114871] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 131072 csum 0x7e797e3e expected csum 0x8941f998 mirror 2
> [ 2927.115469]  btrfs_balance+0x4ed/0x7e0 [btrfs]
> [ 2927.118802] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 139264 csum 0x27df6522 expected csum 0x8941f998 mirror 2
> [ 2927.119691]  ? btrfs_balance+0x7e0/0x7e0 [btrfs]
> [ 2927.123158] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 143360 csum 0x9f144c35 expected csum 0x8941f998 mirror 2
> [ 2927.123965]  balance_kthread+0x37/0x50 [btrfs]

It looks like this occurs when the balance is resumed. We also need
sb_{start,end}_write around btrfs_balance() in balance_kthred().

I guess we can cause a hang if we resume the balance and freeze the FS
at the same time.

> [ 2927.127299] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 147456 csum 0x1027ab9a expected csum 0x8941f998 mirror 2
> [ 2927.128016]  kthread+0xea/0x110
> [ 2927.128023]  ? kthread_complete_and_exit+0x20/0x20
> [ 2927.128027]  ret_from_fork+0x1f/0x30
> [ 2927.128031]  </TASK>
> [ 2927.128032] Modules linked in:
> [ 2927.131390] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 155648 csum 0x428b86d5 expected csum 0x8941f998 mirror 2
> [ 2927.131400] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 163840 csum 0x8fff7df2 expected csum 0x8941f998 mirror 2
> [ 2927.131401] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 159744 csum 0x9893a835 expected csum 0x8941f998 mirror 2
> [ 2927.131416] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 180224 csum 0x83d83877 expected csum 0x8941f998 mirror 2
> [ 2927.131832] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 524288 csum 0x1a0c8fd4 expected csum 0x8941f998 mirror 2
> [ 2927.132128] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 540672 csum 0xcaaf83cc expected csum 0x8941f998 mirror 2
> [ 2927.133105]  dm_flakey dm_mod btrfs blake2b_generic libcrc32c crc32c_intel xor lzo_compress lzo_decompress raid6_pq zstd_decompress zstd_compress xxhash loop
> [ 2927.144290] ---[ end trace 0000000000000000 ]---
> [ 2927.145080] RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
> [ 2927.147738] RSP: 0018:ffffaed9c610fdc0 EFLAGS: 00010246
> [ 2927.148220] RAX: 000000000000004b RBX: ffffa13a873db000 RCX: 0000000000000000
> [ 2927.149126] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 00000000ffffffff
> [ 2927.150057] RBP: ffffa13a55324000 R08: 0000000000000003 R09: 0000000000000001
> [ 2927.150676] R10: 0000000000000000 R11: 0000000000000001 R12: ffffa13a6922f098
> [ 2927.151297] R13: 000000008cfa0000 R14: ffffa13a553262a0 R15: ffffa13a873db000
> [ 2927.152529] FS:  0000000000000000(0000) GS:ffffa13abda00000(0000) knlGS:0000000000000000
> [ 2927.153646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2927.154280] CR2: 000055fdf2a94fd0 CR3: 000000005d012005 CR4: 0000000000170ea0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] btrfs: zoned: mark relocation as writing
  2022-02-24  2:15     ` Naohiro Aota
@ 2022-02-24 19:12       ` David Sterba
  2022-02-28 20:18       ` David Sterba
  1 sibling, 0 replies; 9+ messages in thread
From: David Sterba @ 2022-02-24 19:12 UTC (permalink / raw)
  To: Naohiro Aota
  Cc: dsterba, linux-btrfs, Johannes Thumshirn, linux-fsdevel, viro, david

On Thu, Feb 24, 2022 at 02:15:58AM +0000, Naohiro Aota wrote:
> On Wed, Feb 23, 2022 at 11:31:07AM +0100, David Sterba wrote:
> > On Fri, Feb 18, 2022 at 01:14:19PM +0900, Naohiro Aota wrote:
> > [ 2927.114871] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 131072 csum 0x7e797e3e expected csum 0x8941f998 mirror 2
> > [ 2927.115469]  btrfs_balance+0x4ed/0x7e0 [btrfs]
> > [ 2927.118802] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 139264 csum 0x27df6522 expected csum 0x8941f998 mirror 2
> > [ 2927.119691]  ? btrfs_balance+0x7e0/0x7e0 [btrfs]
> > [ 2927.123158] BTRFS warning (device vdc: state EX): csum failed root 5 ino 258 off 143360 csum 0x9f144c35 expected csum 0x8941f998 mirror 2
> > [ 2927.123965]  balance_kthread+0x37/0x50 [btrfs]
> 
> It looks like this occurs when the balance is resumed. We also need
> sb_{start,end}_write around btrfs_balance() in balance_kthred().

Sounds plausible.

> I guess we can cause a hang if we resume the balance and freeze the FS
> at the same time.

The background balance starts only when the filesystem is mounted for
write, so right after the sb_rdonly check in open_ctree, but I think
you're right that freeze during that can lead to a hang.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] btrfs: zoned: mark relocation as writing
  2022-02-24  2:15     ` Naohiro Aota
  2022-02-24 19:12       ` David Sterba
@ 2022-02-28 20:18       ` David Sterba
  1 sibling, 0 replies; 9+ messages in thread
From: David Sterba @ 2022-02-28 20:18 UTC (permalink / raw)
  To: Naohiro Aota
  Cc: dsterba, linux-btrfs, Johannes Thumshirn, linux-fsdevel, viro, david

On Thu, Feb 24, 2022 at 02:15:58AM +0000, Naohiro Aota wrote:
> On Wed, Feb 23, 2022 at 11:31:07AM +0100, David Sterba wrote:
> > On Fri, Feb 18, 2022 at 01:14:19PM +0900, Naohiro Aota wrote:
> It looks like this occurs when the balance is resumed. We also need
> sb_{start,end}_write around btrfs_balance() in balance_kthred().
> 
> I guess we can cause a hang if we resume the balance and freeze the FS
> at the same time.

We need to fix the missing write protection before the asserts can be
added, so I'll delete them from this patch and will submit the helpers
patch once after we have fixed all.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-28 20:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-18  4:14 [PATCH v3 0/2] btrfs: zoned: mark relocation as writing Naohiro Aota
2022-02-18  4:14 ` [PATCH v3 1/2] fs: add asserting functions for sb_start_{write,pagefault,intwrite} Naohiro Aota
2022-02-18  4:14 ` [PATCH v3 2/2] btrfs: zoned: mark relocation as writing Naohiro Aota
2022-02-18  6:13   ` Johannes Thumshirn
2022-02-23 10:31   ` David Sterba
2022-02-24  2:15     ` Naohiro Aota
2022-02-24 19:12       ` David Sterba
2022-02-28 20:18       ` David Sterba
2022-02-18 16:54 ` [PATCH v3 0/2] " David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).