linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode
@ 2022-02-12 14:20 Jaegeuk Kim
  2022-02-12 14:20 ` [PATCH 2/2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes Jaegeuk Kim
  2022-02-24  8:34 ` [f2fs-dev] [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode Chao Yu
  0 siblings, 2 replies; 6+ messages in thread
From: Jaegeuk Kim @ 2022-02-12 14:20 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

This patch fixes xfstests/generic/475 failure.

[  293.680694] F2FS-fs (dm-1): May loss orphan inode, run fsck to fix.
[  293.685358] Buffer I/O error on dev dm-1, logical block 8388592, async page read
[  293.691527] Buffer I/O error on dev dm-1, logical block 8388592, async page read
[  293.691764] sh (7615): drop_caches: 3
[  293.691819] sh (7616): drop_caches: 3
[  293.694017] Buffer I/O error on dev dm-1, logical block 1, async page read
[  293.695659] sh (7618): drop_caches: 3
[  293.696979] sh (7617): drop_caches: 3
[  293.700290] sh (7623): drop_caches: 3
[  293.708621] sh (7626): drop_caches: 3
[  293.711386] sh (7628): drop_caches: 3
[  293.711825] sh (7627): drop_caches: 3
[  293.716738] sh (7630): drop_caches: 3
[  293.719613] sh (7632): drop_caches: 3
[  293.720971] sh (7633): drop_caches: 3
[  293.727741] sh (7634): drop_caches: 3
[  293.730783] sh (7636): drop_caches: 3
[  293.732681] sh (7635): drop_caches: 3
[  293.732988] sh (7637): drop_caches: 3
[  293.738836] sh (7639): drop_caches: 3
[  293.740568] sh (7641): drop_caches: 3
[  293.743053] sh (7640): drop_caches: 3
[  293.821889] ------------[ cut here ]------------
[  293.824654] kernel BUG at fs/f2fs/node.c:3334!
[  293.826226] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[  293.828713] CPU: 0 PID: 7653 Comm: umount Tainted: G           OE     5.17.0-rc1-custom #1
[  293.830946] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[  293.832526] RIP: 0010:f2fs_destroy_node_manager+0x33f/0x350 [f2fs]
[  293.833905] Code: e8 d6 3d f9 f9 48 8b 45 d0 65 48 2b 04 25 28 00 00 00 75 1a 48 81 c4 28 03 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b
[  293.837783] RSP: 0018:ffffb04ec31e7a20 EFLAGS: 00010202
[  293.839062] RAX: 0000000000000001 RBX: ffff9df947db2eb8 RCX: 0000000080aa0072
[  293.840666] RDX: 0000000000000000 RSI: ffffe86c0432a140 RDI: ffffffffc0b72a21
[  293.842261] RBP: ffffb04ec31e7d70 R08: ffff9df94ca85780 R09: 0000000080aa0072
[  293.843909] R10: ffff9df94ca85700 R11: ffff9df94e1ccf58 R12: ffff9df947db2e00
[  293.845594] R13: ffff9df947db2ed0 R14: ffff9df947db2eb8 R15: ffff9df947db2eb8
[  293.847855] FS:  00007f5a97379800(0000) GS:ffff9dfa77c00000(0000) knlGS:0000000000000000
[  293.850647] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  293.852940] CR2: 00007f5a97528730 CR3: 000000010bc76005 CR4: 0000000000370ef0
[  293.854680] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  293.856423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  293.858380] Call Trace:
[  293.859302]  <TASK>
[  293.860311]  ? ttwu_do_wakeup+0x1c/0x170
[  293.861800]  ? ttwu_do_activate+0x6d/0xb0
[  293.863057]  ? _raw_spin_unlock_irqrestore+0x29/0x40
[  293.864411]  ? try_to_wake_up+0x9d/0x5e0
[  293.865618]  ? debug_smp_processor_id+0x17/0x20
[  293.866934]  ? debug_smp_processor_id+0x17/0x20
[  293.868223]  ? free_unref_page+0xbf/0x120
[  293.869470]  ? __free_slab+0xcb/0x1c0
[  293.870614]  ? preempt_count_add+0x7a/0xc0
[  293.871811]  ? __slab_free+0xa0/0x2d0
[  293.872918]  ? __wake_up_common_lock+0x8a/0xc0
[  293.874186]  ? __slab_free+0xa0/0x2d0
[  293.875305]  ? free_inode_nonrcu+0x20/0x20
[  293.876466]  ? free_inode_nonrcu+0x20/0x20
[  293.877650]  ? debug_smp_processor_id+0x17/0x20
[  293.878949]  ? call_rcu+0x11a/0x240
[  293.880060]  ? f2fs_destroy_stats+0x59/0x60 [f2fs]
[  293.881437]  ? kfree+0x1fe/0x230
[  293.882674]  f2fs_put_super+0x160/0x390 [f2fs]
[  293.883978]  generic_shutdown_super+0x7a/0x120
[  293.885274]  kill_block_super+0x27/0x50
[  293.886496]  kill_f2fs_super+0x7f/0x100 [f2fs]
[  293.887806]  deactivate_locked_super+0x35/0xa0
[  293.889271]  deactivate_super+0x40/0x50
[  293.890513]  cleanup_mnt+0x139/0x190
[  293.891689]  __cleanup_mnt+0x12/0x20
[  293.892850]  task_work_run+0x64/0xa0
[  293.894035]  exit_to_user_mode_prepare+0x1b7/0x1c0
[  293.895409]  syscall_exit_to_user_mode+0x27/0x50
[  293.896872]  do_syscall_64+0x48/0xc0
[  293.898090]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  293.899517] RIP: 0033:0x7f5a975cd25b

Fixes: 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/inode.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 0ec8e32a00b4..ab8e0c06c78c 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -885,6 +885,7 @@ void f2fs_handle_failed_inode(struct inode *inode)
 	err = f2fs_get_node_info(sbi, inode->i_ino, &ni, false);
 	if (err) {
 		set_sbi_flag(sbi, SBI_NEED_FSCK);
+		set_inode_flag(inode, FI_FREE_NID);
 		f2fs_warn(sbi, "May loss orphan inode, run fsck to fix.");
 		goto out;
 	}
-- 
2.35.1.265.g69c8d7142f-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes
  2022-02-12 14:20 [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode Jaegeuk Kim
@ 2022-02-12 14:20 ` Jaegeuk Kim
  2022-02-14 23:27   ` [PATCH 2/2 v2] " Jaegeuk Kim
  2022-02-24  8:34 ` [f2fs-dev] [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode Chao Yu
  1 sibling, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2022-02-12 14:20 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

If one read IO is always failing, we can fall into an infinite loop in
f2fs_sync_dirty_inodes. This happens during xfstests/generic/457.

[  142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
...
[  382.887210]  submit_bio_noacct+0xdd/0x2a0
[  382.887213]  submit_bio+0x80/0x110
[  382.887223]  __submit_bio+0x4d/0x300 [f2fs]
[  382.887282]  f2fs_submit_page_bio+0x125/0x200 [f2fs]
[  382.887299]  __get_meta_page+0xc9/0x280 [f2fs]
[  382.887315]  f2fs_get_meta_page+0x13/0x20 [f2fs]
[  382.887331]  f2fs_get_node_info+0x317/0x3c0 [f2fs]
[  382.887350]  f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
[  382.887367]  f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
[  382.887386]  f2fs_write_cache_pages+0x302/0x890 [f2fs]
[  382.887405]  ? preempt_count_add+0x7a/0xc0
[  382.887408]  f2fs_write_data_pages+0xfd/0x320 [f2fs]
[  382.887425]  ? _raw_spin_unlock+0x1a/0x30
[  382.887428]  do_writepages+0xd3/0x1d0
[  382.887432]  filemap_fdatawrite_wbc+0x69/0x90
[  382.887434]  filemap_fdatawrite+0x50/0x70
[  382.887437]  f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
[  382.887453]  f2fs_write_checkpoint+0x189/0x1640 [f2fs]
[  382.887469]  ? schedule_timeout+0x114/0x150
[  382.887471]  ? ttwu_do_activate+0x6d/0xb0
[  382.887473]  ? preempt_count_add+0x7a/0xc0
[  382.887476]  kill_f2fs_super+0xca/0x100 [f2fs]
[  382.887491]  deactivate_locked_super+0x35/0xa0
[  382.887494]  deactivate_super+0x40/0x50
[  382.887497]  cleanup_mnt+0x139/0x190
[  382.887499]  __cleanup_mnt+0x12/0x20
[  382.887501]  task_work_run+0x64/0xa0
[  382.887505]  exit_to_user_mode_prepare+0x1b7/0x1c0
[  382.887508]  syscall_exit_to_user_mode+0x27/0x50
[  382.887510]  do_syscall_64+0x48/0xc0
[  382.887513]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/checkpoint.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 203a1577942d..756abfdf3628 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
 	struct inode *inode;
 	struct f2fs_inode_info *fi;
 	bool is_dir = (type == DIR_INODE);
-	unsigned long ino = 0;
+	unsigned long ino = 0, retry_count = DEFAULT_RETRY_IO_COUNT;
 
 	trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
 				get_pages(sbi, is_dir ?
 				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
 retry:
-	if (unlikely(f2fs_cp_error(sbi))) {
+	if (unlikely(f2fs_cp_error(sbi) || !retry_count)) {
 		trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
 				get_pages(sbi, is_dir ?
 				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
@@ -1096,10 +1096,12 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
 
 		iput(inode);
 		/* We need to give cpu to another writers. */
-		if (ino == cur_ino)
+		if (ino == cur_ino) {
+			retry_count--;
 			cond_resched();
-		else
+		} else {
 			ino = cur_ino;
+		}
 	} else {
 		/*
 		 * We should submit bio, since it exists several
-- 
2.35.1.265.g69c8d7142f-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2 v2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes
  2022-02-12 14:20 ` [PATCH 2/2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes Jaegeuk Kim
@ 2022-02-14 23:27   ` Jaegeuk Kim
  2022-02-25  2:07     ` [f2fs-dev] " Chao Yu
  0 siblings, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2022-02-14 23:27 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel

If one read IO is always failing, we can fall into an infinite loop in
f2fs_sync_dirty_inodes. This happens during xfstests/generic/475.

[  142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
...
[  382.887210]  submit_bio_noacct+0xdd/0x2a0
[  382.887213]  submit_bio+0x80/0x110
[  382.887223]  __submit_bio+0x4d/0x300 [f2fs]
[  382.887282]  f2fs_submit_page_bio+0x125/0x200 [f2fs]
[  382.887299]  __get_meta_page+0xc9/0x280 [f2fs]
[  382.887315]  f2fs_get_meta_page+0x13/0x20 [f2fs]
[  382.887331]  f2fs_get_node_info+0x317/0x3c0 [f2fs]
[  382.887350]  f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
[  382.887367]  f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
[  382.887386]  f2fs_write_cache_pages+0x302/0x890 [f2fs]
[  382.887405]  ? preempt_count_add+0x7a/0xc0
[  382.887408]  f2fs_write_data_pages+0xfd/0x320 [f2fs]
[  382.887425]  ? _raw_spin_unlock+0x1a/0x30
[  382.887428]  do_writepages+0xd3/0x1d0
[  382.887432]  filemap_fdatawrite_wbc+0x69/0x90
[  382.887434]  filemap_fdatawrite+0x50/0x70
[  382.887437]  f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
[  382.887453]  f2fs_write_checkpoint+0x189/0x1640 [f2fs]
[  382.887469]  ? schedule_timeout+0x114/0x150
[  382.887471]  ? ttwu_do_activate+0x6d/0xb0
[  382.887473]  ? preempt_count_add+0x7a/0xc0
[  382.887476]  kill_f2fs_super+0xca/0x100 [f2fs]
[  382.887491]  deactivate_locked_super+0x35/0xa0
[  382.887494]  deactivate_super+0x40/0x50
[  382.887497]  cleanup_mnt+0x139/0x190
[  382.887499]  __cleanup_mnt+0x12/0x20
[  382.887501]  task_work_run+0x64/0xa0
[  382.887505]  exit_to_user_mode_prepare+0x1b7/0x1c0
[  382.887508]  syscall_exit_to_user_mode+0x27/0x50
[  382.887510]  do_syscall_64+0x48/0xc0
[  382.887513]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 Change log from v1:
  - fix a regression to report EIO too early

 fs/f2fs/checkpoint.c | 13 ++++++++-----
 fs/f2fs/f2fs.h       |  3 +++
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 203a1577942d..56c81c68ef71 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
 	struct inode *inode;
 	struct f2fs_inode_info *fi;
 	bool is_dir = (type == DIR_INODE);
-	unsigned long ino = 0;
+	unsigned long ino = 0, retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
 
 	trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
 				get_pages(sbi, is_dir ?
 				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
 retry:
-	if (unlikely(f2fs_cp_error(sbi))) {
+	if (unlikely(f2fs_cp_error(sbi) || (is_dir && !retry_count))) {
 		trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
 				get_pages(sbi, is_dir ?
 				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
@@ -1096,10 +1096,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
 
 		iput(inode);
 		/* We need to give cpu to another writers. */
-		if (ino == cur_ino)
-			cond_resched();
-		else
+		if (ino == cur_ino) {
+			retry_count--;
+			io_schedule_timeout(DEFAULT_IO_TIMEOUT);
+		} else {
+			retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
 			ino = cur_ino;
+		}
 	} else {
 		/*
 		 * We should submit bio, since it exists several
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index c9515c3c54fd..f40ef7b61965 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -577,6 +577,9 @@ enum {
 /* maximum retry quota flush count */
 #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT		8
 
+/* maximum retry sync dirty inodes */
+#define DEFAULT_RETRY_SYNC_DIR_COUNT	3000
+
 #define F2FS_LINK_MAX	0xffffffff	/* maximum link count per file */
 
 #define MAX_DIR_RA_PAGES	4	/* maximum ra pages of dir */
-- 
2.35.1.265.g69c8d7142f-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode
  2022-02-12 14:20 [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode Jaegeuk Kim
  2022-02-12 14:20 ` [PATCH 2/2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes Jaegeuk Kim
@ 2022-02-24  8:34 ` Chao Yu
  1 sibling, 0 replies; 6+ messages in thread
From: Chao Yu @ 2022-02-24  8:34 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

On 2022/2/12 22:20, Jaegeuk Kim wrote:
> This patch fixes xfstests/generic/475 failure.
> 
> [  293.680694] F2FS-fs (dm-1): May loss orphan inode, run fsck to fix.
> [  293.685358] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> [  293.691527] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> [  293.691764] sh (7615): drop_caches: 3
> [  293.691819] sh (7616): drop_caches: 3
> [  293.694017] Buffer I/O error on dev dm-1, logical block 1, async page read
> [  293.695659] sh (7618): drop_caches: 3
> [  293.696979] sh (7617): drop_caches: 3
> [  293.700290] sh (7623): drop_caches: 3
> [  293.708621] sh (7626): drop_caches: 3
> [  293.711386] sh (7628): drop_caches: 3
> [  293.711825] sh (7627): drop_caches: 3
> [  293.716738] sh (7630): drop_caches: 3
> [  293.719613] sh (7632): drop_caches: 3
> [  293.720971] sh (7633): drop_caches: 3
> [  293.727741] sh (7634): drop_caches: 3
> [  293.730783] sh (7636): drop_caches: 3
> [  293.732681] sh (7635): drop_caches: 3
> [  293.732988] sh (7637): drop_caches: 3
> [  293.738836] sh (7639): drop_caches: 3
> [  293.740568] sh (7641): drop_caches: 3
> [  293.743053] sh (7640): drop_caches: 3
> [  293.821889] ------------[ cut here ]------------
> [  293.824654] kernel BUG at fs/f2fs/node.c:3334!
> [  293.826226] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> [  293.828713] CPU: 0 PID: 7653 Comm: umount Tainted: G           OE     5.17.0-rc1-custom #1
> [  293.830946] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> [  293.832526] RIP: 0010:f2fs_destroy_node_manager+0x33f/0x350 [f2fs]
> [  293.833905] Code: e8 d6 3d f9 f9 48 8b 45 d0 65 48 2b 04 25 28 00 00 00 75 1a 48 81 c4 28 03 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b
> [  293.837783] RSP: 0018:ffffb04ec31e7a20 EFLAGS: 00010202
> [  293.839062] RAX: 0000000000000001 RBX: ffff9df947db2eb8 RCX: 0000000080aa0072
> [  293.840666] RDX: 0000000000000000 RSI: ffffe86c0432a140 RDI: ffffffffc0b72a21
> [  293.842261] RBP: ffffb04ec31e7d70 R08: ffff9df94ca85780 R09: 0000000080aa0072
> [  293.843909] R10: ffff9df94ca85700 R11: ffff9df94e1ccf58 R12: ffff9df947db2e00
> [  293.845594] R13: ffff9df947db2ed0 R14: ffff9df947db2eb8 R15: ffff9df947db2eb8
> [  293.847855] FS:  00007f5a97379800(0000) GS:ffff9dfa77c00000(0000) knlGS:0000000000000000
> [  293.850647] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  293.852940] CR2: 00007f5a97528730 CR3: 000000010bc76005 CR4: 0000000000370ef0
> [  293.854680] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  293.856423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  293.858380] Call Trace:
> [  293.859302]  <TASK>
> [  293.860311]  ? ttwu_do_wakeup+0x1c/0x170
> [  293.861800]  ? ttwu_do_activate+0x6d/0xb0
> [  293.863057]  ? _raw_spin_unlock_irqrestore+0x29/0x40
> [  293.864411]  ? try_to_wake_up+0x9d/0x5e0
> [  293.865618]  ? debug_smp_processor_id+0x17/0x20
> [  293.866934]  ? debug_smp_processor_id+0x17/0x20
> [  293.868223]  ? free_unref_page+0xbf/0x120
> [  293.869470]  ? __free_slab+0xcb/0x1c0
> [  293.870614]  ? preempt_count_add+0x7a/0xc0
> [  293.871811]  ? __slab_free+0xa0/0x2d0
> [  293.872918]  ? __wake_up_common_lock+0x8a/0xc0
> [  293.874186]  ? __slab_free+0xa0/0x2d0
> [  293.875305]  ? free_inode_nonrcu+0x20/0x20
> [  293.876466]  ? free_inode_nonrcu+0x20/0x20
> [  293.877650]  ? debug_smp_processor_id+0x17/0x20
> [  293.878949]  ? call_rcu+0x11a/0x240
> [  293.880060]  ? f2fs_destroy_stats+0x59/0x60 [f2fs]
> [  293.881437]  ? kfree+0x1fe/0x230
> [  293.882674]  f2fs_put_super+0x160/0x390 [f2fs]
> [  293.883978]  generic_shutdown_super+0x7a/0x120
> [  293.885274]  kill_block_super+0x27/0x50
> [  293.886496]  kill_f2fs_super+0x7f/0x100 [f2fs]
> [  293.887806]  deactivate_locked_super+0x35/0xa0
> [  293.889271]  deactivate_super+0x40/0x50
> [  293.890513]  cleanup_mnt+0x139/0x190
> [  293.891689]  __cleanup_mnt+0x12/0x20
> [  293.892850]  task_work_run+0x64/0xa0
> [  293.894035]  exit_to_user_mode_prepare+0x1b7/0x1c0
> [  293.895409]  syscall_exit_to_user_mode+0x27/0x50
> [  293.896872]  do_syscall_64+0x48/0xc0
> [  293.898090]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  293.899517] RIP: 0033:0x7f5a975cd25b
> 
> Fixes: 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()")
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Reviewed-by: Chao Yu <chao@kernel.org>

Thanks,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2 v2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes
  2022-02-14 23:27   ` [PATCH 2/2 v2] " Jaegeuk Kim
@ 2022-02-25  2:07     ` Chao Yu
  2022-02-25 19:11       ` Jaegeuk Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Chao Yu @ 2022-02-25  2:07 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

On 2022/2/15 7:27, Jaegeuk Kim wrote:
> If one read IO is always failing, we can fall into an infinite loop in
> f2fs_sync_dirty_inodes. This happens during xfstests/generic/475.
> 
> [  142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> ...
> [  382.887210]  submit_bio_noacct+0xdd/0x2a0
> [  382.887213]  submit_bio+0x80/0x110
> [  382.887223]  __submit_bio+0x4d/0x300 [f2fs]
> [  382.887282]  f2fs_submit_page_bio+0x125/0x200 [f2fs]
> [  382.887299]  __get_meta_page+0xc9/0x280 [f2fs]
> [  382.887315]  f2fs_get_meta_page+0x13/0x20 [f2fs]
> [  382.887331]  f2fs_get_node_info+0x317/0x3c0 [f2fs]
> [  382.887350]  f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
> [  382.887367]  f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
> [  382.887386]  f2fs_write_cache_pages+0x302/0x890 [f2fs]
> [  382.887405]  ? preempt_count_add+0x7a/0xc0
> [  382.887408]  f2fs_write_data_pages+0xfd/0x320 [f2fs]
> [  382.887425]  ? _raw_spin_unlock+0x1a/0x30
> [  382.887428]  do_writepages+0xd3/0x1d0
> [  382.887432]  filemap_fdatawrite_wbc+0x69/0x90
> [  382.887434]  filemap_fdatawrite+0x50/0x70
> [  382.887437]  f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
> [  382.887453]  f2fs_write_checkpoint+0x189/0x1640 [f2fs]
> [  382.887469]  ? schedule_timeout+0x114/0x150
> [  382.887471]  ? ttwu_do_activate+0x6d/0xb0
> [  382.887473]  ? preempt_count_add+0x7a/0xc0
> [  382.887476]  kill_f2fs_super+0xca/0x100 [f2fs]
> [  382.887491]  deactivate_locked_super+0x35/0xa0
> [  382.887494]  deactivate_super+0x40/0x50
> [  382.887497]  cleanup_mnt+0x139/0x190
> [  382.887499]  __cleanup_mnt+0x12/0x20
> [  382.887501]  task_work_run+0x64/0xa0
> [  382.887505]  exit_to_user_mode_prepare+0x1b7/0x1c0
> [  382.887508]  syscall_exit_to_user_mode+0x27/0x50
> [  382.887510]  do_syscall_64+0x48/0xc0
> [  382.887513]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>   Change log from v1:
>    - fix a regression to report EIO too early
> 
>   fs/f2fs/checkpoint.c | 13 ++++++++-----
>   fs/f2fs/f2fs.h       |  3 +++
>   2 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index 203a1577942d..56c81c68ef71 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
>   	struct inode *inode;
>   	struct f2fs_inode_info *fi;
>   	bool is_dir = (type == DIR_INODE);
> -	unsigned long ino = 0;
> +	unsigned long ino = 0, retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
>   
>   	trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
>   				get_pages(sbi, is_dir ?
>   				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
>   retry:
> -	if (unlikely(f2fs_cp_error(sbi))) {
> +	if (unlikely(f2fs_cp_error(sbi) || (is_dir && !retry_count))) {
>   		trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
>   				get_pages(sbi, is_dir ?
>   				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
> @@ -1096,10 +1096,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
>   
>   		iput(inode);
>   		/* We need to give cpu to another writers. */
> -		if (ino == cur_ino)
> -			cond_resched();
> -		else
> +		if (ino == cur_ino) {
> +			retry_count--;
> +			io_schedule_timeout(DEFAULT_IO_TIMEOUT);
> +		} else {
> +			retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
>   			ino = cur_ino;
> +		}
>   	} else {
>   		/*
>   		 * We should submit bio, since it exists several
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index c9515c3c54fd..f40ef7b61965 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -577,6 +577,9 @@ enum {
>   /* maximum retry quota flush count */
>   #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT		8
>   
> +/* maximum retry sync dirty inodes */
> +#define DEFAULT_RETRY_SYNC_DIR_COUNT	3000

3000 * 20ms/round = 60sec

How about just trying 5 or 10 sec?

Thanks,

> +
>   #define F2FS_LINK_MAX	0xffffffff	/* maximum link count per file */
>   
>   #define MAX_DIR_RA_PAGES	4	/* maximum ra pages of dir */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2 v2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes
  2022-02-25  2:07     ` [f2fs-dev] " Chao Yu
@ 2022-02-25 19:11       ` Jaegeuk Kim
  0 siblings, 0 replies; 6+ messages in thread
From: Jaegeuk Kim @ 2022-02-25 19:11 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 02/25, Chao Yu wrote:
> On 2022/2/15 7:27, Jaegeuk Kim wrote:
> > If one read IO is always failing, we can fall into an infinite loop in
> > f2fs_sync_dirty_inodes. This happens during xfstests/generic/475.
> > 
> > [  142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> > ...
> > [  382.887210]  submit_bio_noacct+0xdd/0x2a0
> > [  382.887213]  submit_bio+0x80/0x110
> > [  382.887223]  __submit_bio+0x4d/0x300 [f2fs]
> > [  382.887282]  f2fs_submit_page_bio+0x125/0x200 [f2fs]
> > [  382.887299]  __get_meta_page+0xc9/0x280 [f2fs]
> > [  382.887315]  f2fs_get_meta_page+0x13/0x20 [f2fs]
> > [  382.887331]  f2fs_get_node_info+0x317/0x3c0 [f2fs]
> > [  382.887350]  f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
> > [  382.887367]  f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
> > [  382.887386]  f2fs_write_cache_pages+0x302/0x890 [f2fs]
> > [  382.887405]  ? preempt_count_add+0x7a/0xc0
> > [  382.887408]  f2fs_write_data_pages+0xfd/0x320 [f2fs]
> > [  382.887425]  ? _raw_spin_unlock+0x1a/0x30
> > [  382.887428]  do_writepages+0xd3/0x1d0
> > [  382.887432]  filemap_fdatawrite_wbc+0x69/0x90
> > [  382.887434]  filemap_fdatawrite+0x50/0x70
> > [  382.887437]  f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
> > [  382.887453]  f2fs_write_checkpoint+0x189/0x1640 [f2fs]
> > [  382.887469]  ? schedule_timeout+0x114/0x150
> > [  382.887471]  ? ttwu_do_activate+0x6d/0xb0
> > [  382.887473]  ? preempt_count_add+0x7a/0xc0
> > [  382.887476]  kill_f2fs_super+0xca/0x100 [f2fs]
> > [  382.887491]  deactivate_locked_super+0x35/0xa0
> > [  382.887494]  deactivate_super+0x40/0x50
> > [  382.887497]  cleanup_mnt+0x139/0x190
> > [  382.887499]  __cleanup_mnt+0x12/0x20
> > [  382.887501]  task_work_run+0x64/0xa0
> > [  382.887505]  exit_to_user_mode_prepare+0x1b7/0x1c0
> > [  382.887508]  syscall_exit_to_user_mode+0x27/0x50
> > [  382.887510]  do_syscall_64+0x48/0xc0
> > [  382.887513]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >   Change log from v1:
> >    - fix a regression to report EIO too early
> > 
> >   fs/f2fs/checkpoint.c | 13 ++++++++-----
> >   fs/f2fs/f2fs.h       |  3 +++
> >   2 files changed, 11 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index 203a1577942d..56c81c68ef71 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
> >   	struct inode *inode;
> >   	struct f2fs_inode_info *fi;
> >   	bool is_dir = (type == DIR_INODE);
> > -	unsigned long ino = 0;
> > +	unsigned long ino = 0, retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
> >   	trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
> >   				get_pages(sbi, is_dir ?
> >   				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
> >   retry:
> > -	if (unlikely(f2fs_cp_error(sbi))) {
> > +	if (unlikely(f2fs_cp_error(sbi) || (is_dir && !retry_count))) {
> >   		trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
> >   				get_pages(sbi, is_dir ?
> >   				F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
> > @@ -1096,10 +1096,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
> >   		iput(inode);
> >   		/* We need to give cpu to another writers. */
> > -		if (ino == cur_ino)
> > -			cond_resched();
> > -		else
> > +		if (ino == cur_ino) {
> > +			retry_count--;
> > +			io_schedule_timeout(DEFAULT_IO_TIMEOUT);
> > +		} else {
> > +			retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
> >   			ino = cur_ino;
> > +		}
> >   	} else {
> >   		/*
> >   		 * We should submit bio, since it exists several
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index c9515c3c54fd..f40ef7b61965 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -577,6 +577,9 @@ enum {
> >   /* maximum retry quota flush count */
> >   #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT		8
> > +/* maximum retry sync dirty inodes */
> > +#define DEFAULT_RETRY_SYNC_DIR_COUNT	3000
> 
> 3000 * 20ms/round = 60sec
> 
> How about just trying 5 or 10 sec?

It seems this causes another EIO issue in other test. Let me drop this for now.

> 
> Thanks,
> 
> > +
> >   #define F2FS_LINK_MAX	0xffffffff	/* maximum link count per file */
> >   #define MAX_DIR_RA_PAGES	4	/* maximum ra pages of dir */

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-02-25 19:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-12 14:20 [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode Jaegeuk Kim
2022-02-12 14:20 ` [PATCH 2/2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes Jaegeuk Kim
2022-02-14 23:27   ` [PATCH 2/2 v2] " Jaegeuk Kim
2022-02-25  2:07     ` [f2fs-dev] " Chao Yu
2022-02-25 19:11       ` Jaegeuk Kim
2022-02-24  8:34 ` [f2fs-dev] [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).