All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] fix error flag covered by journal recovery
@ 2023-02-10  3:20 Ye Bin
  2023-02-10  3:20 ` [PATCH v2 1/6] jbd2: introduce callback for recovery journal Ye Bin
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Ye Bin @ 2023-02-10  3:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

From: Ye Bin <yebin10@huawei.com>

Diff v2 vs v1:
Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
ext4_load_journal() to jbd2_journal_recover().

When do fault injection test, got issue as follows:
EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
EXT4-fs (dm-5): recovery complete
EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro

EXT4-fs (dm-5): recovery complete
EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro

Without do file system check, file system is clean when do second mount.
Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
mode the last super block is commit directly. So super block in journal is
not uptodate. When do jounral recovery, the uptodate super block will be
covered by jounral data. If super block submit all failed after recover
journal, then file system error flag is lost. When do "fsck -a" couldn't
repair file system deeply.
To solve above issue we need to do extra handle when do super block journal
recovery.

Ye Bin (6):
  jbd2: introduce callback for recovery journal
  ext4: introudce helper for jounral recover handle
  jbd2: do extra handle when do journal recovery
  ext4: remove backup for super block when recovery journal
  ext4: fix super block checksum error
  ext4: make sure fs error flag setted before clear journal error

 fs/ext4/ext4_jbd2.c  | 66 ++++++++++++++++++++++++++++++++++++++++++++
 fs/ext4/ext4_jbd2.h  |  2 ++
 fs/ext4/super.c      | 18 ++++--------
 fs/jbd2/recovery.c   | 27 ++++++++++++++++++
 include/linux/jbd2.h | 11 ++++++++
 5 files changed, 112 insertions(+), 12 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/6] jbd2: introduce callback for recovery journal
  2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
@ 2023-02-10  3:20 ` Ye Bin
  2023-02-10  3:20 ` [PATCH v2 2/6] ext4: introudce helper for jounral recover handle Ye Bin
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ye Bin @ 2023-02-10  3:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

From: Ye Bin <yebin10@huawei.com>

EXT4 file system's super block may submited by journal, however it
maybe submited directly when do error handle and also other scene.
So super block isn't uptodate in journal. So there is need to do
some extra handle when recover journal.

Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 include/linux/jbd2.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 5962072a4b19..ab0e1a435a50 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1308,6 +1308,17 @@ struct journal_s
 				    struct buffer_head *bh,
 				    enum passtype pass, int off,
 				    tid_t expected_commit_id);
+	/*
+	 * EXT4 file system's super block may submited by journal, however it
+	 * maybe submited directly when do error handle. So super block isn't
+	 * uptodate in journal. So there is need to do some extra handle when
+	 * recover journal.
+	 */
+	void *j_replay_private_data;
+	int (*j_replay_prepare_callback)(struct journal_s *journal);
+	int (*j_replay_callback)(struct journal_s *journal,
+				  struct buffer_head *bh);
+	void (*j_replay_end_callback)(struct journal_s *journal);
 };
 
 #define jbd2_might_wait_for_commit(j) \
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/6] ext4: introudce helper for jounral recover handle
  2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
  2023-02-10  3:20 ` [PATCH v2 1/6] jbd2: introduce callback for recovery journal Ye Bin
@ 2023-02-10  3:20 ` Ye Bin
  2023-02-10  3:20 ` [PATCH v2 3/6] jbd2: do extra handle when do journal recovery Ye Bin
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ye Bin @ 2023-02-10  3:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

From: Ye Bin <yebin10@huawei.com>

Now, ext4 file system only need to handle super block when do
recover journal.

Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 fs/ext4/ext4_jbd2.c | 65 +++++++++++++++++++++++++++++++++++++++++++++
 fs/ext4/ext4_jbd2.h |  2 ++
 fs/ext4/super.c     |  1 +
 3 files changed, 68 insertions(+)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 77f318ec8abb..af03035606e1 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -395,3 +395,68 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
 	}
 	return err;
 }
+
+static void ext4_replay_end_callback(struct journal_s *journal)
+{
+	kfree(journal->j_replay_private_data);
+	journal->j_replay_private_data = NULL;
+	journal->j_replay_callback = NULL;
+	journal->j_replay_end_callback = NULL;
+}
+
+static int ext4_replay_callback(struct journal_s *journal,
+				struct buffer_head *bh)
+{
+	struct super_block *sb = journal->j_private;
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct ext4_super_block *es = sbi->s_es;
+	struct ext4_super_block *nes;
+	unsigned long offset;
+
+	if (likely(sbi->s_sbh != bh))
+		return 0;
+
+	offset = (void*)es - (void*)sbi->s_sbh->b_data;
+	nes = (struct ext4_super_block*)(bh->b_data + offset);
+	/*
+	 * If super block has error flag in journal record, there isn't need to
+	 * cover error information, as in this case is errors=continue mode,
+	 * error handle submit super block through journal.
+	 */
+	if (le16_to_cpu(nes->s_state) & EXT4_ERROR_FS)
+		return 0;
+
+	memcpy(((char *)es) + EXT4_S_ERR_START,
+	       journal->j_replay_private_data, EXT4_S_ERR_LEN);
+	if (sbi->s_mount_state & EXT4_ERROR_FS)
+		es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
+
+	return 0;
+}
+
+static int ext4_replay_prepare_callback(struct journal_s *journal)
+{
+	struct super_block *sb = journal->j_private;
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	char *private;
+	struct ext4_super_block *es = sbi->s_es;
+
+	if (!(sbi->s_mount_state & EXT4_ERROR_FS))
+		return 0;
+
+	private = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
+	if (!private)
+		return -ENOMEM;
+	memcpy(private, ((char *)es) + EXT4_S_ERR_START, EXT4_S_ERR_LEN);
+
+	journal->j_replay_private_data = private;
+	journal->j_replay_callback = ext4_replay_callback;
+	journal->j_replay_end_callback = ext4_replay_end_callback;
+
+	return 0;
+}
+
+void ext4_init_replay(journal_t *journal)
+{
+	journal->j_replay_prepare_callback = ext4_replay_prepare_callback;
+}
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 0c77697d5e90..8dcc7ef5028c 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -513,4 +513,6 @@ static inline int ext4_should_dioread_nolock(struct inode *inode)
 	return 1;
 }
 
+void ext4_init_replay(journal_t *journal);
+
 #endif	/* _EXT4_JBD2_H */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index dc3907dff13a..ea0fea04907c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5677,6 +5677,7 @@ static void ext4_init_journal_params(struct super_block *sb, journal_t *journal)
 	journal->j_commit_interval = sbi->s_commit_interval;
 	journal->j_min_batch_time = sbi->s_min_batch_time;
 	journal->j_max_batch_time = sbi->s_max_batch_time;
+	ext4_init_replay(journal);
 	ext4_fc_init(sb, journal);
 
 	write_lock(&journal->j_state_lock);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/6] jbd2: do extra handle when do journal recovery
  2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
  2023-02-10  3:20 ` [PATCH v2 1/6] jbd2: introduce callback for recovery journal Ye Bin
  2023-02-10  3:20 ` [PATCH v2 2/6] ext4: introudce helper for jounral recover handle Ye Bin
@ 2023-02-10  3:20 ` Ye Bin
  2023-02-10  3:20 ` [PATCH v2 4/6] ext4: remove backup for super block when recovery journal Ye Bin
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ye Bin @ 2023-02-10  3:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

From: Ye Bin <yebin10@huawei.com>

Ext4 file system's super block in journal maybe not uptodate, when
file system has error, we need set error information when do recover
uper block.

Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 fs/jbd2/recovery.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 8286a9ec122f..83b1a9689984 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -309,6 +309,15 @@ int jbd2_journal_recover(journal_t *journal)
 		return 0;
 	}
 
+	if (journal->j_replay_prepare_callback) {
+		err = journal->j_replay_prepare_callback(journal);
+		if (err) {
+			jbd2_debug(1, "JBD2: failed to prepare replay %d",
+				   err);
+			return err;
+		}
+	}
+
 	err = do_one_pass(journal, &info, PASS_SCAN);
 	if (!err)
 		err = do_one_pass(journal, &info, PASS_REVOKE);
@@ -335,6 +344,10 @@ int jbd2_journal_recover(journal_t *journal)
 		if (!err)
 			err = err2;
 	}
+
+	if (journal->j_replay_end_callback)
+		journal->j_replay_end_callback(journal);
+
 	return err;
 }
 
@@ -687,6 +700,20 @@ static int do_one_pass(journal_t *journal,
 						*((__be32 *)nbh->b_data) =
 						cpu_to_be32(JBD2_MAGIC_NUMBER);
 					}
+					if (unlikely(journal->j_replay_callback)) {
+						err = journal->j_replay_callback(
+							journal, nbh);
+						if (err) {
+							printk(KERN_ERR
+							       "JBD2: replay "
+							       "call back "
+							       "failed.\n");
+							unlock_buffer(nbh);
+							brelse(obh);
+							brelse(nbh);
+							goto failed;
+						}
+					}
 
 					BUFFER_TRACE(nbh, "marking dirty");
 					set_buffer_uptodate(nbh);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/6] ext4: remove backup for super block when recovery journal
  2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
                   ` (2 preceding siblings ...)
  2023-02-10  3:20 ` [PATCH v2 3/6] jbd2: do extra handle when do journal recovery Ye Bin
@ 2023-02-10  3:20 ` Ye Bin
  2023-02-10  3:20 ` [PATCH v2 5/6] ext4: fix super block checksum error Ye Bin
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ye Bin @ 2023-02-10  3:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

From: Ye Bin <yebin10@huawei.com>

As previous commit "jbd2: do extra handle when do journal recovery"
already do extra handle for super block. There's no need to do in
ext4_load_journal(), so remove it.

Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 fs/ext4/super.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ea0fea04907c..d86ee5af2db9 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5916,17 +5916,8 @@ static int ext4_load_journal(struct super_block *sb,
 
 	if (!ext4_has_feature_journal_needs_recovery(sb))
 		err = jbd2_journal_wipe(journal, !really_read_only);
-	if (!err) {
-		char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
-		if (save)
-			memcpy(save, ((char *) es) +
-			       EXT4_S_ERR_START, EXT4_S_ERR_LEN);
+	if (!err)
 		err = jbd2_journal_load(journal);
-		if (save)
-			memcpy(((char *) es) + EXT4_S_ERR_START,
-			       save, EXT4_S_ERR_LEN);
-		kfree(save);
-	}
 
 	if (err) {
 		ext4_msg(sb, KERN_ERR, "error loading journal");
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 5/6] ext4: fix super block checksum error
  2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
                   ` (3 preceding siblings ...)
  2023-02-10  3:20 ` [PATCH v2 4/6] ext4: remove backup for super block when recovery journal Ye Bin
@ 2023-02-10  3:20 ` Ye Bin
  2023-02-10  3:20 ` [PATCH v2 6/6] ext4: make sure fs error flag setted before clear journal error Ye Bin
  2023-02-10 11:56 ` [PATCH v2 0/6] fix error flag covered by journal recovery Jan Kara
  6 siblings, 0 replies; 10+ messages in thread
From: Ye Bin @ 2023-02-10  3:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

From: Ye Bin <yebin10@huawei.com>

As commit("ext4: fix error flag covered by journal recovery") update
error record when do journal recovery.There is need to recalculate
super block checksum after update error record or will lead to super
block checksum mismatch to data.

Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 fs/ext4/ext4_jbd2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index af03035606e1..ffcb0d58d407 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -430,6 +430,7 @@ static int ext4_replay_callback(struct journal_s *journal,
 	       journal->j_replay_private_data, EXT4_S_ERR_LEN);
 	if (sbi->s_mount_state & EXT4_ERROR_FS)
 		es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
+	ext4_superblock_csum_set(sb);
 
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 6/6] ext4: make sure fs error flag setted before clear journal error
  2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
                   ` (4 preceding siblings ...)
  2023-02-10  3:20 ` [PATCH v2 5/6] ext4: fix super block checksum error Ye Bin
@ 2023-02-10  3:20 ` Ye Bin
  2023-02-10 11:56 ` [PATCH v2 0/6] fix error flag covered by journal recovery Jan Kara
  6 siblings, 0 replies; 10+ messages in thread
From: Ye Bin @ 2023-02-10  3:20 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

From: Ye Bin <yebin10@huawei.com>

Now, jounral error number maybe cleared even though ext4_commit_super()
failed. This may lead to error flag miss, then fsck will miss to check
file system deeply.

Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 fs/ext4/super.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d86ee5af2db9..b458af1cbf5c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6143,11 +6143,13 @@ static int ext4_clear_journal_err(struct super_block *sb,
 		errstr = ext4_decode_error(sb, j_errno, nbuf);
 		ext4_warning(sb, "Filesystem error recorded "
 			     "from previous mount: %s", errstr);
-		ext4_warning(sb, "Marking fs in need of filesystem check.");
 
 		EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
 		es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
-		ext4_commit_super(sb);
+		j_errno = ext4_commit_super(sb);
+		if (j_errno)
+			return j_errno;
+		ext4_warning(sb, "Marked fs in need of filesystem check.");
 
 		jbd2_journal_clear_err(journal);
 		jbd2_journal_update_sb_errno(journal);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/6] fix error flag covered by journal recovery
  2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
                   ` (5 preceding siblings ...)
  2023-02-10  3:20 ` [PATCH v2 6/6] ext4: make sure fs error flag setted before clear journal error Ye Bin
@ 2023-02-10 11:56 ` Jan Kara
  2023-02-10 12:47   ` Zhang Yi
  2023-02-15  1:14   ` yebin (H)
  6 siblings, 2 replies; 10+ messages in thread
From: Jan Kara @ 2023-02-10 11:56 UTC (permalink / raw)
  To: Ye Bin; +Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, jack, Ye Bin

Hello!

On Fri 10-02-23 11:20:38, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
> 
> Diff v2 vs v1:
> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
> ext4_load_journal() to jbd2_journal_recover().
> 
> When do fault injection test, got issue as follows:
> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
> 
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
> 
> Without do file system check, file system is clean when do second mount.
> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
> mode the last super block is commit directly. So super block in journal is
> not uptodate. When do jounral recovery, the uptodate super block will be
> covered by jounral data. If super block submit all failed after recover
> journal, then file system error flag is lost. When do "fsck -a" couldn't
> repair file system deeply.
> To solve above issue we need to do extra handle when do super block journal
> recovery.

Thanks for the patches. Looking through the patches, I think this is a bit
of an overengineering for the problem at hand. The only thing that is
really worth preserving so that it is not lost after journal replay is the
error information. So in ext4_load_journal() I would just save that if
EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
after journal replay. Sure if the superblock write during journal replay
succeeds but the write restoring the error information fails, we will loose
the error information but that is so unlikely in practice that I don't
think it is really worth complicating the code for it. Also the only
downside is we will loose the information there is some error in the
filesystem - we'll soon find that out again anyway :).

								Honza

> 
> Ye Bin (6):
>   jbd2: introduce callback for recovery journal
>   ext4: introudce helper for jounral recover handle
>   jbd2: do extra handle when do journal recovery
>   ext4: remove backup for super block when recovery journal
>   ext4: fix super block checksum error
>   ext4: make sure fs error flag setted before clear journal error
> 
>  fs/ext4/ext4_jbd2.c  | 66 ++++++++++++++++++++++++++++++++++++++++++++
>  fs/ext4/ext4_jbd2.h  |  2 ++
>  fs/ext4/super.c      | 18 ++++--------
>  fs/jbd2/recovery.c   | 27 ++++++++++++++++++
>  include/linux/jbd2.h | 11 ++++++++
>  5 files changed, 112 insertions(+), 12 deletions(-)
> 
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/6] fix error flag covered by journal recovery
  2023-02-10 11:56 ` [PATCH v2 0/6] fix error flag covered by journal recovery Jan Kara
@ 2023-02-10 12:47   ` Zhang Yi
  2023-02-15  1:14   ` yebin (H)
  1 sibling, 0 replies; 10+ messages in thread
From: Zhang Yi @ 2023-02-10 12:47 UTC (permalink / raw)
  To: Jan Kara, Ye Bin; +Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, Ye Bin

On 2023/2/10 19:56, Jan Kara wrote:
> Hello!
> 
> On Fri 10-02-23 11:20:38, Ye Bin wrote:
>> From: Ye Bin <yebin10@huawei.com>
>>
>> Diff v2 vs v1:
>> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
>> ext4_load_journal() to jbd2_journal_recover().
>>
>> When do fault injection test, got issue as follows:
>> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
>> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> Without do file system check, file system is clean when do second mount.
>> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
>> mode the last super block is commit directly. So super block in journal is
>> not uptodate. When do jounral recovery, the uptodate super block will be
>> covered by jounral data. If super block submit all failed after recover
>> journal, then file system error flag is lost. When do "fsck -a" couldn't
>> repair file system deeply.
>> To solve above issue we need to do extra handle when do super block journal
>> recovery.
> 
> Thanks for the patches. Looking through the patches, I think this is a bit
> of an overengineering for the problem at hand. The only thing that is
> really worth preserving so that it is not lost after journal replay is the
> error information. So in ext4_load_journal() I would just save that if
> EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
> after journal replay. Sure if the superblock write during journal replay
> succeeds but the write restoring the error information fails, we will loose
> the error information but that is so unlikely in practice that I don't
> think it is really worth complicating the code for it. Also the only
> downside is we will loose the information there is some error in the
> filesystem - we'll soon find that out again anyway :).
> 

I think so, also add a error message if we failed to restoring the error
information, it could let us know what happened.

Thanks,
Yi.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/6] fix error flag covered by journal recovery
  2023-02-10 11:56 ` [PATCH v2 0/6] fix error flag covered by journal recovery Jan Kara
  2023-02-10 12:47   ` Zhang Yi
@ 2023-02-15  1:14   ` yebin (H)
  1 sibling, 0 replies; 10+ messages in thread
From: yebin (H) @ 2023-02-15  1:14 UTC (permalink / raw)
  To: Jan Kara, Ye Bin; +Cc: tytso, adilger.kernel, linux-ext4, linux-kernel



On 2023/2/10 19:56, Jan Kara wrote:
> Hello!
>
> On Fri 10-02-23 11:20:38, Ye Bin wrote:
>> From: Ye Bin <yebin10@huawei.com>
>>
>> Diff v2 vs v1:
>> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
>> ext4_load_journal() to jbd2_journal_recover().
>>
>> When do fault injection test, got issue as follows:
>> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
>> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> Without do file system check, file system is clean when do second mount.
>> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
>> mode the last super block is commit directly. So super block in journal is
>> not uptodate. When do jounral recovery, the uptodate super block will be
>> covered by jounral data. If super block submit all failed after recover
>> journal, then file system error flag is lost. When do "fsck -a" couldn't
>> repair file system deeply.
>> To solve above issue we need to do extra handle when do super block journal
>> recovery.
> Thanks for the patches. Looking through the patches, I think this is a bit
> of an overengineering for the problem at hand. The only thing that is
> really worth preserving so that it is not lost after journal replay is the
> error information. So in ext4_load_journal() I would just save that if
> EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
> after journal replay. Sure if the superblock write during journal replay
> succeeds but the write restoring the error information fails, we will loose
> the error information but that is so unlikely in practice that I don't
> think it is really worth complicating the code for it. Also the only
> downside is we will loose the information there is some error in the
> filesystem - we'll soon find that out again anyway :).
>
> 								Honza
Yes, this solution seems a little cumbersome, but to solve the problem 
of error
information loss, I can only think of this solution.
I re-analyzed the issue scenario. Because the error information of the 
last journal
super block was not recorded. This will cause that the error flag will 
not be updated
when the super block is submitted subsequently. However, when processing 
orphan
list, the file system errors were recorded in the memory, and the orphan 
list were
cleared directly, resulting in file system inconsistencies. To solve 
above isuue, i sent
V3 patch.
>> Ye Bin (6):
>>    jbd2: introduce callback for recovery journal
>>    ext4: introudce helper for jounral recover handle
>>    jbd2: do extra handle when do journal recovery
>>    ext4: remove backup for super block when recovery journal
>>    ext4: fix super block checksum error
>>    ext4: make sure fs error flag setted before clear journal error
>>
>>   fs/ext4/ext4_jbd2.c  | 66 ++++++++++++++++++++++++++++++++++++++++++++
>>   fs/ext4/ext4_jbd2.h  |  2 ++
>>   fs/ext4/super.c      | 18 ++++--------
>>   fs/jbd2/recovery.c   | 27 ++++++++++++++++++
>>   include/linux/jbd2.h | 11 ++++++++
>>   5 files changed, 112 insertions(+), 12 deletions(-)
>>
>> -- 
>> 2.31.1
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-02-15  1:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-10  3:20 [PATCH v2 0/6] fix error flag covered by journal recovery Ye Bin
2023-02-10  3:20 ` [PATCH v2 1/6] jbd2: introduce callback for recovery journal Ye Bin
2023-02-10  3:20 ` [PATCH v2 2/6] ext4: introudce helper for jounral recover handle Ye Bin
2023-02-10  3:20 ` [PATCH v2 3/6] jbd2: do extra handle when do journal recovery Ye Bin
2023-02-10  3:20 ` [PATCH v2 4/6] ext4: remove backup for super block when recovery journal Ye Bin
2023-02-10  3:20 ` [PATCH v2 5/6] ext4: fix super block checksum error Ye Bin
2023-02-10  3:20 ` [PATCH v2 6/6] ext4: make sure fs error flag setted before clear journal error Ye Bin
2023-02-10 11:56 ` [PATCH v2 0/6] fix error flag covered by journal recovery Jan Kara
2023-02-10 12:47   ` Zhang Yi
2023-02-15  1:14   ` yebin (H)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.