All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] jbd2: avoid transaction reuse after reformatting
       [not found] <tencent_2341B065211F204FA07C3ADDA1AE07706405@qq.com>
@ 2020-09-10 12:06 ` 常凤楠
  2020-09-10 18:45 ` Andreas Dilger
  2020-09-11 10:06 ` Jan Kara
  2 siblings, 0 replies; 10+ messages in thread
From: 常凤楠 @ 2020-09-10 12:06 UTC (permalink / raw)
  To: adilger; +Cc: darrick.wong, jack, linux-ext4, tytso

When format ext4 with lazy_journal_init=1, the previous transaction is still on disk,
it is possible that the previous transaction will be used again during jbd2 recovery.
Because the seed is changed, the CRC check will fail.

Signed-off-by: Fengnan Chang <changfengnan@hikvision.com>
---
 fs/jbd2/recovery.c | 60 +++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 54 insertions(+), 6 deletions(-)

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index a4967b27ffb6..8a6bc322a06d 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -33,6 +33,8 @@ struct recovery_info
 intnr_replays;
 intnr_revokes;
 intnr_revoke_hits;
+unsigned long  ri_commit_block;
+__be64  last_trans_commit_time;
 };

 enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY}; @@ -412,7 +414,27 @@ static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
 else
 return tag->t_checksum == cpu_to_be16(csum32);  }
+/*
+ * We check the commit time and compare it with the commit time of
+ * the previous transaction, if it's smaller than previous,
+ * We think it's not belong to same journal.
+ */
+static bool is_same_journal(journal_t *journal, struct buffer_head *bh,
+unsigned long blocknr, __u64 last_commit_sec) {
+unsigned long commit_block = blocknr + count_tags(journal, bh) + 1;
+struct buffer_head *nbh;
+struct commit_header *cbh;
+__u64commit_sec;
+int err = jread(&nbh, journal, commit_block);

+if (err)
+return true;
+
+cbh = (struct commit_header *)nbh->b_data;
+commit_sec = be64_to_cpu(cbh->h_commit_sec);
+
+return commit_sec >= last_commit_sec;
+}
 static int do_one_pass(journal_t *journal,
 struct recovery_info *info, enum passtype pass)  { @@ -514,18 +536,29 @@ static int do_one_pass(journal_t *journal,
 switch(blocktype) {
 case JBD2_DESCRIPTOR_BLOCK:
 /* Verify checksum first */
+if (pass == PASS_SCAN)
+info->ri_commit_block = 0;
+
 if (jbd2_journal_has_csum_v2or3(journal))
 descr_csum_size =
 sizeof(struct jbd2_journal_block_tail);
 if (descr_csum_size > 0 &&
     !jbd2_descriptor_block_csum_verify(journal,
        bh->b_data)) {
-printk(KERN_ERR "JBD2: Invalid checksum "
-       "recovering block %lu in log\n",
+if (is_same_journal(journal, bh, next_log_block-1, info->last_trans_commit_time)) {
+printk(KERN_ERR "JBD2: Invalid checksum recovering block %lu in
+log\n",
        next_log_block);
-err = -EFSBADCRC;
-brelse(bh);
-goto failed;
+err = -EFSBADCRC;
+brelse(bh);
+goto failed;
+} else {
+/*it's not belong to same journal, just end this recovery with success*/
+jbd_debug(1, "JBD2: Invalid checksum found in block %lu in log, but not same journal %d\n",
+       next_log_block, next_commit_ID);
+err = 0;
+brelse(bh);
+goto done;
+}
 }

 /* If it is a valid descriptor block, replay it @@ -688,6 +721,17 @@ static int do_one_pass(journal_t *journal,
  * are present verify them in PASS_SCAN; else not
  * much to do other than move on to the next sequence
  * number. */
+if (pass == PASS_SCAN) {
+struct commit_header *cbh =
+(struct commit_header *)bh->b_data;
+if (info->ri_commit_block) {
+jbd_debug(1, "invalid commit block found in %lu, stop here.\n", next_log_block);
+brelse(bh);
+goto done;
+}
+info->ri_commit_block = next_log_block;
+info->last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
+}
 if (pass == PASS_SCAN &&
     jbd2_has_feature_checksum(journal)) {
 int chksum_err, chksum_seen;
@@ -761,7 +805,11 @@ static int do_one_pass(journal_t *journal,
 brelse(bh);
 continue;
 }
-
+if (pass != PASS_SCAN && info->ri_commit_block) {
+jbd_debug(1, "invalid revoke block found in %lu, stop here.\n", next_log_block);
+brelse(bh);
+goto done;
+}
 err = scan_revoke_records(journal, bh,
   next_commit_ID, info);
 brelse(bh);
--
2.27.0.windows.1



________________________________

CONFIDENTIALITY NOTICE: This electronic message is intended to be viewed only by the individual or entity to whom it is addressed. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. Any dissemination, distribution or copying of this communication is strictly prohibited without our prior permission. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, or if you have received this communication in error, please notify us immediately by return e-mail and delete the original message and any copies of it from your computer system. For further information about Hikvision company. please see our website at www.hikvision.com<http://www.hikvision.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: avoid transaction reuse after reformatting
       [not found] <tencent_2341B065211F204FA07C3ADDA1AE07706405@qq.com>
  2020-09-10 12:06 ` [PATCH] jbd2: avoid transaction reuse after reformatting 常凤楠
@ 2020-09-10 18:45 ` Andreas Dilger
  2020-09-11 10:06 ` Jan Kara
  2 siblings, 0 replies; 10+ messages in thread
From: Andreas Dilger @ 2020-09-10 18:45 UTC (permalink / raw)
  To: changfengnan; +Cc: darrick.wong, jack, linux-ext4, tytso, Fengnan Chang

[-- Attachment #1: Type: text/plain, Size: 4741 bytes --]

On Sep 10, 2020, at 5:55 AM, changfengnan <changfengnan@qq.com> wrote:
> 
> From: Fengnan Chang <changfengnan@hikvision.com>
> 
> When format ext4 with lazy_journal_init=1, the previous transaction is
> still on disk, it is possible that the previous transaction will be
> used again during jbd2 recovery.Because the seed is changed, the CRC
> check will fail.
> 
> Signed-off-by: Fengnan Chang <changfengnan@hikvision.com>

This version of the patch looks OK, with one trivial indentation problem
after the "jbd_debug(1, "JBD2: Invalid checksum" line that could be fixed.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/jbd2/recovery.c | 60 +++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 54 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
> index a4967b27ffb6..8a6bc322a06d 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -33,6 +33,8 @@ struct recovery_info
> 	int		nr_replays;
> 	int		nr_revokes;
> 	int		nr_revoke_hits;
> +	unsigned long  ri_commit_block;
> +	__be64  last_trans_commit_time;
> };
> 
> enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY};
> @@ -412,7 +414,27 @@ static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
> 	else
> 		return tag->t_checksum == cpu_to_be16(csum32);
> }
> +/*
> + * We check the commit time and compare it with the commit time of
> + * the previous transaction, if it's smaller than previous,
> + * We think it's not belong to same journal.
> + */
> +static bool is_same_journal(journal_t *journal, struct buffer_head *bh, unsigned long blocknr, __u64 last_commit_sec)
> +{
> +	unsigned long commit_block = blocknr + count_tags(journal, bh) + 1;
> +	struct buffer_head *nbh;
> +	struct commit_header *cbh;
> +	__u64	commit_sec;
> +	int err = jread(&nbh, journal, commit_block);
> 
> +	if (err)
> +		return true;
> +
> +	cbh = (struct commit_header *)nbh->b_data;
> +	commit_sec = be64_to_cpu(cbh->h_commit_sec);
> +
> +	return commit_sec >= last_commit_sec;
> +}
> static int do_one_pass(journal_t *journal,
> 			struct recovery_info *info, enum passtype pass)
> {
> @@ -514,18 +536,29 @@ static int do_one_pass(journal_t *journal,
> 		switch(blocktype) {
> 		case JBD2_DESCRIPTOR_BLOCK:
> 			/* Verify checksum first */
> +			if (pass == PASS_SCAN)
> +				info->ri_commit_block = 0;
> +
> 			if (jbd2_journal_has_csum_v2or3(journal))
> 				descr_csum_size =
> 					sizeof(struct jbd2_journal_block_tail);
> 			if (descr_csum_size > 0 &&
> 			    !jbd2_descriptor_block_csum_verify(journal,
> 							       bh->b_data)) {
> -				printk(KERN_ERR "JBD2: Invalid checksum "
> -				       "recovering block %lu in log\n",
> +				if (is_same_journal(journal, bh, next_log_block-1, info->last_trans_commit_time)) {
> +					printk(KERN_ERR "JBD2: Invalid checksum recovering block %lu in log\n",
> 				       next_log_block);
> -				err = -EFSBADCRC;
> -				brelse(bh);
> -				goto failed;
> +					err = -EFSBADCRC;
> +					brelse(bh);
> +					goto failed;
> +				} else {
> +					/*it's not belong to same journal, just end this recovery with success*/
> +					jbd_debug(1, "JBD2: Invalid checksum found in block %lu in log, but not same journal %d\n",
> +				       next_log_block, next_commit_ID);

The continued line should be indented one more tab.

> +					err = 0;
> +					brelse(bh);
> +					goto done;
> +				}
> 			}
> 
> 			/* If it is a valid descriptor block, replay it
> @@ -688,6 +721,17 @@ static int do_one_pass(journal_t *journal,
> 			 * are present verify them in PASS_SCAN; else not
> 			 * much to do other than move on to the next sequence
> 			 * number. */
> +			if (pass == PASS_SCAN) {
> +				struct commit_header *cbh =
> +					(struct commit_header *)bh->b_data;
> +				if (info->ri_commit_block) {
> +					jbd_debug(1, "invalid commit block found in %lu, stop here.\n", next_log_block);
> +					brelse(bh);
> +					goto done;
> +				}
> +				info->ri_commit_block = next_log_block;
> +				info->last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
> +			}
> 			if (pass == PASS_SCAN &&
> 			    jbd2_has_feature_checksum(journal)) {
> 				int chksum_err, chksum_seen;
> @@ -761,7 +805,11 @@ static int do_one_pass(journal_t *journal,
> 				brelse(bh);
> 				continue;
> 			}
> -
> +			if (pass != PASS_SCAN && info->ri_commit_block) {
> +				jbd_debug(1, "invalid revoke block found in %lu, stop here.\n", next_log_block);
> +				brelse(bh);
> +				goto done;
> +			}
> 			err = scan_revoke_records(journal, bh,
> 						  next_commit_ID, info);
> 			brelse(bh);
> --
> 2.27.0.windows.1
> 
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: avoid transaction reuse after reformatting
       [not found] <tencent_2341B065211F204FA07C3ADDA1AE07706405@qq.com>
  2020-09-10 12:06 ` [PATCH] jbd2: avoid transaction reuse after reformatting 常凤楠
  2020-09-10 18:45 ` Andreas Dilger
@ 2020-09-11 10:06 ` Jan Kara
  2020-09-14 11:50   ` 答复: " 常凤楠
  2 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2020-09-11 10:06 UTC (permalink / raw)
  To: changfengnan
  Cc: adilger, darrick.wong, jack, linux-ext4, tytso, Fengnan Chang

On Thu 10-09-20 19:55:21, changfengnan wrote:
> From: Fengnan Chang <changfengnan@hikvision.com>
> 
> When format ext4 with lazy_journal_init=1, the previous transaction is
> still on disk, it is possible that the previous transaction will be
> used again during jbd2 recovery.Because the seed is changed, the CRC
> check will fail.
> 
> Signed-off-by: Fengnan Chang <changfengnan@hikvision.com>

Thanks for the patch! I have a few functional comments below, also on a
coding style side your patch contains lines a lot longer than 80
characters. Although that is not a strict requirement anymore it is still
prefered that lines are wrapped to fit 80 columns to make code more easily
readable...

> ---
>  fs/jbd2/recovery.c | 60 +++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 54 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
> index a4967b27ffb6..8a6bc322a06d 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -33,6 +33,8 @@ struct recovery_info
>  	int		nr_replays;
>  	int		nr_revokes;
>  	int		nr_revoke_hits;
> +	unsigned long  ri_commit_block;
> +	__be64  last_trans_commit_time;

Is there a need for these in recovery_info? They are needed only in
PASS_SCAN AFAICT so they can be as well just local variables in
do_one_pass()...

>  };
>  
>  enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY};
> @@ -412,7 +414,27 @@ static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
>  	else
>  		return tag->t_checksum == cpu_to_be16(csum32);
>  }
> +/*
> + * We check the commit time and compare it with the commit time of
> + * the previous transaction, if it's smaller than previous,
> + * We think it's not belong to same journal.
> + */
> +static bool is_same_journal(journal_t *journal, struct buffer_head *bh, unsigned long blocknr, __u64 last_commit_sec)
> +{
> +	unsigned long commit_block = blocknr + count_tags(journal, bh) + 1;
> +	struct buffer_head *nbh;
> +	struct commit_header *cbh;
> +	__u64	commit_sec;
> +	int err = jread(&nbh, journal, commit_block);

So there are two flaws in this logic. First, the computed commit_block may
be beyond end of the journal. You need to wrap the computed value using
wrap() macro. Second, you seem to assume that each transaction has only a
single descriptor block. That is definitely not true. There can be
arbitrary number (well, up to a maximum transaction size which depends on
the total size of the journal) of descriptor blocks in a transaction before
a commit block. So to locate commit block, you need to read the block,
check its type (it can be another descriptor block, revoke block, or commit
block) and then decide accordingly what to do. And you have to be rather
careful when going through the journal like this because you cannot check
checksums and blocks can be containing arbitrary garbage. Probably the
cleanest way how to implement this would be to use the scanning loop in
do_one_pass(), just indicate in some state variable that "we are now
searching for next commit block to determine whether a checksum failure is
due to lazy init or storage error". In this state we will continue
PASS_SCAN, just not checking checksums anymore until we find a commit block
or reach end of journal and then based on what we found decide whether to
report checksum error or just silently end PASS_SCAN pass...

>  
> +	if (err)
> +		return true;
> +
> +	cbh = (struct commit_header *)nbh->b_data;
> +	commit_sec = be64_to_cpu(cbh->h_commit_sec);
> +
> +	return commit_sec >= last_commit_sec;
> +}
>  static int do_one_pass(journal_t *journal,
>  			struct recovery_info *info, enum passtype pass)
>  {
> @@ -514,18 +536,29 @@ static int do_one_pass(journal_t *journal,
>  		switch(blocktype) {
>  		case JBD2_DESCRIPTOR_BLOCK:
>  			/* Verify checksum first */
> +			if (pass == PASS_SCAN)
> +				info->ri_commit_block = 0;
> +
>  			if (jbd2_journal_has_csum_v2or3(journal))
>  				descr_csum_size =
>  					sizeof(struct jbd2_journal_block_tail);
>  			if (descr_csum_size > 0 &&
>  			    !jbd2_descriptor_block_csum_verify(journal,
>  							       bh->b_data)) {
> -				printk(KERN_ERR "JBD2: Invalid checksum "
> -				       "recovering block %lu in log\n",
> +				if (is_same_journal(journal, bh, next_log_block-1, info->last_trans_commit_time)) {
> +					printk(KERN_ERR "JBD2: Invalid checksum recovering block %lu in log\n",
>  				       next_log_block);
> -				err = -EFSBADCRC;
> -				brelse(bh);
> -				goto failed;
> +					err = -EFSBADCRC;
> +					brelse(bh);
> +					goto failed;
> +				} else {
> +					/*it's not belong to same journal, just end this recovery with success*/
> +					jbd_debug(1, "JBD2: Invalid checksum found in block %lu in log, but not same journal %d\n",
> +				       next_log_block, next_commit_ID);
> +					err = 0;
> +					brelse(bh);
> +					goto done;
> +				}
>  			}
>  
>  			/* If it is a valid descriptor block, replay it
> @@ -688,6 +721,17 @@ static int do_one_pass(journal_t *journal,
>  			 * are present verify them in PASS_SCAN; else not
>  			 * much to do other than move on to the next sequence
>  			 * number. */
> +			if (pass == PASS_SCAN) {
> +				struct commit_header *cbh =
> +					(struct commit_header *)bh->b_data;
> +				if (info->ri_commit_block) {
> +					jbd_debug(1, "invalid commit block found in %lu, stop here.\n", next_log_block);
> +					brelse(bh);
> +					goto done;
> +				}
> +				info->ri_commit_block = next_log_block;
> +				info->last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
> +			}

I don't think you need to update ri_commit_block and last_trans_commit_time
if we are not checking checksums (lazy journal init cannot happen without
journal checksumming being used). So I'd just move this into the

                        if (pass == PASS_SCAN &&
                            jbd2_has_feature_checksum(journal)) {

branch below. Also we could validate the commit block the same way (using
timestamp) as the descriptor block. Then you will not need the
'ri_commit_block' logic and things will be nicely uniform...

>  			if (pass == PASS_SCAN &&
>  			    jbd2_has_feature_checksum(journal)) {
>  				int chksum_err, chksum_seen;
> @@ -761,7 +805,11 @@ static int do_one_pass(journal_t *journal,
>  				brelse(bh);
>  				continue;
>  			}
> -
> +			if (pass != PASS_SCAN && info->ri_commit_block) {

I don't understand this. We get here only if "pass == PASS_REVOKE" so "pass
!= PASS_SCAN" is guaranteed to be true. But also info->ri_commit_block will
likely be != 0 from the previous PASS_SCAN pass. So this will always skip
revoke block handling?

> +				jbd_debug(1, "invalid revoke block found in %lu, stop here.\n", next_log_block);
> +				brelse(bh);
> +				goto done;
> +			}
>  			err = scan_revoke_records(journal, bh,
>  						  next_commit_ID, info);
>  			brelse(bh);

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
  2020-09-11 10:06 ` Jan Kara
@ 2020-09-14 11:50   ` 常凤楠
  2020-09-17 10:44     ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: 常凤楠 @ 2020-09-14 11:50 UTC (permalink / raw)
  To: Jan Kara, changfengnan; +Cc: adilger, darrick.wong, jack, linux-ext4, tytso

Thank you for your review comments, there are indeed some problems.
Let's discuss the function problems first, and I will modify the code style later.
1. About add two members in struct recovery_info, it can be just local variables in do_one_pass().
2. About is_same_journal(), I indded I did not consider the situation of serverl
descriptor blocks in a transaction,I have moved the work of is_same_journal() outside to do it as you suggest.
3. You suggest add jbd2_has_feature_checksum() judge in PASS_SCAN, But jbd2_has_feature_checksum() only check checksum
VER1, not include VER2 and VER3. And I don't understand why lazy journal init cannot happen without journal checksumming being used,
so i didn't add jbd2_has_feature_checksum() for now.
4. The pass == PASS_REVOKE case, this is a low-level error, my previous test did not cover it. I have changed it now.

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index a4967b27ffb6..f7702e14077f 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,
 struct recovery_info *info, enum passtype pass)
 {
 unsigned intfirst_commit_ID, next_commit_ID;
-unsigned longnext_log_block;
+unsigned longnext_log_block, ri_commit_block = 0;
 interr, success = 0;
 journal_superblock_t *sb;
 journal_header_t *tmp;
@@ -428,7 +428,9 @@ static int do_one_pass(journal_t *journal,
 __u32crc32_sum = ~0; /* Transactional Checksums */
 intdescr_csum_size = 0;
 intblock_error = 0;
-
+boolneed_check_commit_time = false;
+__be64last_trans_commit_time;
+
 /*
  * First thing is to establish what we expect to find in the log
  * (in terms of transaction IDs), and where (in terms of log
@@ -514,18 +516,18 @@ static int do_one_pass(journal_t *journal,
 switch(blocktype) {
 case JBD2_DESCRIPTOR_BLOCK:
 /* Verify checksum first */
+if(pass == PASS_SCAN)
+ri_commit_block = 0;
+
 if (jbd2_journal_has_csum_v2or3(journal))
 descr_csum_size =
 sizeof(struct jbd2_journal_block_tail);
 if (descr_csum_size > 0 &&
     !jbd2_descriptor_block_csum_verify(journal,
        bh->b_data)) {
-printk(KERN_ERR "JBD2: Invalid checksum "
-       "recovering block %lu in log\n",
-       next_log_block);
-err = -EFSBADCRC;
-brelse(bh);
-goto failed;
+need_check_commit_time = true;
+jbd_debug(1, "invalid descriptor block found in %lu, continue recovery first.\n",next_log_block);
+
 }

 /* If it is a valid descriptor block, replay it
@@ -535,6 +537,7 @@ static int do_one_pass(journal_t *journal,
 if (pass != PASS_REPLAY) {
 if (pass == PASS_SCAN &&
     jbd2_has_feature_checksum(journal) &&
+    !need_check_commit_time &&
     !info->end_transaction) {
 if (calc_chksums(journal, bh,
 &next_log_block,
@@ -688,6 +691,36 @@ static int do_one_pass(journal_t *journal,
  * are present verify them in PASS_SCAN; else not
  * much to do other than move on to the next sequence
  * number. */
+if(pass == PASS_SCAN) {
+struct commit_header *cbh =
+(struct commit_header *)bh->b_data;
+if(need_check_commit_time) {
+__be64 commit_time = be64_to_cpu(cbh->h_commit_sec);
+if(commit_time >= last_trans_commit_time) {
+printk(KERN_ERR "JBD2: Invalid checksum found in log, %d\n",
+next_commit_ID);
+err = -EFSBADCRC;
+brelse(bh);
+goto failed;
+}
+else
+{
+/*it's not belong to same journal, just end this recovery with success*/
+jbd_debug(1, "JBD2: Invalid checksum found in block in log, but not same journal %d\n",
+next_commit_ID);
+err = 0;
+brelse(bh);
+goto done;
+}
+}
+if(ri_commit_block) {
+jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
+brelse(bh);
+goto done;
+}
+ri_commit_block = next_log_block;
+last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
+}
 if (pass == PASS_SCAN &&
     jbd2_has_feature_checksum(journal)) {
 int chksum_err, chksum_seen;
@@ -755,6 +788,12 @@ static int do_one_pass(journal_t *journal,
 continue;

 case JBD2_REVOKE_BLOCK:
+if (pass == PASS_SCAN &&
+ri_commit_block) {
+jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
+brelse(bh);
+goto done;
+}
 /* If we aren't in the REVOKE pass, then we can
  * just skip over this block. */
 if (pass != PASS_REVOKE) {

-----邮件原件-----
发件人: Jan Kara <jack@suse.cz>
发送时间: 2020年9月11日 18:06
收件人: changfengnan <changfengnan@qq.com>
抄送: adilger@dilger.ca; darrick.wong@oracle.com; jack@suse.com; linux-ext4@vger.kernel.org; tytso@mit.edu; 常凤楠 <changfengnan@hikvision.com>
主题: Re: [PATCH] jbd2: avoid transaction reuse after reformatting

On Thu 10-09-20 19:55:21, changfengnan wrote:
> From: Fengnan Chang <changfengnan@hikvision.com>
>
> When format ext4 with lazy_journal_init=1, the previous transaction is
> still on disk, it is possible that the previous transaction will be
> used again during jbd2 recovery.Because the seed is changed, the CRC
> check will fail.
>
> Signed-off-by: Fengnan Chang <changfengnan@hikvision.com>

Thanks for the patch! I have a few functional comments below, also on a coding style side your patch contains lines a lot longer than 80 characters. Although that is not a strict requirement anymore it is still prefered that lines are wrapped to fit 80 columns to make code more easily readable...

> ---
>  fs/jbd2/recovery.c | 60
> +++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 54 insertions(+), 6 deletions(-)
>
> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index
> a4967b27ffb6..8a6bc322a06d 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -33,6 +33,8 @@ struct recovery_info
>  intnr_replays;
>  intnr_revokes;
>  intnr_revoke_hits;
> +unsigned long  ri_commit_block;
> +__be64  last_trans_commit_time;

Is there a need for these in recovery_info? They are needed only in PASS_SCAN AFAICT so they can be as well just local variables in do_one_pass()...

>  };
>
>  enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY}; @@ -412,7
> +414,27 @@ static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
>  else
>  return tag->t_checksum == cpu_to_be16(csum32);  }
> +/*
> + * We check the commit time and compare it with the commit time of
> + * the previous transaction, if it's smaller than previous,
> + * We think it's not belong to same journal.
> + */
> +static bool is_same_journal(journal_t *journal, struct buffer_head
> +*bh, unsigned long blocknr, __u64 last_commit_sec) {
> +unsigned long commit_block = blocknr + count_tags(journal, bh) + 1;
> +struct buffer_head *nbh;
> +struct commit_header *cbh;
> +__u64commit_sec;
> +int err = jread(&nbh, journal, commit_block);

So there are two flaws in this logic. First, the computed commit_block may be beyond end of the journal. You need to wrap the computed value using
wrap() macro. Second, you seem to assume that each transaction has only a single descriptor block. That is definitely not true. There can be arbitrary number (well, up to a maximum transaction size which depends on the total size of the journal) of descriptor blocks in a transaction before a commit block. So to locate commit block, you need to read the block, check its type (it can be another descriptor block, revoke block, or commit
block) and then decide accordingly what to do. And you have to be rather careful when going through the journal like this because you cannot check checksums and blocks can be containing arbitrary garbage. Probably the cleanest way how to implement this would be to use the scanning loop in do_one_pass(), just indicate in some state variable that "we are now searching for next commit block to determine whether a checksum failure is due to lazy init or storage error". In this state we will continue PASS_SCAN, just not checking checksums anymore until we find a commit block or reach end of journal and then based on what we found decide whether to report checksum error or just silently end PASS_SCAN pass...

>
> +if (err)
> +return true;
> +
> +cbh = (struct commit_header *)nbh->b_data;
> +commit_sec = be64_to_cpu(cbh->h_commit_sec);
> +
> +return commit_sec >= last_commit_sec; }
>  static int do_one_pass(journal_t *journal,
>  struct recovery_info *info, enum passtype pass)  { @@ -514,18
> +536,29 @@ static int do_one_pass(journal_t *journal,
>  switch(blocktype) {
>  case JBD2_DESCRIPTOR_BLOCK:
>  /* Verify checksum first */
> +if (pass == PASS_SCAN)
> +info->ri_commit_block = 0;
> +
>  if (jbd2_journal_has_csum_v2or3(journal))
>  descr_csum_size =
>  sizeof(struct jbd2_journal_block_tail);
>  if (descr_csum_size > 0 &&
>      !jbd2_descriptor_block_csum_verify(journal,
>         bh->b_data)) {
> -printk(KERN_ERR "JBD2: Invalid checksum "
> -       "recovering block %lu in log\n",
> +if (is_same_journal(journal, bh, next_log_block-1, info->last_trans_commit_time)) {
> +printk(KERN_ERR "JBD2: Invalid checksum recovering block %lu in
> +log\n",
>         next_log_block);
> -err = -EFSBADCRC;
> -brelse(bh);
> -goto failed;
> +err = -EFSBADCRC;
> +brelse(bh);
> +goto failed;
> +} else {
> +/*it's not belong to same journal, just end this recovery with success*/
> +jbd_debug(1, "JBD2: Invalid checksum found in block %lu in log, but not same journal %d\n",
> +       next_log_block, next_commit_ID);
> +err = 0;
> +brelse(bh);
> +goto done;
> +}
>  }
>
>  /* If it is a valid descriptor block, replay it @@ -688,6 +721,17
> @@ static int do_one_pass(journal_t *journal,
>   * are present verify them in PASS_SCAN; else not
>   * much to do other than move on to the next sequence
>   * number. */
> +if (pass == PASS_SCAN) {
> +struct commit_header *cbh =
> +(struct commit_header *)bh->b_data;
> +if (info->ri_commit_block) {
> +jbd_debug(1, "invalid commit block found in %lu, stop here.\n", next_log_block);
> +brelse(bh);
> +goto done;
> +}
> +info->ri_commit_block = next_log_block;
> +info->last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
> +}

I don't think you need to update ri_commit_block and last_trans_commit_time if we are not checking checksums (lazy journal init cannot happen without journal checksumming being used). So I'd just move this into the

                        if (pass == PASS_SCAN &&
                            jbd2_has_feature_checksum(journal)) {

branch below. Also we could validate the commit block the same way (using
timestamp) as the descriptor block. Then you will not need the 'ri_commit_block' logic and things will be nicely uniform...

>  if (pass == PASS_SCAN &&
>      jbd2_has_feature_checksum(journal)) {
>  int chksum_err, chksum_seen;
> @@ -761,7 +805,11 @@ static int do_one_pass(journal_t *journal,
>  brelse(bh);
>  continue;
>  }
> -
> +if (pass != PASS_SCAN && info->ri_commit_block) {

I don't understand this. We get here only if "pass == PASS_REVOKE" so "pass != PASS_SCAN" is guaranteed to be true. But also info->ri_commit_block will likely be != 0 from the previous PASS_SCAN pass. So this will always skip revoke block handling?

> +jbd_debug(1, "invalid revoke block found in %lu, stop here.\n", next_log_block);
> +brelse(bh);
> +goto done;
> +}
>  err = scan_revoke_records(journal, bh,
>    next_commit_ID, info);
>  brelse(bh);

Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR

________________________________

CONFIDENTIALITY NOTICE: This electronic message is intended to be viewed only by the individual or entity to whom it is addressed. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. Any dissemination, distribution or copying of this communication is strictly prohibited without our prior permission. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, or if you have received this communication in error, please notify us immediately by return e-mail and delete the original message and any copies of it from your computer system. For further information about Hikvision company. please see our website at www.hikvision.com<http://www.hikvision.com>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
  2020-09-14 11:50   ` 答复: " 常凤楠
@ 2020-09-17 10:44     ` Jan Kara
  2020-09-18  1:49       ` 答复: " 常凤楠
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2020-09-17 10:44 UTC (permalink / raw)
  To: 常凤楠
  Cc: Jan Kara, changfengnan, adilger, darrick.wong, jack, linux-ext4, tytso

On Mon 14-09-20 11:50:04, 常凤楠 wrote:
> Thank you for your review comments, there are indeed some problems.
> Let's discuss the function problems first, and I will modify the code style later.
> 1. About add two members in struct recovery_info, it can be just local variables in do_one_pass().
> 2. About is_same_journal(), I indded I did not consider the situation of serverl
> descriptor blocks in a transaction,I have moved the work of is_same_journal() outside to do it as you suggest.
> 3. You suggest add jbd2_has_feature_checksum() judge in PASS_SCAN, But jbd2_has_feature_checksum() only check checksum
> VER1, not include VER2 and VER3. And I don't understand why lazy journal init cannot happen without journal checksumming being used,
> so i didn't add jbd2_has_feature_checksum() for now.
> 4. The pass == PASS_REVOKE case, this is a low-level error, my previous test did not cover it. I have changed it now.

Please, send the patch as an attachment if your mailer is mangling
whitespace like this. The patch is very hard to read in this form. Thanks!

								Honza

> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
> index a4967b27ffb6..f7702e14077f 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,
>  struct recovery_info *info, enum passtype pass)
>  {
>  unsigned intfirst_commit_ID, next_commit_ID;
> -unsigned longnext_log_block;
> +unsigned longnext_log_block, ri_commit_block = 0;
>  interr, success = 0;
>  journal_superblock_t *sb;
>  journal_header_t *tmp;
> @@ -428,7 +428,9 @@ static int do_one_pass(journal_t *journal,
>  __u32crc32_sum = ~0; /* Transactional Checksums */
>  intdescr_csum_size = 0;
>  intblock_error = 0;
> -
> +boolneed_check_commit_time = false;
> +__be64last_trans_commit_time;
> +
>  /*
>   * First thing is to establish what we expect to find in the log
>   * (in terms of transaction IDs), and where (in terms of log
> @@ -514,18 +516,18 @@ static int do_one_pass(journal_t *journal,
>  switch(blocktype) {
>  case JBD2_DESCRIPTOR_BLOCK:
>  /* Verify checksum first */
> +if(pass == PASS_SCAN)
> +ri_commit_block = 0;
> +
>  if (jbd2_journal_has_csum_v2or3(journal))
>  descr_csum_size =
>  sizeof(struct jbd2_journal_block_tail);
>  if (descr_csum_size > 0 &&
>      !jbd2_descriptor_block_csum_verify(journal,
>         bh->b_data)) {
> -printk(KERN_ERR "JBD2: Invalid checksum "
> -       "recovering block %lu in log\n",
> -       next_log_block);
> -err = -EFSBADCRC;
> -brelse(bh);
> -goto failed;
> +need_check_commit_time = true;
> +jbd_debug(1, "invalid descriptor block found in %lu, continue recovery first.\n",next_log_block);
> +
>  }
> 
>  /* If it is a valid descriptor block, replay it
> @@ -535,6 +537,7 @@ static int do_one_pass(journal_t *journal,
>  if (pass != PASS_REPLAY) {
>  if (pass == PASS_SCAN &&
>      jbd2_has_feature_checksum(journal) &&
> +    !need_check_commit_time &&
>      !info->end_transaction) {
>  if (calc_chksums(journal, bh,
>  &next_log_block,
> @@ -688,6 +691,36 @@ static int do_one_pass(journal_t *journal,
>   * are present verify them in PASS_SCAN; else not
>   * much to do other than move on to the next sequence
>   * number. */
> +if(pass == PASS_SCAN) {
> +struct commit_header *cbh =
> +(struct commit_header *)bh->b_data;
> +if(need_check_commit_time) {
> +__be64 commit_time = be64_to_cpu(cbh->h_commit_sec);
> +if(commit_time >= last_trans_commit_time) {
> +printk(KERN_ERR "JBD2: Invalid checksum found in log, %d\n",
> +next_commit_ID);
> +err = -EFSBADCRC;
> +brelse(bh);
> +goto failed;
> +}
> +else
> +{
> +/*it's not belong to same journal, just end this recovery with success*/
> +jbd_debug(1, "JBD2: Invalid checksum found in block in log, but not same journal %d\n",
> +next_commit_ID);
> +err = 0;
> +brelse(bh);
> +goto done;
> +}
> +}
> +if(ri_commit_block) {
> +jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
> +brelse(bh);
> +goto done;
> +}
> +ri_commit_block = next_log_block;
> +last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
> +}
>  if (pass == PASS_SCAN &&
>      jbd2_has_feature_checksum(journal)) {
>  int chksum_err, chksum_seen;
> @@ -755,6 +788,12 @@ static int do_one_pass(journal_t *journal,
>  continue;
> 
>  case JBD2_REVOKE_BLOCK:
> +if (pass == PASS_SCAN &&
> +ri_commit_block) {
> +jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
> +brelse(bh);
> +goto done;
> +}
>  /* If we aren't in the REVOKE pass, then we can
>   * just skip over this block. */
>  if (pass != PASS_REVOKE) {
> 
> -----邮件原件-----
> 发件人: Jan Kara <jack@suse.cz>
> 发送时间: 2020年9月11日 18:06
> 收件人: changfengnan <changfengnan@qq.com>
> 抄送: adilger@dilger.ca; darrick.wong@oracle.com; jack@suse.com; linux-ext4@vger.kernel.org; tytso@mit.edu; 常凤楠 <changfengnan@hikvision.com>
> 主题: Re: [PATCH] jbd2: avoid transaction reuse after reformatting
> 
> On Thu 10-09-20 19:55:21, changfengnan wrote:
> > From: Fengnan Chang <changfengnan@hikvision.com>
> >
> > When format ext4 with lazy_journal_init=1, the previous transaction is
> > still on disk, it is possible that the previous transaction will be
> > used again during jbd2 recovery.Because the seed is changed, the CRC
> > check will fail.
> >
> > Signed-off-by: Fengnan Chang <changfengnan@hikvision.com>
> 
> Thanks for the patch! I have a few functional comments below, also on a coding style side your patch contains lines a lot longer than 80 characters. Although that is not a strict requirement anymore it is still prefered that lines are wrapped to fit 80 columns to make code more easily readable...
> 
> > ---
> >  fs/jbd2/recovery.c | 60
> > +++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 54 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index
> > a4967b27ffb6..8a6bc322a06d 100644
> > --- a/fs/jbd2/recovery.c
> > +++ b/fs/jbd2/recovery.c
> > @@ -33,6 +33,8 @@ struct recovery_info
> >  intnr_replays;
> >  intnr_revokes;
> >  intnr_revoke_hits;
> > +unsigned long  ri_commit_block;
> > +__be64  last_trans_commit_time;
> 
> Is there a need for these in recovery_info? They are needed only in PASS_SCAN AFAICT so they can be as well just local variables in do_one_pass()...
> 
> >  };
> >
> >  enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY}; @@ -412,7
> > +414,27 @@ static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
> >  else
> >  return tag->t_checksum == cpu_to_be16(csum32);  }
> > +/*
> > + * We check the commit time and compare it with the commit time of
> > + * the previous transaction, if it's smaller than previous,
> > + * We think it's not belong to same journal.
> > + */
> > +static bool is_same_journal(journal_t *journal, struct buffer_head
> > +*bh, unsigned long blocknr, __u64 last_commit_sec) {
> > +unsigned long commit_block = blocknr + count_tags(journal, bh) + 1;
> > +struct buffer_head *nbh;
> > +struct commit_header *cbh;
> > +__u64commit_sec;
> > +int err = jread(&nbh, journal, commit_block);
> 
> So there are two flaws in this logic. First, the computed commit_block may be beyond end of the journal. You need to wrap the computed value using
> wrap() macro. Second, you seem to assume that each transaction has only a single descriptor block. That is definitely not true. There can be arbitrary number (well, up to a maximum transaction size which depends on the total size of the journal) of descriptor blocks in a transaction before a commit block. So to locate commit block, you need to read the block, check its type (it can be another descriptor block, revoke block, or commit
> block) and then decide accordingly what to do. And you have to be rather careful when going through the journal like this because you cannot check checksums and blocks can be containing arbitrary garbage. Probably the cleanest way how to implement this would be to use the scanning loop in do_one_pass(), just indicate in some state variable that "we are now searching for next commit block to determine whether a checksum failure is due to lazy init or storage error". In this state we will continue PASS_SCAN, just not checking checksums anymore until we find a commit block or reach end of journal and then based on what we found decide whether to report checksum error or just silently end PASS_SCAN pass...
> 
> >
> > +if (err)
> > +return true;
> > +
> > +cbh = (struct commit_header *)nbh->b_data;
> > +commit_sec = be64_to_cpu(cbh->h_commit_sec);
> > +
> > +return commit_sec >= last_commit_sec; }
> >  static int do_one_pass(journal_t *journal,
> >  struct recovery_info *info, enum passtype pass)  { @@ -514,18
> > +536,29 @@ static int do_one_pass(journal_t *journal,
> >  switch(blocktype) {
> >  case JBD2_DESCRIPTOR_BLOCK:
> >  /* Verify checksum first */
> > +if (pass == PASS_SCAN)
> > +info->ri_commit_block = 0;
> > +
> >  if (jbd2_journal_has_csum_v2or3(journal))
> >  descr_csum_size =
> >  sizeof(struct jbd2_journal_block_tail);
> >  if (descr_csum_size > 0 &&
> >      !jbd2_descriptor_block_csum_verify(journal,
> >         bh->b_data)) {
> > -printk(KERN_ERR "JBD2: Invalid checksum "
> > -       "recovering block %lu in log\n",
> > +if (is_same_journal(journal, bh, next_log_block-1, info->last_trans_commit_time)) {
> > +printk(KERN_ERR "JBD2: Invalid checksum recovering block %lu in
> > +log\n",
> >         next_log_block);
> > -err = -EFSBADCRC;
> > -brelse(bh);
> > -goto failed;
> > +err = -EFSBADCRC;
> > +brelse(bh);
> > +goto failed;
> > +} else {
> > +/*it's not belong to same journal, just end this recovery with success*/
> > +jbd_debug(1, "JBD2: Invalid checksum found in block %lu in log, but not same journal %d\n",
> > +       next_log_block, next_commit_ID);
> > +err = 0;
> > +brelse(bh);
> > +goto done;
> > +}
> >  }
> >
> >  /* If it is a valid descriptor block, replay it @@ -688,6 +721,17
> > @@ static int do_one_pass(journal_t *journal,
> >   * are present verify them in PASS_SCAN; else not
> >   * much to do other than move on to the next sequence
> >   * number. */
> > +if (pass == PASS_SCAN) {
> > +struct commit_header *cbh =
> > +(struct commit_header *)bh->b_data;
> > +if (info->ri_commit_block) {
> > +jbd_debug(1, "invalid commit block found in %lu, stop here.\n", next_log_block);
> > +brelse(bh);
> > +goto done;
> > +}
> > +info->ri_commit_block = next_log_block;
> > +info->last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
> > +}
> 
> I don't think you need to update ri_commit_block and last_trans_commit_time if we are not checking checksums (lazy journal init cannot happen without journal checksumming being used). So I'd just move this into the
> 
>                         if (pass == PASS_SCAN &&
>                             jbd2_has_feature_checksum(journal)) {
> 
> branch below. Also we could validate the commit block the same way (using
> timestamp) as the descriptor block. Then you will not need the 'ri_commit_block' logic and things will be nicely uniform...
> 
> >  if (pass == PASS_SCAN &&
> >      jbd2_has_feature_checksum(journal)) {
> >  int chksum_err, chksum_seen;
> > @@ -761,7 +805,11 @@ static int do_one_pass(journal_t *journal,
> >  brelse(bh);
> >  continue;
> >  }
> > -
> > +if (pass != PASS_SCAN && info->ri_commit_block) {
> 
> I don't understand this. We get here only if "pass == PASS_REVOKE" so "pass != PASS_SCAN" is guaranteed to be true. But also info->ri_commit_block will likely be != 0 from the previous PASS_SCAN pass. So this will always skip revoke block handling?
> 
> > +jbd_debug(1, "invalid revoke block found in %lu, stop here.\n", next_log_block);
> > +brelse(bh);
> > +goto done;
> > +}
> >  err = scan_revoke_records(journal, bh,
> >    next_commit_ID, info);
> >  brelse(bh);
> 
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
> 
> ________________________________
> 
> CONFIDENTIALITY NOTICE: This electronic message is intended to be viewed only by the individual or entity to whom it is addressed. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. Any dissemination, distribution or copying of this communication is strictly prohibited without our prior permission. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, or if you have received this communication in error, please notify us immediately by return e-mail and delete the original message and any copies of it from your computer system. For further information about Hikvision company. please see our website at www.hikvision.com<http://www.hikvision.com>
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* 答复: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
  2020-09-17 10:44     ` Jan Kara
@ 2020-09-18  1:49       ` 常凤楠
  2020-09-18 13:02         ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: 常凤楠 @ 2020-09-18  1:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: changfengnan, adilger, darrick.wong, jack, linux-ext4, tytso

[-- Attachment #1: Type: text/plain, Size: 14237 bytes --]

Sorry about my mailer, the patch is in the attachment.


-----邮件原件-----
发件人: Jan Kara <jack@suse.cz>
发送时间: 2020年9月17日 18:45
收件人: 常凤楠 <changfengnan@hikvision.com>
抄送: Jan Kara <jack@suse.cz>; changfengnan <changfengnan@qq.com>; adilger@dilger.ca; darrick.wong@oracle.com; jack@suse.com; linux-ext4@vger.kernel.org; tytso@mit.edu
主题: Re: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting

On Mon 14-09-20 11:50:04, 常凤楠 wrote:
> Thank you for your review comments, there are indeed some problems.
> Let's discuss the function problems first, and I will modify the code style later.
> 1. About add two members in struct recovery_info, it can be just local variables in do_one_pass().
> 2. About is_same_journal(), I indded I did not consider the situation
> of serverl descriptor blocks in a transaction,I have moved the work of is_same_journal() outside to do it as you suggest.
> 3. You suggest add jbd2_has_feature_checksum() judge in PASS_SCAN, But
> jbd2_has_feature_checksum() only check checksum VER1, not include VER2
> and VER3. And I don't understand why lazy journal init cannot happen without journal checksumming being used, so i didn't add jbd2_has_feature_checksum() for now.
> 4. The pass == PASS_REVOKE case, this is a low-level error, my previous test did not cover it. I have changed it now.

Please, send the patch as an attachment if your mailer is mangling whitespace like this. The patch is very hard to read in this form. Thanks!

Honza

> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index
> a4967b27ffb6..f7702e14077f 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,  struct
> recovery_info *info, enum passtype pass)  {  unsigned
> intfirst_commit_ID, next_commit_ID; -unsigned longnext_log_block;
> +unsigned longnext_log_block, ri_commit_block = 0;
>  interr, success = 0;
>  journal_superblock_t *sb;
>  journal_header_t *tmp;
> @@ -428,7 +428,9 @@ static int do_one_pass(journal_t *journal,
> __u32crc32_sum = ~0; /* Transactional Checksums */  intdescr_csum_size
> = 0;  intblock_error = 0;
> -
> +boolneed_check_commit_time = false;
> +__be64last_trans_commit_time;
> +
>  /*
>   * First thing is to establish what we expect to find in the log
>   * (in terms of transaction IDs), and where (in terms of log @@
> -514,18 +516,18 @@ static int do_one_pass(journal_t *journal,
>  switch(blocktype) {
>  case JBD2_DESCRIPTOR_BLOCK:
>  /* Verify checksum first */
> +if(pass == PASS_SCAN)
> +ri_commit_block = 0;
> +
>  if (jbd2_journal_has_csum_v2or3(journal))
>  descr_csum_size =
>  sizeof(struct jbd2_journal_block_tail);  if (descr_csum_size > 0 &&
>      !jbd2_descriptor_block_csum_verify(journal,
>         bh->b_data)) {
> -printk(KERN_ERR "JBD2: Invalid checksum "
> -       "recovering block %lu in log\n",
> -       next_log_block);
> -err = -EFSBADCRC;
> -brelse(bh);
> -goto failed;
> +need_check_commit_time = true;
> +jbd_debug(1, "invalid descriptor block found in %lu, continue
> +recovery first.\n",next_log_block);
> +
>  }
>
>  /* If it is a valid descriptor block, replay it @@ -535,6 +537,7 @@
> static int do_one_pass(journal_t *journal,  if (pass != PASS_REPLAY) {
> if (pass == PASS_SCAN &&
>      jbd2_has_feature_checksum(journal) &&
> +    !need_check_commit_time &&
>      !info->end_transaction) {
>  if (calc_chksums(journal, bh,
>  &next_log_block,
> @@ -688,6 +691,36 @@ static int do_one_pass(journal_t *journal,
>   * are present verify them in PASS_SCAN; else not
>   * much to do other than move on to the next sequence
>   * number. */
> +if(pass == PASS_SCAN) {
> +struct commit_header *cbh =
> +(struct commit_header *)bh->b_data;
> +if(need_check_commit_time) {
> +__be64 commit_time = be64_to_cpu(cbh->h_commit_sec); if(commit_time
> +>= last_trans_commit_time) { printk(KERN_ERR "JBD2: Invalid checksum
> +found in log, %d\n", next_commit_ID); err = -EFSBADCRC; brelse(bh);
> +goto failed; } else { /*it's not belong to same journal, just end
> +this recovery with success*/ jbd_debug(1, "JBD2: Invalid checksum
> +found in block in log, but not same journal %d\n", next_commit_ID);
> +err = 0; brelse(bh); goto done; } }
> +if(ri_commit_block) {
> +jbd_debug(1, "invalid commit block found in %lu, stop
> +here.\n",next_log_block); brelse(bh); goto done; } ri_commit_block =
> +next_log_block; last_trans_commit_time =
> +be64_to_cpu(cbh->h_commit_sec); }
>  if (pass == PASS_SCAN &&
>      jbd2_has_feature_checksum(journal)) {  int chksum_err,
> chksum_seen; @@ -755,6 +788,12 @@ static int do_one_pass(journal_t
> *journal,  continue;
>
>  case JBD2_REVOKE_BLOCK:
> +if (pass == PASS_SCAN &&
> +ri_commit_block) {
> +jbd_debug(1, "invalid revoke block found in %lu, stop
> +here.\n",next_log_block); brelse(bh); goto done; }
>  /* If we aren't in the REVOKE pass, then we can
>   * just skip over this block. */
>  if (pass != PASS_REVOKE) {
>
> -----邮件原件-----
> 发件人: Jan Kara <jack@suse.cz>
> 发送时间: 2020年9月11日 18:06
> 收件人: changfengnan <changfengnan@qq.com>
> 抄送: adilger@dilger.ca; darrick.wong@oracle.com; jack@suse.com;
> linux-ext4@vger.kernel.org; tytso@mit.edu; 常凤楠
> <changfengnan@hikvision.com>
> 主题: Re: [PATCH] jbd2: avoid transaction reuse after reformatting
>
> On Thu 10-09-20 19:55:21, changfengnan wrote:
> > From: Fengnan Chang <changfengnan@hikvision.com>
> >
> > When format ext4 with lazy_journal_init=1, the previous transaction
> > is still on disk, it is possible that the previous transaction will
> > be used again during jbd2 recovery.Because the seed is changed, the
> > CRC check will fail.
> >
> > Signed-off-by: Fengnan Chang <changfengnan@hikvision.com>
>
> Thanks for the patch! I have a few functional comments below, also on a coding style side your patch contains lines a lot longer than 80 characters. Although that is not a strict requirement anymore it is still prefered that lines are wrapped to fit 80 columns to make code more easily readable...
>
> > ---
> >  fs/jbd2/recovery.c | 60
> > +++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 54 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index
> > a4967b27ffb6..8a6bc322a06d 100644
> > --- a/fs/jbd2/recovery.c
> > +++ b/fs/jbd2/recovery.c
> > @@ -33,6 +33,8 @@ struct recovery_info  intnr_replays;
> > intnr_revokes;  intnr_revoke_hits;
> > +unsigned long  ri_commit_block;
> > +__be64  last_trans_commit_time;
>
> Is there a need for these in recovery_info? They are needed only in PASS_SCAN AFAICT so they can be as well just local variables in do_one_pass()...
>
> >  };
> >
> >  enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY}; @@ -412,7
> > +414,27 @@ static int jbd2_block_tag_csum_verify(journal_t *j,
> > +journal_block_tag_t *tag,
> >  else
> >  return tag->t_checksum == cpu_to_be16(csum32);  }
> > +/*
> > + * We check the commit time and compare it with the commit time of
> > + * the previous transaction, if it's smaller than previous,
> > + * We think it's not belong to same journal.
> > + */
> > +static bool is_same_journal(journal_t *journal, struct buffer_head
> > +*bh, unsigned long blocknr, __u64 last_commit_sec) { unsigned long
> > +commit_block = blocknr + count_tags(journal, bh) + 1; struct
> > +buffer_head *nbh; struct commit_header *cbh; __u64commit_sec; int
> > +err = jread(&nbh, journal, commit_block);
>
> So there are two flaws in this logic. First, the computed commit_block
> may be beyond end of the journal. You need to wrap the computed value
> using
> wrap() macro. Second, you seem to assume that each transaction has
> only a single descriptor block. That is definitely not true. There can
> be arbitrary number (well, up to a maximum transaction size which
> depends on the total size of the journal) of descriptor blocks in a
> transaction before a commit block. So to locate commit block, you need
> to read the block, check its type (it can be another descriptor block,
> revoke block, or commit
> block) and then decide accordingly what to do. And you have to be rather careful when going through the journal like this because you cannot check checksums and blocks can be containing arbitrary garbage. Probably the cleanest way how to implement this would be to use the scanning loop in do_one_pass(), just indicate in some state variable that "we are now searching for next commit block to determine whether a checksum failure is due to lazy init or storage error". In this state we will continue PASS_SCAN, just not checking checksums anymore until we find a commit block or reach end of journal and then based on what we found decide whether to report checksum error or just silently end PASS_SCAN pass...
>
> >
> > +if (err)
> > +return true;
> > +
> > +cbh = (struct commit_header *)nbh->b_data; commit_sec =
> > +be64_to_cpu(cbh->h_commit_sec);
> > +
> > +return commit_sec >= last_commit_sec; }
> >  static int do_one_pass(journal_t *journal,  struct recovery_info
> > *info, enum passtype pass)  { @@ -514,18
> > +536,29 @@ static int do_one_pass(journal_t *journal,
> >  switch(blocktype) {
> >  case JBD2_DESCRIPTOR_BLOCK:
> >  /* Verify checksum first */
> > +if (pass == PASS_SCAN)
> > +info->ri_commit_block = 0;
> > +
> >  if (jbd2_journal_has_csum_v2or3(journal))
> >  descr_csum_size =
> >  sizeof(struct jbd2_journal_block_tail);  if (descr_csum_size > 0 &&
> >      !jbd2_descriptor_block_csum_verify(journal,
> >         bh->b_data)) {
> > -printk(KERN_ERR "JBD2: Invalid checksum "
> > -       "recovering block %lu in log\n",
> > +if (is_same_journal(journal, bh, next_log_block-1,
> > +info->last_trans_commit_time)) { printk(KERN_ERR "JBD2: Invalid
> > +checksum recovering block %lu in log\n",
> >         next_log_block);
> > -err = -EFSBADCRC;
> > -brelse(bh);
> > -goto failed;
> > +err = -EFSBADCRC;
> > +brelse(bh);
> > +goto failed;
> > +} else {
> > +/*it's not belong to same journal, just end this recovery with
> > +success*/ jbd_debug(1, "JBD2: Invalid checksum found in block %lu in log, but not same journal %d\n",
> > +       next_log_block, next_commit_ID); err = 0; brelse(bh); goto
> > +done; }
> >  }
> >
> >  /* If it is a valid descriptor block, replay it @@ -688,6 +721,17
> > @@ static int do_one_pass(journal_t *journal,
> >   * are present verify them in PASS_SCAN; else not
> >   * much to do other than move on to the next sequence
> >   * number. */
> > +if (pass == PASS_SCAN) {
> > +struct commit_header *cbh =
> > +(struct commit_header *)bh->b_data; if (info->ri_commit_block) {
> > +jbd_debug(1, "invalid commit block found in %lu, stop here.\n",
> > +next_log_block); brelse(bh); goto done; }
> > +info->ri_commit_block = next_log_block; last_trans_commit_time =
> > +info->be64_to_cpu(cbh->h_commit_sec);
> > +}
>
> I don't think you need to update ri_commit_block and
> last_trans_commit_time if we are not checking checksums (lazy journal
> init cannot happen without journal checksumming being used). So I'd
> just move this into the
>
>                         if (pass == PASS_SCAN &&
>                             jbd2_has_feature_checksum(journal)) {
>
> branch below. Also we could validate the commit block the same way
> (using
> timestamp) as the descriptor block. Then you will not need the 'ri_commit_block' logic and things will be nicely uniform...
>
> >  if (pass == PASS_SCAN &&
> >      jbd2_has_feature_checksum(journal)) {  int chksum_err,
> > chksum_seen; @@ -761,7 +805,11 @@ static int do_one_pass(journal_t
> > *journal,  brelse(bh);  continue;  }
> > -
> > +if (pass != PASS_SCAN && info->ri_commit_block) {
>
> I don't understand this. We get here only if "pass == PASS_REVOKE" so "pass != PASS_SCAN" is guaranteed to be true. But also info->ri_commit_block will likely be != 0 from the previous PASS_SCAN pass. So this will always skip revoke block handling?
>
> > +jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",
> > +next_log_block); brelse(bh); goto done; }
> >  err = scan_revoke_records(journal, bh,
> >    next_commit_ID, info);
> >  brelse(bh);
>
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
>
> ________________________________
>
> CONFIDENTIALITY NOTICE: This electronic message is intended to be
> viewed only by the individual or entity to whom it is addressed. It
> may contain information that is privileged, confidential and exempt
> from disclosure under applicable law. Any dissemination, distribution
> or copying of this communication is strictly prohibited without our
> prior permission. If the reader of this message is not the intended
> recipient, or the employee or agent responsible for delivering the
> message to the intended recipient, or if you have received this
> communication in error, please notify us immediately by return e-mail
> and delete the original message and any copies of it from your
> computer system. For further information about Hikvision company.
> please see our website at www.hikvision.com<http://www.hikvision.com>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR

________________________________

CONFIDENTIALITY NOTICE: This electronic message is intended to be viewed only by the individual or entity to whom it is addressed. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. Any dissemination, distribution or copying of this communication is strictly prohibited without our prior permission. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, or if you have received this communication in error, please notify us immediately by return e-mail and delete the original message and any copies of it from your computer system. For further information about Hikvision company. please see our website at www.hikvision.com<http://www.hikvision.com>

[-- Attachment #2: jbd2.patch --]
[-- Type: application/octet-stream, Size: 3652 bytes --]

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index a4967b27ffb6..f7702e14077f 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,
 			struct recovery_info *info, enum passtype pass)
 {
 	unsigned int		first_commit_ID, next_commit_ID;
-	unsigned long		next_log_block;
+	unsigned long		next_log_block, ri_commit_block = 0;
 	int			err, success = 0;
 	journal_superblock_t *	sb;
 	journal_header_t *	tmp;
@@ -428,7 +428,9 @@ static int do_one_pass(journal_t *journal,
 	__u32			crc32_sum = ~0; /* Transactional Checksums */
 	int			descr_csum_size = 0;
 	int			block_error = 0;
-
+	bool		need_check_commit_time = false;
+	__be64		last_trans_commit_time;
+	
 	/*
 	 * First thing is to establish what we expect to find in the log
 	 * (in terms of transaction IDs), and where (in terms of log
@@ -514,18 +516,18 @@ static int do_one_pass(journal_t *journal,
 		switch(blocktype) {
 		case JBD2_DESCRIPTOR_BLOCK:
 			/* Verify checksum first */
+			if(pass == PASS_SCAN) 
+				ri_commit_block = 0;
+
 			if (jbd2_journal_has_csum_v2or3(journal))
 				descr_csum_size =
 					sizeof(struct jbd2_journal_block_tail);
 			if (descr_csum_size > 0 &&
 			    !jbd2_descriptor_block_csum_verify(journal,
 							       bh->b_data)) {
-				printk(KERN_ERR "JBD2: Invalid checksum "
-				       "recovering block %lu in log\n",
-				       next_log_block);
-				err = -EFSBADCRC;
-				brelse(bh);
-				goto failed;
+				need_check_commit_time = true;
+				jbd_debug(1, "invalid descriptor block found in %lu, continue recovery first.\n",next_log_block);
+				
 			}
 
 			/* If it is a valid descriptor block, replay it
@@ -535,6 +537,7 @@ static int do_one_pass(journal_t *journal,
 			if (pass != PASS_REPLAY) {
 				if (pass == PASS_SCAN &&
 				    jbd2_has_feature_checksum(journal) &&
+				    !need_check_commit_time &&
 				    !info->end_transaction) {
 					if (calc_chksums(journal, bh,
 							&next_log_block,
@@ -688,6 +691,36 @@ static int do_one_pass(journal_t *journal,
 			 * are present verify them in PASS_SCAN; else not
 			 * much to do other than move on to the next sequence
 			 * number. */
+			if(pass == PASS_SCAN) {
+				struct commit_header *cbh =
+					(struct commit_header *)bh->b_data;
+				if(need_check_commit_time) {
+					__be64 commit_time = be64_to_cpu(cbh->h_commit_sec);
+					if(commit_time >= last_trans_commit_time) {
+						printk(KERN_ERR "JBD2: Invalid checksum found in log, %d\n",
+						next_commit_ID);
+						err = -EFSBADCRC;
+						brelse(bh);
+						goto failed;
+					}
+					else
+					{
+						/*it's not belong to same journal, just end this recovery with success*/
+						jbd_debug(1, "JBD2: Invalid checksum found in block in log, but not same journal %d\n",
+						next_commit_ID);
+						err = 0;
+						brelse(bh);
+						goto done;
+					}
+				}
+				if(ri_commit_block) {
+					jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
+					brelse(bh);
+					goto done;
+				}
+				ri_commit_block = next_log_block;
+				last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
+			}
 			if (pass == PASS_SCAN &&
 			    jbd2_has_feature_checksum(journal)) {
 				int chksum_err, chksum_seen;
@@ -755,6 +788,12 @@ static int do_one_pass(journal_t *journal,
 			continue;
 
 		case JBD2_REVOKE_BLOCK:
+			if (pass == PASS_SCAN && 
+				ri_commit_block) {
+				jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
+				brelse(bh);
+				goto done;
+			}
 			/* If we aren't in the REVOKE pass, then we can
 			 * just skip over this block. */
 			if (pass != PASS_REVOKE) {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: 答复: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
  2020-09-18  1:49       ` 答复: " 常凤楠
@ 2020-09-18 13:02         ` Jan Kara
  2020-09-23  6:29           ` 答复: " 常凤楠
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2020-09-18 13:02 UTC (permalink / raw)
  To: 常凤楠
  Cc: Jan Kara, changfengnan, adilger, darrick.wong, jack, linux-ext4, tytso

Hello,

On Fri 18-09-20 01:49:09, 常凤楠 wrote:
> Sorry about my mailer, the patch is in the attachment.

Thanks for the patch. Functionally the patch looks mostly OK now. The only
concern I have is that it handles checksum failures only in
JBD2_DESCRIPTOR_BLOCK. This is the most likely case but it could also
happen that JBD2_REVOKE_BLOCK or JBD2_COMMIT_BLOCK is the first one you see
with mismatching checksum. So I think you need to handle these cases as
well. I think your ri_commit_block logic below is an attempt to deal with
these cases (but it's difficult to be sure because of complete lack of
comments) but it is not reliable. A valid transaction can begin both with a
descriptor or with a revoke block.

A few other comments mostly about coding style below:

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index a4967b27ffb6..f7702e14077f 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,
 			struct recovery_info *info, enum passtype pass)
 {
 	unsigned int		first_commit_ID, next_commit_ID;
-	unsigned long		next_log_block;
+	unsigned long		next_log_block, ri_commit_block = 0;
 	int			err, success = 0;
 	journal_superblock_t *	sb;
 	journal_header_t *	tmp;
@@ -428,7 +428,9 @@ static int do_one_pass(journal_t *journal,
 	__u32			crc32_sum = ~0; /* Transactional Checksums */
 	int			descr_csum_size = 0;
 	int			block_error = 0;
-
+	bool		need_check_commit_time = false;
+	__be64		last_trans_commit_time;

All variable names in this function seem to be indented by one more
column. Please keep the indentation.

+	

This empty line has whitespace on it. Please delete.

 	/*
 	 * First thing is to establish what we expect to find in the log
 	 * (in terms of transaction IDs), and where (in terms of log
@@ -514,18 +516,18 @@ static int do_one_pass(journal_t *journal,
 		switch(blocktype) {
 		case JBD2_DESCRIPTOR_BLOCK:
 			/* Verify checksum first */
+			if(pass == PASS_SCAN) 
			  ^ Coding style requires space before opening (.
You have this problem at multiple places.

+				ri_commit_block = 0;
+
 			if (jbd2_journal_has_csum_v2or3(journal))
 				descr_csum_size =
 					sizeof(struct jbd2_journal_block_tail);
 			if (descr_csum_size > 0 &&
 			    !jbd2_descriptor_block_csum_verify(journal,
 							       bh->b_data)) {
-				printk(KERN_ERR "JBD2: Invalid checksum "
-				       "recovering block %lu in log\n",
-				       next_log_block);
-				err = -EFSBADCRC;
-				brelse(bh);
-				goto failed;
+				need_check_commit_time = true;
+				jbd_debug(1, "invalid descriptor block found in %lu, continue recovery first.\n",next_log_block);
+				
 			}
 
 			/* If it is a valid descriptor block, replay it
@@ -535,6 +537,7 @@ static int do_one_pass(journal_t *journal,
 			if (pass != PASS_REPLAY) {
 				if (pass == PASS_SCAN &&
 				    jbd2_has_feature_checksum(journal) &&
+				    !need_check_commit_time &&
 				    !info->end_transaction) {
 					if (calc_chksums(journal, bh,
 							&next_log_block,
@@ -688,6 +691,36 @@ static int do_one_pass(journal_t *journal,
 			 * are present verify them in PASS_SCAN; else not
 			 * much to do other than move on to the next sequence
 			 * number. */
+			if(pass == PASS_SCAN) {
+				struct commit_header *cbh =
+					(struct commit_header *)bh->b_data;
+				if(need_check_commit_time) {
+					__be64 commit_time = be64_to_cpu(cbh->h_commit_sec);
+					if(commit_time >= last_trans_commit_time) {
+						printk(KERN_ERR "JBD2: Invalid checksum found in log, %d\n",
+						next_commit_ID);
+						err = -EFSBADCRC;
+						brelse(bh);
+						goto failed;
+					}
+					else
+					{
  Coding style requires to put opening { on the same line as 'else'. Like:
					else {
+						/*it's not belong to same journal, just end this recovery with success*/
+						jbd_debug(1, "JBD2: Invalid checksum found in block in log, but not same journal %d\n",
+						next_commit_ID);
+						err = 0;
+						brelse(bh);
+						goto done;
+					}
+				}
+				if(ri_commit_block) {
+					jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
+					brelse(bh);
+					goto done;
+				}
+				ri_commit_block = next_log_block;

Why does the ri_commit_block logic exist? I don't see it bringing any
benefit...

+				last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
+			}
 			if (pass == PASS_SCAN &&
 			    jbd2_has_feature_checksum(journal)) {
 				int chksum_err, chksum_seen;
@@ -755,6 +788,12 @@ static int do_one_pass(journal_t *journal,
 			continue;
 
 		case JBD2_REVOKE_BLOCK:
+			if (pass == PASS_SCAN && 
+				ri_commit_block) {
+				jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
+				brelse(bh);
+				goto done;
+			}

This is wrong. A valid transaction can start with a revoke block...

 			/* If we aren't in the REVOKE pass, then we can
 			 * just skip over this block. */
 			if (pass != PASS_REVOKE) {

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* 答复: 答复: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
  2020-09-18 13:02         ` Jan Kara
@ 2020-09-23  6:29           ` 常凤楠
  2020-09-23 12:24             ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: 常凤楠 @ 2020-09-23  6:29 UTC (permalink / raw)
  To: Jan Kara; +Cc: changfengnan, adilger, darrick.wong, jack, linux-ext4, tytso

[-- Attachment #1: Type: text/plain, Size: 6542 bytes --]

The attachment is new patch.
I have fix the logic in JBD2_REVOKE_BLOCK and JBD2_COMMIT_BLOCK case.
If the revoke block is the first block after valid transaction, I set the flag like descriptor block ,and check it in commit block.
If the commit block is the first block after valid transaction, I use ri_commit_block to judge whether this commit block is next to another
commit block, if so it is illegal. I did't use time to judge commit block, because of the possibility of time calibration, I think use
ri_commit_block is more reliable.

-----邮件原件-----
发件人: Jan Kara <jack@suse.cz>
发送时间: 2020年9月18日 21:03
收件人: 常凤楠 <changfengnan@hikvision.com>
抄送: Jan Kara <jack@suse.cz>; changfengnan <changfengnan@qq.com>; adilger@dilger.ca; darrick.wong@oracle.com; jack@suse.com; linux-ext4@vger.kernel.org; tytso@mit.edu
主题: Re: 答复: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting

Hello,

On Fri 18-09-20 01:49:09, 常凤楠 wrote:
> Sorry about my mailer, the patch is in the attachment.

Thanks for the patch. Functionally the patch looks mostly OK now. The only concern I have is that it handles checksum failures only in JBD2_DESCRIPTOR_BLOCK. This is the most likely case but it could also happen that JBD2_REVOKE_BLOCK or JBD2_COMMIT_BLOCK is the first one you see with mismatching checksum. So I think you need to handle these cases as well. I think your ri_commit_block logic below is an attempt to deal with these cases (but it's difficult to be sure because of complete lack of
comments) but it is not reliable. A valid transaction can begin both with a descriptor or with a revoke block.

A few other comments mostly about coding style below:

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index a4967b27ffb6..f7702e14077f 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,
 struct recovery_info *info, enum passtype pass)  {
 unsigned intfirst_commit_ID, next_commit_ID;
-unsigned longnext_log_block;
+unsigned longnext_log_block, ri_commit_block = 0;
 interr, success = 0;
 journal_superblock_t *sb;
 journal_header_t *tmp;
@@ -428,7 +428,9 @@ static int do_one_pass(journal_t *journal,
 __u32crc32_sum = ~0; /* Transactional Checksums */
 intdescr_csum_size = 0;
 intblock_error = 0;
-
+boolneed_check_commit_time = false;
+__be64last_trans_commit_time;

All variable names in this function seem to be indented by one more column. Please keep the indentation.

+

This empty line has whitespace on it. Please delete.

 /*
  * First thing is to establish what we expect to find in the log
  * (in terms of transaction IDs), and where (in terms of log @@ -514,18 +516,18 @@ static int do_one_pass(journal_t *journal,
 switch(blocktype) {
 case JBD2_DESCRIPTOR_BLOCK:
 /* Verify checksum first */
+if(pass == PASS_SCAN)
  ^ Coding style requires space before opening (.
You have this problem at multiple places.

+ri_commit_block = 0;
+
 if (jbd2_journal_has_csum_v2or3(journal))
 descr_csum_size =
 sizeof(struct jbd2_journal_block_tail);
 if (descr_csum_size > 0 &&
     !jbd2_descriptor_block_csum_verify(journal,
        bh->b_data)) {
-printk(KERN_ERR "JBD2: Invalid checksum "
-       "recovering block %lu in log\n",
-       next_log_block);
-err = -EFSBADCRC;
-brelse(bh);
-goto failed;
+need_check_commit_time = true;
+jbd_debug(1, "invalid descriptor block found in %lu, continue
+recovery first.\n",next_log_block);
+
 }

 /* If it is a valid descriptor block, replay it @@ -535,6 +537,7 @@ static int do_one_pass(journal_t *journal,
 if (pass != PASS_REPLAY) {
 if (pass == PASS_SCAN &&
     jbd2_has_feature_checksum(journal) &&
+    !need_check_commit_time &&
     !info->end_transaction) {
 if (calc_chksums(journal, bh,
 &next_log_block,
@@ -688,6 +691,36 @@ static int do_one_pass(journal_t *journal,
  * are present verify them in PASS_SCAN; else not
  * much to do other than move on to the next sequence
  * number. */
+if(pass == PASS_SCAN) {
+struct commit_header *cbh =
+(struct commit_header *)bh->b_data;
+if(need_check_commit_time) {
+__be64 commit_time = be64_to_cpu(cbh->h_commit_sec);
+if(commit_time >= last_trans_commit_time) {
+printk(KERN_ERR "JBD2: Invalid checksum found in log, %d\n",
+next_commit_ID);
+err = -EFSBADCRC;
+brelse(bh);
+goto failed;
+}
+else
+{
  Coding style requires to put opening { on the same line as 'else'. Like:
else {
+/*it's not belong to same journal, just end this recovery with success*/
+jbd_debug(1, "JBD2: Invalid checksum found in block in log, but not same journal %d\n",
+next_commit_ID);
+err = 0;
+brelse(bh);
+goto done;
+}
+}
+if(ri_commit_block) {
+jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
+brelse(bh);
+goto done;
+}
+ri_commit_block = next_log_block;

Why does the ri_commit_block logic exist? I don't see it bringing any benefit...

+last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
+}
 if (pass == PASS_SCAN &&
     jbd2_has_feature_checksum(journal)) {
 int chksum_err, chksum_seen;
@@ -755,6 +788,12 @@ static int do_one_pass(journal_t *journal,
 continue;

 case JBD2_REVOKE_BLOCK:
+if (pass == PASS_SCAN &&
+ri_commit_block) {
+jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
+brelse(bh);
+goto done;
+}

This is wrong. A valid transaction can start with a revoke block...

 /* If we aren't in the REVOKE pass, then we can
  * just skip over this block. */
 if (pass != PASS_REVOKE) {

Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR

________________________________

CONFIDENTIALITY NOTICE: This electronic message is intended to be viewed only by the individual or entity to whom it is addressed. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. Any dissemination, distribution or copying of this communication is strictly prohibited without our prior permission. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, or if you have received this communication in error, please notify us immediately by return e-mail and delete the original message and any copies of it from your computer system. For further information about Hikvision company. please see our website at www.hikvision.com<http://www.hikvision.com>

[-- Attachment #2: jbd2.patch --]
[-- Type: application/octet-stream, Size: 4579 bytes --]

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index a4967b27ffb6..631a42b50516 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,
 			struct recovery_info *info, enum passtype pass)
 {
 	unsigned int		first_commit_ID, next_commit_ID;
-	unsigned long		next_log_block;
+	unsigned long		next_log_block, ri_commit_block = 0;
 	int			err, success = 0;
 	journal_superblock_t *	sb;
 	journal_header_t *	tmp;
@@ -428,6 +428,8 @@ static int do_one_pass(journal_t *journal,
 	__u32			crc32_sum = ~0; /* Transactional Checksums */
 	int			descr_csum_size = 0;
 	int			block_error = 0;
+	bool		need_check_commit_time = false;
+	__be64		last_trans_commit_time = 0;
 
 	/*
 	 * First thing is to establish what we expect to find in the log
@@ -514,18 +516,17 @@ static int do_one_pass(journal_t *journal,
 		switch(blocktype) {
 		case JBD2_DESCRIPTOR_BLOCK:
 			/* Verify checksum first */
+			if (pass == PASS_SCAN)
+				ri_commit_block = 0;
 			if (jbd2_journal_has_csum_v2or3(journal))
 				descr_csum_size =
 					sizeof(struct jbd2_journal_block_tail);
 			if (descr_csum_size > 0 &&
 			    !jbd2_descriptor_block_csum_verify(journal,
 							       bh->b_data)) {
-				printk(KERN_ERR "JBD2: Invalid checksum "
-				       "recovering block %lu in log\n",
-				       next_log_block);
-				err = -EFSBADCRC;
-				brelse(bh);
-				goto failed;
+				need_check_commit_time = true;
+				jbd_debug(1, "invalid descriptor block found in %lu, continue recovery first.\n",
+						next_log_block);
 			}
 
 			/* If it is a valid descriptor block, replay it
@@ -535,6 +536,7 @@ static int do_one_pass(journal_t *journal,
 			if (pass != PASS_REPLAY) {
 				if (pass == PASS_SCAN &&
 				    jbd2_has_feature_checksum(journal) &&
+				    !need_check_commit_time &&
 				    !info->end_transaction) {
 					if (calc_chksums(journal, bh,
 							&next_log_block,
@@ -688,6 +690,47 @@ static int do_one_pass(journal_t *journal,
 			 * are present verify them in PASS_SCAN; else not
 			 * much to do other than move on to the next sequence
 			 * number. */
+			if (pass == PASS_SCAN) {
+				struct commit_header *cbh =
+					(struct commit_header *)bh->b_data;
+				/*
+				 * When need check commit time, it means csum
+				 * verify failed before, if commit time is
+				 * increasing, it's same journal, otherwise
+				 * not same journal, just end this recovery.
+				 */
+				if (need_check_commit_time) {
+					__be64 commit_time =
+						be64_to_cpu(cbh->h_commit_sec);
+
+					if (commit_time >= last_trans_commit_time) {
+						pr_err("JBD2: Invalid checksum found in log, %d\n",
+						next_commit_ID);
+						err = -EFSBADCRC;
+						brelse(bh);
+						goto failed;
+					} else {
+						/*
+						 * it's not belong to same journal, just
+						 * end this recovery with success.
+						 */
+						jbd_debug(1, "JBD2: Invalid checksum found in block in log, but not same journal %d\n",
+						next_commit_ID);
+						err = 0;
+						brelse(bh);
+						goto done;
+					}
+				}
+				if (ri_commit_block) {
+					jbd_debug(1, "invalid commit block found in %lu, stop here.\n",
+						next_log_block);
+					brelse(bh);
+					goto done;
+				}
+				ri_commit_block = next_log_block;
+				last_trans_commit_time =
+					be64_to_cpu(cbh->h_commit_sec);
+			}
 			if (pass == PASS_SCAN &&
 			    jbd2_has_feature_checksum(journal)) {
 				int chksum_err, chksum_seen;
@@ -755,6 +798,21 @@ static int do_one_pass(journal_t *journal,
 			continue;
 
 		case JBD2_REVOKE_BLOCK:
+			/*
+			 * Check revoke block crc in pass_scan, if csum verify
+			 * failed, check commit block time later.
+			 */
+			if (pass == PASS_SCAN) {
+				jbd2_journal_revoke_header_t *header =
+				(jbd2_journal_revoke_header_t *)bh->b_data;
+				ri_commit_block = 0;
+				if (!jbd2_descriptor_block_csum_verify(journal,
+						header)) {
+					jbd_debug(1, "invalid revoke block found in %lu, continue recovery first.\n",
+						next_log_block);
+					need_check_commit_time = true;
+				}
+			}
 			/* If we aren't in the REVOKE pass, then we can
 			 * just skip over this block. */
 			if (pass != PASS_REVOKE) {
@@ -822,9 +880,6 @@ static int scan_revoke_records(journal_t *journal, struct buffer_head *bh,
 	offset = sizeof(jbd2_journal_revoke_header_t);
 	rcount = be32_to_cpu(header->r_count);
 
-	if (!jbd2_descriptor_block_csum_verify(journal, header))
-		return -EFSBADCRC;
-
 	if (jbd2_journal_has_csum_v2or3(journal))
 		csum_size = sizeof(struct jbd2_journal_block_tail);
 	if (rcount > journal->j_blocksize - csum_size)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: 答复: 答复: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
  2020-09-23  6:29           ` 答复: " 常凤楠
@ 2020-09-23 12:24             ` Jan Kara
  2020-10-03  4:24               ` Theodore Y. Ts'o
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2020-09-23 12:24 UTC (permalink / raw)
  To: 常凤楠
  Cc: Jan Kara, changfengnan, adilger, darrick.wong, jack, linux-ext4, tytso

On Wed 23-09-20 06:29:12, 常凤楠 wrote:
> The attachment is new patch.

Thanks!

> I have fix the logic in JBD2_REVOKE_BLOCK and JBD2_COMMIT_BLOCK case.
> If the revoke block is the first block after valid transaction, I set the
> flag like descriptor block ,and check it in commit block.  If the commit
> block is the first block after valid transaction, I use ri_commit_block
> to judge whether this commit block is next to another commit block, if so
> it is illegal. I did't use time to judge commit block, because of the
> possibility of time calibration, I think use ri_commit_block is more
> reliable.

If the time is unreliable, your logic with ri_commit_block will detect only
a small minority of the cases. The majority of cases will fail with
checksum error. So I still don't think the ri_commit_block logic is really
worth the additional code. Other than that the current version looks OK to
me. So please do next submission with full changelog, Signed-off-by line,
and the coding style issues fixed (you can use scripts/checkpatch.pl for
patch verification). Thanks!

								Honza

> 
> -----邮件原件-----
> 发件人: Jan Kara <jack@suse.cz>
> 发送时间: 2020年9月18日 21:03
> 收件人: 常凤楠 <changfengnan@hikvision.com>
> 抄送: Jan Kara <jack@suse.cz>; changfengnan <changfengnan@qq.com>; adilger@dilger.ca; darrick.wong@oracle.com; jack@suse.com; linux-ext4@vger.kernel.org; tytso@mit.edu
> 主题: Re: 答复: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
> 
> Hello,
> 
> On Fri 18-09-20 01:49:09, 常凤楠 wrote:
> > Sorry about my mailer, the patch is in the attachment.
> 
> Thanks for the patch. Functionally the patch looks mostly OK now. The only concern I have is that it handles checksum failures only in JBD2_DESCRIPTOR_BLOCK. This is the most likely case but it could also happen that JBD2_REVOKE_BLOCK or JBD2_COMMIT_BLOCK is the first one you see with mismatching checksum. So I think you need to handle these cases as well. I think your ri_commit_block logic below is an attempt to deal with these cases (but it's difficult to be sure because of complete lack of
> comments) but it is not reliable. A valid transaction can begin both with a descriptor or with a revoke block.
> 
> A few other comments mostly about coding style below:
> 
> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index a4967b27ffb6..f7702e14077f 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -417,7 +417,7 @@ static int do_one_pass(journal_t *journal,
>  struct recovery_info *info, enum passtype pass)  {
>  unsigned intfirst_commit_ID, next_commit_ID;
> -unsigned longnext_log_block;
> +unsigned longnext_log_block, ri_commit_block = 0;
>  interr, success = 0;
>  journal_superblock_t *sb;
>  journal_header_t *tmp;
> @@ -428,7 +428,9 @@ static int do_one_pass(journal_t *journal,
>  __u32crc32_sum = ~0; /* Transactional Checksums */
>  intdescr_csum_size = 0;
>  intblock_error = 0;
> -
> +boolneed_check_commit_time = false;
> +__be64last_trans_commit_time;
> 
> All variable names in this function seem to be indented by one more column. Please keep the indentation.
> 
> +
> 
> This empty line has whitespace on it. Please delete.
> 
>  /*
>   * First thing is to establish what we expect to find in the log
>   * (in terms of transaction IDs), and where (in terms of log @@ -514,18 +516,18 @@ static int do_one_pass(journal_t *journal,
>  switch(blocktype) {
>  case JBD2_DESCRIPTOR_BLOCK:
>  /* Verify checksum first */
> +if(pass == PASS_SCAN)
>   ^ Coding style requires space before opening (.
> You have this problem at multiple places.
> 
> +ri_commit_block = 0;
> +
>  if (jbd2_journal_has_csum_v2or3(journal))
>  descr_csum_size =
>  sizeof(struct jbd2_journal_block_tail);
>  if (descr_csum_size > 0 &&
>      !jbd2_descriptor_block_csum_verify(journal,
>         bh->b_data)) {
> -printk(KERN_ERR "JBD2: Invalid checksum "
> -       "recovering block %lu in log\n",
> -       next_log_block);
> -err = -EFSBADCRC;
> -brelse(bh);
> -goto failed;
> +need_check_commit_time = true;
> +jbd_debug(1, "invalid descriptor block found in %lu, continue
> +recovery first.\n",next_log_block);
> +
>  }
> 
>  /* If it is a valid descriptor block, replay it @@ -535,6 +537,7 @@ static int do_one_pass(journal_t *journal,
>  if (pass != PASS_REPLAY) {
>  if (pass == PASS_SCAN &&
>      jbd2_has_feature_checksum(journal) &&
> +    !need_check_commit_time &&
>      !info->end_transaction) {
>  if (calc_chksums(journal, bh,
>  &next_log_block,
> @@ -688,6 +691,36 @@ static int do_one_pass(journal_t *journal,
>   * are present verify them in PASS_SCAN; else not
>   * much to do other than move on to the next sequence
>   * number. */
> +if(pass == PASS_SCAN) {
> +struct commit_header *cbh =
> +(struct commit_header *)bh->b_data;
> +if(need_check_commit_time) {
> +__be64 commit_time = be64_to_cpu(cbh->h_commit_sec);
> +if(commit_time >= last_trans_commit_time) {
> +printk(KERN_ERR "JBD2: Invalid checksum found in log, %d\n",
> +next_commit_ID);
> +err = -EFSBADCRC;
> +brelse(bh);
> +goto failed;
> +}
> +else
> +{
>   Coding style requires to put opening { on the same line as 'else'. Like:
> else {
> +/*it's not belong to same journal, just end this recovery with success*/
> +jbd_debug(1, "JBD2: Invalid checksum found in block in log, but not same journal %d\n",
> +next_commit_ID);
> +err = 0;
> +brelse(bh);
> +goto done;
> +}
> +}
> +if(ri_commit_block) {
> +jbd_debug(1, "invalid commit block found in %lu, stop here.\n",next_log_block);
> +brelse(bh);
> +goto done;
> +}
> +ri_commit_block = next_log_block;
> 
> Why does the ri_commit_block logic exist? I don't see it bringing any benefit...
> 
> +last_trans_commit_time = be64_to_cpu(cbh->h_commit_sec);
> +}
>  if (pass == PASS_SCAN &&
>      jbd2_has_feature_checksum(journal)) {
>  int chksum_err, chksum_seen;
> @@ -755,6 +788,12 @@ static int do_one_pass(journal_t *journal,
>  continue;
> 
>  case JBD2_REVOKE_BLOCK:
> +if (pass == PASS_SCAN &&
> +ri_commit_block) {
> +jbd_debug(1, "invalid revoke block found in %lu, stop here.\n",next_log_block);
> +brelse(bh);
> +goto done;
> +}
> 
> This is wrong. A valid transaction can start with a revoke block...
> 
>  /* If we aren't in the REVOKE pass, then we can
>   * just skip over this block. */
>  if (pass != PASS_REVOKE) {
> 
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
> 
> ________________________________
> 
> CONFIDENTIALITY NOTICE: This electronic message is intended to be viewed only by the individual or entity to whom it is addressed. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. Any dissemination, distribution or copying of this communication is strictly prohibited without our prior permission. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, or if you have received this communication in error, please notify us immediately by return e-mail and delete the original message and any copies of it from your computer system. For further information about Hikvision company. please see our website at www.hikvision.com<http://www.hikvision.com>


-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 答复: 答复: 答复: [PATCH] jbd2: avoid transaction reuse after reformatting
  2020-09-23 12:24             ` Jan Kara
@ 2020-10-03  4:24               ` Theodore Y. Ts'o
  0 siblings, 0 replies; 10+ messages in thread
From: Theodore Y. Ts'o @ 2020-10-03  4:24 UTC (permalink / raw)
  To: Jan Kara
  Cc: 常凤楠,
	changfengnan, adilger, darrick.wong, jack, linux-ext4

On Wed, Sep 23, 2020 at 02:24:35PM +0200, Jan Kara wrote:
> On Wed 23-09-20 06:29:12, 常凤楠 wrote:
> > The attachment is new patch.
> 
> Thanks!
> 
> > I have fix the logic in JBD2_REVOKE_BLOCK and JBD2_COMMIT_BLOCK case.
> > If the revoke block is the first block after valid transaction, I set the
> > flag like descriptor block ,and check it in commit block.  If the commit
> > block is the first block after valid transaction, I use ri_commit_block
> > to judge whether this commit block is next to another commit block, if so
> > it is illegal. I did't use time to judge commit block, because of the
> > possibility of time calibration, I think use ri_commit_block is more
> > reliable.
> 
> If the time is unreliable, your logic with ri_commit_block will detect only
> a small minority of the cases. The majority of cases will fail with
> checksum error. So I still don't think the ri_commit_block logic is really
> worth the additional code. Other than that the current version looks OK to
> me. So please do next submission with full changelog, Signed-off-by line,
> and the coding style issues fixed (you can use scripts/checkpatch.pl for
> patch verification). Thanks!

Hi changfengnan,

Have you had a chance to send a revised patch with the changes
requested by Jan?

Thanks,

					- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-10-03  4:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <tencent_2341B065211F204FA07C3ADDA1AE07706405@qq.com>
2020-09-10 12:06 ` [PATCH] jbd2: avoid transaction reuse after reformatting 常凤楠
2020-09-10 18:45 ` Andreas Dilger
2020-09-11 10:06 ` Jan Kara
2020-09-14 11:50   ` 答复: " 常凤楠
2020-09-17 10:44     ` Jan Kara
2020-09-18  1:49       ` 答复: " 常凤楠
2020-09-18 13:02         ` Jan Kara
2020-09-23  6:29           ` 答复: " 常凤楠
2020-09-23 12:24             ` Jan Kara
2020-10-03  4:24               ` Theodore Y. Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.