All of lore.kernel.org
 help / color / mirror / Atom feed
From: Manfred Spraul <manfred@colorfullife.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, 1vier1@web.de
Subject: Re: JBD2: journal transaction 6943 on loop0-8 is corrupt.
Date: Thu, 30 Dec 2021 09:16:13 +0100	[thread overview]
Message-ID: <8fe067d0-6d57-9dd7-2c10-5a2c34037ee1@colorfullife.com> (raw)
In-Reply-To: <Yc0NUYyRhLdtapq+@mit.edu>

[-- Attachment #1: Type: text/plain, Size: 4552 bytes --]

Hi Ted,

On 12/30/21 02:37, Theodore Ts'o wrote:
> On Tue, Dec 28, 2021 at 09:36:22PM +0100, Manfred Spraul wrote:
>> Hi,
>>
>> with simulated power failures, I see a corrupted journal
>>
>> [39056.200845] JBD2: journal transaction 6943 on loop0-8 is corrupt.
>> [39056.200851] EXT4-fs (loop0): error loading journal
> This means that the journal replay found a commit which was *not* the
> last commit, and which contained a CRC error.  If it's the last commit
> (e.g., there is no valid subsequent commit block), then it's possible
> that the journal commit was never completed before the system crashed
> --- e.g., it was an interrupted commit.

It is the last commit, there are no valid subsequent commit blocks.

The current failure model is simple: all blocks up to block <n> are 
written, the blocks starting from <n+1> are discarded.

What I can't rule out, but I think this is not what I see:
The image is small (512 MB) and everything is in memory. Thus I would 
not rule out that the whole journal is filled within less than one second.
The commit header contains h_commit_sec and h_commit_nsec, but from what 
I see, do_one_pass() evaluates only h_commit_sec.


> Your test is aborting the commit at various points in the write I/O
> stream, so it should be simulating an interrupted commit (assuming
> that it's not corrupting any I/O.  So the jbd2 layer should have
> understood it was the last commit in the journal, and been OK with the
> checksum failure.

I think the jbd2 layer understood that it was the last commit - but it 
nevertheless failed the recovery.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/jbd2/recovery.c?h=v5.16-rc7#n809

> |
> 		if  (pass  ==  PASS_SCAN  &&
> 			!jbd2_commit_block_csum_verify(journal,
> 							bh->b_data))  {
> 			chksum_error:
> 				if  (commit_time  <  last_trans_commit_time)
> 					goto  ignore_crc_mismatch;
> 				info->end_transaction  =  next_commit_ID;
>
> 				if  (!jbd2_has_feature_async_commit(journal))  {
> 					journal->j_failed_commit  =
> 						next_commit_ID;
> 					brelse(bh);
> 					break;
> 				}
> 			}
> |

async_commit() is false.

journal->j_failed_commit is set, and thus after journal playback, a 
failure is reported :-(


What I have done:

- I have removed the line journal->j_failed_commit = next_commit_ID, 
then the image is mounted.

- I had added pr_info() lines, and this confirms that it starts with a 
failure of jbd2_block_csum_verify().

dmesg with JBD2 debug output and additional pr_info lines:

 >>>

[  748.591348] next_commit_id increased: 6943.
[  748.593397] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for 
sequence ID 6943 at 4544/8192
[  748.594322] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking 
block 4544
[  748.595879] fs/jbd2/recovery.c: (do_one_pass, 549): Found magic 5, 
sequence 6943
[  748.596800] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for 
sequence ID 6943 at 4545/8192
[  748.600073] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking 
block 4545
[  748.601398] fs/jbd2/recovery.c: (do_one_pass, 549): Found magic 1, 
sequence 6943
[  748.603481] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for 
sequence ID 6943 at 4574/8192
[  748.606238] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking 
block 4574
[  748.607241] fs/jbd2/recovery.c: (do_one_pass, 549): Found magic 2, 
sequence 6943
[  748.608248] commit_csum_verify error: provided 81be34bd.
[  748.610199] direct block verify error.
[  748.612327] chksum_error.
[  748.617616] error line 2.
[  748.622085] error line 3.
[  748.623291] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for 
sequence ID 6943 at 4575/8192
[  748.624133] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking 
block 4575
[  748.625134] done, info->end_transaction is 6943.
[  748.626966] done with update, info->end_transaction is 6943.
[  748.629521] next_commit_id: initial value 6799.
[  748.630836] fs/jbd2/recovery.c: (do_one_pass, 491): Starting recovery 
pass 1
[  748.631529] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for 
sequence ID 6799 at 6734/8192
[  748.633346] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking 
block 6734

<<<

What do you think?
Is JBD2 too aggressive in declaring something a journal corruption?

What is definitively correct is that the failure is minor. fsck -f was 
able to correct it.
Thus: What is your understanding:
If a mount command fails due to a journal corruption, should user space 
first try a fsck -f before giving up?


--

     Manfred

[-- Attachment #2: 0001-jbd2-recovery.c-Continue-on-csum-failures-for-commit.patch --]
[-- Type: text/x-patch, Size: 2744 bytes --]

From 759062a21176c46a8fc1fa4d95e20c4450a10b9a Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@colorfullife.com>
Date: Thu, 30 Dec 2021 08:22:00 +0100
Subject: [PATCH] jbd2/recovery.c: Continue on csum failures for commit record

Noticed with simulated power failures, i.e. not on real hardware:

The failure model is:
- 512 bytes writes are atomic.
- Larger writes are not atomic.
- Everything is written in order.

Since the JBD2 block size can be larger than the block size of the
physical drive, it may happen that a (JBD2) block starts with the
expected magic/block type==JBD2_COMMIT_BLOCK/sequence number/
commit time, but nevertheless the csum verification fails because
jbd2_commit_block_csum_verify() calculates a checksum over
the complete JBD2 block.

Thus: Just end the scan on a csum failure.

Note: The change is most likely incomplete. There are probably
more situations where the code assumes that an incorrect csum
is always a corruption.

---
 fs/jbd2/recovery.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 8ca3527189f8..f6d59bc204a1 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -709,6 +709,20 @@ static int do_one_pass(journal_t *journal,
 			/*     How to differentiate between interrupted commit
 			 *               and journal corruption ?
 			 *
+			 * Assume: Physical block size 512 bytes,
+			 *         j->j_blocksize=1024
+			 * If the 1st physical block of a commit block is
+			 * written, then the correct magic/block type/
+			 * sequence number/commit time will be there.
+			 * If the 2nd block is not written, then the csum
+			 * verification will fail, because the csum is
+			 * calculated over the whole JBD2 block.
+			 *
+			 * Thus: Only only async_commit, n-th transaction fails
+			 * csum check, (n+1)th transaction passes csum check,
+			 * is a journal corruption. Everything else could be
+			 * just an interrupted write.
+			 *
 			 * {nth transaction}
 			 *        Checksum Verification Failed
 			 *			 |
@@ -717,7 +731,7 @@ static int do_one_pass(journal_t *journal,
 			 * 	async_commit             sync_commit
 			 *     		|                    |
 			 *		| GO TO NEXT    "Journal Corruption"
-			 *		| TRANSACTION
+			 *		| TRANSACTION   or "Interrupted Commit"
 			 *		|
 			 * {(n+1)th transanction}
 			 *		|
@@ -806,8 +820,9 @@ static int do_one_pass(journal_t *journal,
 				info->end_transaction = next_commit_ID;
 
 				if (!jbd2_has_feature_async_commit(journal)) {
-					journal->j_failed_commit =
-						next_commit_ID;
+					/* Interrupted commit or corrupt
+					 * journal. Assume interrupted commit.
+					 */
 					brelse(bh);
 					break;
 				}
-- 
2.33.1


[-- Attachment #3: 0003-DEBUG-patch-add-printk-to-fs-jbd2-recovery.c.patch --]
[-- Type: text/x-patch, Size: 3320 bytes --]

From 693bcf65312e202c15b2f291e1d105b5486cfeb9 Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@colorfullife.com>
Date: Thu, 30 Dec 2021 08:03:04 +0100
Subject: [PATCH 3/3] DEBUG patch: add printk to fs/jbd2/recovery.c

debug printouts.

---
 fs/jbd2/journal.c  |  2 +-
 fs/jbd2/recovery.c | 13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 35302bc192eb..58545fa74b9f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -49,7 +49,7 @@
 #include <asm/page.h>
 
 #ifdef CONFIG_JBD2_DEBUG
-ushort jbd2_journal_enable_debug __read_mostly;
+ushort jbd2_journal_enable_debug __read_mostly = 99;
 EXPORT_SYMBOL(jbd2_journal_enable_debug);
 
 module_param_named(jbd2_debug, jbd2_journal_enable_debug, ushort, 0644);
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 8ca3527189f8..b66884786cb9 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -428,6 +428,9 @@ static int jbd2_commit_block_csum_verify(journal_t *j, void *buf)
 	h->h_chksum[0] = 0;
 	calculated = jbd2_chksum(j, j->j_csum_seed, buf, j->j_blocksize);
 	h->h_chksum[0] = provided;
+if (provided != cpu_to_be32(calculated)) {
+pr_info("commit_csum_verify error: provided %x.\n", provided);
+}
 
 	return provided == cpu_to_be32(calculated);
 }
@@ -478,6 +481,7 @@ static int do_one_pass(journal_t *journal,
 
 	sb = journal->j_superblock;
 	next_commit_ID = be32_to_cpu(sb->s_sequence);
+pr_info("next_commit_id: initial value %u.\n", (unsigned int) next_commit_ID);
 	next_log_block = be32_to_cpu(sb->s_start);
 
 	first_commit_ID = next_commit_ID;
@@ -783,6 +787,7 @@ static int do_one_pass(journal_t *journal,
 					brelse(bh);
 					break;
 				}
+pr_info("crc32_sum %x found_csum %x.\n", crc32_sum, found_chksum);
 
 				/* Neither checksum match nor unused? */
 				if (!((crc32_sum == found_chksum &&
@@ -800,22 +805,28 @@ static int do_one_pass(journal_t *journal,
 			if (pass == PASS_SCAN &&
 			    !jbd2_commit_block_csum_verify(journal,
 							   bh->b_data)) {
+pr_info("direct block verify error.\n");
 			chksum_error:
+pr_info("chksum_error.\n");
 				if (commit_time < last_trans_commit_time)
 					goto ignore_crc_mismatch;
+pr_info("error line 2.\n");
 				info->end_transaction = next_commit_ID;
 
 				if (!jbd2_has_feature_async_commit(journal)) {
+pr_info("error line 3.\n");
 					journal->j_failed_commit =
 						next_commit_ID;
 					brelse(bh);
 					break;
 				}
+pr_info("error line 4.\n");
 			}
 			if (pass == PASS_SCAN)
 				last_trans_commit_time = commit_time;
 			brelse(bh);
 			next_commit_ID++;
+pr_info("next_commit_id increased: %u.\n", (unsigned int) next_commit_ID);
 			continue;
 
 		case JBD2_REVOKE_BLOCK:
@@ -859,10 +870,12 @@ static int do_one_pass(journal_t *journal,
 	 * log.  If the latter happened, then we know that the "current"
 	 * transaction marks the end of the valid log.
 	 */
+pr_info("done, info->end_transaction is %ld.\n", (long)info->end_transaction);
 
 	if (pass == PASS_SCAN) {
 		if (!info->end_transaction)
 			info->end_transaction = next_commit_ID;
+pr_info("done with update, info->end_transaction is %ld.\n", (long)info->end_transaction);
 	} else {
 		/* It's really bad news if different passes end up at
 		 * different places (but possible due to IO errors). */
-- 
2.33.1


      reply	other threads:[~2021-12-30  8:16 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-28 20:36 JBD2: journal transaction 6943 on loop0-8 is corrupt Manfred Spraul
2021-12-30  1:37 ` Theodore Ts'o
2021-12-30  8:16   ` Manfred Spraul [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8fe067d0-6d57-9dd7-2c10-5a2c34037ee1@colorfullife.com \
    --to=manfred@colorfullife.com \
    --cc=1vier1@web.de \
    --cc=adilger.kernel@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.