All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Zhang Yi <yi.zhang@huawei.com>
Cc: Jan Kara <jack@suse.cz>, Zhang Yi <yi.zhang@huaweicloud.com>,
	linux-ext4@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, yukuai3@huawei.com,
	ocfs2-devel@oss.oracle.com
Subject: Re: [PATCH v3 1/2] jbd2: continue to record log between each mount
Date: Fri, 17 Mar 2023 12:25:28 +0100	[thread overview]
Message-ID: <20230317112528.cig7fczuoezn23wy@quack3> (raw)
In-Reply-To: <20230315172817.egezft3msc5z4omm@quack3>

On Wed 15-03-23 18:28:17, Jan Kara wrote:
> On Wed 15-03-23 20:37:32, Zhang Yi wrote:
> > On 2023/3/15 17:48, Jan Kara wrote:
> > > On Tue 14-03-23 22:05:21, Zhang Yi wrote:
> > >> From: Zhang Yi <yi.zhang@huawei.com>
> > >>
> > >> For a newly mounted file system, the journal committing thread always
> > >> record new transactions from the start of the journal area, no matter
> > >> whether the journal was clean or just has been recovered. So the logdump
> > >> code in debugfs cannot dump continuous logs between each mount, it is
> > >> disadvantageous to analysis corrupted file system image and locate the
> > >> file system inconsistency bugs.
> > >>
> > >> If we get a corrupted file system in the running products and want to
> > >> find out what has happened, besides lookup the system log, one effective
> > >> way is to backtrack the journal log. But we may not always run e2fsck
> > >> before each mount and the default fsck -a mode also cannot always
> > >> checkout all inconsistencies, so it could left over some inconsistencies
> > >> into the next mount until we detect it. Finally, transactions in the
> > >> journal may probably discontinuous and some relatively new transactions
> > >> has been covered, it becomes hard to analyse. If we could record
> > >> transactions continuously between each mount, we could acquire more
> > >> useful info from the journal. Like this:
> > >>
> > >>  |Previous mount checkpointed/recovered logs|Current mount logs         |
> > >>  |{------}{---}{--------} ... {------}| ... |{======}{========}...000000|
> > >>
> > >> And yes the journal area is limited and cannot record everything, the
> > >> problematic transaction may also be covered even if we do this, but
> > >> this is still useful for fuzzy tests and short-running products.
> > >>
> > >> This patch save the head blocknr in the superblock after flushing the
> > >> journal or unmounting the file system, let the next mount could continue
> > >> to record new transaction behind it. This change is backward compatible
> > >> because the old kernel does not care about the head blocknr of the
> > >> journal. It is also fine if we mount a clean old image without valid
> > >> head blocknr, we fail back to set it to s_first just like before.
> > >> Finally, for the case of mount an unclean file system, we could also get
> > >> the journal head easily after scanning/replaying the journal, it will
> > >> continue to record new transaction after the recovered transactions.
> > >>
> > >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> > > 
> > > I like this implementation! I even think we could perhaps make ext4 always
> > > behave this way to not increase size of the test matrix. Or do you see any
> > > downside to this option?
> > > 
> > 
> > Thanks for your suggestion. Indeed, I don't find any side effect on this
> > option both in theory and in the actual use tests on ext4, I added a new
> > option was just from the safe point of view and let user could disable it if
> > they don't want it. I also prefer to make ext4 always behave this way.:)
> > 
> > I would like to keep the JBD2_CYCLE_RECORD flag(ocfs2 also use jbd2, I don't
> > want to disturb it until it needs), remove EXT4_MOUNT2_JOURNAL_CYCLE_RECORD
> > and always set JBD2_CYCLE_RECORD on ext4 in patch 2 in the next iteration.
> 
> Yes, that makes sense.

FWIW yesterday I'v spoken with Ted and he also agrees that we don't need
ext4 mount option for this.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara via Ocfs2-devel <ocfs2-devel@oss.oracle.com>
To: Zhang Yi <yi.zhang@huawei.com>
Cc: Jan Kara <jack@suse.cz>, Zhang Yi <yi.zhang@huaweicloud.com>,
	adilger.kernel@dilger.ca, yukuai3@huawei.com, tytso@mit.edu,
	linux-ext4@vger.kernel.org, ocfs2-devel@oss.oracle.com
Subject: Re: [Ocfs2-devel] [PATCH v3 1/2] jbd2: continue to record log	between each mount
Date: Fri, 17 Mar 2023 12:25:28 +0100	[thread overview]
Message-ID: <20230317112528.cig7fczuoezn23wy@quack3> (raw)
In-Reply-To: <20230315172817.egezft3msc5z4omm@quack3>

On Wed 15-03-23 18:28:17, Jan Kara wrote:
> On Wed 15-03-23 20:37:32, Zhang Yi wrote:
> > On 2023/3/15 17:48, Jan Kara wrote:
> > > On Tue 14-03-23 22:05:21, Zhang Yi wrote:
> > >> From: Zhang Yi <yi.zhang@huawei.com>
> > >>
> > >> For a newly mounted file system, the journal committing thread always
> > >> record new transactions from the start of the journal area, no matter
> > >> whether the journal was clean or just has been recovered. So the logdump
> > >> code in debugfs cannot dump continuous logs between each mount, it is
> > >> disadvantageous to analysis corrupted file system image and locate the
> > >> file system inconsistency bugs.
> > >>
> > >> If we get a corrupted file system in the running products and want to
> > >> find out what has happened, besides lookup the system log, one effective
> > >> way is to backtrack the journal log. But we may not always run e2fsck
> > >> before each mount and the default fsck -a mode also cannot always
> > >> checkout all inconsistencies, so it could left over some inconsistencies
> > >> into the next mount until we detect it. Finally, transactions in the
> > >> journal may probably discontinuous and some relatively new transactions
> > >> has been covered, it becomes hard to analyse. If we could record
> > >> transactions continuously between each mount, we could acquire more
> > >> useful info from the journal. Like this:
> > >>
> > >>  |Previous mount checkpointed/recovered logs|Current mount logs         |
> > >>  |{------}{---}{--------} ... {------}| ... |{======}{========}...000000|
> > >>
> > >> And yes the journal area is limited and cannot record everything, the
> > >> problematic transaction may also be covered even if we do this, but
> > >> this is still useful for fuzzy tests and short-running products.
> > >>
> > >> This patch save the head blocknr in the superblock after flushing the
> > >> journal or unmounting the file system, let the next mount could continue
> > >> to record new transaction behind it. This change is backward compatible
> > >> because the old kernel does not care about the head blocknr of the
> > >> journal. It is also fine if we mount a clean old image without valid
> > >> head blocknr, we fail back to set it to s_first just like before.
> > >> Finally, for the case of mount an unclean file system, we could also get
> > >> the journal head easily after scanning/replaying the journal, it will
> > >> continue to record new transaction after the recovered transactions.
> > >>
> > >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> > > 
> > > I like this implementation! I even think we could perhaps make ext4 always
> > > behave this way to not increase size of the test matrix. Or do you see any
> > > downside to this option?
> > > 
> > 
> > Thanks for your suggestion. Indeed, I don't find any side effect on this
> > option both in theory and in the actual use tests on ext4, I added a new
> > option was just from the safe point of view and let user could disable it if
> > they don't want it. I also prefer to make ext4 always behave this way.:)
> > 
> > I would like to keep the JBD2_CYCLE_RECORD flag(ocfs2 also use jbd2, I don't
> > want to disturb it until it needs), remove EXT4_MOUNT2_JOURNAL_CYCLE_RECORD
> > and always set JBD2_CYCLE_RECORD on ext4 in patch 2 in the next iteration.
> 
> Yes, that makes sense.

FWIW yesterday I'v spoken with Ted and he also agrees that we don't need
ext4 mount option for this.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

  reply	other threads:[~2023-03-17 11:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-14 14:05 [PATCH v3 0/2] ext4, jbd2: journal cycled record transactions between each mount Zhang Yi
2023-03-14 14:05 ` [PATCH v3 1/2] jbd2: continue to record log " Zhang Yi
2023-03-15  9:48   ` Jan Kara
2023-03-15 12:37     ` [Ocfs2-devel] " Zhang Yi via Ocfs2-devel
2023-03-15 12:37       ` Zhang Yi
2023-03-15 17:28       ` Jan Kara
2023-03-15 17:28         ` [Ocfs2-devel] " Jan Kara via Ocfs2-devel
2023-03-17 11:25         ` Jan Kara [this message]
2023-03-17 11:25           ` Jan Kara via Ocfs2-devel
2023-03-18  2:25           ` Zhang Yi
2023-03-18  2:25             ` [Ocfs2-devel] " Zhang Yi via Ocfs2-devel
2023-03-14 14:05 ` [PATCH v3 2/2] ext4: add journal cycled recording support Zhang Yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230317112528.cig7fczuoezn23wy@quack3 \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    --cc=yi.zhang@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.