All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] ext4/jbd2: misc 3.17 bugfixes
@ 2014-09-11  0:28 Darrick J. Wong
  2014-09-11  0:28 ` [PATCH 1/4] jbd2: fix journal checksum feature flag handling Darrick J. Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Darrick J. Wong @ 2014-09-11  0:28 UTC (permalink / raw)
  To: tytso, darrick.wong; +Cc: linux-ext4

Hi all,

Here are four patches against 3.17-rc4 to fix some minor problems
in jbd2 and ext4.  None of these four depend on each other; they
fix separate small bugs.

The first patch fixes the journal_checksum feature flag handling at
mount time.  This patch has been out for review on the list for a
while.

The second patch fixes external journal mounting so that the
superblock checksum (of the ext. journal) is verified if metadata_csum
is set.  This is the same patch that has been out for review for a few
days.

The third patch fixes a journal_checksum_v3 replay bug -- if a block
is in a transaction, and then later revoked and written into another
transaction, and the block in the second transaction is corrupt, the
journal would fail even to write the block from the first transaction.
This would worsen the damage caused by a corrupt journal.

The fourth bug fixes an inline_data bug where we would release a page
but then keep using it, which resulted in complaints about freeing
locked pages at umount time or strange system crashes.

Patches are against 3.17-rc4, and have been xfstest'd and checked
against debugfs creating test journals.  There's still a hard to
reproduce crash when ext4_destroy_inline_data_nolock tries to remove
the inline data xattr from a corrupt inode, so we'll see if I can nail
that one.

Comments and questions are, as always, welcome.

--D

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] jbd2: restart replay without revokes if journal block csum fails
@ 2014-09-12 13:14 TR Reardon
  2014-09-12 16:15 ` Jan Kara
  0 siblings, 1 reply; 18+ messages in thread
From: TR Reardon @ 2014-09-12 13:14 UTC (permalink / raw)
  To: Jan Kara, Darrick J. Wong; +Cc: Jan Kara, tytso, linux-ext4

Trying to follow your description below, but still have some confusion.

In the most common mount case of metadata-only journalling (no data journalling), revokes are emitted when extent blocks or directory blocks are released and reused as data blocks?  ie updating a metadata block in-place will never yield a revoke transaction (inodes, bitmaps etc)?

--- Original Message ---

From: "Jan Kara" <jack@suse.cz>
Sent: September 12, 2014 5:59 AM
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: "Jan Kara" <jack@suse.cz>, tytso@mit.edu, linux-ext4@vger.kernel.org
Subject: Re: [PATCH 3/4] jbd2: restart replay without revokes if journal block csum fails

On Thu 11-09-14 10:43:29, Darrick J. Wong wrote:
> On Thu, Sep 11, 2014 at 10:30:09AM -0700, Darrick J. Wong wrote:
> > On Thu, Sep 11, 2014 at 03:15:11PM +0200, Jan Kara wrote:
> > > On Wed 10-09-14 17:28:38, Darrick J. Wong wrote:
> > > > If, during a journal_checksum_v3 replay we encounter a block that
> > > > doesn't match its tag in the descriptor block tag, we need to restart
> > > > the replay without the revoke table in the hopes of replaying the
> > > > newest non-corrupt version of the block that we possibly can.
> > >   Ho hum, I don't like this. If you just ignore revoke list, you'll happily
> > > overwrite freshly allocated data blocks with older metadata. Also when
> > > verifying the checksum, we already know the block hasn't been revoked
> > > so what's even the benefit of ignoring the revoke list?
> >
> > Let's say block X contains contents B0 and the journal contains:
> >
> >  1. write block 1 with B1
> >  2. revoke "write of block 1 (with B1)"
> >  3. write block 1 with B2
> >
> > Now say that B2 gets corrupt, which means that #3 won't get replayed.  Because
> > the revoke in #2 prevented the write in #1 from being written, at the end of
> > replay, block 1 has contents B0, even though B1 could have been played back.
> >
> > What I'm really confused about is the intent of revoke records -- do they exist
> > to say "don't replay older versions of this block; a new one will follow
> > later"?  Or they mean only "don't replay this block if it exists in an earlier
> > transaction" either because a newer block will follow OR because that block is
> > now something non-journalled (i.e.  file data)?  I started off thinking the
> > first, but perhaps it's really the second.
>
> Ahh, I get it.  Revoke records are used only to indicate that a particular
> block that's in the journal has become an un-journalled block; a subsequent
  Yup, exactly.

> re-add to the journal removes the revoke record.
  Well, not quite. Block is revoked in some transaction (and that
information is stored in that transaction in the journal). Thus we don't
replay that block in older transactions. If in your example B2 gets
corrupt, replaying B1 has no sense because the existence of revoke record
means that the block has been reused for data. So metadata in B1 is
hopelessly outdated anyway.

                                                                Honza

> > Rather than dumping the entire revoke list, I think I can just erase the
> > previous revoke records for just the corrupt block and then restart the replay.
> >
> > --D
> >
> > >
> > >                                                           Honza
> > >
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  fs/jbd2/recovery.c |   19 +++++++++++++++++--
> > > >  1 file changed, 17 insertions(+), 2 deletions(-)
> > > >
> > > >
> > > > diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
> > > > index 9b329b5..0094d8b 100644
> > > > --- a/fs/jbd2/recovery.c
> > > > +++ b/fs/jbd2/recovery.c
> > > > @@ -439,6 +439,7 @@ static int do_one_pass(journal_t *journal,
> > > >          * block offsets): query the superblock.
> > > >          */
> > > >
> > > > +restart_pass:
> > > >         sb = journal->j_superblock;
> > > >         next_commit_ID = be32_to_cpu(sb->s_sequence);
> > > >         next_log_block = be32_to_cpu(sb->s_start);
> > > > @@ -585,7 +586,8 @@ static int do_one_pass(journal_t *journal,
> > > >                                         /* If the block has been
> > > >                                          * revoked, then we're all done
> > > >                                          * here. */
> > > > -                                       if (jbd2_journal_test_revoke
> > > > +                                       if (!block_error &&
> > > > +                                           jbd2_journal_test_revoke
> > > >                                             (journal, blocknr,
> > > >                                              next_commit_ID)) {
> > > >                                                 brelse(obh);
> > > > @@ -599,11 +601,24 @@ static int do_one_pass(journal_t *journal,
> > > >                                                 be32_to_cpu(tmp->h_sequence))) {
> > > >                                                 brelse(obh);
> > > >                                                 success = -EIO;
> > > > +                                               if (!block_error) {
> > > > +                                                       /* If we see a corrupt
> > > > +                                                        * block, kill the
> > > > +                                                        * revoke list and
> > > > +                                                        * restart the replay
> > > > +                                                        * so that the blocks
> > > > +                                                        * are as close to
> > > > +                                                        * accurate as
> > > > +                                                        * possible. */
> > > > +                                                       jbd2_journal_clear_revoke(journal);
> > > > +                                                       brelse(bh);
> > > > +                                                       block_error = 1;
> > > > +                                                       goto restart_pass;
> > > > +                                               }
> > > >                                                 printk(KERN_ERR "JBD2: Invalid "
> > > >                                                        "checksum recovering "
> > > >                                                        "block %llu in log\n",
> > > >                                                        blocknr);
> > > > -                                               block_error = 1;
> > > >                                                 goto skip_write;
> > > >                                         }
> > > >
> > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > --
> > > Jan Kara <jack@suse.cz>
> > > SUSE Labs, CR
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-09-12 16:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-11  0:28 [PATCH 0/4] ext4/jbd2: misc 3.17 bugfixes Darrick J. Wong
2014-09-11  0:28 ` [PATCH 1/4] jbd2: fix journal checksum feature flag handling Darrick J. Wong
2014-09-11 15:47   ` Theodore Ts'o
2014-09-11  0:28 ` [PATCH 2/4] ext4: validate external journal superblock checksum Darrick J. Wong
2014-09-11 13:25   ` Jan Kara
2014-09-11 15:47     ` Theodore Ts'o
2014-09-11 17:11     ` Darrick J. Wong
2014-09-11  0:28 ` [PATCH 3/4] jbd2: restart replay without revokes if journal block csum fails Darrick J. Wong
2014-09-11 13:15   ` Jan Kara
2014-09-11 17:30     ` Darrick J. Wong
2014-09-11 17:43       ` Darrick J. Wong
2014-09-12  9:59         ` Jan Kara
2014-09-11  0:28 ` [PATCH 4/4] ext4: don't keep using page if inline conversion fails Darrick J. Wong
2014-09-11 13:17   ` Jan Kara
2014-09-11 15:46     ` Theodore Ts'o
2014-09-11 20:35 ` [PATCH 5/4] ext4: check EA value offset when loading Darrick J. Wong
2014-09-12 13:14 [PATCH 3/4] jbd2: restart replay without revokes if journal block csum fails TR Reardon
2014-09-12 16:15 ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.