linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugo Mills <hugo@carfax.org.uk>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org, Leonard Lausen <leonard@lausen.nl>
Subject: Re: [PATCH] btrfs: Do mandatory tree block check before submitting bio
Date: Wed, 16 Jan 2019 13:07:42 +0000	[thread overview]
Message-ID: <20190116130742.GH5134@carfax.org.uk> (raw)
In-Reply-To: <702957e0-b7c9-bc5f-2409-bdc5ff8d39b6@gmx.com>

[-- Attachment #1: Type: text/plain, Size: 3963 bytes --]

On Wed, Jan 16, 2019 at 08:26:35PM +0800, Qu Wenruo wrote:
> 
> 
> On 2019/1/16 下午8:01, Hugo Mills wrote:
> >    Hi, Qu,
> > 
> > On Wed, Jan 16, 2019 at 07:53:08PM +0800, Qu Wenruo wrote:
> >> There are at least 2 reports about memory bit flip sneaking into on-disk
> >> data.
> >>
> >> Currently we only have a relaxed check triggered at
> >> btrfs_mark_buffer_dirty() time, as it's not mandatory, only for
> >> CONFIG_BTRFS_FS_CHECK_INTEGRITY enabled build.
> >>
> >> This patch will address the hole by triggering comprehensive check on
> >> tree blocks before writing it back to disk.
> >>
> >> The timing is set to csum_tree_block() where @verify == 0.
> >> At that timing, we're generation csum for tree blocks before submitting
> >> the metadata bio, so we could avoid all the unnecessary calls at
> >> btrfs_mark_buffer_dirty(), but still catch enough error.
> > 
> >    I agree wholeheartedly with the idea of this change. Just one
> > question:
> > 
> >    How does this get reported to the user/administrator? As I
> > understand it, a detectably corrupt metadata page will generate an I/O
> > error from the filesystem before it's written to disk? How will this
> > show up in kernel logs?
> 
> Well, you caught me.
> 
> I haven't try the error case, and in fact if it fails, it fails by
> triggering kernel BUG_ON(), thus you may not have a chance to see the
> error message from btrfs module.
> 
> > Is it distinguishable in any way from a
> > similar error that was generated on reading such a corrupt metadata
> > node from the disk?
> 
> Kind of distinguishable for this patch, when you hit kernel BUG_ON() at
> fs/btrfs/extent_io.c:4016 then it's definitely from this patch. :P

   Haha. :)

> >    Basically, I want to be able to distinguish this case (error
> > detected when writing) from the existing case (error detected when
> > reading) when someone shows up on IRC with a "broken filesystem".
> 
> Definitely, I'll address this and the BUG_ON() in next version.

   The error-on-read case gives us a pretty good report on what's
wrong and where. Typically something like "bad key order" or "wrong
item size", plus the logical address of the metadata chunk so we can
dump it with debug-tree -b.

   Having the same messages from this call-site to indicate the kind
of error found, and also something to indicate that it was detected
before hitting disk (i.e. which call-site for the checks triggered the
error) would be, I think, the minimal information needed. If we could
also have the human-readable dump of the full metadata page logged as
well, that would be ideal -- we won't be able to use debug-tree to
diagnose the issue afterwards, as it won't have reached the disk.

   Thanks,
   Hugo.

> Thanks,
> Qu
> 
> > 
> >    Hugo.
> > 
> >> Reported-by: Leonard Lausen <leonard@lausen.nl>
> >> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >> ---
> >>  fs/btrfs/disk-io.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> >> index 8da2f380d3c0..45bf6be9e751 100644
> >> --- a/fs/btrfs/disk-io.c
> >> +++ b/fs/btrfs/disk-io.c
> >> @@ -313,6 +313,16 @@ static int csum_tree_block(struct btrfs_fs_info *fs_info,
> >>  			return -EUCLEAN;
> >>  		}
> >>  	} else {
> >> +		/*
> >> +		 * Here we're calculating csum before writing it to disk,
> >> +		 * do comprenhensive check here to catch memory corruption
> >> +		 */
> >> +		if (btrfs_header_level(buf))
> >> +			err = btrfs_check_node(fs_info, buf);
> >> +		else
> >> +			err = btrfs_check_leaf_full(fs_info, buf);
> >> +		if (err < 0)
> >> +			return err;
> >>  		write_extent_buffer(buf, result, 0, csum_size);
> >>  	}
> >>  
> > 
> 




-- 
Hugo Mills             | Friends: people who know you well, but like you
hugo@... carfax.org.uk | anyway.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

      reply	other threads:[~2019-01-16 13:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-16 11:53 [PATCH] btrfs: Do mandatory tree block check before submitting bio Qu Wenruo
2019-01-16 12:01 ` Hugo Mills
2019-01-16 12:26   ` Qu Wenruo
2019-01-16 13:07     ` Hugo Mills [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190116130742.GH5134@carfax.org.uk \
    --to=hugo@carfax.org.uk \
    --cc=leonard@lausen.nl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).