All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v5 00/12] btrfs: Enhancement to tree block validation
Date: Fri, 15 Feb 2019 18:50:32 +0800	[thread overview]
Message-ID: <20190215105044.17619-1-wqu@suse.com> (raw)

Patchset can be fetched from github:
https://github.com/adam900710/linux/tree/write_time_tree_checker
Which is based on v5.0-rc1 tag.
Also there is no conflict rebasing the patchset to misc-next.

This patchset has the following 3 features:
- Tree block validation output enhancement
  * Output validation failure timing (write time or read time)
  * Always output tree block level/key mismatch error message
    This part is already submitted and reviewed.

- Write time tree block validation check
  To catch memory corruption either from hardware or kernel.
  Example output would be:

    BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 169 0)
    BTRFS error (device dm-3): write time tree block corruption detected
    BTRFS: error (device dm-3) in btrfs_commit_transaction:2220: errno=-5 IO failure (Error while writing out transaction)
    BTRFS info (device dm-3): forced readonly
    BTRFS warning (device dm-3): Skipping commit of aborted transaction.
    BTRFS: error (device dm-3) in cleanup_transaction:1839: errno=-5 IO failure
    BTRFS info (device dm-3): delayed_refs has NO entry

- Better error handling before calling flush_write_bio()
  One hidden reason of calling flush_write_bio() under all cases is,
  flush_write_bio() will trigger endio function and endio function of
  epd->bio will free the bio under all cases.
  So we're in fact abusing flush_write_bio() as cleanup.

  Since now flush_write_bio() has its own return value, we shouldn't call
  flush_write_bio() no-brain, here we introduce proper cleanup helper,
  end_write_bio(). Now we call flush_write_bio() like:
              New                 |           Old
  --------------------------------------------------------------
  ret = do_some_evil(&epd);       | ret = do_some_evil(&epd);
  if (ret < 0) {                  | flush_write_bio(&epd);
  	end_write_bio(&epd, ret); | ^^^ submitting half-backed epd->bio?
  	return ret;               | return ret;
  }                               |
  ret = flush_write_bio(&epd);    |
  return ret;                     |

  Above code should be more streamline for the error handling part.

Changelog:
v2:
- Unlock locked pages in lock_extent_buffer_for_io() for error handling.
- Added Reviewed-by tags.

v3:
- Remove duplicated error message.
- Use IS_ENABLED() macro to replace #ifdef.
- Added Reviewed-by tags.

v4:
- Re-organized patch split
  Now each BUG_ON() cleanup has its own patch
- Dig much further into the call sites to eliminate unexpected >0 return
  May be a little paranoid and abuse some ASSERT(), but it should be
  much safer against further code change.
- Fix the false alert caused by balance and memory pressure
  The fix it skip owner checker for non-essential tree at write time.
  Since owner root can't always be reliable, either due to commit root
  created in current transaction or balance + memory pressure.

v5:
- Do proper error-out handling other than relying on flush_write_bio()
  to clean up.
  This has a side effect that no Reviewed-by tags for modified patches.
- New comment for why we don't need to do anything about ebp->bio when
  submit_one_bio() fails.
- Add some Reviewed-by tag.

Qu Wenruo (12):
  btrfs: Always output error message when key/level verification fails
  btrfs: extent_io: Kill the forward declaration of flush_write_bio()
  btrfs: disk-io: Show the timing of corrupted tree block explicitly
  btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up
  btrfs: extent_io: Handle error better in extent_write_full_page()
  btrfs: extent_io: Handle error better in btree_write_cache_pages()
  btrfs: extent_io: Kill the dead branch in extent_write_cache_pages()
  btrfs: extent_io: Handle error better in extent_write_locked_range()
  btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io()
  btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages()
  btrfs: extent_io: Handle error better in extent_writepages()
  btrfs: Do mandatory tree block check before submitting bio

 fs/btrfs/disk-io.c      |  21 +++--
 fs/btrfs/extent_io.c    | 168 ++++++++++++++++++++++++++++------------
 fs/btrfs/tree-checker.c |  24 +++++-
 fs/btrfs/tree-checker.h |   8 ++
 4 files changed, 162 insertions(+), 59 deletions(-)

-- 
2.20.1


             reply	other threads:[~2019-02-15 10:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-15 10:50 Qu Wenruo [this message]
2019-02-15 10:50 ` [PATCH v5 01/12] btrfs: Always output error message when key/level verification fails Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 02/12] btrfs: extent_io: Kill the forward declaration of flush_write_bio() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 03/12] btrfs: disk-io: Show the timing of corrupted tree block explicitly Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 04/12] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 05/12] btrfs: extent_io: Handle error better in extent_write_full_page() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 06/12] btrfs: extent_io: Handle error better in btree_write_cache_pages() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 07/12] btrfs: extent_io: Kill the dead branch in extent_write_cache_pages() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 08/12] btrfs: extent_io: Handle error better in extent_write_locked_range() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 09/12] btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 10/12] btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 11/12] btrfs: extent_io: Handle error better in extent_writepages() Qu Wenruo
2019-02-15 10:50 ` [PATCH v5 12/12] btrfs: Do mandatory tree block check before submitting bio Qu Wenruo
2019-02-15 13:10 ` [PATCH v5 00/12] btrfs: Enhancement to tree block validation Nikolay Borisov
2019-02-15 13:18   ` Qu Wenruo
2019-02-15 17:19     ` David Sterba
2019-02-16  6:49       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190215105044.17619-1-wqu@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.