[PATCH v2 0/7] btrfs: read-repair rework based on bitmap

* [PATCH v2 0/7] btrfs: read-repair rework based on bitmap
@ 2022-05-25 10:59 Qu Wenruo
  2022-05-25 10:59 ` [PATCH v2 1/7] btrfs: save the original bi_iter into btrfs_bio for buffered read Qu Wenruo
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Qu Wenruo @ 2022-05-25 10:59 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Christoph Hellwig

This is the bitmap version revivied, and based on Christoph's
cleanup series.

The branch can be feteched from my repo:
https://github.com/adam900710/linux/tree/read_repair

Changelog:
v2:
- Still go bitmap version to get batched submission

- Instead of using the old failed bio to hold all the pages, here we
  use an oridnary bio to hold only the corrupted sectors.

- Unlike previous bitmap version, this time no memory allocation
  The needed bitmap will be on-stack, and only limit to 32bits.

- Go synchronous submission for read and write

- Only use bitmap for the corrupted range.
  Unlike previous version, which allocate bio for the whole bio.
  That's a waste of memory.

- btrfs_read_repair_sector() will submit the existing repair early
  For uncontinuous new sector or if we have reached size limit
  (due to bitmap size limit, now it's 128K for 4K sectorsize).

  This allow us to use fixed bitmap size.

The core function is still btrfs_read_repair_finish().

But now, btrfs_read_repair_finish() either get called without any
corrupted sectors, or it will only face a continuous range of corrupted
sectors.

Then we handle the range by iterating all the remaining mirrors.

For each mirror we go:

1) Try to add current bad sector into our io_bio.
   If our io_bio is not continuous, we just submit current io_bio and
   wait for it.
   Then add the new sector into it.

2) If the io_bio is not empty, submit it.

By 1) and 2), we will read all bad sectors from the new mirror.

3) Check if the data is fine and update our ctrl::bad_bitmap

We either end with all sectors repaired, or all mirrors exhausted.

The advantage of bitmap method is, we only try at most (num_copies - 1)
times, no matter the corruption pattern.

On the other hand, for the worst case we're still doing sector by sector
repair.
For the best case (aka, continuous corruption cases), we still do
batched bio submission, thus still way better than sector-by-sector
repair.

Furthermore, all loops in the code are regular for() loops, no hacking
on how we loop.

But I have to admit, even the repair_from_mirror() and
btrfs_read_repair_finish() is super easy to read, the details on bio
page adding and submission are all hidden into io_add_or_submit() and
btrfs_read_repair_add_sector().

Although it has better worst case performance, it's no better than
sector-by-sector repair in worst case scenario.

Cc: Christoph Hellwig <hch@lst.de>

Christoph Hellwig (1):
  btrfs: add a btrfs_map_bio_wait helper

Qu Wenruo (6):
  btrfs: save the original bi_iter into btrfs_bio for buffered read
  btrfs: make repair_io_failure available outside of extent_io.c
  btrfs: introduce new read-repair infrastructure
  btrfs: make buffered read path to use the new read repair
    infrastructure
  btrfs: make direct io read path to use the new read repair
    infrastructure
  btrfs: remove io_failure_record infrastructure completely

 fs/btrfs/Makefile            |   2 +-
 fs/btrfs/btrfs_inode.h       |   5 -
 fs/btrfs/extent-io-tree.h    |  15 --
 fs/btrfs/extent_io.c         | 436 +++--------------------------------
 fs/btrfs/extent_io.h         |  28 +--
 fs/btrfs/inode.c             |  60 ++---
 fs/btrfs/read-repair.c       | 328 ++++++++++++++++++++++++++
 fs/btrfs/read-repair.h       |  48 ++++
 fs/btrfs/volumes.c           |  21 ++
 fs/btrfs/volumes.h           |   2 +
 include/trace/events/btrfs.h |   1 -
 11 files changed, 458 insertions(+), 488 deletions(-)
 create mode 100644 fs/btrfs/read-repair.c
 create mode 100644 fs/btrfs/read-repair.h

-- 
2.36.1

^ permalink raw reply	[flat|nested] 23+ messages in thread