All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/17] btrfs: add subpage support for RAID56
@ 2022-04-12  9:32 Qu Wenruo
  2022-04-12  9:32 ` [PATCH v2 01/17] btrfs: reduce width for stripe_len from u64 to u32 Qu Wenruo
                   ` (17 more replies)
  0 siblings, 18 replies; 34+ messages in thread
From: Qu Wenruo @ 2022-04-12  9:32 UTC (permalink / raw)
  To: linux-btrfs

The branch can be fetched from github, based on latest misc-next branch
(with bio and memory allocation refactors):
https://github.com/adam900710/linux/tree/subpage_raid56

[CHANGELOG]
v2:
- Rebased to latest misc-next
  There are several conflicts caused by bio interface change and page
  allocation update.

- A new patch to reduce the width of @stripe_len to u32
  Currently @stripe_len is fixed to 64K, and even in the future we
  choose to enlarge the value, I see no reason to go beyond 4G for
  stripe length.

  Thus change it u32 to avoid some u64-divided-by-u32 situations.

  This will reduce memory usage for map_lookup (which has a lifespan as
  long as the mounted fs) and btrfs_io_geometry (which only has a very
  short lifespan, mostly bounded to bio).

  Furthermore, add some extra alignment check and use right bit shift
  to replace involved division to avoid possible problems on 32bit
  systems.

- Pack sector_ptr::pgoff and sector_ptr::uptodate into one u32
  This will reduce memory usage and reduce unaligned memory access

  Please note that, even with it packed, we still have a 4 bytes padding
  (it's u64 + u32, thus not perfectly aligned).
  Without packed attribute, it will cost more memory usage anyway.

- Call kunmap_local() using address with pgoff
  As it can handle it without problem, no need to bother extra search
  just for pgoff.

- Use "= { 0 }" for structure initialization

- Reduce comment updates to minimal
  If one comment line is not really touched, then don't touch it just to
  fix some bad styles.

[DESIGN]
To make RAID56 code to support subpage, we need to make every sector of
a RAID56 full stripe (including P/Q) to be addressable.

Previously we use page pointer directly for things like stripe_pages:

Full stripe is 64K * 3, 2 data stripes, one P stripe (RAID5)

stripe_pages:   | 0 | 1 | 2 |  .. | 46 | 47 |

Those 48 pages all points to a page we allocated.


The new subpage support will introduce a sector layer, based on the old
stripe_pages[] array:

The same 64K * 3, RAID5 layout, but 64K page size and 4K sector size:

stripe_sectors: | 0 | 1 | .. |15 |16 |  ...  |31 |32 | ..    |47 |
stripe_pages:   |      Page 0    |     Page 1    |    Page 2     |

Each stripe_ptr of stripe_sectors[] will include:

- One page pointer
  Points back the page inside stripe_pages[].

- One pgoff member
  To indicate the offset inside the page

- One uptodate member
  To indicate if the sector is uptodate, replacing the old PageUptodate
  flag.
  As introducing btrfs_subpage structure to stripe_pages[] looks a
  little overkilled, as we only care Uptodate flag.

The same applies to bio_sectors[] array, which is going to completely
replace the old bio_pages[] array.

[SIDE EFFECT]
Despite the obvious new ability for subpage to utilize btrfs RAID56
profiles, it will cause more memory usage for real btrfs_raid_bio
structure.

We allocate extra memory based on the stripe size and number of stripes,
and update the pointer arrays to utilize the extra memory.

To compare, we use a pretty standard setup, 3 disks raid5, 4K page size
on x86_64:

 Before: 1176
 After:  2320 (+97.8%)

The reason for such a big bump is:

- Extra size for sector_ptr.
  Instead of just a page pointer, now it's twice the size of a pointer
  (a page pointer + 2 * unsigned int)

  This means although we replace bio_pages[] with bio_sectors[], we are
  still enlarging the size.

- A completely new array for stripe_sectors[]
  And the array itself is also twice the size of the old stripe_pages[].

- Extra padding for sector_ptr
  Since we don't have packed attribute anymore, the real size of
  a sector_ptr is 16 bytes, not 12 bytes.

There is some attempt to reduce the size of btrfs_raid_bio itself, but
the big impact still comes from the new sector_ptr arrays.

Without exotic macros or packed attribute, I don't have any better ideas
on reducing the real size of btrfs_raid_bio.

[TESTS]
Now due to recent new error path exposed in generic/475, only btrfs
groups are tested. Or it will always hang at generic/475 due to
unrelated bugs.

Both x86_64 and aarch64 (64K page size) pass the full btrfs test cases
without new regression.

[PATCHSET LAYOUT]
The patchset layout puts several things into consideration:

- Every patch can be tested independently on x86_64
  No more tons of unused helpers then a big switch.
  Every change can be verified on x86_64.

- More temporary sanity checks than previous code
  For example, when rbio_add_io_page() is converted to be subpage
  compatible, extra ASSERT() is added to ensure no subpage range
  can even be added.

  Such temporary checks are removed in the last enablement patch.
  This is to make testing on x86_64 more comprehensive.

- Mostly small change in each patch
  The only exception is the conversion for rbio_add_io_page().
  But the most change in that patch comes from variable renaming.
  The overall line changed in each patch should still be small enough
  for review.

Qu Wenruo (17):
  btrfs: reduce width for stripe_len from u64 to u32
  btrfs: open-code rbio_nr_pages()
  btrfs: make btrfs_raid_bio more compact
  btrfs: introduce new cached members for btrfs_raid_bio
  btrfs: introduce btrfs_raid_bio::stripe_sectors
  btrfs: introduce btrfs_raid_bio::bio_sectors
  btrfs: make rbio_add_io_page() subpage compatible
  btrfs: make finish_parity_scrub() subpage compatible
  btrfs: make __raid_recover_endio_io() subpage compatibable
  btrfs: make finish_rmw() subpage compatible
  btrfs: open-code rbio_stripe_page_index()
  btrfs: make raid56_add_scrub_pages() subpage compatible
  btrfs: remove btrfs_raid_bio::bio_pages array
  btrfs: make set_bio_pages_uptodate() subpage compatible
  btrfs: make steal_rbio() subpage compatible
  btrfs: make alloc_rbio_essential_pages() subpage compatible
  btrfs: enable subpage support for RAID56

 fs/btrfs/disk-io.c |   8 -
 fs/btrfs/raid56.c  | 749 +++++++++++++++++++++++++++------------------
 fs/btrfs/raid56.h  |   8 +-
 fs/btrfs/scrub.c   |   6 +-
 fs/btrfs/volumes.c |  27 +-
 fs/btrfs/volumes.h |   8 +-
 6 files changed, 479 insertions(+), 327 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-04-21 16:31 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-12  9:32 [PATCH v2 00/17] btrfs: add subpage support for RAID56 Qu Wenruo
2022-04-12  9:32 ` [PATCH v2 01/17] btrfs: reduce width for stripe_len from u64 to u32 Qu Wenruo
2022-04-12  9:32 ` [PATCH v2 02/17] btrfs: open-code rbio_nr_pages() Qu Wenruo
2022-04-12  9:32 ` [PATCH v2 03/17] btrfs: make btrfs_raid_bio more compact Qu Wenruo
2022-04-12  9:32 ` [PATCH v2 04/17] btrfs: introduce new cached members for btrfs_raid_bio Qu Wenruo
2022-04-15  5:31   ` Christoph Hellwig
2022-04-15  5:34     ` Qu Wenruo
2022-04-15  5:45       ` Christoph Hellwig
2022-04-12  9:32 ` [PATCH v2 05/17] btrfs: introduce btrfs_raid_bio::stripe_sectors Qu Wenruo
2022-04-12  9:32 ` [PATCH v2 06/17] btrfs: introduce btrfs_raid_bio::bio_sectors Qu Wenruo
2022-04-12  9:32 ` [PATCH v2 07/17] btrfs: make rbio_add_io_page() subpage compatible Qu Wenruo
2022-04-13 19:14   ` David Sterba
2022-04-13 23:28     ` Qu Wenruo
2022-04-14  0:43       ` Qu Wenruo
2022-04-14 10:59         ` Qu Wenruo
2022-04-14 15:51           ` David Sterba
2022-04-14 22:48             ` Qu Wenruo
2022-04-21 15:44               ` David Sterba
2022-04-14 15:43         ` David Sterba
2022-04-14 17:51           ` David Sterba
2022-04-14 22:28             ` Qu Wenruo
2022-04-21 16:24               ` David Sterba
2022-04-12  9:32 ` [PATCH v2 08/17] btrfs: make finish_parity_scrub() " Qu Wenruo
2022-04-12  9:32 ` [PATCH v2 09/17] btrfs: make __raid_recover_endio_io() subpage compatibable Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 10/17] btrfs: make finish_rmw() subpage compatible Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 11/17] btrfs: open-code rbio_stripe_page_index() Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 12/17] btrfs: make raid56_add_scrub_pages() subpage compatible Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 13/17] btrfs: remove btrfs_raid_bio::bio_pages array Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 14/17] btrfs: make set_bio_pages_uptodate() subpage compatible Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 15/17] btrfs: make steal_rbio() " Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 16/17] btrfs: make alloc_rbio_essential_pages() " Qu Wenruo
2022-04-12  9:33 ` [PATCH v2 17/17] btrfs: enable subpage support for RAID56 Qu Wenruo
2022-04-12 17:42 ` [PATCH v2 00/17] btrfs: add " David Sterba
2022-04-13 14:46   ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.