linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v2 0/8] btrfs: raid-stripe-tree draft patches
@ 2022-06-29 14:41 Johannes Thumshirn
  2022-06-29 14:41 ` [PATCH RFC v2 1/8] btrfs: add raid stripe tree definitions Johannes Thumshirn
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2022-06-29 14:41 UTC (permalink / raw)
  To: linux-btrfs
  Cc: Naohiro Aota, Damien Le Moal, Johannes Thumshirn, Qu Wenruo,
	Christoph Hellwig, Josef Bacik

Here's a second draft of my btrfs zoned RAID1 patches.

Updates of the raid-stripe-tree are done at delayed-ref time to safe on
bandwidth while for reading we do the stripe-tree lookup on bio mapping time,
i.e. when the logical to physical translation happens for regular btrfs RAID
as well.

The stripe tree is keyed by an extent's disk_bytenr and disk_num_bytes and
it's contents are the respective physical device id and position.

For an example 1M write (split into 126K segments due to zone-append)
rapido2:/home/johannes/src/fstests# xfs_io -fdc "pwrite -b 1M 0 1M" -c fsync /mnt/test/test
wrote 1048576/1048576 bytes at offset 0
1 MiB, 1 ops; 0.0065 sec (151.538 MiB/sec and 151.5381 ops/sec)

The tree will look as follows:

rapido2:/home/johannes/src/fstests# btrfs inspect-internal dump-tree -t raid_stripe /dev/nullb0
btrfs-progs v5.16.1 
raid stripe tree key (RAID_STRIPE_TREE ROOT_ITEM 0)
leaf 805847040 items 9 free space 15770 generation 9 owner RAID_STRIPE_TREE
leaf 805847040 flags 0x1(WRITTEN) backref revision 1
checksum stored 1b22e13800000000000000000000000000000000000000000000000000000000
checksum calced 1b22e13800000000000000000000000000000000000000000000000000000000
fs uuid e4f523d1-89a1-41f9-ab75-6ba3c42a28fb
chunk uuid 6f2d8aaa-d348-4bf2-9b5e-141a37ba4c77
        item 0 key (939524096 RAID_STRIPE_KEY 126976) itemoff 16251 itemsize 32
                        stripe 0 devid 1 offset 939524096
                        stripe 1 devid 2 offset 536870912
        item 1 key (939651072 RAID_STRIPE_KEY 126976) itemoff 16219 itemsize 32
                        stripe 0 devid 1 offset 939651072
                        stripe 1 devid 2 offset 536997888
        item 2 key (939778048 RAID_STRIPE_KEY 126976) itemoff 16187 itemsize 32
                        stripe 0 devid 1 offset 939778048
                        stripe 1 devid 2 offset 537124864
        item 3 key (939905024 RAID_STRIPE_KEY 126976) itemoff 16155 itemsize 32
                        stripe 0 devid 1 offset 939905024
                        stripe 1 devid 2 offset 537251840
        item 4 key (940032000 RAID_STRIPE_KEY 126976) itemoff 16123 itemsize 32
                        stripe 0 devid 1 offset 940032000
                        stripe 1 devid 2 offset 537378816
        item 5 key (940158976 RAID_STRIPE_KEY 126976) itemoff 16091 itemsize 32
                        stripe 0 devid 1 offset 940158976
                        stripe 1 devid 2 offset 537505792
        item 6 key (940285952 RAID_STRIPE_KEY 126976) itemoff 16059 itemsize 32
                        stripe 0 devid 1 offset 940285952
                        stripe 1 devid 2 offset 537632768
        item 7 key (940412928 RAID_STRIPE_KEY 126976) itemoff 16027 itemsize 32
                        stripe 0 devid 1 offset 940412928
                        stripe 1 devid 2 offset 537759744
        item 8 key (940539904 RAID_STRIPE_KEY 32768) itemoff 15995 itemsize 32
                        stripe 0 devid 1 offset 940539904
                        stripe 1 devid 2 offset 537886720
total bytes 26843545600
bytes used 1245184
uuid e4f523d1-89a1-41f9-ab75-6ba3c42a28fb

The performance deviation is meassurable but overall not too bad for a first shot:

RAID1:
READ: bw=81.6MiB/s (85.6MB/s), 81.6MiB/s-81.6MiB/s (85.6MB/s-85.6MB/s), io=496MiB (520MB), run=6075-6075msec
WRITE: bw=86.9MiB/s (91.1MB/s), 86.9MiB/s-86.9MiB/s (91.1MB/s-91.1MB/s), io=528MiB (554MB), run=6075-6075msec

Single:
READ: bw=92.5MiB/s (97.0MB/s), 92.5MiB/s-92.5MiB/s (97.0MB/s-97.0MB/s), io=496MiB (520MB), run=5360-5360msec
WRITE: bw=98.5MiB/s (103MB/s), 98.5MiB/s-98.5MiB/s (103MB/s-103MB/s), io=528MiB (554MB), run=5360-5360msec

Changes to v1:
- Write the stripe-tree at delayed-ref time (Qu)
- Add a different write path for preallocation

v1 of the patchset can be found here:
https://lore.kernel.org/linux-btrfs/cover.1652711187.git.johannes.thumshirn@wdc.com/

Johannes Thumshirn (8):
  btrfs: add raid stripe tree definitions
  btrfs: read raid-stripe-tree from disk
  btrfs: add boilerplate code to insert raid extent
  btrfs: add boilerplate code to insert stripe entries for preallocated
    extents
  btrfs: add code to delete raid extent
  btrfs: add code to read raid extent
  btrfs: zoned: allow zoned RAID1
  btrfs: add raid stripe tree pretty printer

 fs/btrfs/Makefile               |   2 +-
 fs/btrfs/block-rsv.c            |   1 +
 fs/btrfs/ctree.h                |  33 ++++
 fs/btrfs/disk-io.c              |  15 ++
 fs/btrfs/extent-tree.c          |  53 ++++++
 fs/btrfs/inode.c                |   6 +
 fs/btrfs/print-tree.c           |  21 +++
 fs/btrfs/raid-stripe-tree.c     | 318 ++++++++++++++++++++++++++++++++
 fs/btrfs/raid-stripe-tree.h     |  72 ++++++++
 fs/btrfs/volumes.c              |  35 +++-
 fs/btrfs/volumes.h              |   4 +
 fs/btrfs/zoned.c                |  39 ++++
 include/uapi/linux/btrfs.h      |   1 +
 include/uapi/linux/btrfs_tree.h |  17 ++
 14 files changed, 614 insertions(+), 3 deletions(-)
 create mode 100644 fs/btrfs/raid-stripe-tree.c
 create mode 100644 fs/btrfs/raid-stripe-tree.h

-- 
2.35.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-06-29 14:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-29 14:41 [PATCH RFC v2 0/8] btrfs: raid-stripe-tree draft patches Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 1/8] btrfs: add raid stripe tree definitions Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 2/8] btrfs: read raid-stripe-tree from disk Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 3/8] btrfs: add boilerplate code to insert raid extent Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 4/8] btrfs: add boilerplate code to insert stripe entries for preallocated extents Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 5/8] btrfs: add code to delete raid extent Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 6/8] btrfs: add code to read " Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 7/8] btrfs: zoned: allow zoned RAID1 Johannes Thumshirn
2022-06-29 14:41 ` [PATCH RFC v2 8/8] btrfs: add raid stripe tree pretty printer Johannes Thumshirn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).