All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Jones <paul@pauljones.id.au>
To: Goffredo Baroncelli <kreijack@libero.it>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Cc: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.cz>,
	Sinnamohideen Shafeeq <shafeeqs@panasas.com>,
	Goffredo Baroncelli <kreijack@inwind.it>
Subject: RE: [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode
Date: Mon, 13 Dec 2021 09:39:15 +0000	[thread overview]
Message-ID: <SYXPR01MB1918689AF49BE6E6E031C8B69E749@SYXPR01MB1918.ausprd01.prod.outlook.com> (raw)
In-Reply-To: <cover.1635089352.git.kreijack@inwind.it>

> -----Original Message-----
> From: Goffredo Baroncelli <kreijack@tiscali.it>
> Sent: Monday, 25 October 2021 2:31 AM
> To: linux-btrfs@vger.kernel.org
> Cc: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>; Josef Bacik
> <josef@toxicpanda.com>; David Sterba <dsterba@suse.cz>; Sinnamohideen
> Shafeeq <shafeeqs@panasas.com>; Goffredo Baroncelli
> <kreijack@inwind.it>
> Subject: [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode
> 
> From: Goffredo Baroncelli <kreijack@inwind.it>
> 
> Hi all,
> 
> This patches set was born after some discussion between me, Zygo and
> Josef.
> Some details can be found in https://github.com/btrfs/btrfs-todo/issues/19.
> 
> Some further information about a real use case can be found in
> https://lore.kernel.org/linux-
> btrfs/20210116002533.GE31381@hungrycats.org/
> 
> Reently Shafeeq told me that he is interested too, due to the performance
> gain.
> 
> In this revision I switched away from an ioctl API in favor of a sysfs API ( see
> patch #2 and #3).
> 
> The idea behind this patches set, is to dedicate some disks (the fastest one)
> to the metadata chunk. My initial idea was a "soft" hint. However Zygo asked
> an option for a "strong" hint (== mandatory). The result is that each disk can
> be "tagged" by one of the following flags:
> - BTRFS_DEV_ALLOCATION_METADATA_ONLY
> - BTRFS_DEV_ALLOCATION_PREFERRED_METADATA
> - BTRFS_DEV_ALLOCATION_PREFERRED_DATA
> - BTRFS_DEV_ALLOCATION_DATA_ONLY
> 
> When the chunk allocator search a disks to allocate a chunk, scans the disks in
> an order decided by these tags. For metadata, the order is:
> *_METADATA_ONLY
> *_PREFERRED_METADATA
> *_PREFERRED_DATA
> 
> The *_DATA_ONLY are not eligible from metadata chunk allocation.
> 
> For the data chunk, the order is reversed, and the *_METADATA_ONLY are
> excluded.
> 
> The exact sort logic is to sort first for the "tag", and then for the space
> available. If there is no space available, the next "tag" disks set are selected.
> 
> To set these tags, a new property called "allocation_hint" was created.
> There is a dedicated btrfs-prog patches set [[PATCH V5] btrfs-progs:
> allocation_hint disk property].
> 
> $ sudo mount /dev/loop0 /mnt/test-btrfs/ $ for i in /dev/loop[0-9]; do sudo
> ./btrfs prop get $i allocation_hint; done devid=1, path=/dev/loop0:
> allocation_hint=PREFERRED_METADATA
> devid=2, path=/dev/loop1: allocation_hint=PREFERRED_METADATA
> devid=3, path=/dev/loop2: allocation_hint=PREFERRED_DATA devid=4,
> path=/dev/loop3: allocation_hint=PREFERRED_DATA devid=5,
> path=/dev/loop4: allocation_hint=PREFERRED_DATA devid=6,
> path=/dev/loop5: allocation_hint=DATA_ONLY devid=7, path=/dev/loop6:
> allocation_hint=METADATA_ONLY devid=8, path=/dev/loop7:
> allocation_hint=METADATA_ONLY
> 
> $ sudo ./btrfs fi us /mnt/test-btrfs/
> Overall:
>     Device size:           2.75GiB
>     Device allocated:           1.34GiB
>     Device unallocated:           1.41GiB
>     Device missing:             0.00B
>     Used:             400.89MiB
>     Free (estimated):           1.04GiB    (min: 1.04GiB)
>     Data ratio:                  2.00
>     Metadata ratio:              1.00
>     Global reserve:           3.25MiB    (used: 0.00B)
>     Multiple profiles:                no
> 
> Data,RAID1: Size:542.00MiB, Used:200.25MiB (36.95%)
>    /dev/loop0     288.00MiB
>    /dev/loop1     288.00MiB
>    /dev/loop2     127.00MiB
>    /dev/loop3     127.00MiB
>    /dev/loop4     127.00MiB
>    /dev/loop5     127.00MiB
> 
> Metadata,single: Size:256.00MiB, Used:384.00KiB (0.15%)
>    /dev/loop1     256.00MiB
> 
> System,single: Size:32.00MiB, Used:16.00KiB (0.05%)
>    /dev/loop0      32.00MiB
> 
> Unallocated:
>    /dev/loop0     704.00MiB
>    /dev/loop1     480.00MiB
>    /dev/loop2       1.00MiB
>    /dev/loop3       1.00MiB
>    /dev/loop4       1.00MiB
>    /dev/loop5       1.00MiB
>    /dev/loop6     128.00MiB
>    /dev/loop7     128.00MiB
> 
> # change the tag of some disks
> 
> $ sudo ./btrfs prop set /dev/loop0 allocation_hint DATA_ONLY $ sudo ./btrfs
> prop set /dev/loop1 allocation_hint DATA_ONLY $ sudo ./btrfs prop set
> /dev/loop5 allocation_hint METADATA_ONLY
> 
> $ for i in /dev/loop[0-9]; do sudo ./btrfs prop get $i allocation_hint; done
> devid=1, path=/dev/loop0: allocation_hint=DATA_ONLY devid=2,
> path=/dev/loop1: allocation_hint=DATA_ONLY devid=3, path=/dev/loop2:
> allocation_hint=PREFERRED_DATA devid=4, path=/dev/loop3:
> allocation_hint=PREFERRED_DATA devid=5, path=/dev/loop4:
> allocation_hint=PREFERRED_DATA devid=6, path=/dev/loop5:
> allocation_hint=METADATA_ONLY devid=7, path=/dev/loop6:
> allocation_hint=METADATA_ONLY devid=8, path=/dev/loop7:
> allocation_hint=METADATA_ONLY
> 
> $ sudo btrfs bal start --full-balance /mnt/test-btrfs/ $ sudo ./btrfs fi us
> /mnt/test-btrfs/
> Overall:
>     Device size:           2.75GiB
>     Device allocated:         735.00MiB
>     Device unallocated:           2.03GiB
>     Device missing:             0.00B
>     Used:             400.72MiB
>     Free (estimated):           1.10GiB    (min: 1.10GiB)
>     Data ratio:                  2.00
>     Metadata ratio:              1.00
>     Global reserve:           3.25MiB    (used: 0.00B)
>     Multiple profiles:                no
> 
> Data,RAID1: Size:288.00MiB, Used:200.19MiB (69.51%)
>    /dev/loop0     288.00MiB
>    /dev/loop1     288.00MiB
> 
> Metadata,single: Size:127.00MiB, Used:336.00KiB (0.26%)
>    /dev/loop5     127.00MiB
> 
> System,single: Size:32.00MiB, Used:16.00KiB (0.05%)
>    /dev/loop7      32.00MiB
> 
> Unallocated:
>    /dev/loop0     736.00MiB
>    /dev/loop1     736.00MiB
>    /dev/loop2     128.00MiB
>    /dev/loop3     128.00MiB
>    /dev/loop4     128.00MiB
>    /dev/loop5       1.00MiB
>    /dev/loop6     128.00MiB
>    /dev/loop7      96.00MiB
> 
> 
> #As you can see all the metadata were placed on the disk loop5/loop7 even if
> #the most empty one are loop0 and loop1.
> 
> 
> 
> TODO:
> - more tests
> - the tool which show the space available should consider the tagging (eg
>   the disks tagged by _METADATA_ONLY should be excluded from the data
>   availability)
> 
> 
> Comments are welcome
> BR
> G.Baroncelli


I've been running this patch series since about V4, works really well. Would be nice to have it merged eventually.

Tested By: Paul Jones <paul@pauljones.id.au>


  parent reply	other threads:[~2021-12-13  9:41 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-24 15:31 [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 1/4] btrfs: add flags to give an hint to the chunk allocator Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 2/4] btrfs: export dev_item.type in /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 3/4] btrfs: change the DEV_ITEM 'type' field via sysfs Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 4/4] btrfs: add allocator_hint mode Goffredo Baroncelli
2021-12-17 15:58   ` Hans van Kranenburg
2021-12-17 18:28     ` Goffredo Baroncelli
2021-12-17 19:41       ` Zygo Blaxell
2021-12-18  9:07         ` Goffredo Baroncelli
2021-12-18 22:48           ` Zygo Blaxell
2021-12-19  0:03             ` Graham Cobb
2021-12-19  2:30               ` Zygo Blaxell
2021-12-13  9:39 ` Paul Jones [this message]
2021-12-13 19:54   ` [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Goffredo Baroncelli
2021-12-13 21:15     ` Josef Bacik
2021-12-13 22:49       ` Zygo Blaxell
2021-12-14 14:31         ` Josef Bacik
2021-12-14 19:03         ` Goffredo Baroncelli
2021-12-14 20:04           ` Zygo Blaxell
2021-12-14 20:34             ` Josef Bacik
2021-12-14 20:41               ` Goffredo Baroncelli
2021-12-15 13:58                 ` Josef Bacik
2021-12-15 18:53                   ` Goffredo Baroncelli
2021-12-16  0:56                     ` Josef Bacik
2021-12-17  5:40                       ` Zygo Blaxell
2021-12-17 14:48                         ` Josef Bacik
2021-12-17 16:31                           ` Zygo Blaxell
2021-12-17 18:08                         ` Goffredo Baroncelli
2021-12-16  2:30                   ` Paul Jones
2021-12-14  1:03       ` Sinnamohideen, Shafeeq
2021-12-14 18:53       ` Goffredo Baroncelli
2021-12-14 20:35         ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SYXPR01MB1918689AF49BE6E6E031C8B69E749@SYXPR01MB1918.ausprd01.prod.outlook.com \
    --to=paul@pauljones.id.au \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=kreijack@inwind.it \
    --cc=kreijack@libero.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=shafeeqs@panasas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.