All of lore.kernel.org
 help / color / mirror / Atom feed
From: Goffredo Baroncelli <kreijack@libero.it>
To: dsterba@suse.cz, Anand Jain <anand.jain@oracle.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/2] btrfs: create chunk device type aware
Date: Sun, 30 Jan 2022 23:28:20 +0100	[thread overview]
Message-ID: <1639cabc-e644-885d-5201-1ef355029342@libero.it> (raw)
In-Reply-To: <20220126173838.GE14046@twin.jikos.cz>

On 26/01/2022 18.38, David Sterba wrote:
>> The advantage of this method is that data/metadata allocation distribution
>> based on the device type happens automatically without any manual
>> configuration.
> Yeah, but the default behaviour may not be suitable for all users so
> some policy will have to be done anyway.
> 
> I vaguely remember some comments regarding mixed setups, along lines
> that "if there's a fast flash device I'd rather see ENOSPC and either
> delete files or add more devices than to let everything work but with
> the risk of storing metadata on the slow devices."
> 

I confirm that. There are two aspects that impacted the "allocation_hint"
patches set discussions:
1) the criteria to order the disks
2) allow/permit/deny a kind of BG to be hosted by a device

Regarding 1), initially the first set of patches considered an automatic
behavior on the basis of the "rotational" attribute. Soon it was pointed out
that in the "not rotational" class there are different options (like SSD, NVME...).
The conclusion was that the "priority" should not be cabled in the btrfs code.
(e.g. we could consider the reliability ?)

Regarding 2), my initial patches set only ordered the disks and alloed any
disk to be used. Some user asked me to prevent to use certain disks
for (e.g.) the data; this to prevent the data BG to consume all the
available space [*]

These discussions leads me to create 4 "classes" for disks
- ONLY_METADATA
- PREFERRED_METADATA
- PREFERRED_DATA
- ONLY_DATA

Where the last one is not suitable for metadata, and the first one is not
suitable for data. The middle ones allow one type but only of the other disks are full.

Another differences between the Anand patches and the my ones, is that
in the "allocation_hint" for striped profiles (like raid5, which spans all
the available disks), the devices involved should have the same classes.

I.e., if btrfs has to allocate a RAID5 metadata chunk,
- first tries to use ONLY_METADATA disks.
- If the disks are not enough, then it tries to use ONLY_METADATA and
   PREFERRED_METADATA disks.
- If the disks are not enough, then it tries to use ONLY_METADATA,
   PREFERRED_METADATA and PREFERRED_DATA disks.
- If the disks are not enough, then -ENOSPC

What the Anand patches has more than allocation_hint patches, is that
these handle the case of different latency disks, giving higher
priority to the disks with lower latency. If this is a requirements
we can reserve some bits to add a priority of a disk, and then
we can sort the disk by:

- allocation_hint class
- priority
- max avail
- free space

BR
G.Baroncelli

[*] I don't want to open another discussion, but this seems to me more a "quota"
problem than a "disk allocation" problem...

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

      parent reply	other threads:[~2022-01-30 22:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-18 15:18 [PATCH 0/2] device type and create chunk Anand Jain
2022-01-18 15:18 ` [PATCH 1/2] btrfs: keep device type in the struct btrfs_device Anand Jain
2022-01-26 16:53   ` David Sterba
2022-01-29 16:24     ` Anand Jain
2022-02-01 17:06       ` David Sterba
2022-02-03 12:56         ` Anand Jain
2022-01-18 15:18 ` [PATCH 2/2] btrfs: create chunk device type aware Anand Jain
2022-01-26 17:01   ` David Sterba
2022-01-29 16:24     ` Anand Jain
2022-01-26 17:38   ` David Sterba
2022-01-29 16:46     ` Anand Jain
2022-01-30 22:15       ` Goffredo Baroncelli
2022-01-30 22:28     ` Goffredo Baroncelli [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1639cabc-e644-885d-5201-1ef355029342@libero.it \
    --to=kreijack@libero.it \
    --cc=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.