All of lore.kernel.org
 help / color / mirror / Atom feed
From: Goffredo Baroncelli <kreijack@inwind.it>
To: Hans van Kranenburg <hans@knorrie.org>,
	dsterba@suse.cz, linux-btrfs@vger.kernel.org,
	Josef Bacik <josef@toxicpanda.com>
Subject: Re: [PATCH rfc v3] New ioctl BTRFS_IOC_GET_CHUNK_INFO.
Date: Sat, 30 May 2020 08:41:05 +0200	[thread overview]
Message-ID: <3ca4a81a-5009-2a8c-5604-2d4e07d90155@inwind.it> (raw)
In-Reply-To: <f5f75774-0a35-bd35-7d80-71d1f1a285a1@knorrie.org>

On 5/29/20 8:54 PM, Hans van Kranenburg wrote:
> On 5/29/20 6:23 PM, Goffredo Baroncelli wrote:
>> On 5/28/20 11:03 PM, Hans van Kranenburg wrote:
>>> Hi,
>>>
>>> On 5/26/20 10:19 PM, Goffredo Baroncelli wrote:
>>>> On 5/25/20 7:14 PM, David Sterba wrote:
>>>>> I'll start with the data structures
>> [...]
>>>>>
>>>>> This looks like a copy of btrfs_chunk and stripe, only removing items
>>>>> not needed for the chunk information. Rather than copying the
>>>>> unnecessary fileds like dev_uuid in stripe, this should be designed for
>>>>> data exchange with the usecase in mind.
>>>>
>>>> There are two clients for this api:
>>>> - btrfs fi us
>>>> - btrfs dev us
>>>>
>>>> We can get rid of:
>>>> 	- "offset" fields (2x)
>>>> 	- "uuid" fields
>>>>
>>>> However the "offset" fields can be used to understand where a logical map
>>>> is on the physical disks. I am thinking about a graphical tool to show this
>>>> mapping, which doesn't exits yet -).
>>>> The offset field may be used as key to get further information (like the chunk
>>>> usage, see below)
>>>>
>>>> Regarding the UUID field, I agree it can be removed because it is redundant (there
>>>> is already the devid)
>>>>
>>>>
>>>>>
>>>>> The format does not need follow the exact layout that kernel uses, ie.
>>>>> chunk info with one embedded stripe and then followed by variable length
>>>>> array of further stripes. This is convenient in one way but not in
>>>>> another one. Alternatively each chunk can be emitted as a single entry,
>>>>> duplicating part of the common fields and adding the stripe-specific
>>>>> ones. This is for consideration.
>>>>>
>>>>> I've looked at my old code doing the chunk dump based on the search
>>>>> ioctl and found that it also allows to read the chunk usage, with one
>>>>> extra search to the block group item where the usage is stored. As this
>>>>> is can be slow, it should be optional. Ie. the main ioctl structure
>>>>> needs flags where this can be requested.
>>>>
>>>> This info could be very useful. I think to something like a balance of
>>>> chunks which are near filled (or near empty). The question is if we
>>>> should have a different ioctl.
>>>
>>> Do you mean that you want to allow to a non root user to run btrfs balance?
>> No at all. The balance is an expensive operation that IMHO need root
>> privileges to be started.
>>
>>>
>>> Otherwise, no. IMO convenience functions for quickly retrieving a
>>> specific subset of data should be created as reusable library functions
>>> in the calling code, not as a redundant extra IOCTL that has to be
>>> maintained.
>>
>> I think that there is a misunderstood. There is no intention to take the
>> place of the current balance ioctl.
> 
> Ok, I'll rephrase. Using it to improve balance is not an argument for
> adding a new IOCTL, since it has to run as root anyway, and then you can
> use the SEARCH IOCTL. And as long as there's a few helper functions
> which do the plumbing, SEARCH isn't that bad at just getting some chunk
> and block group info.

Obviously using SEARCH IOCTL you can get rid of all other "read" btrfs ioctl(s).
However SEARCH IOCTL has some disadvantages:
1) it is a quite complex API
2) it exposes a lot of internal of a BTRFS filesystem, which could prevent
   future BTRFS evolution
3) it requires root privileges

May be that you missed my other patches sets "btrfs-progs:
use the new ioctl BTRFS_IOC_GET_CHUNK_INFO" [*] which is the use case for
which this ioctl was born.

Basically we need the chunk layout info in order to run the command
"btrfs fi us". And now as non root user this is impossible because this
command uses SEARCH IOCTL if raid5/raid6 is used.

And due to 2), I think that we should get rid of all the IOCTL SEARCH.

The discussion with David, was about which information should be exposed:
- if you exposed too much information, there is the risk that you will
have the problem 2)
- if you expose too few information, you ends to add another (similar)
ioctl or you have to extend the ioctl
- of course the other factor that has to be considered is the
composeability of the api(s)

IMHO, we need an api that exposes the CHUNK layout. And I think that we should
remove all the SEARCH IOCTL instance for more reasonable api.

BR
G.Baroncelli

[*]https://lore.kernel.org/linux-btrfs/20200315152430.7532-1-kreijack@libero.it/#t

> 
> -# python3
>>>> import btrfs
>>>> with btrfs.FileSystem('/') as fs:
> ...     for chunk in fs.chunks():
> ...         print(fs.block_group(chunk.vaddr))
> ...
> block group vaddr 13631488 length 8388608 flags DATA used 8388608
> used_pct 100
> block group vaddr 22020096 length 8388608 flags SYSTEM|DUP used 114688
> used_pct 1
> block group vaddr 30408704 length 1073741824 flags METADATA|DUP used
> 889061376 used_pct 83
> block group vaddr 1104150528 length 1073741824 flags DATA used
> 1073741824 used_pct 100
> block group vaddr 2177892352 length 1073741824 flags DATA used
> 1073741824 used_pct 100
> [...]
> 
>> The aim of this ioctl is only to get information about the chunk distribution.
>> Getting the chunk information could help to perform a better balance.
>> I.e. a balance which start from the chunk more empty the go forward
>> processing the chunk more filled.
> 
> An example of this is the existing btrfs-balance-least-used program.
> 
>> Another case use is to visulize how
>> the chunk are filled, or how the disks are used..
> 
> An example of that is btrfs-heatmap.
> 
> Hfgl,
> Hans
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

  parent reply	other threads:[~2020-05-30  6:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19 20:39 [PATCH RFC V3] new ioctl BTRFS_IOC_GET_CHUNK_INFO Goffredo Baroncelli
2020-03-19 20:39 ` [PATCH rfc v3] New " Goffredo Baroncelli
2020-03-19 20:59   ` Josef Bacik
2020-03-19 21:09     ` Goffredo Baroncelli
2020-05-25 17:14   ` David Sterba
2020-05-26 20:19     ` Goffredo Baroncelli
2020-05-28 21:03       ` Hans van Kranenburg
2020-05-29 16:23         ` Goffredo Baroncelli
2020-05-29 18:54           ` Hans van Kranenburg
2020-05-29 18:59             ` Hans van Kranenburg
2020-05-30  6:41             ` Goffredo Baroncelli [this message]
2020-06-10 20:30       ` David Sterba
2020-06-11 11:50         ` Goffredo Baroncelli
2020-06-14 11:06           ` Goffredo Baroncelli
2020-05-25 16:39 ` [PATCH RFC V3] new " David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ca4a81a-5009-2a8c-5604-2d4e07d90155@inwind.it \
    --to=kreijack@inwind.it \
    --cc=dsterba@suse.cz \
    --cc=hans@knorrie.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.