From: Nikolay Borisov <nborisov@suse.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Cc: Josef Bacik <josef@toxicpanda.com>
Subject: Re: [PATCH v7 1/5] btrfs: Introduce per-profile available space facility
Date: Thu, 20 Feb 2020 15:19:46 +0200 [thread overview]
Message-ID: <fe2705ff-c951-bfd7-a6f6-0f3385dcea1f@suse.com> (raw)
In-Reply-To: <0f3b2eb2-1af4-13fb-8d9a-867dd7e68fb8@gmx.com>
On 20.02.20 г. 14:59 ч., Qu Wenruo wrote:
>
>
> On 2020/2/20 下午8:49, Nikolay Borisov wrote:
>> <snip>
>>
>>>
>>> Suggested-by: Josef Bacik <josef@toxicpanda.com>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>> fs/btrfs/volumes.c | 216 ++++++++++++++++++++++++++++++++++++++++-----
>>> fs/btrfs/volumes.h | 11 +++
>>> 2 files changed, 207 insertions(+), 20 deletions(-)
>>>
>>
<snip>
>>> + sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
>>> + btrfs_cmp_device_info, NULL);
>>> + ndevs -= ndevs % raid_attr->devs_increment;
>>
>> nit: ndevs = rounddown(ndevs, raid_attr->devs_increment);
>
> IIRC round_down() can only be used when the alignment is power of 2.
>
> Don't forget we have RAID1C3 now.
Sure, but I'm referring to rounddown and not round_down :)
>
>> makes it more clear what's going on. Since you are working with at most
>> int it's not a problem for 32bits.
>>
>>
>>> + if (ndevs < raid_attr->devs_min)
>>> + return -ENOSPC;
>>> + if (raid_attr->devs_max)
>>> + ndevs = min(ndevs, (int)raid_attr->devs_max);
>>> + else
>>> + ndevs = min(ndevs, (int)BTRFS_MAX_DEVS(fs_info));
>>
>> Instead of casting simply use min_t(int, ndevs, BTRFS_MAX_DEVS...)
>
> That looks the same to me...
I guess it's a matter of preference so I will defer to David to decide.
>
>>
>>> +
>>> + /*
>>> + * Now allocate a virtual chunk using the unallocate space of the
>>
>> nit: missing d after 'unallocate'
>>
>>> + * device with the least unallocated space.
>>> + */
>>> + stripe_size = round_down(devices_info[ndevs - 1].total_avail,
>>> + fs_info->sectorsize);
>>> + if (stripe_size == 0)
>>> + return -ENOSPC;
>>
>> Isn't this check redundant - in the loop where you iterate the devices
>> you always ensure total_avail is at least a sector size, this guarantees
>> that stripe_size cannot be 0 at this point.
>>
>>> +
>>> + for (i = 0; i < ndevs; i++)
>>> + devices_info[i].dev->virtual_allocated += stripe_size;
>>> + *allocated = stripe_size * (ndevs - raid_attr->nparity) /
>>> + raid_attr->ncopies;
>>> + return 0;
>>> +}
>>> +
>>> +static int calc_one_profile_avail(struct btrfs_fs_info *fs_info,
>>> + enum btrfs_raid_types type,
>>> + u64 *result_ret)
>>> +{
>>> + struct btrfs_device_info *devices_info = NULL;
>>> + struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
>>> + struct btrfs_device *device;
>>> + u64 allocated;
>>> + u64 result = 0;
>>> + int ret = 0;
>>> +
>>
>> lockdep assert that chunk mutex is held since you access alloc_list.
>>
>>> + ASSERT(type >= 0 && type < BTRFS_NR_RAID_TYPES);
>>> +
>>> + /* Not enough devices, quick exit, just update the result */
>>> + if (fs_devices->rw_devices < btrfs_raid_array[type].devs_min)
>>> + goto out;
>>
>> You can directly return.
>>
>>> +
>>> + devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
>>> + GFP_NOFS);
>>> + if (!devices_info) {
>>> + ret = -ENOMEM;
>>> + goto out;
>>
>> Same here.
>>
>>> + }
>>> + /* Clear virtual chunk used space for each device */
>>> + list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list)
>>> + device->virtual_allocated = 0;
>>> + while (ret == 0) {
>>> + ret = alloc_virtual_chunk(fs_info, devices_info, type,
>>> + &allocated);
>> The 'allocated' variable is used only in this loop so declare it in the
>> loop. Ideally we want variables to have the shortest possible lifecycle.
>>
>>> + if (ret == 0)
>>> + result += allocated;
>>> + }
>>
After the explanation on IRC I think it's better if this is written as:
while(!alloc_virtual_chunk(fs_info, devices_info, type, &allocated))
result += allocated
That way it's easier (at least for me) to spot you are "draining"
something. IN this case you try to allocate/simulate multiple
allocations to calculate the real available space.
>
> For this case, we must go several loops:
> Dev1: 10G
> Dev2: 5G
> Dev3: 5G
> Type: RAID1.
>
> The first loop will use 5G from dev1, 5G from dev2.
> Then the 2nd loop will use the remaining 5G from dev1, 5G from dev3.
>
> And that's the core problem per-profile available space system want to
> address.
>
<snip>
next prev parent reply other threads:[~2020-02-20 13:19 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-11 5:11 [PATCH v7 0/5] Introduce per-profile available space array to avoid over-confident can_overcommit() Qu Wenruo
2020-02-11 5:11 ` [PATCH v7 1/5] btrfs: Introduce per-profile available space facility Qu Wenruo
2020-02-13 17:12 ` kbuild test robot
2020-02-20 12:49 ` Nikolay Borisov
2020-02-20 12:59 ` Qu Wenruo
2020-02-20 13:19 ` Nikolay Borisov [this message]
2020-02-21 5:16 ` Qu Wenruo
2020-02-11 5:11 ` [PATCH v7 2/5] btrfs: space-info: Use per-profile available space in can_overcommit() Qu Wenruo
2020-02-11 5:11 ` [PATCH v7 3/5] btrfs: statfs: Use pre-calculated per-profile available space Qu Wenruo
2020-02-11 5:11 ` [PATCH v7 4/5] btrfs: Reset device size when btrfs_update_device() failed in btrfs_grow_device() Qu Wenruo
2020-02-11 5:11 ` [PATCH v7 5/5] btrfs: volumes: Revert device used bytes when calc_per_profile_avail() failed Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fe2705ff-c951-bfd7-a6f6-0f3385dcea1f@suse.com \
--to=nborisov@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).