linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs-progs: match the max chunk size to the kernel
@ 2020-03-11 11:33 Anand Jain
  2020-03-11 13:01 ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Anand Jain @ 2020-03-11 11:33 UTC (permalink / raw)
  To: linux-btrfs

For chunk type Single, %metadata_profile and %data_profile in
create_raid_groups() is NULL, so we continue to use the initially
created 8MB chunks for both metadata and data.

8MB is too small. Kernel default chunk size for type Single is 256MB.
Further the mkfs.btrfs created chunk will stay unless relocated or
cleanup during balance. Consider an ENOSPC case due to 8MB metadata
full.

I don't see any reason that mkfs.btrfs should create 8MB chunks for
chunk type Single instead it could match it with the kernel allocation
size of 256MB for the chunk type Single.

For other chunk-types the create_one_raid_group() is called and creates
the required bigger chunks and there is no change with this patch. Also
for fs sizes (-b option) smaller than 256MB there is no issue as the
chunks sizes are 10% of the requested fs-size until the maximum of
256MB.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
The fio in generic/299 is failing to create the files which shall be
deleted in the later part of the test case and the failure happens
only with the MKFS_OPTIONS="-n64K -msingle" only and not with other types
of chunks. This is bit inconsistent. And it appears that in case of
Single chunk type it fails to create (or touch) the necessary file
as the metadata is full, so increasing the metadata chunk size to the
sizes as kernel would create will add consistency.

 volumes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/volumes.c b/volumes.c
index b46bf5988a95..d56f2fc897e3 100644
--- a/volumes.c
+++ b/volumes.c
@@ -1004,7 +1004,7 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	struct list_head *cur;
 	struct map_lookup *map;
 	int min_stripe_size = SZ_1M;
-	u64 calc_size = SZ_8M;
+	u64 calc_size = SZ_256M;
 	u64 min_free;
 	u64 max_chunk_size = 4 * calc_size;
 	u64 avail = 0;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs-progs: match the max chunk size to the kernel
  2020-03-11 11:33 [PATCH] btrfs-progs: match the max chunk size to the kernel Anand Jain
@ 2020-03-11 13:01 ` Qu Wenruo
  2020-03-12 11:12   ` Anand Jain
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2020-03-11 13:01 UTC (permalink / raw)
  To: Anand Jain, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2714 bytes --]



On 2020/3/11 下午7:33, Anand Jain wrote:
> For chunk type Single, %metadata_profile and %data_profile in
> create_raid_groups() is NULL, so we continue to use the initially
> created 8MB chunks for both metadata and data.
> 
> 8MB is too small. Kernel default chunk size for type Single is 256MB.
> Further the mkfs.btrfs created chunk will stay unless relocated or
> cleanup during balance. Consider an ENOSPC case due to 8MB metadata
> full.
> 
> I don't see any reason that mkfs.btrfs should create 8MB chunks for
> chunk type Single instead it could match it with the kernel allocation
> size of 256MB for the chunk type Single.
> 
> For other chunk-types the create_one_raid_group() is called and creates
> the required bigger chunks and there is no change with this patch. Also
> for fs sizes (-b option) smaller than 256MB there is no issue as the
> chunks sizes are 10% of the requested fs-size until the maximum of
> 256MB.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> The fio in generic/299 is failing to create the files which shall be
> deleted in the later part of the test case and the failure happens
> only with the MKFS_OPTIONS="-n64K -msingle" only and not with other types
> of chunks. This is bit inconsistent. And it appears that in case of
> Single chunk type it fails to create (or touch) the necessary file
> as the metadata is full, so increasing the metadata chunk size to the
> sizes as kernel would create will add consistency.

Have you tried all existing btrfs-progs test cases?
IIRC there are some minimal device related corner cases preventing us
from using larger default chunk size.

Despite that, for generic/299 error, I believe it should be more
appropriate to address the problem in ticket space system other than
initial metadata chunk size.
As btrfs can do metadata overcommit as long as we have enough
unallocated space, thus the initial chunk size should make minimal impact.

But don't get me wrong, I'm pretty fine with the unified minimal chunk size.
Just don't believe it's the proper fix for your problem, and want to be
extra safe before we hit some strange problems.

Thanks,
Qu

> 
>  volumes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/volumes.c b/volumes.c
> index b46bf5988a95..d56f2fc897e3 100644
> --- a/volumes.c
> +++ b/volumes.c
> @@ -1004,7 +1004,7 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>  	struct list_head *cur;
>  	struct map_lookup *map;
>  	int min_stripe_size = SZ_1M;
> -	u64 calc_size = SZ_8M;
> +	u64 calc_size = SZ_256M;
>  	u64 min_free;
>  	u64 max_chunk_size = 4 * calc_size;
>  	u64 avail = 0;
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs-progs: match the max chunk size to the kernel
  2020-03-11 13:01 ` Qu Wenruo
@ 2020-03-12 11:12   ` Anand Jain
  2020-03-25  8:23     ` Anand Jain
  0 siblings, 1 reply; 4+ messages in thread
From: Anand Jain @ 2020-03-12 11:12 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On 3/11/20 9:01 PM, Qu Wenruo wrote:
> 
> 
> On 2020/3/11 下午7:33, Anand Jain wrote:
>> For chunk type Single, %metadata_profile and %data_profile in
>> create_raid_groups() is NULL, so we continue to use the initially
>> created 8MB chunks for both metadata and data.
>>
>> 8MB is too small. Kernel default chunk size for type Single is 256MB.
>> Further the mkfs.btrfs created chunk will stay unless relocated or
>> cleanup during balance. Consider an ENOSPC case due to 8MB metadata
>> full.
>>
>> I don't see any reason that mkfs.btrfs should create 8MB chunks for
>> chunk type Single instead it could match it with the kernel allocation
>> size of 256MB for the chunk type Single.
>>
>> For other chunk-types the create_one_raid_group() is called and creates
>> the required bigger chunks and there is no change with this patch. Also
>> for fs sizes (-b option) smaller than 256MB there is no issue as the
>> chunks sizes are 10% of the requested fs-size until the maximum of
>> 256MB.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> The fio in generic/299 is failing to create the files which shall be
>> deleted in the later part of the test case and the failure happens
>> only with the MKFS_OPTIONS="-n64K -msingle" only and not with other types
>> of chunks. This is bit inconsistent. And it appears that in case of
>> Single chunk type it fails to create (or touch) the necessary file
>> as the metadata is full, so increasing the metadata chunk size to the
>> sizes as kernel would create will add consistency.
> 
> Have you tried all existing btrfs-progs test cases?
> IIRC there are some minimal device related corner cases preventing us
> from using larger default chunk size.
> 

  I forgot btrfs-progs tests. Will run them.

> Despite that, for generic/299 error, I believe it should be more
> appropriate to address the problem in ticket space system

  Agreed.

> other than
> initial metadata chunk size.

> As btrfs can do metadata overcommit as long as we have enough
> unallocated space, thus the initial chunk size should make minimal impact.

  IMO problem is if all the unallocated space has been occupied by
  the data chunks leading to enospc for the metadata then we have an
  imbalance which we didn't design in the kernel.

  To further debug enospc issues, the approach should be an ability to
  tune chunk sizes on demand. Generally inconsistency makes debugging
  more difficult. IMO it ok to fix the inconsistency in the chunk sizes.

Thanks, Anand

> But don't get me wrong, I'm pretty fine with the unified minimal chunk size.
> Just don't believe it's the proper fix for your problem, and want to be
> extra safe before we hit some strange problems.
> 
> Thanks,
> Qu
> 
>>
>>   volumes.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/volumes.c b/volumes.c
>> index b46bf5988a95..d56f2fc897e3 100644
>> --- a/volumes.c
>> +++ b/volumes.c
>> @@ -1004,7 +1004,7 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>>   	struct list_head *cur;
>>   	struct map_lookup *map;
>>   	int min_stripe_size = SZ_1M;
>> -	u64 calc_size = SZ_8M;
>> +	u64 calc_size = SZ_256M;
>>   	u64 min_free;
>>   	u64 max_chunk_size = 4 * calc_size;
>>   	u64 avail = 0;
>>
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs-progs: match the max chunk size to the kernel
  2020-03-12 11:12   ` Anand Jain
@ 2020-03-25  8:23     ` Anand Jain
  0 siblings, 0 replies; 4+ messages in thread
From: Anand Jain @ 2020-03-25  8:23 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, kreijack



On 3/12/20 7:12 PM, Anand Jain wrote:
> On 3/11/20 9:01 PM, Qu Wenruo wrote:
>>
>>
>> On 2020/3/11 下午7:33, Anand Jain wrote:
>>> For chunk type Single, %metadata_profile and %data_profile in
>>> create_raid_groups() is NULL, so we continue to use the initially
>>> created 8MB chunks for both metadata and data.
>>>
>>> 8MB is too small. Kernel default chunk size for type Single is 256MB.
>>> Further the mkfs.btrfs created chunk will stay unless relocated or
>>> cleanup during balance. Consider an ENOSPC case due to 8MB metadata
>>> full.
>>>
>>> I don't see any reason that mkfs.btrfs should create 8MB chunks for
>>> chunk type Single instead it could match it with the kernel allocation
>>> size of 256MB for the chunk type Single.
>>>
>>> For other chunk-types the create_one_raid_group() is called and creates
>>> the required bigger chunks and there is no change with this patch. Also
>>> for fs sizes (-b option) smaller than 256MB there is no issue as the
>>> chunks sizes are 10% of the requested fs-size until the maximum of
>>> 256MB.
>>>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>> The fio in generic/299 is failing to create the files which shall be
>>> deleted in the later part of the test case and the failure happens
>>> only with the MKFS_OPTIONS="-n64K -msingle" only and not with other 
>>> types
>>> of chunks. This is bit inconsistent. And it appears that in case of
>>> Single chunk type it fails to create (or touch) the necessary file
>>> as the metadata is full, so increasing the metadata chunk size to the
>>> sizes as kernel would create will add consistency.
>>
>> Have you tried all existing btrfs-progs test cases?
>> IIRC there are some minimal device related corner cases preventing us
>> from using larger default chunk size.
>>
> 

The test case which checks for the min-size is
   tests/mkfs-tests/010-minimal-size

Which depends on the mkfs to report the min size. Which in turn depends
on the function btrfs_min_dev_size() which does the static calculation
for the minimum size. IMO this can be updated to match with the kernel
allocations as its not a defined API.

But we are creating chunks in the userspace just to inform the kernel
about the user preferred chunk profile types (and Single profile gets
stuck at the initial temporary 8MB chunk).

Anyways I am recalling this patch as Goffredo is working [1] on
the way to communicate the user preferred profiles to the kernel,
which means there is no need to allocate the chunks in the userspace
so indirectly shall fix the single-chunk-size-inconsistency issue
between kernel and progs.

[1]
https://www.spinics.net/lists/linux-btrfs/msg99915.html

Thanks, Anand

>   I forgot btrfs-progs tests. Will run them.
> 
>> Despite that, for generic/299 error, I believe it should be more
>> appropriate to address the problem in ticket space system
> 
>   Agreed.
> 
>> other than
>> initial metadata chunk size.
> 
>> As btrfs can do metadata overcommit as long as we have enough
>> unallocated space, thus the initial chunk size should make minimal 
>> impact.
> 
>   IMO problem is if all the unallocated space has been occupied by
>   the data chunks leading to enospc for the metadata then we have an
>   imbalance which we didn't design in the kernel.
> 
>   To further debug enospc issues, the approach should be an ability to
>   tune chunk sizes on demand. Generally inconsistency makes debugging
>   more difficult. IMO it ok to fix the inconsistency in the chunk sizes.
> 
> Thanks, Anand
> 
>> But don't get me wrong, I'm pretty fine with the unified minimal chunk 
>> size.
>> Just don't believe it's the proper fix for your problem, and want to be
>> extra safe before we hit some strange problems.
>>
>> Thanks,
>> Qu
>>
>>>
>>>   volumes.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/volumes.c b/volumes.c
>>> index b46bf5988a95..d56f2fc897e3 100644
>>> --- a/volumes.c
>>> +++ b/volumes.c
>>> @@ -1004,7 +1004,7 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle 
>>> *trans,
>>>       struct list_head *cur;
>>>       struct map_lookup *map;
>>>       int min_stripe_size = SZ_1M;
>>> -    u64 calc_size = SZ_8M;
>>> +    u64 calc_size = SZ_256M;
>>>       u64 min_free;
>>>       u64 max_chunk_size = 4 * calc_size;
>>>       u64 avail = 0;
>>>
>>
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-03-25  8:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-11 11:33 [PATCH] btrfs-progs: match the max chunk size to the kernel Anand Jain
2020-03-11 13:01 ` Qu Wenruo
2020-03-12 11:12   ` Anand Jain
2020-03-25  8:23     ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).