All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix max max_inline for pagesize=64K
@ 2021-08-10 15:23 Anand Jain
  2021-08-16 15:10 ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Anand Jain @ 2021-08-10 15:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Alexander Tsvetkov

The mount option max_inline ranges from 0 to the sectorsize (which is
equal to pagesize). But we parse the mount options too early and before
the sectorsize is a cache from the superblock. So the upper limit of
max_inline is unaware of the actual sectorsize. And is limited by the
temporary sectorsize 4096 (as below), even on a system where the default
sectorsize is 64K.

disk-io.c
::
2980         /* Usable values until the real ones are cached from the superblock */
2981         fs_info->nodesize = 4096;
2982         fs_info->sectorsize = 4096;

Fix this by reading the superblock sectorsize before the mount option parse.

Reported-by: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/disk-io.c | 49 +++++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 2dd56ee23b35..d9505b35c7f5 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3317,6 +3317,31 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	 */
 	fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
 
+	/*
+	 * flag our filesystem as having big metadata blocks if
+	 * they are bigger than the page size
+	 */
+	if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
+		if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
+			btrfs_info(fs_info,
+				"flagging fs with big metadata feature");
+		features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
+	}
+
+	/* btrfs_parse_options uses fs_info::sectorsize, so read it here */
+	nodesize = btrfs_super_nodesize(disk_super);
+	sectorsize = btrfs_super_sectorsize(disk_super);
+	stripesize = sectorsize;
+	fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));
+	fs_info->delalloc_batch = sectorsize * 512 * (1 + ilog2(nr_cpu_ids));
+
+	/* Cache block sizes */
+	fs_info->nodesize = nodesize;
+	fs_info->sectorsize = sectorsize;
+	fs_info->sectorsize_bits = ilog2(sectorsize);
+	fs_info->csums_per_leaf = BTRFS_MAX_ITEM_SIZE(fs_info) / fs_info->csum_size;
+	fs_info->stripesize = stripesize;
+
 	ret = btrfs_parse_options(fs_info, options, sb->s_flags);
 	if (ret) {
 		err = ret;
@@ -3343,30 +3368,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	if (features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
 		btrfs_info(fs_info, "has skinny extents");
 
-	/*
-	 * flag our filesystem as having big metadata blocks if
-	 * they are bigger than the page size
-	 */
-	if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
-		if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
-			btrfs_info(fs_info,
-				"flagging fs with big metadata feature");
-		features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
-	}
-
-	nodesize = btrfs_super_nodesize(disk_super);
-	sectorsize = btrfs_super_sectorsize(disk_super);
-	stripesize = sectorsize;
-	fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));
-	fs_info->delalloc_batch = sectorsize * 512 * (1 + ilog2(nr_cpu_ids));
-
-	/* Cache block sizes */
-	fs_info->nodesize = nodesize;
-	fs_info->sectorsize = sectorsize;
-	fs_info->sectorsize_bits = ilog2(sectorsize);
-	fs_info->csums_per_leaf = BTRFS_MAX_ITEM_SIZE(fs_info) / fs_info->csum_size;
-	fs_info->stripesize = stripesize;
-
 	/*
 	 * mixed block groups end up with duplicate but slightly offset
 	 * extent buffers for the same range.  It leads to corruptions
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: fix max max_inline for pagesize=64K
  2021-08-10 15:23 [PATCH] btrfs: fix max max_inline for pagesize=64K Anand Jain
@ 2021-08-16 15:10 ` David Sterba
  2021-08-24  5:54   ` Anand Jain
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2021-08-16 15:10 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, Alexander Tsvetkov

On Tue, Aug 10, 2021 at 11:23:44PM +0800, Anand Jain wrote:
> The mount option max_inline ranges from 0 to the sectorsize (which is
> equal to pagesize). But we parse the mount options too early and before
> the sectorsize is a cache from the superblock. So the upper limit of
> max_inline is unaware of the actual sectorsize. And is limited by the
> temporary sectorsize 4096 (as below), even on a system where the default
> sectorsize is 64K.

So the question is what's a sensible value for >4K sectors, which is 64K
in this case.

Generally we allow setting values that may make sense only for some
limited usecase and leave it up to the user to decide.

The inline files reduce the slack space and on 64K sectors it could be
more noticeable than on 4K sectors. It's a trade off as the inline data
are stored in metadata blocks that are considered more precious.

Do you have any analysis of file size distribution on 64K systems for
some normal use case like roo partition?

I think this is worth fixing so to be in line with constraints we have
for 4K sectors but some numbers would be good too.

> 
> disk-io.c
> ::
> 2980         /* Usable values until the real ones are cached from the superblock */
> 2981         fs_info->nodesize = 4096;
> 2982         fs_info->sectorsize = 4096;
> 
> Fix this by reading the superblock sectorsize before the mount option parse.
> 
> Reported-by: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/disk-io.c | 49 +++++++++++++++++++++++-----------------------
>  1 file changed, 25 insertions(+), 24 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 2dd56ee23b35..d9505b35c7f5 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3317,6 +3317,31 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>  	 */
>  	fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
>  
> +	/*
> +	 * flag our filesystem as having big metadata blocks if
> +	 * they are bigger than the page size

Please fix/reformat/improve any comments that are in moved code.

> +	 */
> +	if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
> +		if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
> +			btrfs_info(fs_info,
> +				"flagging fs with big metadata feature");
> +		features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
> +	}
> +
> +	/* btrfs_parse_options uses fs_info::sectorsize, so read it here */
> +	nodesize = btrfs_super_nodesize(disk_super);
> +	sectorsize = btrfs_super_sectorsize(disk_super);
> +	stripesize = sectorsize;
> +	fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));
> +	fs_info->delalloc_batch = sectorsize * 512 * (1 + ilog2(nr_cpu_ids));
> +
> +	/* Cache block sizes */
> +	fs_info->nodesize = nodesize;
> +	fs_info->sectorsize = sectorsize;
> +	fs_info->sectorsize_bits = ilog2(sectorsize);
> +	fs_info->csums_per_leaf = BTRFS_MAX_ITEM_SIZE(fs_info) / fs_info->csum_size;
> +	fs_info->stripesize = stripesize;

It looks that there are no invariants broken by moving that code, so ok.

> +
>  	ret = btrfs_parse_options(fs_info, options, sb->s_flags);
>  	if (ret) {
>  		err = ret;
> @@ -3343,30 +3368,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>  	if (features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
>  		btrfs_info(fs_info, "has skinny extents");
>  
> -	/*
> -	 * flag our filesystem as having big metadata blocks if
> -	 * they are bigger than the page size
> -	 */
> -	if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
> -		if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
> -			btrfs_info(fs_info,
> -				"flagging fs with big metadata feature");
> -		features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
> -	}
> -
> -	nodesize = btrfs_super_nodesize(disk_super);
> -	sectorsize = btrfs_super_sectorsize(disk_super);
> -	stripesize = sectorsize;
> -	fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));
> -	fs_info->delalloc_batch = sectorsize * 512 * (1 + ilog2(nr_cpu_ids));
> -
> -	/* Cache block sizes */
> -	fs_info->nodesize = nodesize;
> -	fs_info->sectorsize = sectorsize;
> -	fs_info->sectorsize_bits = ilog2(sectorsize);
> -	fs_info->csums_per_leaf = BTRFS_MAX_ITEM_SIZE(fs_info) / fs_info->csum_size;
> -	fs_info->stripesize = stripesize;
> -
>  	/*
>  	 * mixed block groups end up with duplicate but slightly offset
>  	 * extent buffers for the same range.  It leads to corruptions
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: fix max max_inline for pagesize=64K
  2021-08-16 15:10 ` David Sterba
@ 2021-08-24  5:54   ` Anand Jain
  2021-08-26 17:53     ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Anand Jain @ 2021-08-24  5:54 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Alexander Tsvetkov

On 16/08/2021 23:10, David Sterba wrote:
> On Tue, Aug 10, 2021 at 11:23:44PM +0800, Anand Jain wrote:
>> The mount option max_inline ranges from 0 to the sectorsize (which is
>> equal to pagesize). But we parse the mount options too early and before
>> the sectorsize is a cache from the superblock. So the upper limit of
>> max_inline is unaware of the actual sectorsize. And is limited by the
>> temporary sectorsize 4096 (as below), even on a system where the default
>> sectorsize is 64K.
> 

> So the question is what's a sensible value for >4K sectors, which is 64K
> in this case.
> 
> Generally we allow setting values that may make sense only for some
> limited usecase and leave it up to the user to decide.
>
> The inline files reduce the slack space and on 64K sectors it could be
> more noticeable than on 4K sectors. It's a trade off as the inline data
> are stored in metadata blocks that are considered more precious.
>
> Do you have any analysis of file size distribution on 64K systems for
> some normal use case like roo partition?
>
> I think this is worth fixing so to be in line with constraints we have
> for 4K sectors but some numbers would be good too.


Default max_inline for sectorsize=64K is an interesting topic and
probably long. If time permits, I will look into it.
Furthermore, we need test cases and a repo that can hold it (and
also add  read_policy test cases there).

IMO there is no need to hold this patch in search of optimum default
max_inline for 64K systems.

This patch reports and fixes a bug due to which we are limited to test
max_inline only up to 4K on a 64K pagesize system. Not as our document
claimed as below.
-----------------
  man -s 5 btrfs
  ::
         max_inline=bytes
            (default: min(2048, page size) )
  ::
	In practice, this value is limited by the filesystem block size
	(named sectorsize at mkfs time), and memory page size of the
	system.
-----------------


  more below.

>>
>> disk-io.c
>> ::
>> 2980         /* Usable values until the real ones are cached from the superblock */
>> 2981         fs_info->nodesize = 4096;
>> 2982         fs_info->sectorsize = 4096;
>>
>> Fix this by reading the superblock sectorsize before the mount option parse.
>>
>> Reported-by: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/disk-io.c | 49 +++++++++++++++++++++++-----------------------
>>   1 file changed, 25 insertions(+), 24 deletions(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 2dd56ee23b35..d9505b35c7f5 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -3317,6 +3317,31 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>>   	 */
>>   	fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
>>   
>> +	/*
>> +	 * flag our filesystem as having big metadata blocks if
>> +	 * they are bigger than the page size
> 
> Please fix/reformat/improve any comments that are in moved code.

  I think you are pointing to s/f/F and 80 chars long? Will fix.

Thanks, Anand

> 
>> +	 */
>> +	if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
>> +		if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
>> +			btrfs_info(fs_info,
>> +				"flagging fs with big metadata feature");
>> +		features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
>> +	}
>> +
>> +	/* btrfs_parse_options uses fs_info::sectorsize, so read it here */
>> +	nodesize = btrfs_super_nodesize(disk_super);
>> +	sectorsize = btrfs_super_sectorsize(disk_super);
>> +	stripesize = sectorsize;
>> +	fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));
>> +	fs_info->delalloc_batch = sectorsize * 512 * (1 + ilog2(nr_cpu_ids));
>> +
>> +	/* Cache block sizes */
>> +	fs_info->nodesize = nodesize;
>> +	fs_info->sectorsize = sectorsize;
>> +	fs_info->sectorsize_bits = ilog2(sectorsize);
>> +	fs_info->csums_per_leaf = BTRFS_MAX_ITEM_SIZE(fs_info) / fs_info->csum_size;
>> +	fs_info->stripesize = stripesize;
> 
> It looks that there are no invariants broken by moving that code, so ok.
> 
>> +
>>   	ret = btrfs_parse_options(fs_info, options, sb->s_flags);
>>   	if (ret) {
>>   		err = ret;
>> @@ -3343,30 +3368,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>>   	if (features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
>>   		btrfs_info(fs_info, "has skinny extents");
>>   
>> -	/*
>> -	 * flag our filesystem as having big metadata blocks if
>> -	 * they are bigger than the page size
>> -	 */
>> -	if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
>> -		if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
>> -			btrfs_info(fs_info,
>> -				"flagging fs with big metadata feature");
>> -		features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
>> -	}
>> -
>> -	nodesize = btrfs_super_nodesize(disk_super);
>> -	sectorsize = btrfs_super_sectorsize(disk_super);
>> -	stripesize = sectorsize;
>> -	fs_info->dirty_metadata_batch = nodesize * (1 + ilog2(nr_cpu_ids));
>> -	fs_info->delalloc_batch = sectorsize * 512 * (1 + ilog2(nr_cpu_ids));
>> -
>> -	/* Cache block sizes */
>> -	fs_info->nodesize = nodesize;
>> -	fs_info->sectorsize = sectorsize;
>> -	fs_info->sectorsize_bits = ilog2(sectorsize);
>> -	fs_info->csums_per_leaf = BTRFS_MAX_ITEM_SIZE(fs_info) / fs_info->csum_size;
>> -	fs_info->stripesize = stripesize;
>> -
>>   	/*
>>   	 * mixed block groups end up with duplicate but slightly offset
>>   	 * extent buffers for the same range.  It leads to corruptions
>> -- 
>> 2.27.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: fix max max_inline for pagesize=64K
  2021-08-24  5:54   ` Anand Jain
@ 2021-08-26 17:53     ` David Sterba
  2021-08-27  0:38       ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2021-08-26 17:53 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, linux-btrfs, Alexander Tsvetkov

On Tue, Aug 24, 2021 at 01:54:21PM +0800, Anand Jain wrote:
> On 16/08/2021 23:10, David Sterba wrote:
> > On Tue, Aug 10, 2021 at 11:23:44PM +0800, Anand Jain wrote:
> >> The mount option max_inline ranges from 0 to the sectorsize (which is
> >> equal to pagesize). But we parse the mount options too early and before
> >> the sectorsize is a cache from the superblock. So the upper limit of
> >> max_inline is unaware of the actual sectorsize. And is limited by the
> >> temporary sectorsize 4096 (as below), even on a system where the default
> >> sectorsize is 64K.
> > 
> 
> > So the question is what's a sensible value for >4K sectors, which is 64K
> > in this case.
> > 
> > Generally we allow setting values that may make sense only for some
> > limited usecase and leave it up to the user to decide.
> >
> > The inline files reduce the slack space and on 64K sectors it could be
> > more noticeable than on 4K sectors. It's a trade off as the inline data
> > are stored in metadata blocks that are considered more precious.
> >
> > Do you have any analysis of file size distribution on 64K systems for
> > some normal use case like roo partition?
> >
> > I think this is worth fixing so to be in line with constraints we have
> > for 4K sectors but some numbers would be good too.
> 
> Default max_inline for sectorsize=64K is an interesting topic and
> probably long. If time permits, I will look into it.
> Furthermore, we need test cases and a repo that can hold it (and
> also add  read_policy test cases there).
> 
> IMO there is no need to hold this patch in search of optimum default
> max_inline for 64K systems.

Yeah, I'm more interested in some reasonable value, now the default is
2048 but probably it should be sectorsize/2 in general.

> This patch reports and fixes a bug due to which we are limited to test
> max_inline only up to 4K on a 64K pagesize system. Not as our document
> claimed as below.
> -----------------
>   man -s 5 btrfs
>   ::
>          max_inline=bytes
>             (default: min(2048, page size) )
>   ::
> 	In practice, this value is limited by the filesystem block size
> 	(named sectorsize at mkfs time), and memory page size of the
> 	system.
> -----------------
> 
> 
>   more below.
> 
> >>
> >> disk-io.c
> >> ::
> >> 2980         /* Usable values until the real ones are cached from the superblock */
> >> 2981         fs_info->nodesize = 4096;
> >> 2982         fs_info->sectorsize = 4096;
> >>
> >> Fix this by reading the superblock sectorsize before the mount option parse.
> >>
> >> Reported-by: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>
> >> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> >> ---
> >>   fs/btrfs/disk-io.c | 49 +++++++++++++++++++++++-----------------------
> >>   1 file changed, 25 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> >> index 2dd56ee23b35..d9505b35c7f5 100644
> >> --- a/fs/btrfs/disk-io.c
> >> +++ b/fs/btrfs/disk-io.c
> >> @@ -3317,6 +3317,31 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
> >>   	 */
> >>   	fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
> >>   
> >> +	/*
> >> +	 * flag our filesystem as having big metadata blocks if
> >> +	 * they are bigger than the page size
> > 
> > Please fix/reformat/improve any comments that are in moved code.
> 
>   I think you are pointing to s/f/F and 80 chars long? Will fix.

Yes, already fixed in the committed version in misc-next, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: fix max max_inline for pagesize=64K
  2021-08-26 17:53     ` David Sterba
@ 2021-08-27  0:38       ` Qu Wenruo
  2021-08-27  6:24         ` Anand Jain
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2021-08-27  0:38 UTC (permalink / raw)
  To: dsterba, Anand Jain, linux-btrfs, Alexander Tsvetkov



On 2021/8/27 上午1:53, David Sterba wrote:
> On Tue, Aug 24, 2021 at 01:54:21PM +0800, Anand Jain wrote:
>> On 16/08/2021 23:10, David Sterba wrote:
>>> On Tue, Aug 10, 2021 at 11:23:44PM +0800, Anand Jain wrote:
>>>> The mount option max_inline ranges from 0 to the sectorsize (which is
>>>> equal to pagesize). But we parse the mount options too early and before
>>>> the sectorsize is a cache from the superblock. So the upper limit of
>>>> max_inline is unaware of the actual sectorsize. And is limited by the
>>>> temporary sectorsize 4096 (as below), even on a system where the default
>>>> sectorsize is 64K.
>>>
>>
>>> So the question is what's a sensible value for >4K sectors, which is 64K
>>> in this case.
>>>
>>> Generally we allow setting values that may make sense only for some
>>> limited usecase and leave it up to the user to decide.
>>>
>>> The inline files reduce the slack space and on 64K sectors it could be
>>> more noticeable than on 4K sectors. It's a trade off as the inline data
>>> are stored in metadata blocks that are considered more precious.
>>>
>>> Do you have any analysis of file size distribution on 64K systems for
>>> some normal use case like roo partition?
>>>
>>> I think this is worth fixing so to be in line with constraints we have
>>> for 4K sectors but some numbers would be good too.
>>
>> Default max_inline for sectorsize=64K is an interesting topic and
>> probably long. If time permits, I will look into it.
>> Furthermore, we need test cases and a repo that can hold it (and
>> also add  read_policy test cases there).
>>
>> IMO there is no need to hold this patch in search of optimum default
>> max_inline for 64K systems.
>
> Yeah, I'm more interested in some reasonable value, now the default is
> 2048 but probably it should be sectorsize/2 in general.

Half of sectorsize is pretty solid to me.

But I'm afraid this is a little too late, especially considering we're
moving to 4K sectorsize as default for all page sizes.

Thanks,
Qu

>
>> This patch reports and fixes a bug due to which we are limited to test
>> max_inline only up to 4K on a 64K pagesize system. Not as our document
>> claimed as below.
>> -----------------
>>    man -s 5 btrfs
>>    ::
>>           max_inline=bytes
>>              (default: min(2048, page size) )
>>    ::
>> 	In practice, this value is limited by the filesystem block size
>> 	(named sectorsize at mkfs time), and memory page size of the
>> 	system.
>> -----------------
>>
>>
>>    more below.
>>
>>>>
>>>> disk-io.c
>>>> ::
>>>> 2980         /* Usable values until the real ones are cached from the superblock */
>>>> 2981         fs_info->nodesize = 4096;
>>>> 2982         fs_info->sectorsize = 4096;
>>>>
>>>> Fix this by reading the superblock sectorsize before the mount option parse.
>>>>
>>>> Reported-by: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>
>>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>>> ---
>>>>    fs/btrfs/disk-io.c | 49 +++++++++++++++++++++++-----------------------
>>>>    1 file changed, 25 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>>>> index 2dd56ee23b35..d9505b35c7f5 100644
>>>> --- a/fs/btrfs/disk-io.c
>>>> +++ b/fs/btrfs/disk-io.c
>>>> @@ -3317,6 +3317,31 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>>>>    	 */
>>>>    	fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
>>>>
>>>> +	/*
>>>> +	 * flag our filesystem as having big metadata blocks if
>>>> +	 * they are bigger than the page size
>>>
>>> Please fix/reformat/improve any comments that are in moved code.
>>
>>    I think you are pointing to s/f/F and 80 chars long? Will fix.
>
> Yes, already fixed in the committed version in misc-next, thanks.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: fix max max_inline for pagesize=64K
  2021-08-27  0:38       ` Qu Wenruo
@ 2021-08-27  6:24         ` Anand Jain
  2021-08-27  7:05           ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: Anand Jain @ 2021-08-27  6:24 UTC (permalink / raw)
  To: Qu Wenruo, dsterba, linux-btrfs, Alexander Tsvetkov



On 27/08/2021 08:38, Qu Wenruo wrote:
> 
> 
> On 2021/8/27 上午1:53, David Sterba wrote:
>> On Tue, Aug 24, 2021 at 01:54:21PM +0800, Anand Jain wrote:
>>> On 16/08/2021 23:10, David Sterba wrote:
>>>> On Tue, Aug 10, 2021 at 11:23:44PM +0800, Anand Jain wrote:
>>>>> The mount option max_inline ranges from 0 to the sectorsize (which is
>>>>> equal to pagesize). But we parse the mount options too early and 
>>>>> before
>>>>> the sectorsize is a cache from the superblock. So the upper limit of
>>>>> max_inline is unaware of the actual sectorsize. And is limited by the
>>>>> temporary sectorsize 4096 (as below), even on a system where the 
>>>>> default
>>>>> sectorsize is 64K.
>>>>
>>>
>>>> So the question is what's a sensible value for >4K sectors, which is 
>>>> 64K
>>>> in this case.
>>>>
>>>> Generally we allow setting values that may make sense only for some
>>>> limited usecase and leave it up to the user to decide.
>>>>
>>>> The inline files reduce the slack space and on 64K sectors it could be
>>>> more noticeable than on 4K sectors. It's a trade off as the inline data
>>>> are stored in metadata blocks that are considered more precious.
>>>>
>>>> Do you have any analysis of file size distribution on 64K systems for
>>>> some normal use case like roo partition?
>>>>
>>>> I think this is worth fixing so to be in line with constraints we have
>>>> for 4K sectors but some numbers would be good too.
>>>
>>> Default max_inline for sectorsize=64K is an interesting topic and
>>> probably long. If time permits, I will look into it.
>>> Furthermore, we need test cases and a repo that can hold it (and
>>> also add  read_policy test cases there).
>>>
>>> IMO there is no need to hold this patch in search of optimum default
>>> max_inline for 64K systems.
>>
>> Yeah, I'm more interested in some reasonable value, now the default is
>> 2048 but probably it should be sectorsize/2 in general.
> 
> Half of sectorsize is pretty solid to me.
> 
> But I'm afraid this is a little too late, especially considering we're
> moving to 4K sectorsize as default for all page sizes.

I am writing a patch to autotune it to sectorsize/2 by default.

To test this, we need to have a filesystem with file sizes of various
sizes (so that we have both inline and regular extents) and run rw. It
looks like no regular workload (fio/sysbench) can do that and, I am
stuck on that. Any inputs?

>>>> Please fix/reformat/improve any comments that are in moved code.
>>>
>>>    I think you are pointing to s/f/F and 80 chars long? Will fix.
>>
>> Yes, already fixed in the committed version in misc-next, thanks.

  Thanks.

- Anand

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: fix max max_inline for pagesize=64K
  2021-08-27  6:24         ` Anand Jain
@ 2021-08-27  7:05           ` Qu Wenruo
  2021-08-27  7:31             ` Anand Jain
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2021-08-27  7:05 UTC (permalink / raw)
  To: Anand Jain, dsterba, linux-btrfs, Alexander Tsvetkov



On 2021/8/27 下午2:24, Anand Jain wrote:
>
>
> On 27/08/2021 08:38, Qu Wenruo wrote:
>>
>>
>> On 2021/8/27 上午1:53, David Sterba wrote:
>>> On Tue, Aug 24, 2021 at 01:54:21PM +0800, Anand Jain wrote:
>>>> On 16/08/2021 23:10, David Sterba wrote:
>>>>> On Tue, Aug 10, 2021 at 11:23:44PM +0800, Anand Jain wrote:
>>>>>> The mount option max_inline ranges from 0 to the sectorsize (which is
>>>>>> equal to pagesize). But we parse the mount options too early and
>>>>>> before
>>>>>> the sectorsize is a cache from the superblock. So the upper limit of
>>>>>> max_inline is unaware of the actual sectorsize. And is limited by the
>>>>>> temporary sectorsize 4096 (as below), even on a system where the
>>>>>> default
>>>>>> sectorsize is 64K.
>>>>>
>>>>
>>>>> So the question is what's a sensible value for >4K sectors, which
>>>>> is 64K
>>>>> in this case.
>>>>>
>>>>> Generally we allow setting values that may make sense only for some
>>>>> limited usecase and leave it up to the user to decide.
>>>>>
>>>>> The inline files reduce the slack space and on 64K sectors it could be
>>>>> more noticeable than on 4K sectors. It's a trade off as the inline
>>>>> data
>>>>> are stored in metadata blocks that are considered more precious.
>>>>>
>>>>> Do you have any analysis of file size distribution on 64K systems for
>>>>> some normal use case like roo partition?
>>>>>
>>>>> I think this is worth fixing so to be in line with constraints we have
>>>>> for 4K sectors but some numbers would be good too.
>>>>
>>>> Default max_inline for sectorsize=64K is an interesting topic and
>>>> probably long. If time permits, I will look into it.
>>>> Furthermore, we need test cases and a repo that can hold it (and
>>>> also add  read_policy test cases there).
>>>>
>>>> IMO there is no need to hold this patch in search of optimum default
>>>> max_inline for 64K systems.
>>>
>>> Yeah, I'm more interested in some reasonable value, now the default is
>>> 2048 but probably it should be sectorsize/2 in general.
>>
>> Half of sectorsize is pretty solid to me.
>>
>> But I'm afraid this is a little too late, especially considering we're
>> moving to 4K sectorsize as default for all page sizes.
>
> I am writing a patch to autotune it to sectorsize/2 by default.

That would be pretty good.

>
> To test this, we need to have a filesystem with file sizes of various
> sizes (so that we have both inline and regular extents) and run rw. It
> looks like no regular workload (fio/sysbench) can do that and, I am
> stuck on that. Any inputs?

Is that a performance benchmark or just function tests?

For the former one, I guess you can specific the file sizes for
fio/sysbench.

For the latter part, it's pretty simple, just write a bunch of files
with different sizes, and use fiemap to check their block range.

Inline extents should report block range the same as their file offset.

Thanks,
Qu

>
>>>>> Please fix/reformat/improve any comments that are in moved code.
>>>>
>>>>    I think you are pointing to s/f/F and 80 chars long? Will fix.
>>>
>>> Yes, already fixed in the committed version in misc-next, thanks.
>
>   Thanks.
>
> - Anand

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] btrfs: fix max max_inline for pagesize=64K
  2021-08-27  7:05           ` Qu Wenruo
@ 2021-08-27  7:31             ` Anand Jain
  0 siblings, 0 replies; 8+ messages in thread
From: Anand Jain @ 2021-08-27  7:31 UTC (permalink / raw)
  To: Qu Wenruo, dsterba, linux-btrfs, Alexander Tsvetkov



>>>> Yeah, I'm more interested in some reasonable value, now the default is
>>>> 2048 but probably it should be sectorsize/2 in general.
>>>
>>> Half of sectorsize is pretty solid to me.
>>>
>>> But I'm afraid this is a little too late, especially considering we're
>>> moving to 4K sectorsize as default for all page sizes.
>>
>> I am writing a patch to autotune it to sectorsize/2 by default.
> 
> That would be pretty good.
> 
>>
>> To test this, we need to have a filesystem with file sizes of various
>> sizes (so that we have both inline and regular extents) and run rw. It
>> looks like no regular workload (fio/sysbench) can do that and, I am
>> stuck on that. Any inputs?
> 
> Is that a performance benchmark or just function tests?
> 
> For the former one, I guess you can specific the file sizes for
> fio/sysbench.


  Right. Performance benchmark.
  I just found, in fio, you can specify a range for the filesize, which I
  am planning to specify 1K to 10K as it is the most common file size in
  a Linux root fs.

  sysbench does not allow to specify a range of file sizes.


> For the latter part, it's pretty simple, just write a bunch of files
> with different sizes, and use fiemap to check their block range.
> 
> Inline extents should report block range the same as their file offset.

  I didn't have a plan of writing a functional test case for this,
  but now I am reconsidering.

Thanks, Anand

> Thanks,
> Qu
> 
>>
>>>>>> Please fix/reformat/improve any comments that are in moved code.
>>>>>
>>>>>    I think you are pointing to s/f/F and 80 chars long? Will fix.
>>>>
>>>> Yes, already fixed in the committed version in misc-next, thanks.
>>
>>   Thanks.
>>
>> - Anand

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-27  7:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-10 15:23 [PATCH] btrfs: fix max max_inline for pagesize=64K Anand Jain
2021-08-16 15:10 ` David Sterba
2021-08-24  5:54   ` Anand Jain
2021-08-26 17:53     ` David Sterba
2021-08-27  0:38       ` Qu Wenruo
2021-08-27  6:24         ` Anand Jain
2021-08-27  7:05           ` Qu Wenruo
2021-08-27  7:31             ` Anand Jain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.