All of lore.kernel.org
 help / color / mirror / Atom feed
* max_inline: alternative values?
@ 2021-08-09 11:15 Serhat Sevki Dincer
  2021-08-09 12:00 ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Serhat Sevki Dincer @ 2021-08-09 11:15 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I was reading btrfs mount options and max_inline=2048 (by default)
caught my attention.
I could not find any benchmarks on the internet comparing different
values for this parameter.
The most detailed info I could find is below from May 2016, when 2048
was set as default.

So on a new-ish 64-bit system (amd64 or arm64) with "SSD" (memory/file
blocks are 4K,
metadata profile "single" by default), how would max_inline=2048
compare to say 3072 ?
Do you know/have any benchmarks comparing different values on a
typical linux installation in terms of:
- performance
- total disk usage
What would be the "optimal" value for SSD on a typical desktop? server?

Thanks a lot..

Note:
From: David Sterba <dsterba@suse.com>

commit f7e98a7fff8634ae655c666dc2c9fc55a48d0a73 upstream.

The current practical default is ~4k on x86_64 (the logic is more complex,
simplified for brevity), the inlined files land in the metadata group and
thus consume space that could be needed for the real metadata.

The inlining brings some usability surprises:

1) total space consumption measured on various filesystems and btrfs
   with DUP metadata was quite visible because of the duplicated data
   within metadata

2) inlined data may exhaust the metadata, which are more precious in case
   the entire device space is allocated to chunks (ie. balance cannot
   make the space more compact)

3) performance suffers a bit as the inlined blocks are duplicate and
   stored far away on the device.

Proposed fix: set the default to 2048

This fixes namely 1), the total filesysystem space consumption will be on
par with other filesystems.

Partially fixes 2), more data are pushed to the data block groups.

The characteristics of 3) are based on actual small file size
distribution.

The change is independent of the metadata blockgroup type (though it's
most visible with DUP) or system page size as these parameters are not
trival to find out, compared to file size.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: max_inline: alternative values?
  2021-08-09 11:15 max_inline: alternative values? Serhat Sevki Dincer
@ 2021-08-09 12:00 ` Qu Wenruo
  2021-08-09 12:36   ` Serhat Sevki Dincer
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2021-08-09 12:00 UTC (permalink / raw)
  To: Serhat Sevki Dincer, linux-btrfs



On 2021/8/9 下午7:15, Serhat Sevki Dincer wrote:
> Hi,
>
> I was reading btrfs mount options and max_inline=2048 (by default)
> caught my attention.
> I could not find any benchmarks on the internet comparing different
> values for this parameter.
> The most detailed info I could find is below from May 2016, when 2048
> was set as default.
>
> So on a new-ish 64-bit system (amd64 or arm64) with "SSD" (memory/file
> blocks are 4K,

For 64-bit arm64, there are 3 different default page size (4K, 16K and 64K).
Thus it's a completely different beast, as currently btrfs don't support
sectorsize other than page size.

But we're already working on support 4K sectorsize for 64K page size,
the initial support will arrive at v5.15 upstream.

Anyway, for now we will only discuss 4K sectorsize for supported systems
(amd64 or 4K page sized aarch64), with default 16K nodesize.


> metadata profile "single" by default), how would max_inline=2048
> compare to say 3072 ?
> Do you know/have any benchmarks comparing different values on a
> typical linux installation in terms of:
> - performance
> - total disk usage

Personally speaking, I'm also very interested in such benchmark, as the
subpage support is coming soon, except RAID56, only inline extent
creation is disabled for subpage.

Thus knowing the performance impact is really important.

But there are more variables involved in such "benchmark".
Not only the inline file limit, but also things like the average file
size involved in the "typical" setup.

If we can define the "typical" setup, I guess it would much easier to do
benchmark.
Depends on the "typical" average file size and how concurrency the
operations are, the result can change.


 From what I know, inline extent size affects the following things:

- Metadata size
   Obviously, but since you're mentioning SSD default, it's less a
   concern, as metadata is also SINGLE in that case.

   Much larger metadata will make the already slow btrfs metadata
   operations even slower.

   On the other hand it will make such inlined data more compact,
   as we no longer needs to pad the data to sectorsize.

   So I'm not sure about the final result.

- Data writeback
   With inline extent, we don't need to submit data writes, but inline
   them directly into metadata.

   This means we don't need to things like data csum calculation, but
   we also need to do more metadata csum calculation.

   Again, no obvious result again.


> What would be the "optimal" value for SSD on a typical desktop? server?

I bet it's not a big deal, but would be very happy to be proven run.

BTW, I just did a super stupid test:
------
fill_dir()
{
         local dir=$1
         for (( i = 0; i < 5120 ; i++)); do
                 xfs_io -f -c "pwrite 0 3K" $dir/file_$i > /dev/null
         done
         sync
}

dev="/dev/test/test"
mnt="/mnt/btrfs"

umount $dev &> /dev/null
umount $mnt &> /dev/null

mkfs.btrfs -f -s 4k -m single $dev
mount $dev $mnt -o ssd,max_inline=2048
echo "ssd,max_inline=2048"
time fill_dir $mnt
umount $mnt

mkfs.btrfs -f -s 4k -m single $dev
mount $dev $mnt -o ssd,max_inline=3072
echo "ssd,max_inline=3072"
time fill_dir $mnt
umount $mnt
------

The results are:

ssd,max_inline=2048
real    0m20.403s
user    0m4.076s
sys     0m16.607s

ssd,max_inline=3072
real    0m20.096s
user    0m4.195s
sys     0m16.213s


Except the slow nature of btrfs metadata operations, it doesn't show
much difference at least for their writeback performance.

Thanks,
Qu

>
> Thanks a lot..
>
> Note:
> From: David Sterba <dsterba@suse.com>
>
> commit f7e98a7fff8634ae655c666dc2c9fc55a48d0a73 upstream.
>
> The current practical default is ~4k on x86_64 (the logic is more complex,
> simplified for brevity), the inlined files land in the metadata group and
> thus consume space that could be needed for the real metadata.
>
> The inlining brings some usability surprises:
>
> 1) total space consumption measured on various filesystems and btrfs
>     with DUP metadata was quite visible because of the duplicated data
>     within metadata
>
> 2) inlined data may exhaust the metadata, which are more precious in case
>     the entire device space is allocated to chunks (ie. balance cannot
>     make the space more compact)
>
> 3) performance suffers a bit as the inlined blocks are duplicate and
>     stored far away on the device.
>
> Proposed fix: set the default to 2048
>
> This fixes namely 1), the total filesysystem space consumption will be on
> par with other filesystems.
>
> Partially fixes 2), more data are pushed to the data block groups.
>
> The characteristics of 3) are based on actual small file size
> distribution.
>
> The change is independent of the metadata blockgroup type (though it's
> most visible with DUP) or system page size as these parameters are not
> trival to find out, compared to file size.
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: max_inline: alternative values?
  2021-08-09 12:00 ` Qu Wenruo
@ 2021-08-09 12:36   ` Serhat Sevki Dincer
  2021-08-09 14:13     ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Serhat Sevki Dincer @ 2021-08-09 12:36 UTC (permalink / raw)
  To: linux-btrfs

Also, in the case of DUP metadata profile, how about duplicating
"only" the metadata in the two blocks? The total inline data space of
2 * 2048/3072 = 4096/6144 bytes
can carry unduplicated data.
That requires I think metadata and inline data have separate crc32c sums.
Is that feasible?

On Mon, Aug 9, 2021 at 3:00 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2021/8/9 下午7:15, Serhat Sevki Dincer wrote:
> > Hi,
> >
> > I was reading btrfs mount options and max_inline=2048 (by default)
> > caught my attention.
> > I could not find any benchmarks on the internet comparing different
> > values for this parameter.
> > The most detailed info I could find is below from May 2016, when 2048
> > was set as default.
> >
> > So on a new-ish 64-bit system (amd64 or arm64) with "SSD" (memory/file
> > blocks are 4K,
>
> For 64-bit arm64, there are 3 different default page size (4K, 16K and 64K).
> Thus it's a completely different beast, as currently btrfs don't support
> sectorsize other than page size.
>
> But we're already working on support 4K sectorsize for 64K page size,
> the initial support will arrive at v5.15 upstream.
>
> Anyway, for now we will only discuss 4K sectorsize for supported systems
> (amd64 or 4K page sized aarch64), with default 16K nodesize.
>
>
> > metadata profile "single" by default), how would max_inline=2048
> > compare to say 3072 ?
> > Do you know/have any benchmarks comparing different values on a
> > typical linux installation in terms of:
> > - performance
> > - total disk usage
>
> Personally speaking, I'm also very interested in such benchmark, as the
> subpage support is coming soon, except RAID56, only inline extent
> creation is disabled for subpage.
>
> Thus knowing the performance impact is really important.
>
> But there are more variables involved in such "benchmark".
> Not only the inline file limit, but also things like the average file
> size involved in the "typical" setup.
>
> If we can define the "typical" setup, I guess it would much easier to do
> benchmark.
> Depends on the "typical" average file size and how concurrency the
> operations are, the result can change.
>
>
>  From what I know, inline extent size affects the following things:
>
> - Metadata size
>    Obviously, but since you're mentioning SSD default, it's less a
>    concern, as metadata is also SINGLE in that case.
>
>    Much larger metadata will make the already slow btrfs metadata
>    operations even slower.
>
>    On the other hand it will make such inlined data more compact,
>    as we no longer needs to pad the data to sectorsize.
>
>    So I'm not sure about the final result.
>
> - Data writeback
>    With inline extent, we don't need to submit data writes, but inline
>    them directly into metadata.
>
>    This means we don't need to things like data csum calculation, but
>    we also need to do more metadata csum calculation.
>
>    Again, no obvious result again.
>
>
> > What would be the "optimal" value for SSD on a typical desktop? server?
>
> I bet it's not a big deal, but would be very happy to be proven run.
>
> BTW, I just did a super stupid test:
> ------
> fill_dir()
> {
>          local dir=$1
>          for (( i = 0; i < 5120 ; i++)); do
>                  xfs_io -f -c "pwrite 0 3K" $dir/file_$i > /dev/null
>          done
>          sync
> }
>
> dev="/dev/test/test"
> mnt="/mnt/btrfs"
>
> umount $dev &> /dev/null
> umount $mnt &> /dev/null
>
> mkfs.btrfs -f -s 4k -m single $dev
> mount $dev $mnt -o ssd,max_inline=2048
> echo "ssd,max_inline=2048"
> time fill_dir $mnt
> umount $mnt
>
> mkfs.btrfs -f -s 4k -m single $dev
> mount $dev $mnt -o ssd,max_inline=3072
> echo "ssd,max_inline=3072"
> time fill_dir $mnt
> umount $mnt
> ------
>
> The results are:
>
> ssd,max_inline=2048
> real    0m20.403s
> user    0m4.076s
> sys     0m16.607s
>
> ssd,max_inline=3072
> real    0m20.096s
> user    0m4.195s
> sys     0m16.213s
>
>
> Except the slow nature of btrfs metadata operations, it doesn't show
> much difference at least for their writeback performance.
>
> Thanks,
> Qu
>
> >
> > Thanks a lot..
> >
> > Note:
> > From: David Sterba <dsterba@suse.com>
> >
> > commit f7e98a7fff8634ae655c666dc2c9fc55a48d0a73 upstream.
> >
> > The current practical default is ~4k on x86_64 (the logic is more complex,
> > simplified for brevity), the inlined files land in the metadata group and
> > thus consume space that could be needed for the real metadata.
> >
> > The inlining brings some usability surprises:
> >
> > 1) total space consumption measured on various filesystems and btrfs
> >     with DUP metadata was quite visible because of the duplicated data
> >     within metadata
> >
> > 2) inlined data may exhaust the metadata, which are more precious in case
> >     the entire device space is allocated to chunks (ie. balance cannot
> >     make the space more compact)
> >
> > 3) performance suffers a bit as the inlined blocks are duplicate and
> >     stored far away on the device.
> >
> > Proposed fix: set the default to 2048
> >
> > This fixes namely 1), the total filesysystem space consumption will be on
> > par with other filesystems.
> >
> > Partially fixes 2), more data are pushed to the data block groups.
> >
> > The characteristics of 3) are based on actual small file size
> > distribution.
> >
> > The change is independent of the metadata blockgroup type (though it's
> > most visible with DUP) or system page size as these parameters are not
> > trival to find out, compared to file size.
> >

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: max_inline: alternative values?
  2021-08-09 12:36   ` Serhat Sevki Dincer
@ 2021-08-09 14:13     ` Qu Wenruo
  0 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2021-08-09 14:13 UTC (permalink / raw)
  To: Serhat Sevki Dincer, linux-btrfs



On 2021/8/9 下午8:36, Serhat Sevki Dincer wrote:
> Also, in the case of DUP metadata profile, how about duplicating
> "only" the metadata in the two blocks? The total inline data space of
> 2 * 2048/3072 = 4096/6144 bytes

That's not that simple.

For DUP, all metadata are doubled. This includes the metadata
header/padding.

Thus although the space usage is simply 2x, but the real space usage is
a little more complex.

> can carry unduplicated data.
> That requires I think metadata and inline data have separate crc32c sums.
> Is that feasible?

Inline data is considered as part of the metadata, thus it's protected
by metadata csum already.

Thanks,
Qu

>
> On Mon, Aug 9, 2021 at 3:00 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2021/8/9 下午7:15, Serhat Sevki Dincer wrote:
>>> Hi,
>>>
>>> I was reading btrfs mount options and max_inline=2048 (by default)
>>> caught my attention.
>>> I could not find any benchmarks on the internet comparing different
>>> values for this parameter.
>>> The most detailed info I could find is below from May 2016, when 2048
>>> was set as default.
>>>
>>> So on a new-ish 64-bit system (amd64 or arm64) with "SSD" (memory/file
>>> blocks are 4K,
>>
>> For 64-bit arm64, there are 3 different default page size (4K, 16K and 64K).
>> Thus it's a completely different beast, as currently btrfs don't support
>> sectorsize other than page size.
>>
>> But we're already working on support 4K sectorsize for 64K page size,
>> the initial support will arrive at v5.15 upstream.
>>
>> Anyway, for now we will only discuss 4K sectorsize for supported systems
>> (amd64 or 4K page sized aarch64), with default 16K nodesize.
>>
>>
>>> metadata profile "single" by default), how would max_inline=2048
>>> compare to say 3072 ?
>>> Do you know/have any benchmarks comparing different values on a
>>> typical linux installation in terms of:
>>> - performance
>>> - total disk usage
>>
>> Personally speaking, I'm also very interested in such benchmark, as the
>> subpage support is coming soon, except RAID56, only inline extent
>> creation is disabled for subpage.
>>
>> Thus knowing the performance impact is really important.
>>
>> But there are more variables involved in such "benchmark".
>> Not only the inline file limit, but also things like the average file
>> size involved in the "typical" setup.
>>
>> If we can define the "typical" setup, I guess it would much easier to do
>> benchmark.
>> Depends on the "typical" average file size and how concurrency the
>> operations are, the result can change.
>>
>>
>>   From what I know, inline extent size affects the following things:
>>
>> - Metadata size
>>     Obviously, but since you're mentioning SSD default, it's less a
>>     concern, as metadata is also SINGLE in that case.
>>
>>     Much larger metadata will make the already slow btrfs metadata
>>     operations even slower.
>>
>>     On the other hand it will make such inlined data more compact,
>>     as we no longer needs to pad the data to sectorsize.
>>
>>     So I'm not sure about the final result.
>>
>> - Data writeback
>>     With inline extent, we don't need to submit data writes, but inline
>>     them directly into metadata.
>>
>>     This means we don't need to things like data csum calculation, but
>>     we also need to do more metadata csum calculation.
>>
>>     Again, no obvious result again.
>>
>>
>>> What would be the "optimal" value for SSD on a typical desktop? server?
>>
>> I bet it's not a big deal, but would be very happy to be proven run.
>>
>> BTW, I just did a super stupid test:
>> ------
>> fill_dir()
>> {
>>           local dir=$1
>>           for (( i = 0; i < 5120 ; i++)); do
>>                   xfs_io -f -c "pwrite 0 3K" $dir/file_$i > /dev/null
>>           done
>>           sync
>> }
>>
>> dev="/dev/test/test"
>> mnt="/mnt/btrfs"
>>
>> umount $dev &> /dev/null
>> umount $mnt &> /dev/null
>>
>> mkfs.btrfs -f -s 4k -m single $dev
>> mount $dev $mnt -o ssd,max_inline=2048
>> echo "ssd,max_inline=2048"
>> time fill_dir $mnt
>> umount $mnt
>>
>> mkfs.btrfs -f -s 4k -m single $dev
>> mount $dev $mnt -o ssd,max_inline=3072
>> echo "ssd,max_inline=3072"
>> time fill_dir $mnt
>> umount $mnt
>> ------
>>
>> The results are:
>>
>> ssd,max_inline=2048
>> real    0m20.403s
>> user    0m4.076s
>> sys     0m16.607s
>>
>> ssd,max_inline=3072
>> real    0m20.096s
>> user    0m4.195s
>> sys     0m16.213s
>>
>>
>> Except the slow nature of btrfs metadata operations, it doesn't show
>> much difference at least for their writeback performance.
>>
>> Thanks,
>> Qu
>>
>>>
>>> Thanks a lot..
>>>
>>> Note:
>>> From: David Sterba <dsterba@suse.com>
>>>
>>> commit f7e98a7fff8634ae655c666dc2c9fc55a48d0a73 upstream.
>>>
>>> The current practical default is ~4k on x86_64 (the logic is more complex,
>>> simplified for brevity), the inlined files land in the metadata group and
>>> thus consume space that could be needed for the real metadata.
>>>
>>> The inlining brings some usability surprises:
>>>
>>> 1) total space consumption measured on various filesystems and btrfs
>>>      with DUP metadata was quite visible because of the duplicated data
>>>      within metadata
>>>
>>> 2) inlined data may exhaust the metadata, which are more precious in case
>>>      the entire device space is allocated to chunks (ie. balance cannot
>>>      make the space more compact)
>>>
>>> 3) performance suffers a bit as the inlined blocks are duplicate and
>>>      stored far away on the device.
>>>
>>> Proposed fix: set the default to 2048
>>>
>>> This fixes namely 1), the total filesysystem space consumption will be on
>>> par with other filesystems.
>>>
>>> Partially fixes 2), more data are pushed to the data block groups.
>>>
>>> The characteristics of 3) are based on actual small file size
>>> distribution.
>>>
>>> The change is independent of the metadata blockgroup type (though it's
>>> most visible with DUP) or system page size as these parameters are not
>>> trival to find out, compared to file size.
>>>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-08-09 14:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-09 11:15 max_inline: alternative values? Serhat Sevki Dincer
2021-08-09 12:00 ` Qu Wenruo
2021-08-09 12:36   ` Serhat Sevki Dincer
2021-08-09 14:13     ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.