Re: max_inline: alternative values?

From: Serhat Sevki Dincer <jfcgauss@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: max_inline: alternative values?
Date: Mon, 9 Aug 2021 15:36:28 +0300	[thread overview]
Message-ID: <CAPqC6xQZvcg1XNeRGYW+1UAXrVXbWhxp4Hqq2nMMJXTvKYnT+g@mail.gmail.com> (raw)
In-Reply-To: <9073e835-41c2-bdab-8e05-dfc759c0e22f@gmx.com>

Also, in the case of DUP metadata profile, how about duplicating
"only" the metadata in the two blocks? The total inline data space of
2 * 2048/3072 = 4096/6144 bytes
can carry unduplicated data.
That requires I think metadata and inline data have separate crc32c sums.
Is that feasible?

On Mon, Aug 9, 2021 at 3:00 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2021/8/9 下午7:15, Serhat Sevki Dincer wrote:
> > Hi,
> >
> > I was reading btrfs mount options and max_inline=2048 (by default)
> > caught my attention.
> > I could not find any benchmarks on the internet comparing different
> > values for this parameter.
> > The most detailed info I could find is below from May 2016, when 2048
> > was set as default.
> >
> > So on a new-ish 64-bit system (amd64 or arm64) with "SSD" (memory/file
> > blocks are 4K,
>
> For 64-bit arm64, there are 3 different default page size (4K, 16K and 64K).
> Thus it's a completely different beast, as currently btrfs don't support
> sectorsize other than page size.
>
> But we're already working on support 4K sectorsize for 64K page size,
> the initial support will arrive at v5.15 upstream.
>
> Anyway, for now we will only discuss 4K sectorsize for supported systems
> (amd64 or 4K page sized aarch64), with default 16K nodesize.
>
>
> > metadata profile "single" by default), how would max_inline=2048
> > compare to say 3072 ?
> > Do you know/have any benchmarks comparing different values on a
> > typical linux installation in terms of:
> > - performance
> > - total disk usage
>
> Personally speaking, I'm also very interested in such benchmark, as the
> subpage support is coming soon, except RAID56, only inline extent
> creation is disabled for subpage.
>
> Thus knowing the performance impact is really important.
>
> But there are more variables involved in such "benchmark".
> Not only the inline file limit, but also things like the average file
> size involved in the "typical" setup.
>
> If we can define the "typical" setup, I guess it would much easier to do
> benchmark.
> Depends on the "typical" average file size and how concurrency the
> operations are, the result can change.
>
>
>  From what I know, inline extent size affects the following things:
>
> - Metadata size
>    Obviously, but since you're mentioning SSD default, it's less a
>    concern, as metadata is also SINGLE in that case.
>
>    Much larger metadata will make the already slow btrfs metadata
>    operations even slower.
>
>    On the other hand it will make such inlined data more compact,
>    as we no longer needs to pad the data to sectorsize.
>
>    So I'm not sure about the final result.
>
> - Data writeback
>    With inline extent, we don't need to submit data writes, but inline
>    them directly into metadata.
>
>    This means we don't need to things like data csum calculation, but
>    we also need to do more metadata csum calculation.
>
>    Again, no obvious result again.
>
>
> > What would be the "optimal" value for SSD on a typical desktop? server?
>
> I bet it's not a big deal, but would be very happy to be proven run.
>
> BTW, I just did a super stupid test:
> ------
> fill_dir()
> {
>          local dir=$1
>          for (( i = 0; i < 5120 ; i++)); do
>                  xfs_io -f -c "pwrite 0 3K" $dir/file_$i > /dev/null
>          done
>          sync
> }
>
> dev="/dev/test/test"
> mnt="/mnt/btrfs"
>
> umount $dev &> /dev/null
> umount $mnt &> /dev/null
>
> mkfs.btrfs -f -s 4k -m single $dev
> mount $dev $mnt -o ssd,max_inline=2048
> echo "ssd,max_inline=2048"
> time fill_dir $mnt
> umount $mnt
>
> mkfs.btrfs -f -s 4k -m single $dev
> mount $dev $mnt -o ssd,max_inline=3072
> echo "ssd,max_inline=3072"
> time fill_dir $mnt
> umount $mnt
> ------
>
> The results are:
>
> ssd,max_inline=2048
> real    0m20.403s
> user    0m4.076s
> sys     0m16.607s
>
> ssd,max_inline=3072
> real    0m20.096s
> user    0m4.195s
> sys     0m16.213s
>
>
> Except the slow nature of btrfs metadata operations, it doesn't show
> much difference at least for their writeback performance.
>
> Thanks,
> Qu
>
> >
> > Thanks a lot..
> >
> > Note:
> > From: David Sterba <dsterba@suse.com>
> >
> > commit f7e98a7fff8634ae655c666dc2c9fc55a48d0a73 upstream.
> >
> > The current practical default is ~4k on x86_64 (the logic is more complex,
> > simplified for brevity), the inlined files land in the metadata group and
> > thus consume space that could be needed for the real metadata.
> >
> > The inlining brings some usability surprises:
> >
> > 1) total space consumption measured on various filesystems and btrfs
> >     with DUP metadata was quite visible because of the duplicated data
> >     within metadata
> >
> > 2) inlined data may exhaust the metadata, which are more precious in case
> >     the entire device space is allocated to chunks (ie. balance cannot
> >     make the space more compact)
> >
> > 3) performance suffers a bit as the inlined blocks are duplicate and
> >     stored far away on the device.
> >
> > Proposed fix: set the default to 2048
> >
> > This fixes namely 1), the total filesysystem space consumption will be on
> > par with other filesystems.
> >
> > Partially fixes 2), more data are pushed to the data block groups.
> >
> > The characteristics of 3) are based on actual small file size
> > distribution.
> >
> > The change is independent of the metadata blockgroup type (though it's
> > most visible with DUP) or system page size as these parameters are not
> > trival to find out, compared to file size.
> >