On Fri, Dec 13, 2019 at 05:32:14PM +0800, Qu Wenruo wrote: > > > On 2019/12/13 上午12:15, halfdog wrote: > > Hello list, > > > > Using btrfs on > > > > Linux version 5.3.0-2-amd64 (debian-kernel@lists.debian.org) (gcc version 9.2.1 20191109 (Debian 9.2.1-19)) #1 SMP Debian 5.3.9-3 (2019-11-19) > > > > the FIDEDUPERANGE exposes weird behaviour on two identical but > > not too large files that seems to be depending on the file size. > > Before FIDEDUPERANGE both files have a single extent, afterwards > > first file is still single extent, second file has all bytes sharing > > with the extent of the first file except for the last 4096 bytes. > > > > Is there anything known about a bug fixed since the above mentioned > > kernel version? > > > > > > > > If no, does following reproducer still show the same behaviour > > on current Linux kernel (my Python test tools also attached)? > > > >> dd if=/dev/zero bs=1M count=32 of=disk > >> mkfs.btrfs --mixed --metadata single --data single --nodesize 4096 disk > >> mount disk /mnt/test > >> mkdir /mnt/test/x > >> dd bs=1 count=155489 if=/dev/urandom of=/mnt/test/x/file-0 > > 155489 is not sector size aligned, thus the last extent will be padded > with zero. > > >> cat /mnt/test/x/file-0 > /mnt/test/x/file-1 > > Same for the new file. > > For the tailing padding part, it's not aligned, and it's smaller than > the inode size. > > Thus we won't dedupe that tailing part. We definitely *must* dedupe the tailing part on btrfs; otherwise, we can't eliminate the reference to the last (partial) block in the last extent of the file, and there is no way to dedupe the _entire_ file in this example. It does pretty bad things to dedupe hit rates on uncompressed contiguous files, where you can lose an average of 64MB of space per file. I had been wondering why dedupe scores seemed so low on recent kernels, and this bug would certainly contribute to that. It worked in 4.20, broken in 5.0. My guess is commit 34a28e3d77535efc7761aa8d67275c07d1fe2c58 ("Btrfs: use generic_remap_file_range_prep() for cloning and deduplication") but I haven't run a test to confirm. > Thanks, > Qu > > > > >> ./SimpleIndexer x > x.json > >> ./IndexDeduplicationAnalyzer --IndexFile /mnt/test/x.json /mnt/test/x > dedup.list > > Got dict: {b'/mnt/test/x/file-0': [(0, 5316608, 155648)], b'/mnt/test/x/file-1': [(0, 5472256, 155648)]} > > ... > >> strace -s256 -f btrfs-extent-same 155489 /mnt/test/x/file-0 0 /mnt/test/x/file-1 0 2>&1 | grep -E -e FIDEDUPERANGE > > ioctl(3, BTRFS_IOC_FILE_EXTENT_SAME or FIDEDUPERANGE, {src_offset=0, src_length=155489, dest_count=1, info=[{dest_fd=4, dest_offset=0}]} => {info=[{bytes_deduped=155489, status=0}]}) = 0 > >> ./IndexDeduplicationAnalyzer --IndexFile /mnt/test/x.json /mnt/test/x > dedup.list > > Got dict: {b'/mnt/test/x/file-0': [(0, 5316608, 155648)], b'/mnt/test/x/file-1': [(0, 5316608, 151552), (151552, 5623808, 4096)]} > > ... > >> strace -s256 -f btrfs-extent-same 155489 /mnt/test/x/file-0 0 /mnt/test/x/file-1 0 2>&1 | grep -E -e FIDEDUPERANGE > > ioctl(3, BTRFS_IOC_FILE_EXTENT_SAME or FIDEDUPERANGE, {src_offset=0, src_length=155489, dest_count=1, info=[{dest_fd=4, dest_offset=0}]} => {info=[{bytes_deduped=155489, status=0}]}) = 0 > >> strace -s256 -f btrfs-extent-same 4096 /mnt/test/x/file-0 151552 /mnt/test/x/file-1 151552 2>&1 | grep -E -e FIDEDUPERANGE > > ioctl(3, BTRFS_IOC_FILE_EXTENT_SAME or FIDEDUPERANGE, {src_offset=151552, src_length=4096, dest_count=1, info=[{dest_fd=4, dest_offset=151552}]}) = -1 EINVAL (Invalid argument) > > >