From: Bruno Damasceno Freire <bdamasceno@hotmail.com.br>
To: Thorsten Leemhuis <regressions@leemhuis.info>,
Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"regressions@lists.linux.dev" <regressions@lists.linux.dev>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [regression] 5.15 kernel triggering 100x more inode evictions
Date: Mon, 4 Apr 2022 05:29:18 +0000 [thread overview]
Message-ID: <MN2PR20MB251205164078C367C3FC4166D2E59@MN2PR20MB2512.namprd20.prod.outlook.com> (raw)
In-Reply-To: <35b62998-e386-2032-5a7a-07e3413b3bc1@leemhuis.info>
On 28.03.22 13:28, Thorsten Leemhuis wrote:
> Hi Btrfs developers, this is your Linux kernel regression tracker.
> Top-posting for once, to make this easily accessible to everyone.
>
> Are there any known regressions in 5.15 that cause more inode evictions?
> There is a user reporting such problems, see the msg quoted below, which
> is the last from this thread:
>
> https://lore.kernel.org/all/MN2PR20MB2512314446801B92562E26B5D2169@MN2PR20MB2512.namprd20.prod.outlook.com/
>
> @Bruno: sorry, you report is very hard to follow at least for me. Maybe
> the btrfs developers have a idea what's wrong here, but if not you
> likely need to make this easier for all us:
>
> Write a really simple testcase and run it on those vanilla kernels that
> matter, which are in this case: The latest 5.17.y and 5.15.y releases to
> check if it was already solved -- and if not, on 5.14 and 5.15. Maybe
> you need to bisect the issue to bring us closer the the root of the
> problem. But with a bit of luck the btrfs developers might have an idea
> already.
>
> Ciao, Thosten
Hi Thorsten,
Thanks for the feedback.
I tried my best to make it simple this time.
I hope it is, at least, better than before.
Grazie, Bruno
*** Test case about high inode eviction on the 5.15 kernel ***
This regression was first observed during rpm operations with specific packages that became A LOT slower to update ranging from 4 to 30 minutes [1].
The symptoms are: high cpu usage, high inode eviction and much slower I/O performance.
Analyzing the rpm's strace report and making some experiments I could replace the rpm with a script that do 3 thing:
- rename a file, unlink the renamed file and create a new file.
This test case is designed to trigger the following regression on the 5.15 kernel:
* repeated renameat2, unlink and openat system calls reach files with btrfs compression property enable.
* the combination of these system calls and the btrfs compression property triggers the high inode eviction.
* the high inode eviction causes to much work for the btrfs directory logging.
* the overloaded btrfs directory logging causes the slower I/O performance.
A simplified script is supplied.
For a more capable script, more information and more test results please refer to my github account [2].
[1] https://bugzilla.opensuse.org/show_bug.cgi?id=1193549
[2] https://github.com/bdamascen0/s3e
Index:
1. Vanilla kernels
2. k5.14.21 x k5.15.32
2.1. Detailed test report for compressed files
2.2. Comparison results for compressed files
2.3. Detailed test report for uncompressed files
2.4. Comparison results for uncompressed files
2.5. Cpu usage
3. k5.17.1 and k5.16.15
3.1. Basic test report
4. Simplified test script
1. Vanilla kernels
This test case mainly covers the following vanilla kernels [3]: 5.15.32 and 5.14.21.
The 5.15.32 vanilla kernel produced abnormal results:
* high cpu usage, high inode eviction and much slower I/O performance for compressed files.
* double inode eviction and slightly worse I/O performance for uncompressed files.
The 5.14.21 vanilla kernel produced normal results and is used as a reference.
The 5.17.1 and 5.16.15 vanilla kernels [4] produced normal results which can be found at the end.
[3] https://wiki.ubuntu.com/Kernel/MainlineBuilds
[4] https://software.opensuse.org/package/kernel-vanilla
2. k5.14.21 x k5.15.32
2.1 Detailed test report for compressed files
ubuntu jammy jellyfish -- kernel 5.14.21 --- vanilla --- (kvm)
...updating 50 files on /mnt/inode-ev/zstd: Job took 226 ms @inode_evictions: 51
...updating 50 files on /mnt/inode-ev/lzo: Job took 222 ms @inode_evictions: 51
...updating 100 files on /mnt/inode-ev/zstd: Job took 384 ms @inode_evictions: 101
...updating 100 files on /mnt/inode-ev/lzo: Job took 462 ms @inode_evictions: 101
...updating 150 files on /mnt/inode-ev/zstd: Job took 493 ms @inode_evictions: 151
...updating 150 files on /mnt/inode-ev/lzo: Job took 554 ms @inode_evictions: 151
...updating 200 files on /mnt/inode-ev/zstd: Job took 804 ms @inode_evictions: 201
...updating 200 files on /mnt/inode-ev/lzo: Job took 725 ms @inode_evictions: 201
...updating 250 files on /mnt/inode-ev/zstd: Job took 745 ms @inode_evictions: 251
...updating 250 files on /mnt/inode-ev/lzo: Job took 758 ms @inode_evictions: 251
...updating 1000 files on /mnt/inode-ev/zstd: Job took 3452 ms @inode_evictions: 1001
...updating 1000 files on /mnt/inode-ev/lzo: Job took 2979 ms @inode_evictions: 1001
ubuntu jammy jellyfish -- kernel 5.15.32 --- vanilla --- (kvm)
...updating 50 files on /mnt/inode-ev/zstd: Job took 420 ms @inode_evictions: 1275
...updating 50 files on /mnt/inode-ev/lzo: Job took 488 ms @inode_evictions: 1275
...updating 100 files on /mnt/inode-ev/zstd: Job took 1649 ms @inode_evictions: 5050
...updating 100 files on /mnt/inode-ev/lzo: Job took 1566 ms @inode_evictions: 5050
...updating 150 files on /mnt/inode-ev/zstd: Job took 4448 ms @inode_evictions: 11325
...updating 150 files on /mnt/inode-ev/lzo: Job took 4136 ms @inode_evictions: 11325
...updating 200 files on /mnt/inode-ev/zstd: Job took 9177 ms @inode_evictions: 20100
...updating 200 files on /mnt/inode-ev/lzo: Job took 9070 ms @inode_evictions: 20100
...updating 250 files on /mnt/inode-ev/zstd: Job took 16191 ms @inode_evictions: 31375
...updating 250 files on /mnt/inode-ev/lzo: Job took 16062 ms @inode_evictions: 31375
...updating 1000 files on /mnt/inode-ev/zstd: Job took 132865 ms @inode_evictions: 104195
...updating 1000 files on /mnt/inode-ev/lzo: Job took 131979 ms @inode_evictions: 106639
2.2. Comparison results for compressed files
k5.15.32 vanilla (compared to: k5.14.21 vanilla)
50 files gives aprox. 1.8 x more time and aprox. 25 x more inode evictions
100 files gives aprox. 3.3 x more time and aprox. 50 x more inode evictions
150 files gives aprox. 7.4 x more time and aprox. 75 x more inode evictions
200 files gives aprox. 11.4 x more time and aprox. 100 x more inode evictions
250 files gives aprox. 21.1 x more time and aprox. 125 x more inode evictions
1000 files gives aprox. 38.4 x more time and aprox. 100 x more inode evictions
2.3 Detailed test report for uncompressed files
ubuntu jammy jellyfish -- kernel 5.14.21 --- vanilla --- (kvm)
...updating 50 files on /mnt/inode-ev/uncompressed: Job took 214 ms @inode_evictions: 51
...updating 100 files on /mnt/inode-ev/uncompressed: Job took 402 ms @inode_evictions: 101
...updating 150 files on /mnt/inode-ev/uncompressed: Job took 543 ms @inode_evictions: 151
...updating 200 files on /mnt/inode-ev/uncompressed: Job took 694 ms @inode_evictions: 201
...updating 250 files on /mnt/inode-ev/uncompressed: Job took 835 ms @inode_evictions: 251
...updating 1000 files on /mnt/inode-ev/uncompressed: Job took 3162 ms @inode_evictions: 1001
ubuntu jammy jellyfish -- kernel 5.15.32 --- vanilla --- (kvm)
...updating 50 files on /mnt/inode-ev/uncompressed: Job took 269 ms @inode_evictions: 99
...updating 100 files on /mnt/inode-ev/uncompressed: Job took 359 ms @inode_evictions: 199
...updating 150 files on /mnt/inode-ev/uncompressed: Job took 675 ms @inode_evictions: 299
...updating 200 files on /mnt/inode-ev/uncompressed: Job took 752 ms @inode_evictions: 399
...updating 250 files on /mnt/inode-ev/uncompressed: Job took 1149 ms @inode_evictions: 499
...updating 1000 files on /mnt/inode-ev/uncompressed: Job took 7333 ms @inode_evictions: 1999
2.4. Comparison results for uncompressed files
k5.15.32 vanilla (compared to: k5.14.21 vanilla)
50 files gives aprox. 1.2 x more time and aprox. 2 x more inode evictions
100 files gives aprox. 0.8 x more time and aprox. 2 x more inode evictions
150 files gives aprox. 1.2 x more time and aprox. 2 x more inode evictions
200 files gives aprox. 1.0 x more time and aprox. 2 x more inode evictions
250 files gives aprox. 1.3 x more time and aprox. 2 x more inode evictions
1000 files gives aprox. 2.3 x more time and aprox. 2 x more inode evictions
2.5. Cpu usage
ubuntu jammy jellyfish -- kernel 5.15.32 --- vanilla --- (kvm)
...updating 1000 files on /mnt/inode-ev/zstd: Job took 132691 ms - real 2m12,731s sys 2m 7,134s
...updating 1000 files on /mnt/inode-ev/lzo: Job took 134130 ms - real 2m14,149s sys 2m 8,447s
...updating 1000 files on /mnt/inode-ev/uncompressed: Job took 7241 ms - real 0m 7,256s sys 0m 4,732s
3 k5.17.1 and k5.16.15
Just for the record, the 5.16 kernel never reproduced the regression.
The real life workload to trigger the regression, the rpm package updates, were verified to work fine since 5.16~rc6 [1].
It was expected that the synthetic workload from the script also produced normal results on the 5.16 and 5.17 kernels.
[1] https://bugzilla.opensuse.org/show_bug.cgi?id=1193549
3.1 Basic test report
opensuse tumbleweed ----- kernel 5.16.15 --- vanilla --- (kvm)
...updating 250 files on /mnt/inode-ev/zstd: Job took 910 ms @inode_evictions: 250
...updating 250 files on /mnt/inode-ev/lzo: Job took 740 ms @inode_evictions: 250
...updating 250 files on /mnt/inode-ev/uncompressed: Job took 717 ms @inode_evictions: 250
opensuse tumbleweed ----- kernel 5.17.1 ---- vanilla --- (kvm)
...updating 250 files on /mnt/inode-ev/zstd: Job took 701 ms @inode_evictions: 250
...updating 250 files on /mnt/inode-ev/lzo: Job took 695 ms @inode_evictions: 250
...updating 250 files on /mnt/inode-ev/uncompressed: Job took 954 ms @inode_evictions: 250
4. Simplified test script
This simplified script tries to setup, format and mount a ramdisk block device.
It creates 3 testing folders (zstd, lzo, uncompressed) and set its btrfs compression property.
For each time the script is executed, 3 tests are done and the bpftrace is started right before each test.
#!/bin/bash
# s3e_t3.sh (based on s3e.sh version 4.7)
# Simple Syscall Signature Emulator (test3)
# test3: populate + test. test renameat2/openat + unlink syscalls w/ empty files (3x)
# Copyright (c) 2022 Bruno Damasceno <bdamasceno@hotmail.com.br>
# Warning: no safety checks
dir1=zstd
dir2=lzo
dir3=uncompressed
DIR=zzz
NUM_FILES=250
DEV=/dev/ram0
MNT=/mnt/inode-ev
DIR_1="$MNT/$dir1"
DIR_2="$MNT/$dir2"
DIR_3="$MNT/$dir3"
populate() {
DIR=$1
echo "...populating 1st generation of files on $DIR:"
for ((i = 1; i <= $NUM_FILES; i++)); do
echo -n > $DIR/file_$i
done
}
run_test() {
DIR=$1
sync
xfs_io -c "fsync" $DIR
echo -e "\n...updating $NUM_FILES files on $DIR:"
#dumb pause so bpftrace has time to atach its probe
sleep 3s
start=$(date +%s%N)
for ((i = 1; i <= $NUM_FILES; i++)); do
mv $DIR/file_$i $DIR/file_$i-RPMDELETE
unlink $DIR/file_$i-RPMDELETE
echo -n > $DIR/file_$i
echo -n "_$i"
[ $i != $NUM_FILES ] && echo -ne "\r"
done
end=$(date +%s%N)
dur=$(( (end - start) / 1000000 ))
echo -ne "\r"
echo "Job took $dur milliseconds"
}
modprobe brd rd_size=128000 max_part=1 rd_nr=1
mkfs.btrfs --label inode-ev --force $DEV > /dev/null
mkdir $MNT
mount $DEV $MNT
mkdir $MNT/{$dir1,$dir2,$dir3}
btrfs property set $DIR_1 compression zstd:1
btrfs property set $DIR_2 compression lzo
btrfs property set $DIR_3 compression none
for dir in "$DIR_1" "$DIR_2" "$DIR_3"
do
populate "$dir"
bpftrace -e 'kprobe:btrfs_evict_inode { @inode_evictions = count(); }' & run_test "$dir"
pkill bpftrace
#dumb pause to wait the bpftrace report
sleep 2s
done
umount $MNT
rm --dir $MNT
rmmod brd
next prev parent reply other threads:[~2022-04-04 5:29 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <MN2PR20MB2512314446801B92562E26B5D2169@MN2PR20MB2512.namprd20.prod.outlook.com>
2022-03-24 12:18 ` [regression] 5.15 kernel triggering 100x more inode evictions Thorsten Leemhuis
2022-03-28 1:29 ` Bruno Damasceno Freire
2022-03-28 3:12 ` Bruno Damasceno Freire
2022-03-28 11:28 ` Thorsten Leemhuis
2022-04-04 5:29 ` Bruno Damasceno Freire [this message]
2022-04-08 10:32 ` Thorsten Leemhuis
2022-04-08 14:52 ` David Sterba
2022-04-08 15:55 ` Filipe Manana
2022-04-08 16:50 ` Thorsten Leemhuis
2022-04-09 17:12 ` Bruno Damasceno Freire
2022-04-10 7:27 ` Thorsten Leemhuis
2022-04-13 0:15 ` Nicholas D Steeves
2022-04-09 17:07 ` Bruno Damasceno Freire
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MN2PR20MB251205164078C367C3FC4166D2E59@MN2PR20MB2512.namprd20.prod.outlook.com \
--to=bdamasceno@hotmail.com.br \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=regressions@leemhuis.info \
--cc=regressions@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).