From: Thorsten Leemhuis <firstname.lastname@example.org> To: Josef Bacik <email@example.com>, firstname.lastname@example.org, "email@example.com" <firstname.lastname@example.org> Cc: email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com Subject: Re: [REGRESSION] 5-10% increase in IO latencies with nohz balance patch Date: Tue, 30 Nov 2021 08:16:40 +0100 [thread overview] Message-ID: <firstname.lastname@example.org> (raw) In-Reply-To: <YaUH5GFFoLiS4email@example.com> Hi, this is your Linux kernel regression tracker speaking. Top-posting for once, to make this easy accessible to everyone. Adding the regression mailing list to the list of recipients, as it should be in the loop for all regressions, as explained here: https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html To be sure this issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced 7fd7a9e0caba #regzbot ignore-activity Reminder: when fixing the issue, please add a 'Link:' tag with the URL to the report (the parent of this mail), then regzbot will automatically mark the regression as resolved once the fix lands in the appropriate tree. For more details about regzbot see footer. Sending this to everyone that got the initial report, to make all aware of the tracking. I also hope that messages like this motivate people to directly get at least the regression mailing list and ideally even regzbot involved when dealing with regressions, as messages like this wouldn't be needed then. Don't worry, I'll send further messages wrt to this regression just to the lists (with a tag in the subject so people can filter them away), as long as they are intended just for regzbot. With a bit of luck no such messages will be needed anyway. Ciao, Thorsten, your Linux kernel regression tracker. --- Additional information about regzbot: If you want to know more about regzbot, check out its web-interface, the getting start guide, and/or the references documentation: https://linux-regtracking.leemhuis.info/regzbot/ https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md The last two documents will explain how you can interact with regzbot yourself if your want to. Hint for reporters: when reporting a regression it's in your interest to tell #regzbot about it in the report, as that will ensure the regression gets on the radar of regzbot and the regression tracker. That's in your interest, as they will make sure the report won't fall through the cracks unnoticed. Hint for developers: you normally don't need to care about regzbot once it's involved. Fix the issue as you normally would, just remember to include a 'Link:' tag to the report in the commit message, as explained in Documentation/process/submitting-patches.rst That aspect was recently was made more explicit in commit 1f57bd42b77c: https://git.kernel.org/linus/1f57bd42b77c On 29.11.21 18:03, Josef Bacik wrote: > > Our nightly performance testing found a performance regression when we rebased > our devel tree onto v5.16-rc. This took me a few days to bisect down, but this > patch > > 7fd7a9e0caba ("sched/fair: Trigger nohz.next_balance updates when a CPU goes NOHZ-idle") > > is the one that introduces the regression. My performance testing box is a 2 > socket, with a model name "Intel(R) Xeon(R) Bronze 3204 CPU @ 1.90GHz", for a > total of 12 cpu's reported in cpuinfo. It has 128gib of RAM, and these perf > tests are being run against a SSD and spinning rust device, but the regression > is consistent across both configurations. You can see the historical graph of > the completion latencies for this specific run > > http://toxicpanda.com/performance/emptyfiles500k_write_clat_ns_p99.png > > Or for something a little more braindead (untar firefox) you can see a increase > in the runtime > > http://toxicpanda.com/performance/untarfirefox_elapsed.png > > These two tests are single threaded, the regression doesn't appear to affect > multi-threaded tests. For a simple reproducer you can simply download a tarball > of the firefox sources and untar it onto a clean btrfs file system. The time > before and after this commit goes up ~1-2 seconds on my machine. For a less > simple test you can create a clean btrfs file system and run > > fio --name emptyfiles500k --create_on_open=1 --nrfiles=31250 --readwrite=write \ > --readwrite=write --ioengine=filecreate --fallocate=none --filesize=4k \ > --openfiles=1 --alloc-size 98304 --allrandrepeat=1 --randseed=12345 \ > --directory <mount point> > > And you are looking for the "Write clat ns p99" metric. You'll see a 5-10% > increase in the latency time. If you want to run our tests directly it's > relatively easy to setup, you can clone the fsperf repo > > https://github.com/josefbacik/fsperf > > Then in the fsperf directory edit the local.cfg and add > > [main] > directory=/mnt/test > > [btrfs] > device=/dev/sdc > iosched=none > mkfs=mkfs.btrfs -f > mount=mount -o noatime > > And then run the following on the baseline kernel > > ./fsperf -p regression -c btrfs -n 10 emptyfiles500k > > This will run the test 10 times and save the results to the database. Then you > can boot into your changed kernel and runn > > ./fsperf -p regrssion -c btrfs -n 10 -t emptyfiles500k > > This will run the test 10 times and take the average and compare it to the > baseline and print out the values, you'll see the increase latency values there. > > I can reproduce this at will, if you want to just throw patches at me I'm happy > to run it and let you know what happens. I'm attaching my .config as well in > case that is needed, but the HZ and PREEMPT settings are > > CONFIG_NO_HZ_COMMON=y > CONFIG_NO_HZ_FULL=y > CONFIG_NO_HZ=y > CONFIG_HZ_1000=y > CONFIG_PREEMPT=y > CONFIG_PREEMPT_COUNT=y > CONFIG_PREEMPTION=y > CONFIG_PREEMPT_DYNAMIC=y > CONFIG_PREEMPT_RCU=y > CONFIG_HAVE_PREEMPT_DYNAMIC=y > CONFIG_PREEMPT_NOTIFIERS=y > CONFIG_DEBUG_PREEMPT=y > > Thanks, > > Josef >
prev parent reply other threads:[~2021-11-30 7:16 UTC|newest] Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-11-29 17:03 Josef Bacik 2021-11-29 18:03 ` Valentin Schneider 2021-11-29 18:15 ` Josef Bacik 2021-11-29 18:31 ` Valentin Schneider 2021-11-29 19:49 ` Josef Bacik 2021-11-30 0:26 ` Valentin Schneider 2021-12-03 12:03 ` Valentin Schneider 2021-12-03 19:00 ` Josef Bacik 2021-12-06 9:48 ` Valentin Schneider 2021-12-09 17:22 ` Valentin Schneider 2021-12-09 19:16 ` Josef Bacik 2021-12-22 12:42 ` Thorsten Leemhuis 2021-12-22 16:07 ` Valentin Schneider 2022-01-03 16:16 ` Josef Bacik 2022-01-13 16:41 ` Valentin Schneider 2022-01-13 16:57 ` Roman Gushchin 2022-02-18 11:00 ` Thorsten Leemhuis 2022-02-18 15:34 ` Josef Bacik 2021-11-30 7:16 ` Thorsten Leemhuis [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [REGRESSION] 5-10% increase in IO latencies with nohz balance patch' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).