linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thorsten Leemhuis <regressions@leemhuis.info>
To: Josef Bacik <josef@toxicpanda.com>,
	Valentin Schneider <valentin.schneider@arm.com>
Cc: peterz@infradead.org, vincent.guittot@linaro.org,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-btrfs@vger.kernel.org, guro@fb.com, clm@fb.com
Subject: Re: [REGRESSION] 5-10% increase in IO latencies with nohz balance patch
Date: Wed, 22 Dec 2021 13:42:16 +0100	[thread overview]
Message-ID: <99452126-661e-9a0c-6b51-d345ed0f76ee@leemhuis.info> (raw)
In-Reply-To: <YbJWBGaGAW/MenOn@localhost.localdomain>

Hi, this is your Linux kernel regression tracker speaking.

On 09.12.21 20:16, Josef Bacik wrote:
> On Thu, Dec 09, 2021 at 05:22:05PM +0000, Valentin Schneider wrote:
>> On 06/12/21 09:48, Valentin Schneider wrote:
>>> On 03/12/21 14:00, Josef Bacik wrote:
>>>> On Fri, Dec 03, 2021 at 12:03:27PM +0000, Valentin Schneider wrote:
>>>>> Could you give the 4 top patches, i.e. those above
>>>>> 8c92606ab810 ("sched/cpuacct: Make user/system times in cpuacct.stat more precise")
>>>>> a try?
>>>>>
>>>>> https://git.gitlab.arm.com/linux-arm/linux-vs.git -b mainline/sched/nohz-next-update-regression
>>>>>
>>>>> I gave that a quick test on the platform that caused me to write the patch
>>>>> you bisected and looks like it didn't break the original fix. If the above
>>>>> counter-measures aren't sufficient, I'll have to go poke at your
>>>>> reproducers...
>>>>>
>>>>
>>>> It's better but still around 6% regression.  If I compare these patches to the
>>>> average of the last few days worth of runs you're 5% better than before, so
>>>> progress but not completely erased.
>>>>
>>>
>>> Hmph, time for me to reproduce this locally then. Thanks!
>>
>> I carved out a partition out of an Ampere eMAG's HDD to play with BTRFS
>> via fsperf; this is what I get for the bisected commit (baseline is
>> bisected patchset's immediate parent, aka v5.15-rc4) via a handful of
>> ./fsperf -p before-regression -c btrfs -n 100 -t emptyfiles500k
>>
>>   write_clat_ns_p99     195395.92     198790.46      4797.01    1.74%
>>   write_iops             17305.79      17471.57       250.66    0.96%
>>
>>   write_clat_ns_p99     195395.92     197694.06      4797.01    1.18%
>>   write_iops             17305.79      17533.62       250.66    1.32%
>>
>>   write_clat_ns_p99     195395.92     197903.67      4797.01    1.28%
>>   write_iops             17305.79      17519.71       250.66    1.24%
>>
>> If I compare against tip/sched/core however:
>>
>>   write_clat_ns_p99     195395.92     202936.32      4797.01    3.86%
>>   write_iops             17305.79      17065.46       250.66   -1.39%
>>
>>   write_clat_ns_p99     195395.92     204349.44      4797.01    4.58%
>>   write_iops             17305.79      17097.79       250.66   -1.20%
>>
>>   write_clat_ns_p99     195395.92     204169.05      4797.01    4.49%
>>   write_iops             17305.79      17112.29       250.66   -1.12%
>>
>> tip/sched/core + my patches:
>>
>>   write_clat_ns_p99     195395.92     205721.60      4797.01    5.28%
>>   write_iops             17305.79      16947.59       250.66   -2.07%
>>
>>   write_clat_ns_p99     195395.92     203358.04      4797.01    4.07%
>>   write_iops             17305.79      16953.24       250.66   -2.04%
>>
>>   write_clat_ns_p99     195395.92     201830.40      4797.01    3.29%
>>   write_iops             17305.79      17041.18       250.66   -1.53%
>>
>> So tip/sched/core seems to have a much worse regression, and my patches
>> are making things worse on that system...
>>
>> I've started a bisection to see where the above leads me, unfortunately
>> this machine needs more babysitting than I thought so it's gonna take a
>> while.
>>
>> @Josef any chance you could see if the above also applies to you? tip lives
>> at https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, though from
>> where my bisection is taking me it looks like you should see that against
>> Linus' tree as well.
>>
> 
> This has made us all curious, so we're all fucking around with schbench to see
> if we can make it show up without needing to use fsperf.  Maybe that'll help
> with the bisect, because I had to bisect twice to land on your patches, and I
> only emailed when I could see the change right before and right after your
> patch.  It would not surprise me at all if there's something else here that's
> causing us pain.

What's the status here? Just wondering, because there hasn't been any
activity in this thread since 11 days and the festive season is upon us.

Was the discussion moved elsewhere? Or is this still a mystery? And if
it is: how bad is it, does it need to be fixed before Linus releases 5.16?

Ciao, Thorsten

#regzbot poke

P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
on my table. I can only look briefly into most of them. Unfortunately
therefore I sometimes will get things wrong or miss something important.
I hope that's not the case here; if you think it is, don't hesitate to
tell me about it in a public reply. That's in everyone's interest, as
what I wrote above might be misleading to everyone reading this; any
suggestion I gave thus might sent someone reading this down the wrong
rabbit hole, which none of us wants.

BTW, I have no personal interest in this issue, which is tracked using
regzbot, my Linux kernel regression tracking bot
(https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
this mail to get things rolling again and hence don't need to be CC on
all further activities wrt to this regression.


  reply	other threads:[~2021-12-22 12:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-29 17:03 Josef Bacik
2021-11-29 18:03 ` Valentin Schneider
2021-11-29 18:15   ` Josef Bacik
2021-11-29 18:31     ` Valentin Schneider
2021-11-29 19:49       ` Josef Bacik
2021-11-30  0:26         ` Valentin Schneider
2021-12-03 12:03           ` Valentin Schneider
2021-12-03 19:00             ` Josef Bacik
2021-12-06  9:48               ` Valentin Schneider
2021-12-09 17:22                 ` Valentin Schneider
2021-12-09 19:16                   ` Josef Bacik
2021-12-22 12:42                     ` Thorsten Leemhuis [this message]
2021-12-22 16:07                       ` Valentin Schneider
2022-01-03 16:16                         ` Josef Bacik
2022-01-13 16:41                           ` Valentin Schneider
2022-01-13 16:57                             ` Roman Gushchin
2022-02-18 11:00                               ` Thorsten Leemhuis
2022-02-18 15:34                                 ` Josef Bacik
2021-11-30  7:16 ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99452126-661e-9a0c-6b51-d345ed0f76ee@leemhuis.info \
    --to=regressions@leemhuis.info \
    --cc=clm@fb.com \
    --cc=guro@fb.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --subject='Re: [REGRESSION] 5-10% increase in IO latencies with nohz balance patch' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).