All of lore.kernel.org
 help / color / mirror / Atom feed
* Performance regression: thread wakeup time (latency) increased up to 3x
@ 2021-10-15  7:43 Norbert
  2021-10-15  8:52 ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Norbert @ 2021-10-15  7:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Thomas Gleixner

Performance regression: thread wakeup time (latency) increased up to 3x.

Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.

Short story:
------------
On isolated  CPUs, wakeup increased from 1080 ns to 3550 ns. (3.3x)
On non-isol. CPUs, wakeup increased from  980 ns to 1245 ns. (1.3x)

Such an increase is surely not an expected part of an intentional
change, especially considering that threads on isolated CPUs are often
latency sensitive. Also, for example, it significantly increases
throughput on contended locks in general (1.3x).

Long Story:
-----------
Time measured from before futex-wake on thread A, to after futex-wait 
returns on thread B.

Times are similar for eventfd write -> blocked-read, just a bit higher.

Thread A and B have affinity set on two neighboring CPUs on Threadripper 
Zen2 CPU at fixed frequency 4.0 Ghz. On isolated CPUs, with SCHED_FIFO, 
on non-isolated CPUs with SCHED_OTHER, however that does not make a big 
difference (I also measured the other combinations).

Measured 5.13.0, 5.13.8, 5.14.0, 5.14.9 and 5.14.11.
Some on Fedora 35 Beta, some on ClearLinux 35100.
All given times are measured with multi-user.target (no GUI shell). 
Times on graphical.target (with GUI shell) are about 10% higher.

These values are not an average of separate shorter and longer times:
This is a typical distribution:
(None are less than 3300 ns, and none are more than 5099 ns.)
  count with 33nn ns: 858
  count with 34nn ns: 19359
  count with 35nn ns: 57257
  count with 36nn ns: 6135
  count with 37nn ns: 150
  count with 38nn ns: 48
  count with 39nn ns: 11
  count with 40nn ns: 10
  count with 41nn ns: 10
  count with 42nn ns: 10
  count with 43nn ns: 7
  count with 44nn ns: 11
  count with 45nn ns: 3
  count with 46nn ns: 6
  count with 47nn ns: 3
  count with 48nn ns: 4
  count with 49nn ns: 1
  count with 50nn ns: 3

Also the times for the futex-wake call itself increased significantly:

On isolated  CPUs, wake call increased from 510 ns to 710 ns. (1.4x)
On non-isol, CPUs, wake call increased from 420 ns to 580 ns. (1.4x)

This is my first time reporting a kernel problem, so please excuse if 
this is not the right place or form. (Also I don't yet have the know-how
to bisect arbitrary kernel versions, or to compile specific patches.)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-15  7:43 Performance regression: thread wakeup time (latency) increased up to 3x Norbert
@ 2021-10-15  8:52 ` Peter Zijlstra
  2021-10-15  9:36   ` Norbert
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2021-10-15  8:52 UTC (permalink / raw)
  To: Norbert; +Cc: linux-kernel, Ingo Molnar, Steven Rostedt, Thomas Gleixner

On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
> Performance regression: thread wakeup time (latency) increased up to 3x.
> 
> Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.

Could you git-bisect this?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-15  8:52 ` Peter Zijlstra
@ 2021-10-15  9:36   ` Norbert
  2021-10-15 14:04     ` Steven Rostedt
  0 siblings, 1 reply; 13+ messages in thread
From: Norbert @ 2021-10-15  9:36 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Steven Rostedt, Thomas Gleixner

On 10/15/21 01:52, Peter Zijlstra wrote:
> On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
>> Performance regression: thread wakeup time (latency) increased up to 3x.
>>
>> Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.
> 
> Could you git-bisect this?
> 

So far I haven't built a kernel yet, I'm quite new to Linux in that way, 
so it may take me some time to figure it all out, but yes.

(By the way, of course I meant that throughput *de*creases, not increases.)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-15  9:36   ` Norbert
@ 2021-10-15 14:04     ` Steven Rostedt
  2021-10-16  4:08       ` Norbert
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2021-10-15 14:04 UTC (permalink / raw)
  To: Norbert; +Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Thomas Gleixner

On Fri, 15 Oct 2021 02:36:03 -0700
Norbert <nbrtt01@gmail.com> wrote:

> On 10/15/21 01:52, Peter Zijlstra wrote:
> > On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:  
> >> Performance regression: thread wakeup time (latency) increased up to 3x.
> >>
> >> Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.  
> > 
> > Could you git-bisect this?
> >   
> 
> So far I haven't built a kernel yet, I'm quite new to Linux in that way, 
> so it may take me some time to figure it all out, but yes.

If you do test a kernel you build yourself, you may want to use
"localmodconfig". Which will look at all the modules that are loaded on the
local system and turn off the configs for modules that are not loaded. This
will save a lot of time compiling the kernel. Especially if your basing off
of a distro config.

  $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
  $ cd linux
  $ cp /boot/config-[whatever your kernel is] .config
  $ yes '' | make localmodconfig

Then build your kernel.

  $ make -j10
  $ sudo make modules_install install

> 
> (By the way, of course I meant that throughput *de*creases, not increases.)

of course ;-)

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-15 14:04     ` Steven Rostedt
@ 2021-10-16  4:08       ` Norbert
  2021-10-18 11:25         ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Norbert @ 2021-10-16  4:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Thomas Gleixner, Yunfeng Ye

On 10/15/21 07:04, Steven Rostedt wrote:
> On Fri, 15 Oct 2021 02:36:03 -0700
> Norbert <nbrtt01@gmail.com> wrote:
> 
>> On 10/15/21 01:52, Peter Zijlstra wrote:
>>> On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
>>>> Performance regression: thread wakeup time (latency) increased up to 3x.
>>>>
>>>> Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.
>>>
>>> Could you git-bisect this?
>>>    
>>
>> So far I haven't built a kernel yet, I'm quite new to Linux in that way,
>> so it may take me some time to figure it all out, but yes.
> 
> If you do test a kernel you build yourself, you may want to use
> "localmodconfig". Which will look at all the modules that are loaded on the
> local system and turn off the configs for modules that are not loaded. This
> will save a lot of time compiling the kernel. Especially if your basing off
> of a distro config.
> 
>    $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>    $ cd linux
>    $ cp /boot/config-[whatever your kernel is] .config
>    $ yes '' | make localmodconfig
> 
> Then build your kernel.
> 
>    $ make -j10
>    $ sudo make modules_install install
> 
>>
>> (By the way, of course I meant that throughput *de*creases, not increases.)
> 
> of course ;-)
> 
> -- Steve
> 

Thank you, localmodconfig is an enormous timesaver. Once I got some 
local build problems/modules out of the way, this method worked very well.

So git-bisect finally identified the following commit.
The performance difference came in a single step. Times were consistent 
with my first post either the slow time or the fast time,
as far as I could tell during the bisection.

It is a bit unfortunate that this comes from an attempt to reduce OS noise.

-----------------------------------------------------
commit a5183862e76fdc25f36b39c2489b816a5c66e2e5
Author: Yunfeng Ye <yeyunfeng@huawei.com>
Date:   Thu May 13 01:29:16 2021 +0200

     tick/nohz: Conditionally restart tick on idle exit

     In nohz_full mode, switching from idle to a task will unconditionally
     issue a tick restart. If the task is alone in the runqueue or is the
     highest priority, the tick will fire once then eventually stop. But 
that
     alone is still undesired noise.

     Therefore, only restart the tick on idle exit when it's strictly
     necessary.

     Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
     Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
     Signed-off-by: Ingo Molnar <mingo@kernel.org>
     Acked-by: Peter Zijlstra <peterz@infradead.org>
     Link: 
https://lore.kernel.org/r/20210512232924.150322-3-frederic@kernel.org
-----------------------------------------------------

Is there anything else to do to complete this report?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-16  4:08       ` Norbert
@ 2021-10-18 11:25         ` Peter Zijlstra
  2021-10-19  1:56           ` Norbert
  2021-10-22 13:03           ` Rafael J. Wysocki
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Zijlstra @ 2021-10-18 11:25 UTC (permalink / raw)
  To: Norbert
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Yunfeng Ye, Rafael J. Wysocki, frederic

On Fri, Oct 15, 2021 at 09:08:58PM -0700, Norbert wrote:

> > > > On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
> > > > > Performance regression: thread wakeup time (latency) increased up to 3x.
> > > > > 
> > > > > Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.

> So git-bisect finally identified the following commit.
> The performance difference came in a single step. Times were consistent with
> my first post either the slow time or the fast time,
> as far as I could tell during the bisection.
> 
> It is a bit unfortunate that this comes from an attempt to reduce OS noise.
> 
> -----------------------------------------------------
> commit a5183862e76fdc25f36b39c2489b816a5c66e2e5
> Author: Yunfeng Ye <yeyunfeng@huawei.com>
> Date:   Thu May 13 01:29:16 2021 +0200
> 
>     tick/nohz: Conditionally restart tick on idle exit
> 
>     In nohz_full mode, switching from idle to a task will unconditionally
>     issue a tick restart. If the task is alone in the runqueue or is the
>     highest priority, the tick will fire once then eventually stop. But that
>     alone is still undesired noise.
> 
>     Therefore, only restart the tick on idle exit when it's strictly
>     necessary.
> 
>     Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
>     Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
>     Acked-by: Peter Zijlstra <peterz@infradead.org>
>     Link:
> https://lore.kernel.org/r/20210512232924.150322-3-frederic@kernel.org
> -----------------------------------------------------
> 
> Is there anything else to do to complete this report?

So it _could_ be you're seeing increased use of deeper idle states due
to less noise. I'm forever forgetting what the most friendly tool is for
checking that (powertop can I think), Rafael?

One thing to try is boot with idle=halt and see if that makes a
different.

Also, let me Cc all the people involved.. the thread starts:

  https://lkml.kernel.org/r/035c23b4-118e-6a35-36d9-1b11e3d679f8@gmail.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-18 11:25         ` Peter Zijlstra
@ 2021-10-19  1:56           ` Norbert
  2021-10-19  7:01             ` Norbert
  2021-10-22 13:03           ` Rafael J. Wysocki
  1 sibling, 1 reply; 13+ messages in thread
From: Norbert @ 2021-10-19  1:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Yunfeng Ye, Rafael J. Wysocki, frederic

On 10/18/21 04:25, Peter Zijlstra wrote:
> On Fri, Oct 15, 2021 at 09:08:58PM -0700, Norbert wrote:
> 
>>>>> On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
>>>>>> Performance regression: thread wakeup time (latency) increased up to 3x.
>>>>>>
>>>>>> Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.
> 
>> So git-bisect finally identified the following commit.
>> The performance difference came in a single step. Times were consistent with
>> my first post either the slow time or the fast time,
>> as far as I could tell during the bisection.
>>
>> It is a bit unfortunate that this comes from an attempt to reduce OS noise.
>>
>> -----------------------------------------------------
>> commit a5183862e76fdc25f36b39c2489b816a5c66e2e5
>> Author: Yunfeng Ye <yeyunfeng@huawei.com>
>> Date:   Thu May 13 01:29:16 2021 +0200
>>
>>      tick/nohz: Conditionally restart tick on idle exit
>>
>>      In nohz_full mode, switching from idle to a task will unconditionally
>>      issue a tick restart. If the task is alone in the runqueue or is the
>>      highest priority, the tick will fire once then eventually stop. But that
>>      alone is still undesired noise.
>>
>>      Therefore, only restart the tick on idle exit when it's strictly
>>      necessary.
>>
>>      Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
>>      Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
>>      Signed-off-by: Ingo Molnar <mingo@kernel.org>
>>      Acked-by: Peter Zijlstra <peterz@infradead.org>
>>      Link:
>> https://lore.kernel.org/r/20210512232924.150322-3-frederic@kernel.org
>> -----------------------------------------------------
>>
>> Is there anything else to do to complete this report?
> 
> So it _could_ be you're seeing increased use of deeper idle states due
> to less noise. I'm forever forgetting what the most friendly tool is for
> checking that (powertop can I think), Rafael?
> 
> One thing to try is boot with idle=halt and see if that makes a
> different.
> 
> Also, let me Cc all the people involved.. the thread starts:
> 
>    https://lkml.kernel.org/r/035c23b4-118e-6a35-36d9-1b11e3d679f8@gmail.com
> 


Booting with idle=halt results in a thread wakeup time of around 2000 
ns, so in the middle between the kernel 5.13 value of 1080 ns and the 
kernel 5.14/5.15 value of around 3550 ns. The wake call time remains at 
740 ns (meaning as bad as without this setting). I'm not sure how much 
that says or doesn't say. By the way, using cpufreq.off=1 seems to have 
no effect at all.

In the meantime I verified the finding from the git bisection by 
manually reverting the changes from this commit in the source code of 
the 5.15-rc5 code base. By doing so the timings for the 
isolated/nohz_full CPUs come back almost to the (good) 5.13 values (both 
wakeup and wake-call).

However the timings for the non-isolated CPUs are unaffected and remain
with the worse performance of 1.3x for the wakeup and 1.4x for the wake 
call. So this apparently requires a separate independent git-bisect and 
is probably a second separate issue (if it is also due to a single change).

I've tried a bit to narrow down the cause of the 3.3x slowdown but am 
still trying to find my way through the maze of little functions... :-).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-19  1:56           ` Norbert
@ 2021-10-19  7:01             ` Norbert
  2021-11-07 15:56               ` Frederic Weisbecker
  0 siblings, 1 reply; 13+ messages in thread
From: Norbert @ 2021-10-19  7:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Yunfeng Ye, Rafael J. Wysocki, frederic

On 10/18/21 18:56, Norbert wrote:
> On 10/18/21 04:25, Peter Zijlstra wrote:
>> On Fri, Oct 15, 2021 at 09:08:58PM -0700, Norbert wrote:
>>
>>>>>> On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
>>>>>>> Performance regression: thread wakeup time (latency) increased up 
>>>>>>> to 3x.
>>>>>>>
>>>>>>> Happened between 5.13.8 and 5.14.0. Still happening at least on 
>>>>>>> 5.14.11.
>>
>>> So git-bisect finally identified the following commit.
>>> The performance difference came in a single step. Times were 
>>> consistent with
>>> my first post either the slow time or the fast time,
>>> as far as I could tell during the bisection.
>>>
>>> It is a bit unfortunate that this comes from an attempt to reduce OS 
>>> noise.
>>>
>>> -----------------------------------------------------
>>> commit a5183862e76fdc25f36b39c2489b816a5c66e2e5
>>> Author: Yunfeng Ye <yeyunfeng@huawei.com>
>>> Date:   Thu May 13 01:29:16 2021 +0200
>>>
>>>      tick/nohz: Conditionally restart tick on idle exit
>>>
>>>      In nohz_full mode, switching from idle to a task will 
>>> unconditionally
>>>      issue a tick restart. If the task is alone in the runqueue or is 
>>> the
>>>      highest priority, the tick will fire once then eventually stop. 
>>> But that
>>>      alone is still undesired noise.
>>>
>>>      Therefore, only restart the tick on idle exit when it's strictly
>>>      necessary.
>>>
>>>      Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
>>>      Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
>>>      Signed-off-by: Ingo Molnar <mingo@kernel.org>
>>>      Acked-by: Peter Zijlstra <peterz@infradead.org>
>>>      Link:
>>> https://lore.kernel.org/r/20210512232924.150322-3-frederic@kernel.org
>>> -----------------------------------------------------
>>>
>>> Is there anything else to do to complete this report?
>>
>> So it _could_ be you're seeing increased use of deeper idle states due
>> to less noise. I'm forever forgetting what the most friendly tool is for
>> checking that (powertop can I think), Rafael?
>>
>> One thing to try is boot with idle=halt and see if that makes a
>> different.
>>
>> Also, let me Cc all the people involved.. the thread starts:
>>
>>    
>> https://lkml.kernel.org/r/035c23b4-118e-6a35-36d9-1b11e3d679f8@gmail.com
>>
> 
> 
> Booting with idle=halt results in a thread wakeup time of around 2000 
> ns, so in the middle between the kernel 5.13 value of 1080 ns and the 
> kernel 5.14/5.15 value of around 3550 ns. The wake call time remains at 
> 740 ns (meaning as bad as without this setting). I'm not sure how much 
> that says or doesn't say. By the way, using cpufreq.off=1 seems to have 
> no effect at all.
> 
> In the meantime I verified the finding from the git bisection by 
> manually reverting the changes from this commit in the source code of 
> the 5.15-rc5 code base. By doing so the timings for the 
> isolated/nohz_full CPUs come back almost to the (good) 5.13 values (both 
> wakeup and wake-call).
> 
> However the timings for the non-isolated CPUs are unaffected and remain
> with the worse performance of 1.3x for the wakeup and 1.4x for the wake 
> call. So this apparently requires a separate independent git-bisect and 
> is probably a second separate issue (if it is also due to a single change).
> 
> I've tried a bit to narrow down the cause of the 3.3x slowdown but am 
> still trying to find my way through the maze of little functions... :-).


On the thought that it might enter deeper idle/wait/sleep states:

The benchmark executes this test in a quite tight loop, except that so 
far it waited 1000 ns (with a mix of pause and rdtsc) before calling 
futex-wake, to make sure the other thread fully enters the futex-wait 
without taking any shortcuts.

Except when this "prepare time" is reduced to less than even 350 ns or 
so, the timings remain the same (they go up before they start going 
down). Surely in this situation the thread is at least not supposed to 
enter deeper states for such short waiting times.

Best, Norbert

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-18 11:25         ` Peter Zijlstra
  2021-10-19  1:56           ` Norbert
@ 2021-10-22 13:03           ` Rafael J. Wysocki
  2021-10-24  5:09             ` Norbert
  1 sibling, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2021-10-22 13:03 UTC (permalink / raw)
  To: Norbert, Peter Zijlstra
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Yunfeng Ye, frederic

On Monday, October 18, 2021 1:25:02 PM CEST Peter Zijlstra wrote:
> On Fri, Oct 15, 2021 at 09:08:58PM -0700, Norbert wrote:
> 
> > > > > On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
> > > > > > Performance regression: thread wakeup time (latency) increased up to 3x.
> > > > > > 
> > > > > > Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.
> 
> > So git-bisect finally identified the following commit.
> > The performance difference came in a single step. Times were consistent with
> > my first post either the slow time or the fast time,
> > as far as I could tell during the bisection.
> > 
> > It is a bit unfortunate that this comes from an attempt to reduce OS noise.
> > 
> > -----------------------------------------------------
> > commit a5183862e76fdc25f36b39c2489b816a5c66e2e5
> > Author: Yunfeng Ye <yeyunfeng@huawei.com>
> > Date:   Thu May 13 01:29:16 2021 +0200
> > 
> >     tick/nohz: Conditionally restart tick on idle exit
> > 
> >     In nohz_full mode, switching from idle to a task will unconditionally
> >     issue a tick restart. If the task is alone in the runqueue or is the
> >     highest priority, the tick will fire once then eventually stop. But that
> >     alone is still undesired noise.
> > 
> >     Therefore, only restart the tick on idle exit when it's strictly
> >     necessary.
> > 
> >     Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
> >     Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> >     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> >     Acked-by: Peter Zijlstra <peterz@infradead.org>
> >     Link:
> > https://lore.kernel.org/r/20210512232924.150322-3-frederic@kernel.org
> > -----------------------------------------------------
> > 
> > Is there anything else to do to complete this report?
> 
> So it _could_ be you're seeing increased use of deeper idle states due
> to less noise. I'm forever forgetting what the most friendly tool is for
> checking that (powertop can I think), Rafael?

You can use turbostat too.

> One thing to try is boot with idle=halt and see if that makes a
> different.
> 
> Also, let me Cc all the people involved.. the thread starts:
> 
>   https://lkml.kernel.org/r/035c23b4-118e-6a35-36d9-1b11e3d679f8@gmail.com
> 





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-22 13:03           ` Rafael J. Wysocki
@ 2021-10-24  5:09             ` Norbert
  2021-11-02  1:35               ` Norbert
  0 siblings, 1 reply; 13+ messages in thread
From: Norbert @ 2021-10-24  5:09 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Yunfeng Ye, frederic

On 10/22/21 06:03, Rafael J. Wysocki wrote:
> On Monday, October 18, 2021 1:25:02 PM CEST Peter Zijlstra wrote:
>> On Fri, Oct 15, 2021 at 09:08:58PM -0700, Norbert wrote:
>>
>>>>>> On Fri, Oct 15, 2021 at 12:43:45AM -0700, Norbert wrote:
>>>>>>> Performance regression: thread wakeup time (latency) increased up to 3x.
>>>>>>>
>>>>>>> Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.
>>
>>> So git-bisect finally identified the following commit.
>>> The performance difference came in a single step. Times were consistent with
>>> my first post either the slow time or the fast time,
>>> as far as I could tell during the bisection.
>>>
>>> It is a bit unfortunate that this comes from an attempt to reduce OS noise.
>>>
>>> -----------------------------------------------------
>>> commit a5183862e76fdc25f36b39c2489b816a5c66e2e5
>>> Author: Yunfeng Ye <yeyunfeng@huawei.com>
>>> Date:   Thu May 13 01:29:16 2021 +0200
>>>
>>>      tick/nohz: Conditionally restart tick on idle exit
>>>
>>>      In nohz_full mode, switching from idle to a task will unconditionally
>>>      issue a tick restart. If the task is alone in the runqueue or is the
>>>      highest priority, the tick will fire once then eventually stop. But that
>>>      alone is still undesired noise.
>>>
>>>      Therefore, only restart the tick on idle exit when it's strictly
>>>      necessary.
>>>
>>>      Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
>>>      Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
>>>      Signed-off-by: Ingo Molnar <mingo@kernel.org>
>>>      Acked-by: Peter Zijlstra <peterz@infradead.org>
>>>      Link:
>>> https://lore.kernel.org/r/20210512232924.150322-3-frederic@kernel.org
>>> -----------------------------------------------------
>>>
>>> Is there anything else to do to complete this report?
>>
>> So it _could_ be you're seeing increased use of deeper idle states due
>> to less noise. I'm forever forgetting what the most friendly tool is for
>> checking that (powertop can I think), Rafael?
> 
> You can use turbostat too.
> 
>> One thing to try is boot with idle=halt and see if that makes a
>> different.
>>
>> Also, let me Cc all the people involved.. the thread starts:
>>
>>    https://lkml.kernel.org/r/035c23b4-118e-6a35-36d9-1b11e3d679f8@gmail.com
>>
> 
> 
> 
> 


Measuring Idle states being used:

Since I've been running tests frequently and under various conditions, I 
wanted to integrate this into the benchmark, to automatically get 
precise numbers in so far as possible. So it is now reading sysfs values 
for idle state usage within the benchmark, right before and right after 
the test loop, and prints out the delta.

My processor/system has 3 idle states: POLL, C1 and C2.

Running the benchmark with the post-patch version of the kernel code, 
almost 100% of the idle states are C2 (more details below). This is 
regardless of the trigger time ("trigger time" means the FUTEX_WAKE call 
comes 50,000 ns after the other thread goes into FUTEX_WAIT), where I 
have measured 1,000 ns and 50,000 ns.

For the pre-patch version, the older faster one, the idle state used 
depends on the trigger time. For a a trigger time of 1000 ns, the idle 
states are almost 100% POLL. For a trigger time of 50,000 ns, the idle 
states shift almost 100% to C1.

All numbers are on isolated, nohz_full CPUs with SCHED_FIFO. (The more 
important part being SCHED_FIFO.) Measured with multi-user.target right 
after boot, on a modified 5.14.14 kernel which runs old or new code 
depending on the CPU number of the isolated CPU. Also using cgroups.

Detailed data is further below.

So each idle state is being used under some condition. :)


--- More details from sysfs: (I hope this provides the data that you 
were expecting from powertop or turbostat, otherwise please let me know.)

Initial data:

idle: available_governors : menu
    idle: current_governor : menu
idle: current_governor_ro : menu
      idle: current_driver : acpi_idle

Idle States at beginning of test for cpu 31 :

           idle state name : POLL
                      desc : CPUIDLE CORE POLL IDLE
                   latency : 0
                 residency : 0
                  rejected : 0
                     usage : 0
                      time : 0

           idle state name : C1
                      desc : ACPI FFH MWAIT 0x0
                   latency : 1
                 residency : 2
                  rejected : 0
                     usage : 0
                      time : 0

           idle state name : C2
                      desc : ACPI IOPORT 0x814
                   latency : 400
                 residency : 800
                  rejected : 0
                     usage : 4954
                      time : 264964543


Below, "Delta usage" is the difference in idle state usage between 
before and after the test loop. The test loop runs exactly 1,000,000 
iterations each time.

----- Output for post-patch, trigger time 1000 ns: -----

Testrun with cpu 31, iterations 1000000, non-aborted futex-waits 999999, 
trigger time 1000 ns.

3540.954890 ns wakeup latency, 757.503257 ns wake call time.

idle state POLL   delta usage:     45    delta time:      36
idle state C1     delta usage:   1038    delta time:     781
idle state C2     delta usage: 998890    delta time: 2845938


----- Output for pre-patch, trigger time 1000 ns: -----

Testrun with cpu 28, iterations 1000000, non-aborted futex-waits 999861, 
trigger time 1000 ns.

1071.148085 ns wakeup latency, 540.017857 ns wake call time.

idle state POLL   delta usage: 997960    delta time: 682471
idle state C1     delta usage:      1    delta time:      0
idle state C2     delta usage:   1244    delta time:   3411


--- Output for post-patch, trigger time 50,000 ns:

Testrun with cpu 31, iterations 1000000, non-aborted futex-waits 
1000000, trigger time 50000 ns.

3554.147479 ns latency, 740.034589 wake call.

idle state POLL   delta usage:       0    delta time:        0
idle state C1     delta usage:      16    delta time:      771
idle state C2     delta usage: 1000047    delta time: 51848427


----- Output for pre-patch, trigger time 50,000 ns: -----

1154.504049 ns latency, 587.281359 wake call.

Testrun with cpu 28, iterations 1000000, non-aborted futex-waits 
1000000, trigger time 50000 ns.

idle state POLL   delta usage:       1    delta time:       50
idle state C1     delta usage: 1027007    delta time: 49536027
idle state C2     delta usage:    2408    delta time:   123819

------------------------------
Best, Norbert

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-24  5:09             ` Norbert
@ 2021-11-02  1:35               ` Norbert
  0 siblings, 0 replies; 13+ messages in thread
From: Norbert @ 2021-11-02  1:35 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Yunfeng Ye, frederic


On 10/23/21 22:09, Norbert wrote:
> (The more important part being SCHED_FIFO.)

Please ignore this side-comment. The decisive factor is 
isolated/nohz_full vs normal.

There are indications that "normal" times have become worse lately 
(measured in 5.14.14), relative to earlier, but in that area the 
fluctuations I see in my current tests are too large to get a clear picture.

Best, Norbert



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-10-19  7:01             ` Norbert
@ 2021-11-07 15:56               ` Frederic Weisbecker
  2021-11-08 15:56                 ` Norbert
  0 siblings, 1 reply; 13+ messages in thread
From: Frederic Weisbecker @ 2021-11-07 15:56 UTC (permalink / raw)
  To: Norbert
  Cc: Peter Zijlstra, Steven Rostedt, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Yunfeng Ye, Rafael J. Wysocki

Hi Norbert,

On Tue, Oct 19, 2021 at 12:01:32AM -0700, Norbert wrote:
> 
> On the thought that it might enter deeper idle/wait/sleep states:
> 
> The benchmark executes this test in a quite tight loop, except that so far
> it waited 1000 ns (with a mix of pause and rdtsc) before calling futex-wake,
> to make sure the other thread fully enters the futex-wait without taking any
> shortcuts.
> 
> Except when this "prepare time" is reduced to less than even 350 ns or so,
> the timings remain the same (they go up before they start going down).
> Surely in this situation the thread is at least not supposed to enter deeper
> states for such short waiting times.

Is it possible for you to share this benchmark so that I can try to reproduce?

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Performance regression: thread wakeup time (latency) increased up to 3x
  2021-11-07 15:56               ` Frederic Weisbecker
@ 2021-11-08 15:56                 ` Norbert
  0 siblings, 0 replies; 13+ messages in thread
From: Norbert @ 2021-11-08 15:56 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Steven Rostedt, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Yunfeng Ye, Rafael J. Wysocki

On 11/7/21 07:56, Frederic Weisbecker wrote:
> 
> Is it possible for you to share this benchmark so that I can try to reproduce?
> 
> Thanks!
> 

Emails were sent. It's currently 1000+ lines of largely experimental 
code, so a bit too much to post here on the list... (and in the middle 
of adding more).
--Norbert

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-11-08 15:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15  7:43 Performance regression: thread wakeup time (latency) increased up to 3x Norbert
2021-10-15  8:52 ` Peter Zijlstra
2021-10-15  9:36   ` Norbert
2021-10-15 14:04     ` Steven Rostedt
2021-10-16  4:08       ` Norbert
2021-10-18 11:25         ` Peter Zijlstra
2021-10-19  1:56           ` Norbert
2021-10-19  7:01             ` Norbert
2021-11-07 15:56               ` Frederic Weisbecker
2021-11-08 15:56                 ` Norbert
2021-10-22 13:03           ` Rafael J. Wysocki
2021-10-24  5:09             ` Norbert
2021-11-02  1:35               ` Norbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.