All of lore.kernel.org
 help / color / mirror / Atom feed
* slowdown due to reader-owned rwsem time-based spinning
@ 2020-10-15 11:38 Julia Lawall
  2020-10-19 19:33 ` Waiman Long
  0 siblings, 1 reply; 5+ messages in thread
From: Julia Lawall @ 2020-10-15 11:38 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Will Deacon, Ingo Molnar, linux-kernel, Gilles Muller

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]

Hello,

Phoenix is an implementation of map reduce:

https://github.com/kozyraki/phoenix

The phoenix-2.0/tests subdirectory contains some benchmarks, including
word_count.

At the same time, on my server, since v5.8, the kernel has changed from
using the governor intel_pstate by default to using intel_cpufreq.
Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
while intel_pstate involves very few such stray processes.

Suprisingly, all those kworkers cause the word_count benchmark to run 2-3
times faster.  I bisected the problem back to the following commit, whcih
was introduced in v5.3:

commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
Author: Waiman Long <longman@redhat.com>
Date:   Mon May 20 16:59:13 2019 -0400

    locking/rwsem: Enable time-based spinning on reader-owned rwsem

Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
the one with the kworkers.

I don't know the Phoenix code in detail, but the problem seems to be in
the infrastructure not the specific word count aplication, because most of
the benchmarks seem to suffer similarly.  Some of the other benchmarks
seem to take a variable and long amount of time to get started in the
active mode, so perhaps the problem could be in reading the initial
dataset.

Before I plunge into it, do you have any suggestions as to what could be
the problem?

thanks,
julia

[-- Attachment #2: Type: application/pdf, Size: 1511252 bytes --]

[-- Attachment #3: Type: application/pdf, Size: 1797989 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: slowdown due to reader-owned rwsem time-based spinning
  2020-10-15 11:38 slowdown due to reader-owned rwsem time-based spinning Julia Lawall
@ 2020-10-19 19:33 ` Waiman Long
  2020-10-19 19:48   ` Julia Lawall
  0 siblings, 1 reply; 5+ messages in thread
From: Waiman Long @ 2020-10-19 19:33 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Peter Zijlstra, Will Deacon, Ingo Molnar, linux-kernel, Gilles Muller

On 10/15/20 7:38 AM, Julia Lawall wrote:
> Hello,
>
> Phoenix is an implementation of map reduce:
>
> https://github.com/kozyraki/phoenix
>
> The phoenix-2.0/tests subdirectory contains some benchmarks, including
> word_count.
>
> At the same time, on my server, since v5.8, the kernel has changed from
> using the governor intel_pstate by default to using intel_cpufreq.
> Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
> while intel_pstate involves very few such stray processes.
>
> Suprisingly, all those kworkers cause the word_count benchmark to run 2-3
> times faster.  I bisected the problem back to the following commit, whcih
> was introduced in v5.3:
>
> commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
> Author: Waiman Long <longman@redhat.com>
> Date:   Mon May 20 16:59:13 2019 -0400
>
>      locking/rwsem: Enable time-based spinning on reader-owned rwsem
>
> Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
> the one with the kworkers.
>
> I don't know the Phoenix code in detail, but the problem seems to be in
> the infrastructure not the specific word count aplication, because most of
> the benchmarks seem to suffer similarly.  Some of the other benchmarks
> seem to take a variable and long amount of time to get started in the
> active mode, so perhaps the problem could be in reading the initial
> dataset.
>
> Before I plunge into it, do you have any suggestions as to what could be
> the problem?

I am a bit confused as to what you are looking for. So you said this 
patch make the benchmark run 2-3 times faster. Is this a problem? What 
are you trying to achieve? Is it to make the passive case similar to the 
active case?

What this patch does is to allow writer waiting for a rwsem to spin for 
a while hoping the readers will release the lock soon to acquire the 
lock. Before that, the writer will go to sleep immediately when the 
rwsem is owned by readers. Probably because of that, the kworkers keep 
on running for a much longer time as long as there are no other tasks 
competing for the CPUs.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: slowdown due to reader-owned rwsem time-based spinning
  2020-10-19 19:33 ` Waiman Long
@ 2020-10-19 19:48   ` Julia Lawall
  2020-10-20  3:09     ` Waiman Long
  0 siblings, 1 reply; 5+ messages in thread
From: Julia Lawall @ 2020-10-19 19:48 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Will Deacon, Ingo Molnar, linux-kernel, Gilles Muller



On Mon, 19 Oct 2020, Waiman Long wrote:

> On 10/15/20 7:38 AM, Julia Lawall wrote:
> > Hello,
> >
> > Phoenix is an implementation of map reduce:
> >
> > https://github.com/kozyraki/phoenix
> >
> > The phoenix-2.0/tests subdirectory contains some benchmarks, including
> > word_count.
> >
> > At the same time, on my server, since v5.8, the kernel has changed from
> > using the governor intel_pstate by default to using intel_cpufreq.
> > Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
> > while intel_pstate involves very few such stray processes.
> >
> > Suprisingly, all those kworkers cause the word_count benchmark to run 2-3
> > times faster.  I bisected the problem back to the following commit, whcih
> > was introduced in v5.3:
> >
> > commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
> > Author: Waiman Long <longman@redhat.com>
> > Date:   Mon May 20 16:59:13 2019 -0400
> >
> >      locking/rwsem: Enable time-based spinning on reader-owned rwsem
> >
> > Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
> > the one with the kworkers.
> >
> > I don't know the Phoenix code in detail, but the problem seems to be in
> > the infrastructure not the specific word count aplication, because most of
> > the benchmarks seem to suffer similarly.  Some of the other benchmarks
> > seem to take a variable and long amount of time to get started in the
> > active mode, so perhaps the problem could be in reading the initial
> > dataset.
> >
> > Before I plunge into it, do you have any suggestions as to what could be
> > the problem?
>
> I am a bit confused as to what you are looking for. So you said this patch
> make the benchmark run 2-3 times faster. Is this a problem? What are you
> trying to achieve? Is it to make the passive case similar to the active case?

Sorry, it seems that I was not clear.  Prior to the commit above the
active case had good performance,  The patch caused the active case to
slow down by 2-3 times.  Adding lots of kworkers that interrupt the
threads eliminated the slowdown.

>
> What this patch does is to allow writer waiting for a rwsem to spin for a
> while hoping the readers will release the lock soon to acquire the lock.
> Before that, the writer will go to sleep immediately when the rwsem is owned
> by readers. Probably because of that, the kworkers keep on running for a much
> longer time as long as there are no other tasks competing for the CPUs.

No, the kworkers don't run for a long time.  My hypothesis is that the
kworkers interrupt a thread that is spinning waiting for a lock and thus
allow the thread that is holding the lock to run.

julia

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: slowdown due to reader-owned rwsem time-based spinning
  2020-10-19 19:48   ` Julia Lawall
@ 2020-10-20  3:09     ` Waiman Long
  2020-10-20  6:16       ` Julia Lawall
  0 siblings, 1 reply; 5+ messages in thread
From: Waiman Long @ 2020-10-20  3:09 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Peter Zijlstra, Will Deacon, Ingo Molnar, linux-kernel, Gilles Muller

On 10/19/20 3:48 PM, Julia Lawall wrote:
>
> On Mon, 19 Oct 2020, Waiman Long wrote:
>
>> On 10/15/20 7:38 AM, Julia Lawall wrote:
>>> Hello,
>>>
>>> Phoenix is an implementation of map reduce:
>>>
>>> https://github.com/kozyraki/phoenix
>>>
>>> The phoenix-2.0/tests subdirectory contains some benchmarks, including
>>> word_count.
>>>
>>> At the same time, on my server, since v5.8, the kernel has changed from
>>> using the governor intel_pstate by default to using intel_cpufreq.
>>> Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
>>> while intel_pstate involves very few such stray processes.
>>>
>>> Suprisingly, all those kworkers cause the word_count benchmark to run 2-3
>>> times faster.  I bisected the problem back to the following commit, whcih
>>> was introduced in v5.3:
>>>
>>> commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
>>> Author: Waiman Long <longman@redhat.com>
>>> Date:   Mon May 20 16:59:13 2019 -0400
>>>
>>>       locking/rwsem: Enable time-based spinning on reader-owned rwsem
>>>
>>> Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
>>> the one with the kworkers.
>>>
>>> I don't know the Phoenix code in detail, but the problem seems to be in
>>> the infrastructure not the specific word count aplication, because most of
>>> the benchmarks seem to suffer similarly.  Some of the other benchmarks
>>> seem to take a variable and long amount of time to get started in the
>>> active mode, so perhaps the problem could be in reading the initial
>>> dataset.
>>>
>>> Before I plunge into it, do you have any suggestions as to what could be
>>> the problem?
>> I am a bit confused as to what you are looking for. So you said this patch
>> make the benchmark run 2-3 times faster. Is this a problem? What are you
>> trying to achieve? Is it to make the passive case similar to the active case?
> Sorry, it seems that I was not clear.  Prior to the commit above the
> active case had good performance,  The patch caused the active case to
> slow down by 2-3 times.  Adding lots of kworkers that interrupt the
> threads eliminated the slowdown.
>
>> What this patch does is to allow writer waiting for a rwsem to spin for a
>> while hoping the readers will release the lock soon to acquire the lock.
>> Before that, the writer will go to sleep immediately when the rwsem is owned
>> by readers. Probably because of that, the kworkers keep on running for a much
>> longer time as long as there are no other tasks competing for the CPUs.
> No, the kworkers don't run for a long time.  My hypothesis is that the
> kworkers interrupt a thread that is spinning waiting for a lock and thus
> allow the thread that is holding the lock to run.
>
Thanks for the clarification. Now I see what you mean by thinking this 
is a problem?

However, the reader spinning is about 25us max. So I am puzzled by the 
long idle period in between busy period in the active chart. I will need 
to reproduce this condition myself to see what has gone wrong. What is 
configuration of your test machine as well as config option you used for 
the kernel and the boot command line parameters?

Thanks,
Longman


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: slowdown due to reader-owned rwsem time-based spinning
  2020-10-20  3:09     ` Waiman Long
@ 2020-10-20  6:16       ` Julia Lawall
  0 siblings, 0 replies; 5+ messages in thread
From: Julia Lawall @ 2020-10-20  6:16 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Will Deacon, Ingo Molnar, linux-kernel, Gilles Muller



On Mon, 19 Oct 2020, Waiman Long wrote:

> On 10/19/20 3:48 PM, Julia Lawall wrote:
> >
> > On Mon, 19 Oct 2020, Waiman Long wrote:
> >
> > > On 10/15/20 7:38 AM, Julia Lawall wrote:
> > > > Hello,
> > > >
> > > > Phoenix is an implementation of map reduce:
> > > >
> > > > https://github.com/kozyraki/phoenix
> > > >
> > > > The phoenix-2.0/tests subdirectory contains some benchmarks, including
> > > > word_count.
> > > >
> > > > At the same time, on my server, since v5.8, the kernel has changed from
> > > > using the governor intel_pstate by default to using intel_cpufreq.
> > > > Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
> > > > while intel_pstate involves very few such stray processes.
> > > >
> > > > Suprisingly, all those kworkers cause the word_count benchmark to run
> > > > 2-3
> > > > times faster.  I bisected the problem back to the following commit,
> > > > whcih
> > > > was introduced in v5.3:
> > > >
> > > > commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
> > > > Author: Waiman Long <longman@redhat.com>
> > > > Date:   Mon May 20 16:59:13 2019 -0400
> > > >
> > > >       locking/rwsem: Enable time-based spinning on reader-owned rwsem
> > > >
> > > > Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
> > > > the one with the kworkers.
> > > >
> > > > I don't know the Phoenix code in detail, but the problem seems to be in
> > > > the infrastructure not the specific word count aplication, because most
> > > > of
> > > > the benchmarks seem to suffer similarly.  Some of the other benchmarks
> > > > seem to take a variable and long amount of time to get started in the
> > > > active mode, so perhaps the problem could be in reading the initial
> > > > dataset.
> > > >
> > > > Before I plunge into it, do you have any suggestions as to what could be
> > > > the problem?
> > > I am a bit confused as to what you are looking for. So you said this patch
> > > make the benchmark run 2-3 times faster. Is this a problem? What are you
> > > trying to achieve? Is it to make the passive case similar to the active
> > > case?
> > Sorry, it seems that I was not clear.  Prior to the commit above the
> > active case had good performance,  The patch caused the active case to
> > slow down by 2-3 times.  Adding lots of kworkers that interrupt the
> > threads eliminated the slowdown.
> >
> > > What this patch does is to allow writer waiting for a rwsem to spin for a
> > > while hoping the readers will release the lock soon to acquire the lock.
> > > Before that, the writer will go to sleep immediately when the rwsem is
> > > owned
> > > by readers. Probably because of that, the kworkers keep on running for a
> > > much
> > > longer time as long as there are no other tasks competing for the CPUs.
> > No, the kworkers don't run for a long time.  My hypothesis is that the
> > kworkers interrupt a thread that is spinning waiting for a lock and thus
> > allow the thread that is holding the lock to run.
> >
> Thanks for the clarification. Now I see what you mean by thinking this is a
> problem?
>
> However, the reader spinning is about 25us max. So I am puzzled by the long
> idle period in between busy period in the active chart. I will need to
> reproduce this condition myself to see what has gone wrong. What is
> configuration of your test machine as well as config option you used for the
> kernel and the boot command line parameters?

80 physical cores, 160 hardware threads.  4 sockets.  Intel(R) Xeon(R) CPU
E7-8870 v4 @ 2.10GHz

Boot options:  ro quiet intel_pstate=active

Benchmark suite: https://github.com/kozyraki/phoenix.git

phoenix-2.0/tests/word_count/word_count datasets/word_count/word_count_datafiles/word_100MB.txt

Traces from Linux 5.9 of several of the benchmarks are available at
https://pages.lip6.fr/Julia.Lawall/px.pdf

julia

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-20  6:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-15 11:38 slowdown due to reader-owned rwsem time-based spinning Julia Lawall
2020-10-19 19:33 ` Waiman Long
2020-10-19 19:48   ` Julia Lawall
2020-10-20  3:09     ` Waiman Long
2020-10-20  6:16       ` Julia Lawall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.