linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Julia Lawall <julia.lawall@inria.fr>
To: Waiman Long <longman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will.deacon@arm.com>, Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org,
	Gilles Muller <Gilles.Muller@inria.fr>
Subject: Re: slowdown due to reader-owned rwsem time-based spinning
Date: Tue, 20 Oct 2020 08:16:59 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.22.394.2010200804220.2736@hadrien> (raw)
In-Reply-To: <31a92ff6-280a-12f6-9b5a-a904501ceb04@redhat.com>



On Mon, 19 Oct 2020, Waiman Long wrote:

> On 10/19/20 3:48 PM, Julia Lawall wrote:
> >
> > On Mon, 19 Oct 2020, Waiman Long wrote:
> >
> > > On 10/15/20 7:38 AM, Julia Lawall wrote:
> > > > Hello,
> > > >
> > > > Phoenix is an implementation of map reduce:
> > > >
> > > > https://github.com/kozyraki/phoenix
> > > >
> > > > The phoenix-2.0/tests subdirectory contains some benchmarks, including
> > > > word_count.
> > > >
> > > > At the same time, on my server, since v5.8, the kernel has changed from
> > > > using the governor intel_pstate by default to using intel_cpufreq.
> > > > Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
> > > > while intel_pstate involves very few such stray processes.
> > > >
> > > > Suprisingly, all those kworkers cause the word_count benchmark to run
> > > > 2-3
> > > > times faster.  I bisected the problem back to the following commit,
> > > > whcih
> > > > was introduced in v5.3:
> > > >
> > > > commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
> > > > Author: Waiman Long <longman@redhat.com>
> > > > Date:   Mon May 20 16:59:13 2019 -0400
> > > >
> > > >       locking/rwsem: Enable time-based spinning on reader-owned rwsem
> > > >
> > > > Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
> > > > the one with the kworkers.
> > > >
> > > > I don't know the Phoenix code in detail, but the problem seems to be in
> > > > the infrastructure not the specific word count aplication, because most
> > > > of
> > > > the benchmarks seem to suffer similarly.  Some of the other benchmarks
> > > > seem to take a variable and long amount of time to get started in the
> > > > active mode, so perhaps the problem could be in reading the initial
> > > > dataset.
> > > >
> > > > Before I plunge into it, do you have any suggestions as to what could be
> > > > the problem?
> > > I am a bit confused as to what you are looking for. So you said this patch
> > > make the benchmark run 2-3 times faster. Is this a problem? What are you
> > > trying to achieve? Is it to make the passive case similar to the active
> > > case?
> > Sorry, it seems that I was not clear.  Prior to the commit above the
> > active case had good performance,  The patch caused the active case to
> > slow down by 2-3 times.  Adding lots of kworkers that interrupt the
> > threads eliminated the slowdown.
> >
> > > What this patch does is to allow writer waiting for a rwsem to spin for a
> > > while hoping the readers will release the lock soon to acquire the lock.
> > > Before that, the writer will go to sleep immediately when the rwsem is
> > > owned
> > > by readers. Probably because of that, the kworkers keep on running for a
> > > much
> > > longer time as long as there are no other tasks competing for the CPUs.
> > No, the kworkers don't run for a long time.  My hypothesis is that the
> > kworkers interrupt a thread that is spinning waiting for a lock and thus
> > allow the thread that is holding the lock to run.
> >
> Thanks for the clarification. Now I see what you mean by thinking this is a
> problem?
>
> However, the reader spinning is about 25us max. So I am puzzled by the long
> idle period in between busy period in the active chart. I will need to
> reproduce this condition myself to see what has gone wrong. What is
> configuration of your test machine as well as config option you used for the
> kernel and the boot command line parameters?

80 physical cores, 160 hardware threads.  4 sockets.  Intel(R) Xeon(R) CPU
E7-8870 v4 @ 2.10GHz

Boot options:  ro quiet intel_pstate=active

Benchmark suite: https://github.com/kozyraki/phoenix.git

phoenix-2.0/tests/word_count/word_count datasets/word_count/word_count_datafiles/word_100MB.txt

Traces from Linux 5.9 of several of the benchmarks are available at
https://pages.lip6.fr/Julia.Lawall/px.pdf

julia

      reply	other threads:[~2020-10-20  6:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 11:38 slowdown due to reader-owned rwsem time-based spinning Julia Lawall
2020-10-19 19:33 ` Waiman Long
2020-10-19 19:48   ` Julia Lawall
2020-10-20  3:09     ` Waiman Long
2020-10-20  6:16       ` Julia Lawall [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.22.394.2010200804220.2736@hadrien \
    --to=julia.lawall@inria.fr \
    --cc=Gilles.Muller@inria.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).