From: Vincent Guittot <vincent.guittot@linaro.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Sterba <dsterba@suse.cz>,
David Howells <dhowells@redhat.com>,
Eric Biggers <ebiggers@kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH 0/2] pipe: Fixes [ver #2]
Date: Mon, 9 Dec 2019 10:53:46 +0100 [thread overview]
Message-ID: <CAKfTPtDBtPuvK0NzYC0VZgEhh31drCDN=o+3Hd3fUwoffQg0fw@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com>
On Sat, 7 Dec 2019 at 23:48, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Dec 6, 2019 at 7:50 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > The "make goes slow" problem bisects down to b667b8673443 ("pipe:
> > Advance tail pointer inside of wait spinlock in pipe_read()").
>
> I'm not entirely sure that ends up being 100% true. It did bisect to
> that, but the behavior wasn't entirely stable. There definitely is
> some nasty timing trigger.
>
> But I did finally figure out what seems to have been going on with at
> least the biggest part of the build performance regression. It's seems
> to be a nasty interaction with the scheduler and the GNU make
> jobserver, and in particular the pipe wakeups really _really_ do seem
> to want to be synchronous both for the readers and the writers.
>
> When a writer wakes up a reader, we want the reader to react quickly
> and vice versa. The most obvious case was for the GNU make jobserver,
> where sub-makes would do a single-byte write to the jobserver pipe,
> and we want to wake up the reader *immediatly*, because the reader is
> actually a lot more important than the writer. The reader is what gets
> the next job going, the writer just got done with the last one.
>
> And when a reader empties a full pipe, it's because the writer is
> generating data, and you want to just get the writer going again asap.
>
> Anyway, I've spent way too much time looking at this and wondering
> about odd performance patterns. It seems to be mostly back up to
> normal.
>
> I say "mostly", because I still see times of "not as many concurrent
> compiles going as I'd expect". It might be a kbuild problem, it might
> be an issue with GNU make (I've seen problems with the make jobserver
> wanting many more tokens than expected before and the kernel makefiles
> - it migth be about deep subdirectories etc), and it might be some
> remaining pipe issue. But my allmodconfig builds aren't _enormously_
> slower than they used to be.
>
> But there's definitely some unhappy interaction with the jobserver. I
> have 16 threads (8 cores with HT), and I generally use "make -j32" to
> keep them busy because the jobserver isn't great. The pipe rework made
> even that 2x slop not work all that well. Something held on to tokens
> too long, and there was definitely some interaction with the pipe
> wakeup code. Using "-j64" hid the problem, but it was a problem.
>
> It might be the new scheduler balancing changes that are interacting
> with the pipe thing. I'm adding PeterZ, Ingo and Vincent to the cc,
> because I hadn't realized just how important the sync wakeup seems to
> be for pipe performance even at a big level.
Which version of make should I use to reproduce the problem ?
My setup is not the same and my make is a bit old but I haven't been
able to reproduce the problem described above on my arm64 octa cores
system and v5.5-rc1.
All cores are busy with -j16. And even -j8 keeps the cores almost always busy
>
> I've pushed out my pipe changes. I really didn't want to do that kind
> of stuff at the end of the merge window, but I spent a lot more time
> than I wanted looking at this code, because I was getting to the point
> where the alternative was to just revert it all.
>
> DavidH, give these a look:
>
> 85190d15f4ea pipe: don't use 'pipe_wait() for basic pipe IO
> a28c8b9db8a1 pipe: remove 'waiting_writers' merging logic
> f467a6a66419 pipe: fix and clarify pipe read wakeup logic
> 1b6b26ae7053 pipe: fix and clarify pipe write wakeup logic
> ad910e36da4c pipe: fix poll/select race introduced by the pipe rework
>
> the top two of which are purely "I'm fed up looking at this code, this
> needs to go" kind of changes.
>
> In particular, that last change is because I think the GNU jobserver
> problem is partly a thundering herd issue: when a job token becomes
> free (ie somebody does a one-byte write to an empty jobserver pipe),
> it wakes up *everybody* who is waiting for a token. One of them will
> get it, and the others will go to sleep again. And then it repeats all
> over. I didn't fix it, but it _could_ be fixed with exclusive waits
> for readers/writers, but that means more smarts than pipe_wait() can
> do. And because the jobserver isn't great at keeping everybody happy,
> I'm using a much bigger "make -jX" value than the number of CPU's I
> have, which makes the herd bigger. And I suspect none of this helps
> the scheduler pick the _right_ process to run, which just makes
> scheduling an even bigger problem.
>
> Linus
next prev parent reply other threads:[~2019-12-09 9:54 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-05 22:30 [PATCH 0/2] pipe: Fixes [ver #2] David Howells
2019-12-05 22:30 ` [PATCH 1/2] pipe: Remove assertion from pipe_poll() " David Howells
2019-12-05 22:30 ` [PATCH 2/2] pipe: Fix missing mask update after pipe_wait() " David Howells
2019-12-05 23:58 ` Linus Torvalds
2019-12-06 13:56 ` [PATCH 0/2] pipe: Fixes " David Sterba
2019-12-06 17:09 ` Linus Torvalds
2019-12-06 17:42 ` Linus Torvalds
2019-12-06 18:59 ` Linus Torvalds
2019-12-07 21:31 ` Akemi Yagi
2019-12-08 16:45 ` Akemi Yagi
2019-12-08 18:04 ` Linus Torvalds
2019-12-09 3:07 ` Linus Torvalds
2019-12-06 20:28 ` Linus Torvalds
2019-12-06 21:04 ` Linus Torvalds
2019-12-07 3:50 ` Linus Torvalds
2019-12-07 4:01 ` Linus Torvalds
2019-12-07 22:47 ` Linus Torvalds
2019-12-09 9:53 ` Vincent Guittot [this message]
2019-12-09 17:48 ` Linus Torvalds
2019-12-09 17:57 ` Akemi Yagi
2019-12-09 18:18 ` Linus Torvalds
2019-12-09 18:24 ` Linus Torvalds
2019-12-18 20:59 ` Josh Triplett
2019-12-10 2:58 ` DJ Delorie
2019-12-10 14:38 ` Vincent Guittot
2019-12-10 17:39 ` Linus Torvalds
2019-12-11 18:09 ` DJ Delorie
2019-12-11 18:59 ` Linus Torvalds
2019-12-12 10:18 ` Konstantin Khlebnikov
2019-12-18 22:51 ` Linus Torvalds
2019-12-19 0:03 ` Josh Triplett
2019-12-19 0:14 ` Josh Triplett
2019-12-19 0:51 ` Linus Torvalds
2019-12-19 0:54 ` Linus Torvalds
2019-12-19 7:56 ` David Howells
2019-12-19 16:35 ` Linus Torvalds
2019-12-11 20:55 ` David Howells
2019-12-12 1:28 ` Linus Torvalds
2019-12-12 7:34 ` David Howells
2019-12-09 14:55 ` David Sterba
2019-12-06 21:26 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKfTPtDBtPuvK0NzYC0VZgEhh31drCDN=o+3Hd3fUwoffQg0fw@mail.gmail.com' \
--to=vincent.guittot@linaro.org \
--cc=dhowells@redhat.com \
--cc=dsterba@suse.cz \
--cc=ebiggers@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).