From: Chris Mason <clm@fb.com>
To: Johannes Weiner <hannes@cmpxchg.org>, Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>,
"Darrick J. Wong" <djwong@kernel.org>,
xfs <linux-xfs@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
"dchinner@redhat.com" <dchinner@redhat.com>
Subject: Re: [PATCH RFC] iomap: invalidate pages past eof in iomap_do_writepage()
Date: Mon, 6 Jun 2022 11:13:18 -0400 [thread overview]
Message-ID: <da9984a7-a3f1-8a62-f2ca-f8f6d4321e80@fb.com> (raw)
In-Reply-To: <Yp4TWwLrNM1Lhwq3@cmpxchg.org>
On 6/6/22 10:46 AM, Johannes Weiner wrote:
> Hello,
>
> On Mon, Jun 06, 2022 at 09:32:13AM +1000, Dave Chinner wrote:
>> Sure, but you've brought a problem we don't understand the root
>> cause of to my attention. I want to know what the root cause is so
>> that I can determine that there are no other unknown underlying
>> issues that are contributing to this issue.
>
> It seems to me we're just not on the same page on what the reported
> bug is. From my POV, there currently isn't a missing piece in this
> puzzle. But Chris worked closer with the prod folks on this, so I'll
> leave it to him :)
The basic description of the investigation:
* Multiple hits per hour on per 100K machines, but almost impossible to
catch across a single box.
* The debugging information from the long tail detector showed high IO
and high CPU time. (high CPU is relative here, these machines tend to
be IO bound).
* Kernel stack analysis showed IO completion threads waiting for CPU.
* CPU profiling showed redirty_page_for_writepage() dominating.
From here we made a relatively simple reproduction of the
redirty_page_for_writepage() part of the problem. It's a good fix in
isolation, but we'll have to circle back to see how much of the long
tail latency issue it solves.
We can livepatch it quickly, but filtering out the long tail latency
hits for just this one bug is labor intensive, so it'll take a little
bit of time to get good data.
I've got a v2 of the patch that drops the invalidate, doing a load test
with fsx this morning and then getting a second xfstests baseline run to
see if I've added new failures.
-chris
next prev parent reply other threads:[~2022-06-06 15:13 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-01 1:11 [PATCH RFC] iomap: invalidate pages past eof in iomap_do_writepage() Chris Mason
2022-06-01 12:18 ` Christoph Hellwig
2022-06-01 14:13 ` Chris Mason
2022-06-02 6:52 ` Dave Chinner
2022-06-02 15:32 ` Johannes Weiner
2022-06-02 19:41 ` Chris Mason
2022-06-02 19:59 ` Matthew Wilcox
2022-06-02 22:07 ` Dave Chinner
2022-06-02 22:06 ` Dave Chinner
2022-06-03 1:29 ` Chris Mason
2022-06-03 5:20 ` Dave Chinner
2022-06-03 15:06 ` Johannes Weiner
2022-06-03 16:09 ` Chris Mason
2022-06-05 23:32 ` Dave Chinner
2022-06-06 14:46 ` Johannes Weiner
2022-06-06 15:13 ` Chris Mason [this message]
2022-06-07 22:52 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=da9984a7-a3f1-8a62-f2ca-f8f6d4321e80@fb.com \
--to=clm@fb.com \
--cc=david@fromorbit.com \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.