All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: Dave Chinner <david@fromorbit.com>, linux-xfs@vger.kernel.org
Subject: [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim
Date: Fri, 14 Oct 2016 08:27:24 -0400	[thread overview]
Message-ID: <06aade22-b29e-f55e-7f00-39154f220aa6@fb.com> (raw)


Hi Dave,

This is part of a series of patches we're growing to fix a perf
regression on a few straggler tiers that are still on v3.10.  In this
case, hadoop had to switch back to v3.10 because v4.x is as much as 15%
slower on recent kernels.

Between v3.10 and v4.x, kswapd is less effective overall.  This leads
more and more procs to get bogged down in direct reclaim Using SYNC_WAIT
in xfs_reclaim_inodes_ag().

Since slab shrinking happens very early in direct reclaim, we've seen
systems with 130GB of ram where hundreds of procs are stuck on the xfs
slab shrinker fighting to walk a slab 900MB in size.  They'd have better
luck moving on to the page cache instead.

Also, we're going into direct reclaim much more often than we should
because kswapd is getting stuck on XFS inode locks and writeback.
Dropping the SYNC_WAIT means that kswapd can move on to other things and
let the async worker threads get kicked to work on the inodes.

We're still working on the series, and this is only compile tested on
current Linus git.  I'm working out some better simulations for the
hadoop workload to stuff into Mel's tests.  Numbers from prod take
roughly 3 days to stabilize, so I haven't isolated this patch from the rest
of the series.

Unpatched v4.x our base allocation stall rate goes up to as much as
200-300/sec, averaging 70/sec.  The series I'm finalizing gets that
number down to < 1 /sec.

Omar Sandoval did some digging and found you added the SYNC_WAIT in
response to a workload I sent ages ago.  I tried to make this OOM with
fsmark creating empty files, and it has been soaking in memory
constrained workloads in production for almost two weeks.

Signed-off-by: Chris Mason <clm@fb.com>
---
 fs/xfs/xfs_icache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index bf2d607..63938fb 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1195,7 +1195,7 @@ xfs_reclaim_inodes_nr(
 	xfs_reclaim_work_queue(mp);
 	xfs_ail_push_all(mp->m_ail);
 
-	return xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr_to_scan);
+	return xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK, &nr_to_scan);
 }
 
 /*
-- 
2.9.3


             reply	other threads:[~2016-10-14 12:28 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-14 12:27 Chris Mason [this message]
2016-10-15 22:34 ` [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim Dave Chinner
2016-10-17  0:24   ` Chris Mason
2016-10-17  1:52     ` Dave Chinner
2016-10-17 13:30       ` Chris Mason
2016-10-17 22:30         ` Dave Chinner
2016-10-17 23:20           ` Chris Mason
2016-10-18  2:03             ` Dave Chinner
2016-11-14  1:00               ` Chris Mason
2016-11-14  7:27                 ` Dave Chinner
2016-11-14 20:56                   ` Chris Mason
2016-11-14 23:58                     ` Dave Chinner
2016-11-15  3:09                       ` Chris Mason
2016-11-15  5:54                       ` Dave Chinner
2016-11-15 19:00                         ` Chris Mason
2016-11-16  1:30                           ` Dave Chinner
2016-11-16  3:03                             ` Chris Mason
2016-11-16 23:31                               ` Dave Chinner
2016-11-17  0:27                                 ` Chris Mason
2016-11-17  1:00                                   ` Dave Chinner
2016-11-17  0:47                               ` Dave Chinner
2016-11-17  1:07                                 ` Chris Mason
2016-11-17  3:39                                   ` Dave Chinner
2019-06-14 12:58 ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=06aade22-b29e-f55e-7f00-39154f220aa6@fb.com \
    --to=clm@fb.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.