All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] mm: vmscan: fix IO/refault regression in cache workingset transition
Date: Tue, 4 Apr 2017 18:29:52 -0400	[thread overview]
Message-ID: <20170404222952.GA28930@cmpxchg.org> (raw)
In-Reply-To: <20170404150703.742c49d73921df6369ed3dbd@linux-foundation.org>

On Tue, Apr 04, 2017 at 03:07:03PM -0700, Andrew Morton wrote:
> On Tue,  4 Apr 2017 18:00:52 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > Since 59dc76b0d4df ("mm: vmscan: reduce size of inactive file list")
> > we noticed bigger IO spikes during changes in cache access patterns.
> > 
> > The patch in question shrunk the inactive list size to leave more room
> > for the current workingset in the presence of streaming IO. However,
> > workingset transitions that previously happened on the inactive list
> > are now pushed out of memory and incur more refaults to complete.
> > 
> > This patch disables active list protection when refaults are being
> > observed. This accelerates workingset transitions, and allows more of
> > the new set to establish itself from memory, without eating into the
> > ability to protect the established workingset during stable periods.
> > 
> > Fixes: 59dc76b0d4df ("mm: vmscan: reduce size of inactive file list")
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: <stable@vger.kernel.org> # 4.7+
> 
> That's a pretty large patch and the problem has been there for a year. 
> I'm not sure that it's 4.11 material, let alone -stable.  Care to
> explain further?

The problem statement is a little terse, my apologies.

The workloads that were measurably affected for us were hit pretty bad
by it, with refault/majfault rates doubling and tripling during cache
transitions, and the machines sustaining half-hour periods of 100% IO
utilization, where they'd previously have sub-minute peaks at 60-90%.

Stateful services that handle user data tend to be more conservative
with kernel upgrades. As a result we hit most page cache issues with
some delay, as was the case here.

The severity seemed to warrant a stable tag, but I agree that holding
out until 4.11.1 is probably better, given the invasiveness of this.

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] mm: vmscan: fix IO/refault regression in cache workingset transition
Date: Tue, 4 Apr 2017 18:29:52 -0400	[thread overview]
Message-ID: <20170404222952.GA28930@cmpxchg.org> (raw)
In-Reply-To: <20170404150703.742c49d73921df6369ed3dbd@linux-foundation.org>

On Tue, Apr 04, 2017 at 03:07:03PM -0700, Andrew Morton wrote:
> On Tue,  4 Apr 2017 18:00:52 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > Since 59dc76b0d4df ("mm: vmscan: reduce size of inactive file list")
> > we noticed bigger IO spikes during changes in cache access patterns.
> > 
> > The patch in question shrunk the inactive list size to leave more room
> > for the current workingset in the presence of streaming IO. However,
> > workingset transitions that previously happened on the inactive list
> > are now pushed out of memory and incur more refaults to complete.
> > 
> > This patch disables active list protection when refaults are being
> > observed. This accelerates workingset transitions, and allows more of
> > the new set to establish itself from memory, without eating into the
> > ability to protect the established workingset during stable periods.
> > 
> > Fixes: 59dc76b0d4df ("mm: vmscan: reduce size of inactive file list")
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: <stable@vger.kernel.org> # 4.7+
> 
> That's a pretty large patch and the problem has been there for a year. 
> I'm not sure that it's 4.11 material, let alone -stable.  Care to
> explain further?

The problem statement is a little terse, my apologies.

The workloads that were measurably affected for us were hit pretty bad
by it, with refault/majfault rates doubling and tripling during cache
transitions, and the machines sustaining half-hour periods of 100% IO
utilization, where they'd previously have sub-minute peaks at 60-90%.

Stateful services that handle user data tend to be more conservative
with kernel upgrades. As a result we hit most page cache issues with
some delay, as was the case here.

The severity seemed to warrant a stable tag, but I agree that holding
out until 4.11.1 is probably better, given the invasiveness of this.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
	Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	kernel-team-b10kYP2dOMg@public.gmane.org
Subject: Re: [PATCH] mm: vmscan: fix IO/refault regression in cache workingset transition
Date: Tue, 4 Apr 2017 18:29:52 -0400	[thread overview]
Message-ID: <20170404222952.GA28930@cmpxchg.org> (raw)
In-Reply-To: <20170404150703.742c49d73921df6369ed3dbd-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>

On Tue, Apr 04, 2017 at 03:07:03PM -0700, Andrew Morton wrote:
> On Tue,  4 Apr 2017 18:00:52 -0400 Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:
> 
> > Since 59dc76b0d4df ("mm: vmscan: reduce size of inactive file list")
> > we noticed bigger IO spikes during changes in cache access patterns.
> > 
> > The patch in question shrunk the inactive list size to leave more room
> > for the current workingset in the presence of streaming IO. However,
> > workingset transitions that previously happened on the inactive list
> > are now pushed out of memory and incur more refaults to complete.
> > 
> > This patch disables active list protection when refaults are being
> > observed. This accelerates workingset transitions, and allows more of
> > the new set to establish itself from memory, without eating into the
> > ability to protect the established workingset during stable periods.
> > 
> > Fixes: 59dc76b0d4df ("mm: vmscan: reduce size of inactive file list")
> > Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> > Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 4.7+
> 
> That's a pretty large patch and the problem has been there for a year. 
> I'm not sure that it's 4.11 material, let alone -stable.  Care to
> explain further?

The problem statement is a little terse, my apologies.

The workloads that were measurably affected for us were hit pretty bad
by it, with refault/majfault rates doubling and tripling during cache
transitions, and the machines sustaining half-hour periods of 100% IO
utilization, where they'd previously have sub-minute peaks at 60-90%.

Stateful services that handle user data tend to be more conservative
with kernel upgrades. As a result we hit most page cache issues with
some delay, as was the case here.

The severity seemed to warrant a stable tag, but I agree that holding
out until 4.11.1 is probably better, given the invasiveness of this.

  reply	other threads:[~2017-04-04 22:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-04 22:00 [PATCH] mm: vmscan: fix IO/refault regression in cache workingset transition Johannes Weiner
2017-04-04 22:00 ` Johannes Weiner
2017-04-04 22:07 ` Andrew Morton
2017-04-04 22:07   ` Andrew Morton
2017-04-04 22:29   ` Johannes Weiner [this message]
2017-04-04 22:29     ` Johannes Weiner
2017-04-04 22:29     ` Johannes Weiner
2017-04-05 22:11 ` Rik van Riel
2017-04-05 22:11   ` Rik van Riel
2017-04-06 14:49   ` Johannes Weiner
2017-04-06 14:49     ` Johannes Weiner
2017-04-06 14:49     ` Johannes Weiner
2017-04-06 16:51     ` Rik van Riel
2017-04-06 16:51       ` Rik van Riel
2017-04-06 16:51       ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170404222952.GA28930@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=riel@redhat.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.