All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: <linux-mm@kvack.org>
Cc: <linux-kernel@vger.kernel.org>, <kernel-team@fb.com>,
	Shakeel Butt <shakeelb@google.com>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Rik van Riel <riel@surriel.com>,
	Konstantin Khlebnikov <koct9i@gmail.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: [PATCH v3 3/3] mm: don't miss the last page because of round-off error
Date: Mon, 27 Aug 2018 09:26:21 -0700	[thread overview]
Message-ID: <20180827162621.30187-3-guro@fb.com> (raw)
In-Reply-To: <20180827162621.30187-1-guro@fb.com>

I've noticed, that dying memory cgroups are  often pinned
in memory by a single pagecache page. Even under moderate
memory pressure they sometimes stayed in such state
for a long time. That looked strange.

My investigation showed that the problem is caused by
applying the LRU pressure balancing math:

  scan = div64_u64(scan * fraction[lru], denominator),

where

  denominator = fraction[anon] + fraction[file] + 1.

Because fraction[lru] is always less than denominator,
if the initial scan size is 1, the result is always 0.

This means the last page is not scanned and has
no chances to be reclaimed.

Fix this by rounding up the result of the division.

In practice this change significantly improves the speed
of dying cgroups reclaim.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
---
 include/linux/math64.h | 2 ++
 mm/vmscan.c            | 6 ++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/math64.h b/include/linux/math64.h
index 837f2f2d1d34..94af3d9c73e7 100644
--- a/include/linux/math64.h
+++ b/include/linux/math64.h
@@ -281,4 +281,6 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
 }
 #endif /* mul_u64_u32_div */
 
+#define DIV64_U64_ROUND_UP(ll, d)	div64_u64((ll) + (d) - 1, (d))
+
 #endif /* _LINUX_MATH64_H */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d649b242b989..2c67a0121c6d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2446,9 +2446,11 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 			/*
 			 * Scan types proportional to swappiness and
 			 * their relative recent reclaim efficiency.
+			 * Make sure we don't miss the last page
+			 * because of a round-off error.
 			 */
-			scan = div64_u64(scan * fraction[file],
-					 denominator);
+			scan = DIV64_U64_ROUND_UP(scan * fraction[file],
+						  denominator);
 			break;
 		case SCAN_FILE:
 		case SCAN_ANON:
-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Roman Gushchin <guro@fb.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com,
	Shakeel Butt <shakeelb@google.com>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Rik van Riel <riel@surriel.com>,
	Konstantin Khlebnikov <koct9i@gmail.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: [PATCH v3 3/3] mm: don't miss the last page because of round-off error
Date: Mon, 27 Aug 2018 09:26:21 -0700	[thread overview]
Message-ID: <20180827162621.30187-3-guro@fb.com> (raw)
In-Reply-To: <20180827162621.30187-1-guro@fb.com>

I've noticed, that dying memory cgroups are  often pinned
in memory by a single pagecache page. Even under moderate
memory pressure they sometimes stayed in such state
for a long time. That looked strange.

My investigation showed that the problem is caused by
applying the LRU pressure balancing math:

  scan = div64_u64(scan * fraction[lru], denominator),

where

  denominator = fraction[anon] + fraction[file] + 1.

Because fraction[lru] is always less than denominator,
if the initial scan size is 1, the result is always 0.

This means the last page is not scanned and has
no chances to be reclaimed.

Fix this by rounding up the result of the division.

In practice this change significantly improves the speed
of dying cgroups reclaim.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
---
 include/linux/math64.h | 2 ++
 mm/vmscan.c            | 6 ++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/math64.h b/include/linux/math64.h
index 837f2f2d1d34..94af3d9c73e7 100644
--- a/include/linux/math64.h
+++ b/include/linux/math64.h
@@ -281,4 +281,6 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
 }
 #endif /* mul_u64_u32_div */
 
+#define DIV64_U64_ROUND_UP(ll, d)	div64_u64((ll) + (d) - 1, (d))
+
 #endif /* _LINUX_MATH64_H */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d649b242b989..2c67a0121c6d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2446,9 +2446,11 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 			/*
 			 * Scan types proportional to swappiness and
 			 * their relative recent reclaim efficiency.
+			 * Make sure we don't miss the last page
+			 * because of a round-off error.
 			 */
-			scan = div64_u64(scan * fraction[file],
-					 denominator);
+			scan = DIV64_U64_ROUND_UP(scan * fraction[file],
+						  denominator);
 			break;
 		case SCAN_FILE:
 		case SCAN_ANON:
-- 
2.17.1

  parent reply	other threads:[~2018-08-27 16:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-27 16:26 [PATCH v3 1/3] mm: rework memcg kernel stack accounting Roman Gushchin
2018-08-27 16:26 ` Roman Gushchin
2018-08-27 16:26 ` [PATCH v3 2/3] mm: drain memcg stocks on css offlining Roman Gushchin
2018-08-27 16:26   ` Roman Gushchin
2018-08-27 16:26 ` Roman Gushchin [this message]
2018-08-27 16:26   ` [PATCH v3 3/3] mm: don't miss the last page because of round-off error Roman Gushchin
2018-08-27 21:04   ` Andrew Morton
2018-08-27 21:04     ` Andrew Morton
2018-08-27 23:24     ` Roman Gushchin
2018-08-27 23:24       ` Roman Gushchin
2018-08-29 21:33     ` Roman Gushchin
2018-08-29 21:33       ` Roman Gushchin
2018-09-05 21:08       ` Andrew Morton
2018-09-05 21:08         ` Andrew Morton
2018-08-27 21:01 ` [PATCH v3 1/3] mm: rework memcg kernel stack accounting Andrew Morton
2018-08-27 21:01   ` Andrew Morton
2018-08-27 23:19   ` Roman Gushchin
2018-08-27 23:19     ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180827162621.30187-3-guro@fb.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=koct9i@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=riel@surriel.com \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.