linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>, Tejun Heo <tj@kernel.org>,
	Rik van Riel <riel@surriel.com>,
	Konstantin Khlebnikov <koct9i@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Sasha Levin <sashal@kernel.org>,
	linux-mm@kvack.org
Subject: [PATCH AUTOSEL 4.18 38/39] mm: don't miss the last page because of round-off error
Date: Tue, 13 Nov 2018 00:50:52 -0500	[thread overview]
Message-ID: <20181113055053.78352-38-sashal@kernel.org> (raw)
In-Reply-To: <20181113055053.78352-1-sashal@kernel.org>

From: Roman Gushchin <guro@fb.com>

[ Upstream commit 68600f623d69da428c6163275f97ca126e1a8ec5 ]

I've noticed, that dying memory cgroups are often pinned in memory by a
single pagecache page.  Even under moderate memory pressure they sometimes
stayed in such state for a long time.  That looked strange.

My investigation showed that the problem is caused by applying the LRU
pressure balancing math:

  scan = div64_u64(scan * fraction[lru], denominator),

where

  denominator = fraction[anon] + fraction[file] + 1.

Because fraction[lru] is always less than denominator, if the initial scan
size is 1, the result is always 0.

This means the last page is not scanned and has
no chances to be reclaimed.

Fix this by rounding up the result of the division.

In practice this change significantly improves the speed of dying cgroups
reclaim.

[guro@fb.com: prevent double calculation of DIV64_U64_ROUND_UP() arguments]
  Link: http://lkml.kernel.org/r/20180829213311.GA13501@castle
Link: http://lkml.kernel.org/r/20180827162621.30187-3-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/math64.h | 3 +++
 mm/vmscan.c            | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/math64.h b/include/linux/math64.h
index 837f2f2d1d34..bb2c84afb80c 100644
--- a/include/linux/math64.h
+++ b/include/linux/math64.h
@@ -281,4 +281,7 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
 }
 #endif /* mul_u64_u32_div */
 
+#define DIV64_U64_ROUND_UP(ll, d)	\
+	({ u64 _tmp = (d); div64_u64((ll) + _tmp - 1, _tmp); })
+
 #endif /* _LINUX_MATH64_H */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 03822f86f288..7b94e33823b5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2287,9 +2287,11 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 			/*
 			 * Scan types proportional to swappiness and
 			 * their relative recent reclaim efficiency.
+			 * Make sure we don't miss the last page
+			 * because of a round-off error.
 			 */
-			scan = div64_u64(scan * fraction[file],
-					 denominator);
+			scan = DIV64_U64_ROUND_UP(scan * fraction[file],
+						  denominator);
 			break;
 		case SCAN_FILE:
 		case SCAN_ANON:
-- 
2.17.1


  parent reply	other threads:[~2018-11-13  5:51 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-13  5:50 [PATCH AUTOSEL 4.18 01/39] bfs: add sanity check at bfs_fill_super() Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 02/39] cifs: don't dereference smb_file_target before null check Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 03/39] cifs: fix return value for cifs_listxattr Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 04/39] arm64: kprobe: make page to RO mode when allocate it Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 05/39] nvme-pci: fix conflicting p2p resource adds Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 06/39] block: brd: associate with queue until adding disk Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 07/39] net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset() Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 08/39] net: hns3: bugfix for rtnl_lock's range in the hclge_reset() Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 09/39] net: hns3: bugfix for the initialization of command queue's spin lock Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 10/39] ixgbe: fix MAC anti-spoofing filter after VFLR Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 11/39] mm: Fix warning in insert_pfn() Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 12/39] mm/memory_hotplug: make add_memory() take the device_hotplug_lock Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 13/39] reiserfs: propagate errors from fill_with_dentries() properly Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 14/39] hfs: prevent btree data loss on root split Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 15/39] hfsplus: " Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 16/39] mm/gup_benchmark.c: prevent integer overflow in ioctl Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 17/39] perf unwind: Take pgoff into account when reporting elf to libdwfl Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 18/39] um: Give start_idle_thread() a return code Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 19/39] drm/edid: Add 6 bpc quirk for BOE panel Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 20/39] afs: Handle EIO from delivery function Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 21/39] platform/x86: intel_telemetry: report debugfs failure Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 22/39] clk: fixed-rate: fix of_node_get-put imbalance Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 23/39] perf symbols: Set PLT entry/header sizes properly on Sparc Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 24/39] fs/exofs: fix potential memory leak in mount option parsing Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 25/39] clk: samsung: exynos5420: Enable PERIS clocks for suspend Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 26/39] apparmor: Fix uninitialized value in aa_split_fqname Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 27/39] x86/earlyprintk: Add a force option for pciserial device Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 28/39] platform/x86: acerhdf: Add BIOS entry for Gateway LT31 v1.3307 Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 29/39] arm64: percpu: Initialize ret in the default case Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 30/39] clk: meson: clk-pll: drop CLK_GET_RATE_NOCACHE where unnecessary Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 31/39] clk: sunxi-ng: sun50i: h6: Add 2x fixed post-divider to MMC module clocks Sasha Levin
2018-11-13 12:27   ` Icenowy Zheng
2018-11-22 19:35     ` Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 32/39] clk: ti: fix OF child-node lookup Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 33/39] mm: thp: fix MADV_DONTNEED vs migrate_misplaced_transhuge_page race condition Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 34/39] mm: thp: fix mmu_notifier in migrate_misplaced_transhuge_page() Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 35/39] mm: calculate deferred pages after skipping mirrored memory Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 36/39] mm/vmstat.c: assert that vmstat_text is in sync with stat_items_size Sasha Levin
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 37/39] userfaultfd: allow get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) to trigger userfaults Sasha Levin
2018-11-13  5:50 ` Sasha Levin [this message]
2018-11-13  5:50 ` [PATCH AUTOSEL 4.18 39/39] mm: don't warn about large allocations for slab Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181113055053.78352-38-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=koct9i@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=riel@surriel.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).