linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <yang.shi@linux.alibaba.com>
To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com,
	hannes@cmpxchg.org, akpm@linux-foundation.org,
	dave.hansen@intel.com, keith.busch@intel.com,
	dan.j.williams@intel.com, fengguang.wu@intel.com,
	fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com
Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: [v2 PATCH 7/9] mm: vmscan: check if the demote target node is contended or not
Date: Thu, 11 Apr 2019 11:56:57 +0800	[thread overview]
Message-ID: <1554955019-29472-8-git-send-email-yang.shi@linux.alibaba.com> (raw)
In-Reply-To: <1554955019-29472-1-git-send-email-yang.shi@linux.alibaba.com>

When demoting to PMEM node, the target node may have memory pressure,
then the memory pressure may cause migrate_pages() fail.

If the failure is caused by memory pressure (i.e. returning -ENOMEM),
tag the node with PGDAT_CONTENDED.  The tag would be cleared once the
target node is balanced again.

Check if the target node is PGDAT_CONTENDED or not, if it is just skip
demotion.

Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 include/linux/mmzone.h |  3 +++
 mm/vmscan.c            | 28 ++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fba7741..de534db 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -520,6 +520,9 @@ enum pgdat_flags {
 					 * many pages under writeback
 					 */
 	PGDAT_RECLAIM_LOCKED,		/* prevents concurrent reclaim */
+	PGDAT_CONTENDED,		/* the node has not enough free memory
+					 * available
+					 */
 };
 
 enum zone_flags {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 80cd624..50cde53 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1048,6 +1048,9 @@ static void page_check_dirty_writeback(struct page *page,
 
 static inline bool is_demote_ok(int nid, struct scan_control *sc)
 {
+	int node;
+	nodemask_t used_mask;
+
 	/* It is pointless to do demotion in memcg reclaim */
 	if (!global_reclaim(sc))
 		return false;
@@ -1060,6 +1063,13 @@ static inline bool is_demote_ok(int nid, struct scan_control *sc)
 	if (!has_cpuless_node_online())
 		return false;
 
+	/* Check if the demote target node is contended or not */
+	nodes_clear(used_mask);
+	node = find_next_best_node(nid, &used_mask, true);
+
+	if (test_bit(PGDAT_CONTENDED, &NODE_DATA(node)->flags))
+		return false;
+
 	return true;
 }
 
@@ -1502,6 +1512,10 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		nr_reclaimed += nr_succeeded;
 
 		if (err) {
+			if (err == -ENOMEM)
+				set_bit(PGDAT_CONTENDED,
+					&NODE_DATA(target_nid)->flags);
+
 			putback_movable_pages(&demote_pages);
 
 			list_splice(&ret_pages, &demote_pages);
@@ -2596,6 +2610,19 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 		 * scan target and the percentage scanning already complete
 		 */
 		lru = (lru == LRU_FILE) ? LRU_BASE : LRU_FILE;
+
+		/*
+		 * The shrink_page_list() may find the demote target node is
+		 * contended, if so it doesn't make sense to scan anonymous
+		 * LRU again.
+		 *
+		 * Need check if swap is available or not too since demotion
+		 * may happen on swapless system.
+		 */
+		if (!is_demote_ok(pgdat->node_id, sc) &&
+		    (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) <= 0))
+			lru = LRU_FILE;
+
 		nr_scanned = targets[lru] - nr[lru];
 		nr[lru] = targets[lru] * (100 - percentage) / 100;
 		nr[lru] -= min(nr[lru], nr_scanned);
@@ -3458,6 +3485,7 @@ static void clear_pgdat_congested(pg_data_t *pgdat)
 	clear_bit(PGDAT_CONGESTED, &pgdat->flags);
 	clear_bit(PGDAT_DIRTY, &pgdat->flags);
 	clear_bit(PGDAT_WRITEBACK, &pgdat->flags);
+	clear_bit(PGDAT_CONTENDED, &pgdat->flags);
 }
 
 /*
-- 
1.8.3.1


  parent reply	other threads:[~2019-04-11  3:58 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11  3:56 [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node Yang Shi
2019-04-11  3:56 ` [v2 PATCH 1/9] mm: define N_CPU_MEM node states Yang Shi
2019-04-11  3:56 ` [v2 PATCH 2/9] mm: page_alloc: make find_next_best_node find return cpuless node Yang Shi
2019-04-11  3:56 ` [v2 PATCH 3/9] mm: numa: promote pages to DRAM when it gets accessed twice Yang Shi
2019-04-11  3:56 ` [v2 PATCH 4/9] mm: migrate: make migrate_pages() return nr_succeeded Yang Shi
2019-04-11  3:56 ` [v2 PATCH 5/9] mm: vmscan: demote anon DRAM pages to PMEM node Yang Shi
2019-04-11 14:31   ` Dave Hansen
2019-04-15 22:10     ` Yang Shi
2019-04-15 22:14       ` Dave Hansen
2019-04-15 22:26         ` Yang Shi
2019-04-11  3:56 ` [v2 PATCH 6/9] mm: vmscan: don't demote for memcg reclaim Yang Shi
2019-04-11  3:56 ` Yang Shi [this message]
2019-04-11 16:06   ` [v2 PATCH 7/9] mm: vmscan: check if the demote target node is contended or not Dave Hansen
2019-04-15 22:06     ` Yang Shi
2019-04-15 22:13       ` Dave Hansen
2019-04-15 22:23         ` Yang Shi
2019-04-11  3:56 ` [v2 PATCH 8/9] mm: vmscan: add page demotion counter Yang Shi
2019-04-11  3:56 ` [v2 PATCH 9/9] mm: numa: add page promotion counter Yang Shi
2019-04-11 14:28 ` [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node Dave Hansen
2019-04-12  8:47 ` Michal Hocko
2019-04-16  0:09   ` Yang Shi
2019-04-16  7:47     ` Michal Hocko
2019-04-16 14:30       ` Dave Hansen
2019-04-16 14:39         ` Michal Hocko
2019-04-16 15:46           ` Dave Hansen
2019-04-16 18:34             ` Michal Hocko
2019-04-16 15:33         ` Zi Yan
2019-04-16 15:55           ` Dave Hansen
2019-04-16 16:12             ` Zi Yan
2019-04-16 19:19       ` Yang Shi
2019-04-16 21:22         ` Dave Hansen
2019-04-16 21:59           ` Yang Shi
2019-04-16 23:04             ` Dave Hansen
2019-04-16 23:17               ` Yang Shi
2019-04-17 15:13                 ` Keith Busch
2019-04-17  9:23           ` Michal Hocko
2019-04-17 15:23             ` Keith Busch
2019-04-17 15:39               ` Michal Hocko
2019-04-17 15:37                 ` Keith Busch
2019-04-17 16:39                   ` Michal Hocko
2019-04-17 17:26                     ` Yang Shi
2019-04-17 17:29                       ` Keith Busch
2019-04-17 17:51                       ` Michal Hocko
2019-04-18 16:24                         ` Yang Shi
2019-04-17 17:13             ` Dave Hansen
2019-04-17 17:57               ` Michal Hocko
2019-04-18 18:16               ` Keith Busch
2019-04-18 19:23                 ` Yang Shi
2019-04-18 21:07                   ` Zi Yan
2019-04-16 23:18         ` Yang Shi
2019-04-17  9:17         ` Michal Hocko
2019-05-01  6:43           ` Fengguang Wu
2019-04-17 20:43         ` Yang Shi
2019-04-18  9:02           ` Michal Hocko
2019-05-01  5:20             ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1554955019-29472-8-git-send-email-yang.shi@linux.alibaba.com \
    --to=yang.shi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=fan.du@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).