linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Minchan Kim <minchan@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch v2] mm, vmscan: avoid thrashing anon lru when free + file is low
Date: Wed, 3 May 2017 10:49:52 +0200	[thread overview]
Message-ID: <20170503084952.GD8836@dhcp22.suse.cz> (raw)
In-Reply-To: <20170503070656.GA8836@dhcp22.suse.cz>

On Wed 03-05-17 09:06:56, Michal Hocko wrote:
[...]
> This is still untested but should be much closer to what I've had in
> mind.
> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 24efcc20af91..bcdad30f942d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2174,8 +2174,14 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
>  		}
>  
>  		if (unlikely(pgdatfile + pgdatfree <= total_high_wmark)) {
> -			scan_balance = SCAN_ANON;
> -			goto out;
> +			unsigned long pgdatanon;
> +
> +			pgdatanon = node_page_state(pgdat, NR_ACTIVE_ANON) +
> +				node_page_state(pgdat, NR_INACTIVE_ANON);
> +			if (pgdatanon + pgdatfree > total_high_wmark) {
> +				scan_balance = SCAN_ANON;
> +				goto out;
> +			}
>  		}
>  	}

I've realized that this just makes the situation more obscure than
necessary after thinking some more about it. It also doesn't achieve my
original intention to treat biased anon and file LRUs the same. Now that
I've digested the change more thoroughly I am willing to ack your patch.
So feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>
And sorry about the diversion here but I am always nervous when touching
g_s_c because this tends to lead to subtle issues.

Maybe we could make this aspect of the biased LRUs more explicit by
doing the following rather than duplicating the condition. What do you
think?
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 24efcc20af91..f3ec8760dc06 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2113,16 +2113,14 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 	u64 denominator = 0;	/* gcc */
 	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
 	unsigned long anon_prio, file_prio;
-	enum scan_balance scan_balance;
+	enum scan_balance scan_balance = SCAN_FILE;
 	unsigned long anon, file;
 	unsigned long ap, fp;
 	enum lru_list lru;
 
 	/* If we have no swap space, do not bother scanning anon pages. */
-	if (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) <= 0) {
-		scan_balance = SCAN_FILE;
+	if (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) <= 0)
 		goto out;
-	}
 
 	/*
 	 * Global reclaim will swap to prevent OOM even with no
@@ -2131,10 +2129,8 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 	 * using the memory controller's swap limit feature would be
 	 * too expensive.
 	 */
-	if (!global_reclaim(sc) && !swappiness) {
-		scan_balance = SCAN_FILE;
+	if (!global_reclaim(sc) && !swappiness)
 		goto out;
-	}
 
 	/*
 	 * Do not apply any pressure balancing cleverness when the
@@ -2147,8 +2143,9 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 	}
 
 	/*
-	 * Prevent the reclaimer from falling into the cache trap: as
-	 * cache pages start out inactive, every cache fault will tip
+	 * We usually want to bias page cache reclaim over anonymous
+	 * memory. Prevent the reclaimer from falling into the cache trap:
+	 * as cache pages start out inactive, every cache fault will tip
 	 * the scan balance towards the file LRU.  And as the file LRU
 	 * shrinks, so does the window for rotation from references.
 	 * This means we have a runaway feedback loop where a tiny
@@ -2173,26 +2170,24 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 			total_high_wmark += high_wmark_pages(zone);
 		}
 
-		if (unlikely(pgdatfile + pgdatfree <= total_high_wmark)) {
+		if (unlikely(pgdatfile + pgdatfree <= total_high_wmark))
 			scan_balance = SCAN_ANON;
-			goto out;
-		}
 	}
 
 	/*
-	 * If there is enough inactive page cache, i.e. if the size of the
-	 * inactive list is greater than that of the active list *and* the
-	 * inactive list actually has some pages to scan on this priority, we
-	 * do not reclaim anything from the anonymous working set right now.
-	 * Without the second condition we could end up never scanning an
-	 * lruvec even if it has plenty of old anonymous pages unless the
-	 * system is under heavy pressure.
+	 * Make sure there are enough pages on the biased LRU before we go
+	 * and do an exclusive reclaim from that list, i.e. if the
+	 * size of the inactive list is greater than that of the active list
+	 * *and* the inactive list actually has some pages to scan on this
+	 * priority.
+	 * Without the second condition we could end up never scanning other
+	 * lruvecs even if they have plenty of old pages unless the system is
+	 * under heavy pressure.
 	 */
-	if (!inactive_list_is_low(lruvec, true, memcg, sc, false) &&
-	    lruvec_lru_size(lruvec, LRU_INACTIVE_FILE, sc->reclaim_idx) >> sc->priority) {
-		scan_balance = SCAN_FILE;
+	lru = LRU_INACTIVE_ANON + LRU_FILE * (scan_balance == SCAN_FILE);
+	if (!inactive_list_is_low(lruvec, is_file_lru(lru), memcg, sc, false) &&
+	    lruvec_lru_size(lruvec, lru, sc->reclaim_idx) >> sc->priority)
 		goto out;
-	}
 
 	scan_balance = SCAN_FRACT;
 
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2017-05-03  8:50 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-18  0:06 [patch] mm, vmscan: avoid thrashing anon lru when free + file is low David Rientjes
2017-04-18  1:36 ` Minchan Kim
2017-04-18 21:32   ` David Rientjes
2017-04-19  0:14     ` Minchan Kim
2017-04-19 23:24       ` David Rientjes
2017-04-20  6:09         ` Minchan Kim
2017-05-01 21:34           ` [patch v2] " David Rientjes
2017-05-02  8:02             ` Michal Hocko
2017-05-02 20:41               ` David Rientjes
2017-05-03  6:15                 ` Michal Hocko
2017-05-03  7:06                   ` Michal Hocko
2017-05-03  8:49                     ` Michal Hocko [this message]
2017-05-03 22:52                       ` David Rientjes
2017-05-04 11:43                         ` Michal Hocko
2017-05-31 15:20             ` Michal Hocko
2017-06-02 20:36             ` Andrew Morton
2017-06-04 22:27               ` David Rientjes
2017-04-19  7:04     ` [patch] " Michal Hocko
2017-04-18  7:11 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170503084952.GD8836@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=minchan@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).