All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated
@ 2015-01-14 11:36 ` Vinayak Menon
  0 siblings, 0 replies; 60+ messages in thread
From: Vinayak Menon @ 2015-01-14 11:36 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: akpm, hannes, vdavydov, mhocko, mgorman, minchan, Vinayak Menon

It is observed that sometimes multiple tasks get blocked for long
in the congestion_wait loop below, in shrink_inactive_list. This
is because of vm_stat values not being synced.

(__schedule) from [<c0a03328>]
(schedule_timeout) from [<c0a04940>]
(io_schedule_timeout) from [<c01d585c>]
(congestion_wait) from [<c01cc9d8>]
(shrink_inactive_list) from [<c01cd034>]
(shrink_zone) from [<c01cdd08>]
(try_to_free_pages) from [<c01c442c>]
(__alloc_pages_nodemask) from [<c01f1884>]
(new_slab) from [<c09fcf60>]
(__slab_alloc) from [<c01f1a6c>]

In one such instance, zone_page_state(zone, NR_ISOLATED_FILE)
had returned 14, zone_page_state(zone, NR_INACTIVE_FILE)
returned 92, and GFP_IOFS was set, and this resulted
in too_many_isolated returning true. But one of the CPU's
pageset vm_stat_diff had NR_ISOLATED_FILE as "-14". So the
actual isolated count was zero. As there weren't any more
updates to NR_ISOLATED_FILE and vmstat_update deffered work
had not been scheduled yet, 7 tasks were spinning in the
congestion wait loop for around 4 seconds, in the direct
reclaim path.

This patch uses zone_page_state_snapshot instead, but restricts
its usage to avoid performance penalty.

Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
---
 mm/vmscan.c | 56 +++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 37 insertions(+), 19 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5e8772b..266551f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1392,6 +1392,32 @@ int isolate_lru_page(struct page *page)
 	return ret;
 }
 
+static int __too_many_isolated(struct zone *zone, int file,
+	struct scan_control *sc, int safe)
+{
+	unsigned long inactive, isolated;
+
+	if (safe) {
+		inactive = zone_page_state_snapshot(zone,
+				NR_INACTIVE_ANON + 2 * file);
+		isolated = zone_page_state_snapshot(zone,
+				NR_ISOLATED_ANON + file);
+	} else {
+		inactive = zone_page_state(zone, NR_INACTIVE_ANON + 2 * file);
+		isolated = zone_page_state(zone, NR_ISOLATED_ANON + file);
+	}
+
+	/*
+	 * GFP_NOIO/GFP_NOFS callers are allowed to isolate more pages, so they
+	 * won't get blocked by normal direct-reclaimers, forming a circular
+	 * deadlock.
+	 */
+	if ((sc->gfp_mask & GFP_IOFS) == GFP_IOFS)
+		inactive >>= 3;
+
+	return isolated > inactive;
+}
+
 /*
  * A direct reclaimer may isolate SWAP_CLUSTER_MAX pages from the LRU list and
  * then get resheduled. When there are massive number of tasks doing page
@@ -1400,33 +1426,22 @@ int isolate_lru_page(struct page *page)
  * unnecessary swapping, thrashing and OOM.
  */
 static int too_many_isolated(struct zone *zone, int file,
-		struct scan_control *sc)
+		struct scan_control *sc, int safe)
 {
-	unsigned long inactive, isolated;
-
 	if (current_is_kswapd())
 		return 0;
 
 	if (!global_reclaim(sc))
 		return 0;
 
-	if (file) {
-		inactive = zone_page_state(zone, NR_INACTIVE_FILE);
-		isolated = zone_page_state(zone, NR_ISOLATED_FILE);
-	} else {
-		inactive = zone_page_state(zone, NR_INACTIVE_ANON);
-		isolated = zone_page_state(zone, NR_ISOLATED_ANON);
+	if (unlikely(__too_many_isolated(zone, file, sc, 0))) {
+		if (safe)
+			return __too_many_isolated(zone, file, sc, safe);
+		else
+			return 1;
 	}
 
-	/*
-	 * GFP_NOIO/GFP_NOFS callers are allowed to isolate more pages, so they
-	 * won't get blocked by normal direct-reclaimers, forming a circular
-	 * deadlock.
-	 */
-	if ((sc->gfp_mask & GFP_IOFS) == GFP_IOFS)
-		inactive >>= 3;
-
-	return isolated > inactive;
+	return 0;
 }
 
 static noinline_for_stack void
@@ -1516,15 +1531,18 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	unsigned long nr_immediate = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
+	int safe = 0;
 	struct zone *zone = lruvec_zone(lruvec);
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 
-	while (unlikely(too_many_isolated(zone, file, sc))) {
+	while (unlikely(too_many_isolated(zone, file, sc, safe))) {
 		congestion_wait(BLK_RW_ASYNC, HZ/10);
 
 		/* We are about to die and free our memory. Return now. */
 		if (fatal_signal_pending(current))
 			return SWAP_CLUSTER_MAX;
+
+		safe = 1;
 	}
 
 	lru_add_drain();
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2015-02-12 16:19 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-14 11:36 [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated Vinayak Menon
2015-01-14 11:36 ` Vinayak Menon
2015-01-14 16:50 ` Michal Hocko
2015-01-14 16:50   ` Michal Hocko
2015-01-15 17:24   ` Vinayak Menon
2015-01-15 17:24     ` Vinayak Menon
2015-01-16 15:49     ` Michal Hocko
2015-01-16 15:49       ` Michal Hocko
2015-01-16 17:57       ` Michal Hocko
2015-01-16 17:57         ` Michal Hocko
2015-01-16 19:17         ` Christoph Lameter
2015-01-16 19:17           ` Christoph Lameter
2015-01-17 15:18       ` Vinayak Menon
2015-01-17 15:18         ` Vinayak Menon
2015-01-17 19:48         ` Christoph Lameter
2015-01-17 19:48           ` Christoph Lameter
2015-01-19  4:27           ` Vinayak Menon
2015-01-19  4:27             ` Vinayak Menon
2015-01-21 14:39             ` Michal Hocko
2015-01-21 14:39               ` Michal Hocko
2015-01-22 15:16               ` Vlastimil Babka
2015-01-22 15:16                 ` Vlastimil Babka
2015-01-22 16:11               ` Christoph Lameter
2015-01-22 16:11                 ` Christoph Lameter
2015-01-26 17:46                 ` Michal Hocko
2015-01-26 17:46                   ` Michal Hocko
2015-01-26 18:35                   ` Christoph Lameter
2015-01-26 18:35                     ` Christoph Lameter
2015-01-27 10:52                     ` Michal Hocko
2015-01-27 10:52                       ` Michal Hocko
2015-01-27 16:59                       ` Christoph Lameter
2015-01-27 16:59                         ` Christoph Lameter
2015-01-30 15:28                         ` Michal Hocko
2015-01-30 15:28                           ` Michal Hocko
2015-01-26 17:28           ` Michal Hocko
2015-01-26 17:28             ` Michal Hocko
2015-01-26 18:35             ` Christoph Lameter
2015-01-26 18:35               ` Christoph Lameter
2015-01-26 22:11             ` Andrew Morton
2015-01-26 22:11               ` Andrew Morton
2015-01-27 10:41               ` Michal Hocko
2015-01-27 10:41                 ` Michal Hocko
2015-01-27 10:33             ` Vinayak Menon
2015-01-27 10:33               ` Vinayak Menon
2015-01-27 10:45               ` Michal Hocko
2015-01-27 10:45                 ` Michal Hocko
2015-01-29 17:32       ` Christoph Lameter
2015-01-29 17:32         ` Christoph Lameter
2015-01-30 15:27         ` Michal Hocko
2015-01-30 15:27           ` Michal Hocko
2015-01-16  1:17 ` Andrew Morton
2015-01-16  1:17   ` Andrew Morton
2015-01-16  5:10   ` Vinayak Menon
2015-01-16  5:10     ` Vinayak Menon
2015-01-17 16:29   ` Vinayak Menon
2015-01-17 16:29     ` Vinayak Menon
2015-02-11 22:14     ` Andrew Morton
2015-02-11 22:14       ` Andrew Morton
2015-02-12 16:19       ` Vlastimil Babka
2015-02-12 16:19         ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.