From: Vinayak Menon <vinmenon@codeaurora.org> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, vdavydov@parallels.com, mhocko@suse.cz, mgorman@suse.de, minchan@kernel.org, Vinayak Menon <vinmenon@codeaurora.org> Subject: [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated Date: Wed, 14 Jan 2015 17:06:59 +0530 [thread overview] Message-ID: <1421235419-30736-1-git-send-email-vinmenon@codeaurora.org> (raw) It is observed that sometimes multiple tasks get blocked for long in the congestion_wait loop below, in shrink_inactive_list. This is because of vm_stat values not being synced. (__schedule) from [<c0a03328>] (schedule_timeout) from [<c0a04940>] (io_schedule_timeout) from [<c01d585c>] (congestion_wait) from [<c01cc9d8>] (shrink_inactive_list) from [<c01cd034>] (shrink_zone) from [<c01cdd08>] (try_to_free_pages) from [<c01c442c>] (__alloc_pages_nodemask) from [<c01f1884>] (new_slab) from [<c09fcf60>] (__slab_alloc) from [<c01f1a6c>] In one such instance, zone_page_state(zone, NR_ISOLATED_FILE) had returned 14, zone_page_state(zone, NR_INACTIVE_FILE) returned 92, and GFP_IOFS was set, and this resulted in too_many_isolated returning true. But one of the CPU's pageset vm_stat_diff had NR_ISOLATED_FILE as "-14". So the actual isolated count was zero. As there weren't any more updates to NR_ISOLATED_FILE and vmstat_update deffered work had not been scheduled yet, 7 tasks were spinning in the congestion wait loop for around 4 seconds, in the direct reclaim path. This patch uses zone_page_state_snapshot instead, but restricts its usage to avoid performance penalty. Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> --- mm/vmscan.c | 56 +++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 37 insertions(+), 19 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 5e8772b..266551f 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1392,6 +1392,32 @@ int isolate_lru_page(struct page *page) return ret; } +static int __too_many_isolated(struct zone *zone, int file, + struct scan_control *sc, int safe) +{ + unsigned long inactive, isolated; + + if (safe) { + inactive = zone_page_state_snapshot(zone, + NR_INACTIVE_ANON + 2 * file); + isolated = zone_page_state_snapshot(zone, + NR_ISOLATED_ANON + file); + } else { + inactive = zone_page_state(zone, NR_INACTIVE_ANON + 2 * file); + isolated = zone_page_state(zone, NR_ISOLATED_ANON + file); + } + + /* + * GFP_NOIO/GFP_NOFS callers are allowed to isolate more pages, so they + * won't get blocked by normal direct-reclaimers, forming a circular + * deadlock. + */ + if ((sc->gfp_mask & GFP_IOFS) == GFP_IOFS) + inactive >>= 3; + + return isolated > inactive; +} + /* * A direct reclaimer may isolate SWAP_CLUSTER_MAX pages from the LRU list and * then get resheduled. When there are massive number of tasks doing page @@ -1400,33 +1426,22 @@ int isolate_lru_page(struct page *page) * unnecessary swapping, thrashing and OOM. */ static int too_many_isolated(struct zone *zone, int file, - struct scan_control *sc) + struct scan_control *sc, int safe) { - unsigned long inactive, isolated; - if (current_is_kswapd()) return 0; if (!global_reclaim(sc)) return 0; - if (file) { - inactive = zone_page_state(zone, NR_INACTIVE_FILE); - isolated = zone_page_state(zone, NR_ISOLATED_FILE); - } else { - inactive = zone_page_state(zone, NR_INACTIVE_ANON); - isolated = zone_page_state(zone, NR_ISOLATED_ANON); + if (unlikely(__too_many_isolated(zone, file, sc, 0))) { + if (safe) + return __too_many_isolated(zone, file, sc, safe); + else + return 1; } - /* - * GFP_NOIO/GFP_NOFS callers are allowed to isolate more pages, so they - * won't get blocked by normal direct-reclaimers, forming a circular - * deadlock. - */ - if ((sc->gfp_mask & GFP_IOFS) == GFP_IOFS) - inactive >>= 3; - - return isolated > inactive; + return 0; } static noinline_for_stack void @@ -1516,15 +1531,18 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, unsigned long nr_immediate = 0; isolate_mode_t isolate_mode = 0; int file = is_file_lru(lru); + int safe = 0; struct zone *zone = lruvec_zone(lruvec); struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat; - while (unlikely(too_many_isolated(zone, file, sc))) { + while (unlikely(too_many_isolated(zone, file, sc, safe))) { congestion_wait(BLK_RW_ASYNC, HZ/10); /* We are about to die and free our memory. Return now. */ if (fatal_signal_pending(current)) return SWAP_CLUSTER_MAX; + + safe = 1; } lru_add_drain(); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation
WARNING: multiple messages have this Message-ID (diff)
From: Vinayak Menon <vinmenon@codeaurora.org> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, vdavydov@parallels.com, mhocko@suse.cz, mgorman@suse.de, minchan@kernel.org, Vinayak Menon <vinmenon@codeaurora.org> Subject: [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated Date: Wed, 14 Jan 2015 17:06:59 +0530 [thread overview] Message-ID: <1421235419-30736-1-git-send-email-vinmenon@codeaurora.org> (raw) It is observed that sometimes multiple tasks get blocked for long in the congestion_wait loop below, in shrink_inactive_list. This is because of vm_stat values not being synced. (__schedule) from [<c0a03328>] (schedule_timeout) from [<c0a04940>] (io_schedule_timeout) from [<c01d585c>] (congestion_wait) from [<c01cc9d8>] (shrink_inactive_list) from [<c01cd034>] (shrink_zone) from [<c01cdd08>] (try_to_free_pages) from [<c01c442c>] (__alloc_pages_nodemask) from [<c01f1884>] (new_slab) from [<c09fcf60>] (__slab_alloc) from [<c01f1a6c>] In one such instance, zone_page_state(zone, NR_ISOLATED_FILE) had returned 14, zone_page_state(zone, NR_INACTIVE_FILE) returned 92, and GFP_IOFS was set, and this resulted in too_many_isolated returning true. But one of the CPU's pageset vm_stat_diff had NR_ISOLATED_FILE as "-14". So the actual isolated count was zero. As there weren't any more updates to NR_ISOLATED_FILE and vmstat_update deffered work had not been scheduled yet, 7 tasks were spinning in the congestion wait loop for around 4 seconds, in the direct reclaim path. This patch uses zone_page_state_snapshot instead, but restricts its usage to avoid performance penalty. Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> --- mm/vmscan.c | 56 +++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 37 insertions(+), 19 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 5e8772b..266551f 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1392,6 +1392,32 @@ int isolate_lru_page(struct page *page) return ret; } +static int __too_many_isolated(struct zone *zone, int file, + struct scan_control *sc, int safe) +{ + unsigned long inactive, isolated; + + if (safe) { + inactive = zone_page_state_snapshot(zone, + NR_INACTIVE_ANON + 2 * file); + isolated = zone_page_state_snapshot(zone, + NR_ISOLATED_ANON + file); + } else { + inactive = zone_page_state(zone, NR_INACTIVE_ANON + 2 * file); + isolated = zone_page_state(zone, NR_ISOLATED_ANON + file); + } + + /* + * GFP_NOIO/GFP_NOFS callers are allowed to isolate more pages, so they + * won't get blocked by normal direct-reclaimers, forming a circular + * deadlock. + */ + if ((sc->gfp_mask & GFP_IOFS) == GFP_IOFS) + inactive >>= 3; + + return isolated > inactive; +} + /* * A direct reclaimer may isolate SWAP_CLUSTER_MAX pages from the LRU list and * then get resheduled. When there are massive number of tasks doing page @@ -1400,33 +1426,22 @@ int isolate_lru_page(struct page *page) * unnecessary swapping, thrashing and OOM. */ static int too_many_isolated(struct zone *zone, int file, - struct scan_control *sc) + struct scan_control *sc, int safe) { - unsigned long inactive, isolated; - if (current_is_kswapd()) return 0; if (!global_reclaim(sc)) return 0; - if (file) { - inactive = zone_page_state(zone, NR_INACTIVE_FILE); - isolated = zone_page_state(zone, NR_ISOLATED_FILE); - } else { - inactive = zone_page_state(zone, NR_INACTIVE_ANON); - isolated = zone_page_state(zone, NR_ISOLATED_ANON); + if (unlikely(__too_many_isolated(zone, file, sc, 0))) { + if (safe) + return __too_many_isolated(zone, file, sc, safe); + else + return 1; } - /* - * GFP_NOIO/GFP_NOFS callers are allowed to isolate more pages, so they - * won't get blocked by normal direct-reclaimers, forming a circular - * deadlock. - */ - if ((sc->gfp_mask & GFP_IOFS) == GFP_IOFS) - inactive >>= 3; - - return isolated > inactive; + return 0; } static noinline_for_stack void @@ -1516,15 +1531,18 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, unsigned long nr_immediate = 0; isolate_mode_t isolate_mode = 0; int file = is_file_lru(lru); + int safe = 0; struct zone *zone = lruvec_zone(lruvec); struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat; - while (unlikely(too_many_isolated(zone, file, sc))) { + while (unlikely(too_many_isolated(zone, file, sc, safe))) { congestion_wait(BLK_RW_ASYNC, HZ/10); /* We are about to die and free our memory. Return now. */ if (fatal_signal_pending(current)) return SWAP_CLUSTER_MAX; + + safe = 1; } lru_add_drain(); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2015-01-14 11:37 UTC|newest] Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-01-14 11:36 Vinayak Menon [this message] 2015-01-14 11:36 ` [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated Vinayak Menon 2015-01-14 16:50 ` Michal Hocko 2015-01-14 16:50 ` Michal Hocko 2015-01-15 17:24 ` Vinayak Menon 2015-01-15 17:24 ` Vinayak Menon 2015-01-16 15:49 ` Michal Hocko 2015-01-16 15:49 ` Michal Hocko 2015-01-16 17:57 ` Michal Hocko 2015-01-16 17:57 ` Michal Hocko 2015-01-16 19:17 ` Christoph Lameter 2015-01-16 19:17 ` Christoph Lameter 2015-01-17 15:18 ` Vinayak Menon 2015-01-17 15:18 ` Vinayak Menon 2015-01-17 19:48 ` Christoph Lameter 2015-01-17 19:48 ` Christoph Lameter 2015-01-19 4:27 ` Vinayak Menon 2015-01-19 4:27 ` Vinayak Menon 2015-01-21 14:39 ` Michal Hocko 2015-01-21 14:39 ` Michal Hocko 2015-01-22 15:16 ` Vlastimil Babka 2015-01-22 15:16 ` Vlastimil Babka 2015-01-22 16:11 ` Christoph Lameter 2015-01-22 16:11 ` Christoph Lameter 2015-01-26 17:46 ` Michal Hocko 2015-01-26 17:46 ` Michal Hocko 2015-01-26 18:35 ` Christoph Lameter 2015-01-26 18:35 ` Christoph Lameter 2015-01-27 10:52 ` Michal Hocko 2015-01-27 10:52 ` Michal Hocko 2015-01-27 16:59 ` Christoph Lameter 2015-01-27 16:59 ` Christoph Lameter 2015-01-30 15:28 ` Michal Hocko 2015-01-30 15:28 ` Michal Hocko 2015-01-26 17:28 ` Michal Hocko 2015-01-26 17:28 ` Michal Hocko 2015-01-26 18:35 ` Christoph Lameter 2015-01-26 18:35 ` Christoph Lameter 2015-01-26 22:11 ` Andrew Morton 2015-01-26 22:11 ` Andrew Morton 2015-01-27 10:41 ` Michal Hocko 2015-01-27 10:41 ` Michal Hocko 2015-01-27 10:33 ` Vinayak Menon 2015-01-27 10:33 ` Vinayak Menon 2015-01-27 10:45 ` Michal Hocko 2015-01-27 10:45 ` Michal Hocko 2015-01-29 17:32 ` Christoph Lameter 2015-01-29 17:32 ` Christoph Lameter 2015-01-30 15:27 ` Michal Hocko 2015-01-30 15:27 ` Michal Hocko 2015-01-16 1:17 ` Andrew Morton 2015-01-16 1:17 ` Andrew Morton 2015-01-16 5:10 ` Vinayak Menon 2015-01-16 5:10 ` Vinayak Menon 2015-01-17 16:29 ` Vinayak Menon 2015-01-17 16:29 ` Vinayak Menon 2015-02-11 22:14 ` Andrew Morton 2015-02-11 22:14 ` Andrew Morton 2015-02-12 16:19 ` Vlastimil Babka 2015-02-12 16:19 ` Vlastimil Babka
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1421235419-30736-1-git-send-email-vinmenon@codeaurora.org \ --to=vinmenon@codeaurora.org \ --cc=akpm@linux-foundation.org \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=mhocko@suse.cz \ --cc=minchan@kernel.org \ --cc=vdavydov@parallels.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.