All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: linux-mm@kvack.org
Cc: Kaiyang Zhao <kaiyang2@cs.cmu.edu>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>,
	David Rientjes <rientjes@google.com>,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: [RFC PATCH 02/26] mm: compaction: avoid GFP_NOFS deadlocks
Date: Tue, 18 Apr 2023 15:12:49 -0400	[thread overview]
Message-ID: <20230418191313.268131-3-hannes@cmpxchg.org> (raw)
In-Reply-To: <20230418191313.268131-1-hannes@cmpxchg.org>

During stress testing, two deadlock scenarios were observed:

1. One GFP_NOFS allocation was sleeping on too_many_isolated(), and
   all CPUs were busy with compactors that appeared to be spinning on
   buffer locks.

   Give GFP_NOFS compactors additional isolation headroom, the same
   way we do during reclaim, to eliminate this deadlock scenario.

2. In a more pernicious scenario, the GFP_NOFS allocation was
   busy-spinning in compaction, but seemingly never making
   progress. Upon closer inspection, memory was dominated by file
   pages, which the fs compactor isn't allowed to touch. The remaining
   anon pages didn't have the contiguity to satisfy the request.

   Allow GFP_NOFS allocations to bypass watermarks when compaction
   failed at the highest priority.

While these deadlocks were encountered only in tests with the
subsequent patches (which put a lot more demand on compaction), in
theory these problems already exist in the code today. Fix them now.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/compaction.c | 15 +++++++++++++--
 mm/page_alloc.c | 10 +++++++++-
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 8238e83385a7..84db84e8fd3a 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -745,8 +745,9 @@ isolate_freepages_range(struct compact_control *cc,
 }
 
 /* Similar to reclaim, but different enough that they don't share logic */
-static bool too_many_isolated(pg_data_t *pgdat)
+static bool too_many_isolated(struct compact_control *cc)
 {
+	pg_data_t *pgdat = cc->zone->zone_pgdat;
 	bool too_many;
 
 	unsigned long active, inactive, isolated;
@@ -758,6 +759,16 @@ static bool too_many_isolated(pg_data_t *pgdat)
 	isolated = node_page_state(pgdat, NR_ISOLATED_FILE) +
 			node_page_state(pgdat, NR_ISOLATED_ANON);
 
+	/*
+	 * GFP_NOFS callers are allowed to isolate more pages, so they
+	 * won't get blocked by normal direct-reclaimers, forming a
+	 * circular deadlock. GFP_NOIO won't get here.
+	 */
+	if (cc->gfp_mask & __GFP_FS) {
+		inactive >>= 3;
+		active >>= 3;
+	}
+
 	too_many = isolated > (inactive + active) / 2;
 	if (!too_many)
 		wake_throttle_isolated(pgdat);
@@ -806,7 +817,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 	 * list by either parallel reclaimers or compaction. If there are,
 	 * delay for some time until fewer pages are isolated
 	 */
-	while (unlikely(too_many_isolated(pgdat))) {
+	while (unlikely(too_many_isolated(cc))) {
 		/* stop isolation if there are still pages not migrated */
 		if (cc->nr_migratepages)
 			return -EAGAIN;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3bb3484563ed..ac03571e0532 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4508,8 +4508,16 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 		prep_new_page(page, order, gfp_mask, alloc_flags);
 
 	/* Try get a page from the freelist if available */
-	if (!page)
+	if (!page) {
+		/*
+		 * It's possible that the only migration sources are
+		 * file pages, and the GFP_NOFS stack is holding up
+		 * other compactors. Use reserves to avoid deadlock.
+		 */
+		if (prio == MIN_COMPACT_PRIORITY && !(gfp_mask & __GFP_FS))
+			alloc_flags |= ALLOC_NO_WATERMARKS;
 		page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
+	}
 
 	if (page) {
 		struct zone *zone = page_zone(page);
-- 
2.39.2


  parent reply	other threads:[~2023-04-18 19:14 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-18 19:12 [RFC PATCH 00/26] mm: reliable huge page allocator Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 01/26] block: bdev: blockdev page cache is movable Johannes Weiner
2023-04-19  4:07   ` Matthew Wilcox
2023-04-21 12:25   ` Mel Gorman
2023-04-18 19:12 ` Johannes Weiner [this message]
2023-04-21 12:27   ` [RFC PATCH 02/26] mm: compaction: avoid GFP_NOFS deadlocks Mel Gorman
2023-04-21 14:17     ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 03/26] mm: make pageblock_order 2M per default Johannes Weiner
2023-04-19  0:01   ` Kirill A. Shutemov
2023-04-19  2:55     ` Johannes Weiner
2023-04-19  3:44       ` Johannes Weiner
2023-04-19 11:10     ` David Hildenbrand
2023-04-19 10:36   ` Vlastimil Babka
2023-04-19 11:09     ` David Hildenbrand
2023-04-21 12:37   ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 04/26] mm: page_isolation: write proper kerneldoc Johannes Weiner
2023-04-21 12:39   ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 05/26] mm: page_alloc: per-migratetype pcplist for THPs Johannes Weiner
2023-04-21 12:47   ` Mel Gorman
2023-04-21 15:06     ` Johannes Weiner
2023-04-28 10:29       ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 06/26] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-04-21 12:54   ` Mel Gorman
2023-04-21 15:08     ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 07/26] mm: page_alloc: move capture_control to the page allocator Johannes Weiner
2023-04-21 12:59   ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 08/26] mm: page_alloc: claim blocks during compaction capturing Johannes Weiner
2023-04-21 13:12   ` Mel Gorman
2023-04-25 14:39     ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 09/26] mm: page_alloc: move expand() above compaction_capture() Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 10/26] mm: page_alloc: allow compaction capturing from larger blocks Johannes Weiner
2023-04-21 14:14   ` Mel Gorman
2023-04-25 15:40     ` Johannes Weiner
2023-04-28 10:41       ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 11/26] mm: page_alloc: introduce MIGRATE_FREE Johannes Weiner
2023-04-21 14:25   ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 12/26] mm: page_alloc: per-migratetype free counts Johannes Weiner
2023-04-21 14:28   ` Mel Gorman
2023-04-21 15:35     ` Johannes Weiner
2023-04-21 16:03       ` Mel Gorman
2023-04-21 16:32         ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 13/26] mm: compaction: remove compaction result helpers Johannes Weiner
2023-04-21 14:32   ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 14/26] mm: compaction: simplify should_compact_retry() Johannes Weiner
2023-04-21 14:36   ` Mel Gorman
2023-04-25  2:15     ` Johannes Weiner
2023-04-25  0:56   ` Huang, Ying
2023-04-25  2:11     ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 15/26] mm: compaction: simplify free block check in suitable_migration_target() Johannes Weiner
2023-04-21 14:39   ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 16/26] mm: compaction: improve compaction_suitable() accuracy Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 17/26] mm: compaction: refactor __compaction_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 18/26] mm: compaction: remove unnecessary is_via_compact_memory() checks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 19/26] mm: compaction: drop redundant watermark check in compaction_zonelist_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd Johannes Weiner
2023-04-25  3:12   ` Huang, Ying
2023-04-25 14:26     ` Johannes Weiner
2023-04-26  1:30       ` Huang, Ying
2023-04-26 15:22         ` Johannes Weiner
2023-04-27  5:41           ` Huang, Ying
2023-04-18 19:13 ` [RFC PATCH 21/26] mm: compaction: align compaction goals with reclaim goals Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 22/26] mm: page_alloc: manage free memory in whole pageblocks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 23/26] mm: page_alloc: kill highatomic Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 24/26] mm: page_alloc: kill watermark boosting Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 25/26] mm: page_alloc: disallow fallbacks when 2M defrag is enabled Johannes Weiner
2023-04-21 14:56   ` Mel Gorman
2023-04-21 15:24     ` Johannes Weiner
2023-04-21 15:55       ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 26/26] mm: page_alloc: add sanity checks for migratetypes Johannes Weiner
2023-04-18 23:54 ` [RFC PATCH 00/26] mm: reliable huge page allocator Kirill A. Shutemov
2023-04-19  2:08   ` Johannes Weiner
2023-04-19 10:56     ` Vlastimil Babka
2023-04-19  4:11 ` Matthew Wilcox
2023-04-21 16:11   ` Mel Gorman
2023-04-21 17:14     ` Matthew Wilcox
2023-05-02 15:21       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230418191313.268131-3-hannes@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=kaiyang2@cs.cmu.edu \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.