From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AA9DC433EF for ; Mon, 24 Jan 2022 03:51:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 863856B008A; Sun, 23 Jan 2022 22:51:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 813056B008C; Sun, 23 Jan 2022 22:51:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DA5A6B0092; Sun, 23 Jan 2022 22:51:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id 5DA116B008A for ; Sun, 23 Jan 2022 22:51:10 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 19648181C49C6 for ; Mon, 24 Jan 2022 03:51:10 +0000 (UTC) X-FDA: 79063805100.16.F20CA76 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf08.hostedemail.com (Postfix) with ESMTP id 82E28160013 for ; Mon, 24 Jan 2022 03:51:09 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 82AA41F3A0; Mon, 24 Jan 2022 03:51:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1642996268; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7aGIJFrJoxVYTOarUMpcVxyhqOqRdBZGTPLRKjCZ04Y=; b=n+mu4ts5W6++NB/HdVWdE8/u2lXQLbHMlfwpoZTqDMnapsQ13f+u7f9HhjxBTWTO5iDyLQ D0Iy1Xqj9oNV7juUdF1ZhfwDMEWP8KHBfIA5kx9sPvOBTcgObD+IXl6Guud+j9drUFn32J kIgz1fZoFZeIUalBTmx+NouK2MsD3mw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1642996268; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7aGIJFrJoxVYTOarUMpcVxyhqOqRdBZGTPLRKjCZ04Y=; b=8DCw4T5/4ajAli4QaAAQISZwKGlcsbSrCUT2KT+Qt+OEuiQAoF11dpSgoz/pGYcsT324Nj PW28866cIoVuzYDg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2098E1331A; Mon, 24 Jan 2022 03:51:04 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id rbgdMygi7mH6RAAAMHmgww (envelope-from ); Mon, 24 Jan 2022 03:51:04 +0000 Subject: [PATCH 05/23] MM: reclaim mustn't enter FS for SWP_FS_OPS swap-space From: NeilBrown To: Trond Myklebust , Anna Schumaker , Chuck Lever , Andrew Morton , Mel Gorman , Christoph Hellwig , David Howells Cc: linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Mon, 24 Jan 2022 14:48:32 +1100 Message-ID: <164299611276.26253.11555458501911153645.stgit@noble.brown> In-Reply-To: <164299573337.26253.7538614611220034049.stgit@noble.brown> References: <164299573337.26253.7538614611220034049.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=n+mu4ts5; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b="8DCw4T5/"; spf=pass (imf08.hostedemail.com: domain of neilb@suse.de designates 195.135.220.29 as permitted sender) smtp.mailfrom=neilb@suse.de; dmarc=pass (policy=none) header.from=suse.de X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 82E28160013 X-Stat-Signature: eitc5jnjckb7of9zzejbdfodwkzmasbw X-HE-Tag: 1642996269-597789 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If swap-out is using filesystem operations (SWP_FS_OPS), then it is not safe to enter the FS for reclaim. So only down-grade the requirement for swap pages to __GFP_IO after checking that SWP_FS_OPS are not being used. This makes the calculation of "may_enter_fs" slightly more complex, so move it into a separate function. with that done, there is little value in maintaining the bool variable any more. So replace the may_enter_fs variable with a may_enter_fs() function. This removes any risk for the variable becoming out-of-date. Signed-off-by: NeilBrown --- mm/swap.h | 8 ++++++++ mm/vmscan.c | 29 ++++++++++++++++++++--------- 2 files changed, 28 insertions(+), 9 deletions(-) diff --git a/mm/swap.h b/mm/swap.h index 13e72a5023aa..5c676e55f288 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -47,6 +47,10 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t flag, struct page *swapin_readahead(swp_entry_t entry, gfp_t flag, struct vm_fault *vmf); +static inline unsigned int page_swap_flags(struct page *page) +{ + return page_swap_info(page)->flags; +} #else /* CONFIG_SWAP */ static inline int swap_readpage(struct page *page, bool do_poll) { @@ -126,4 +130,8 @@ static inline void clear_shadow_from_swap_cache(int type, unsigned long begin, { } +static inline unsigned int page_swap_flags(struct page *page) +{ + return 0; +} #endif /* CONFIG_SWAP */ diff --git a/mm/vmscan.c b/mm/vmscan.c index 5c734ffc6057..ad5026d06aa8 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1506,6 +1506,22 @@ static unsigned int demote_page_list(struct list_head *demote_pages, return nr_succeeded; } +static bool may_enter_fs(struct page *page, gfp_t gfp_mask) +{ + if (gfp_mask & __GFP_FS) + return true; + if (!PageSwapCache(page) || !(gfp_mask & __GFP_IO)) + return false; + /* + * We can "enter_fs" for swap-cache with only __GFP_IO + * providing this isn't SWP_FS_OPS. + * ->flags can be updated non-atomicially (scan_swap_map_slots), + * but that will never affect SWP_FS_OPS, so the data_race + * is safe. + */ + return !data_race(page_swap_flags(page) & SWP_FS_OPS); +} + /* * shrink_page_list() returns the number of reclaimed pages */ @@ -1531,7 +1547,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, struct address_space *mapping; struct page *page; enum page_references references = PAGEREF_RECLAIM; - bool dirty, writeback, may_enter_fs; + bool dirty, writeback; unsigned int nr_pages; cond_resched(); @@ -1555,9 +1571,6 @@ static unsigned int shrink_page_list(struct list_head *page_list, if (!sc->may_unmap && page_mapped(page)) goto keep_locked; - may_enter_fs = (sc->gfp_mask & __GFP_FS) || - (PageSwapCache(page) && (sc->gfp_mask & __GFP_IO)); - /* * The number of dirty pages determines if a node is marked * reclaim_congested. kswapd will stall and start writing @@ -1602,7 +1615,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, * not to fs). In this case mark the page for immediate * reclaim and continue scanning. * - * Require may_enter_fs because we would wait on fs, which + * Require may_enter_fs() because we would wait on fs, which * may not have submitted IO yet. And the loop driver might * enter reclaim, and deadlock if it waits on a page for * which it is needed to do the write (loop masks off @@ -1634,7 +1647,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, /* Case 2 above */ } else if (writeback_throttling_sane(sc) || - !PageReclaim(page) || !may_enter_fs) { + !PageReclaim(page) || !may_enter_fs(page, sc->gfp_mask)) { /* * This is slightly racy - end_page_writeback() * might have just cleared PageReclaim, then @@ -1724,8 +1737,6 @@ static unsigned int shrink_page_list(struct list_head *page_list, goto activate_locked_split; } - may_enter_fs = true; - /* Adding to swap updated mapping */ mapping = page_mapping(page); } @@ -1795,7 +1806,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, if (references == PAGEREF_RECLAIM_CLEAN) goto keep_locked; - if (!may_enter_fs) + if (!may_enter_fs(page, sc->gfp_mask)) goto keep_locked; if (!sc->may_writepage) goto keep_locked;