linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	Michal Hocko <mhocko@kernel.org>, Nadav Amit <namit@vmware.com>,
	Rik van Riel <riel@surriel.com>, Roman Gushchin <guro@fb.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Peter Xu <peterx@redhat.com>, Donald Dutile <ddutile@redhat.com>,
	Christoph Hellwig <hch@lst.de>, Oleg Nesterov <oleg@redhat.com>,
	Jan Kara <jack@suse.cz>, Liang Zhang <zhangliang5@huawei.com>,
	linux-mm@kvack.org, David Hildenbrand <david@redhat.com>
Subject: [PATCH v3 2/9] mm: optimize do_wp_page() for fresh pages in local LRU pagevecs
Date: Mon, 31 Jan 2022 17:29:32 +0100	[thread overview]
Message-ID: <20220131162940.210846-3-david@redhat.com> (raw)
In-Reply-To: <20220131162940.210846-1-david@redhat.com>

For example, if a page just got swapped in via a read fault, the LRU
pagevecs might still hold a reference to the page. If we trigger a
write fault on such a page, the additional reference from the LRU
pagevecs will prohibit reusing the page.

Let's conditionally drain the local LRU pagevecs when we stumble over a
!PageLRU() page. We cannot easily drain remote LRU pagevecs and it might
not be desirable performance-wise. Consequently, this will only avoid
copying in some cases.

Add a simple "page_count(page) > 3" check first but keep the
"page_count(page) > 1 + PageSwapCache(page)" check in place, as
we want to minimize cases where we remove a page from the swapcache but
won't be able to reuse it, for example, because another process has it
mapped R/O, to not affect reclaim.

We cannot easily handle the following cases and we will always have to
copy:

(1) The page is referenced in the LRU pagevecs of other CPUs. We really
    would have to drain the LRU pagevecs of all CPUs -- most probably
    copying is much cheaper.

(2) The page is already PageLRU() but is getting moved between LRU
    lists, for example, for activation (e.g., mark_page_accessed()),
    deactivation (MADV_COLD), or lazyfree (MADV_FREE). We'd have to
    drain mostly unconditionally, which might be bad performance-wise.
    Most probably this won't happen too often in practice.

Note that there are other reasons why an anon page might temporarily not
be PageLRU(): for example, compaction and migration have to isolate LRU
pages from the LRU lists first (isolate_lru_page()), moving them to
temporary local lists and clearing PageLRU() and holding an additional
reference on the page. In that case, we'll always copy.

This change seems to be fairly effective with the reproducer [1] shared
by Nadav, as long as writeback is done synchronously, for example, using
zram. However, with asynchronous writeback, we'll usually fail to free the
swapcache because the page is still under writeback: something we cannot
easily optimize for, and maybe it's not really relevant in practice.

[1] https://lkml.kernel.org/r/0480D692-D9B2-429A-9A88-9BBA1331AC3A@gmail.com

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index bcd3b7c50891..923165b4c27e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3298,7 +3298,15 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
 		 *
 		 * PageKsm() doesn't necessarily raise the page refcount.
 		 */
-		if (PageKsm(page) || page_count(page) > 1 + PageSwapCache(page))
+		if (PageKsm(page) || page_count(page) > 3)
+			goto copy;
+		if (!PageLRU(page))
+			/*
+			 * Note: We cannot easily detect+handle references from
+			 * remote LRU pagevecs or references to PageLRU() pages.
+			 */
+			lru_add_drain();
+		if (page_count(page) > 1 + PageSwapCache(page))
 			goto copy;
 		if (!trylock_page(page))
 			goto copy;
-- 
2.34.1



  parent reply	other threads:[~2022-01-31 16:32 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-31 16:29 [PATCH v3 0/9] mm: COW fixes part 1: fix the COW security issue for THP and swap David Hildenbrand
2022-01-31 16:29 ` [PATCH v3 1/9] mm: optimize do_wp_page() for exclusive pages in the swapcache David Hildenbrand
2022-01-31 16:29 ` David Hildenbrand [this message]
2022-03-09 17:53   ` [PATCH v3 2/9] mm: optimize do_wp_page() for fresh pages in local LRU pagevecs Vlastimil Babka
2022-01-31 16:29 ` [PATCH v3 3/9] mm: slightly clarify KSM logic in do_swap_page() David Hildenbrand
2022-03-09 18:03   ` Vlastimil Babka
2022-03-09 18:48   ` Yang Shi
2022-03-09 19:15     ` David Hildenbrand
2022-03-09 20:50       ` Andrew Morton
2022-01-31 16:29 ` [PATCH v3 4/9] mm: streamline COW " David Hildenbrand
2022-03-10  9:41   ` Vlastimil Babka
2022-01-31 16:29 ` [PATCH v3 5/9] mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page() David Hildenbrand
2022-03-10  9:52   ` Vlastimil Babka
2022-01-31 16:29 ` [PATCH v3 6/9] mm/khugepaged: remove reuse_swap_page() usage David Hildenbrand
2022-02-01 21:31   ` Yang Shi
2022-03-10 10:37   ` Vlastimil Babka
2022-01-31 16:29 ` [PATCH v3 7/9] mm/swapfile: remove stale reuse_swap_page() David Hildenbrand
2022-02-02 14:35   ` Christoph Hellwig
2022-02-02 17:01     ` David Hildenbrand
2022-03-10 10:44   ` Vlastimil Babka
2022-01-31 16:29 ` [PATCH v3 8/9] mm/huge_memory: remove stale page_trans_huge_mapcount() David Hildenbrand
2022-03-10 10:50   ` Vlastimil Babka
2022-01-31 16:29 ` [PATCH v3 9/9] mm/huge_memory: remove stale locking logic from __split_huge_pmd() David Hildenbrand
2022-03-10 11:02   ` Vlastimil Babka
2022-02-01 18:59 ` [PATCH v3 0/9] mm: COW fixes part 1: fix the COW security issue for THP and swap Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220131162940.210846-3-david@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ddutile@redhat.com \
    --cc=guro@fb.com \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=namit@vmware.com \
    --cc=oleg@redhat.com \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rppt@linux.ibm.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=zhangliang5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).