[RFC PATCH 0/5] mm/mlock: new mlock_count tracking scheme

From: Yosry Ahmed <yosryahmed@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Yu Zhao <yuzhao@google.com>,
	"Jan Alexander Steffens (heftig)" <heftig@archlinux.org>,
	Steven Barrett <steven@liquorix.net>,
	Brian Geffon <bgeffon@google.com>,
	"T.J. Alumbaugh" <talumbau@google.com>,
	Gaosheng Cui <cuigaosheng1@huawei.com>,
	Suren Baghdasaryan <surenb@google.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	David Howells <dhowells@redhat.com>,
	Hugh Dickins <hughd@google.com>, Greg Thelen <gthelen@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Yosry Ahmed <yosryahmed@google.com>
Subject: [RFC PATCH 0/5] mm/mlock: new mlock_count tracking scheme
Date: Sun, 18 Jun 2023 06:57:19 +0000	[thread overview]
Message-ID: <20230618065719.1363271-1-yosryahmed@google.com> (raw)

This series attempts to rework the mlock_count tracking scheme to avoid
overlaying page->lru. The main goal is to revive the unevictable LRU,
which would be useful for upcoming work for offline memcgs recharging
[1]. For that work, we need to be able to find all the pages charged to
a memcg, and iterating the LRUS is the most efficient way to do it.
With the current mlock_count scheme, the unevictable LRU is imaginary,
as page->mlock_count overlays page->lru.

The proposed scheme overloads page->_mapcount to track mlock_count for
order-0 pages, slightly similar to how page->_refcount is overloaded
for pincount. More details in patch 1.

Another advantage of this series is that we do not have to reset the
mlock_count everytime we isolate an mlocked page from the LRU. This
means we can more reliably track the mlock_count -- we are less likely
to prematurely munlock() a page. We also do not need to re-initialize
the mlock_count every time we add an mlocked page to the LRUs, or every
time we found that it was reset during mlock/munlock. The lack of
re-initialization slightly simplifies the mlock_count logic. The
complexity is also more contained within mm/mlock.c.

This series is based on v6.4-rc6, and has been tested with the mlock
selftests (though I had to rebase to v6.2 to get those selftests
working).

The series is broken up as follows:
- Patch 1 is the actual rework of the mlock_count scheme.
- Patch 2 handles the case where a page might be mistaknely stranded as
  mlocked indefinetly if it was mapped a very large number of times.
- Patch 3 adds a WARN_ON() in case a very large number of mappings can
  be mistakenly interpreted as an mlock_count.
- Patch 4 revives the unevictable LRU.
- Patch 5 reverts a patch that was part of the original mlock_count
  series [2] that is no longer needed now.

[1]https://lore.kernel.org/linux-mm/CAJD7tkb56gR0X5v3VHfmk3az3bOz=wF2jhEi+7Eek0J8XXBeWQ@mail.gmail.com/
[2]https://lore.kernel.org/linux-mm/55a49083-37f9-3766-1de9-9feea7428ac@google.com/

Yosry Ahmed (5):
  mm/mlock: rework mlock_count to use _mapcount for order-0 folios
  mm/mlock: fixup mlock_count during unmap
  mm/mlock: WARN_ON() if mapcount overflows into mlock_count
  mm/vmscan: revive the unevictable LRU
  Revert "mm/migrate: __unmap_and_move() push good newpage to LRU"

 include/linux/mm.h        |  31 ++++++--
 include/linux/mm_inline.h |  11 +--
 include/linux/mm_types.h  |  24 +-----
 mm/huge_memory.c          |   5 +-
 mm/migrate.c              |  24 +++---
 mm/mlock.c                | 150 +++++++++++++++++++++++++++++++++-----
 mm/mmzone.c               |   8 --
 mm/rmap.c                 |   3 +
 mm/swap.c                 |   8 --
 9 files changed, 174 insertions(+), 90 deletions(-)

-- 
2.41.0.162.gfafddb0af9-goog