All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Mike Rapoport <rppt@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Song Liu <song@kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-kernel@vger.kernel.org, x86@kernel.org
Subject: [RFC PATCH 0/5] Prototype for direct map awareness in page allocator
Date: Wed,  8 Mar 2023 11:41:01 +0200	[thread overview]
Message-ID: <20230308094106.227365-1-rppt@kernel.org> (raw)

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Hi,

This is a third attempt to make page allocator aware of the direct map
layout and allow grouping of the pages that must be unmapped from
the direct map.

This a new implementation of __GFP_UNMAPPED, kinda a follow up for this set:

https://lore.kernel.org/all/20220127085608.306306-1-rppt@kernel.org

but instead of using a migrate type to cache the unmapped pages, the
current implementation adds a dedicated cache to serve __GFP_UNMAPPED
allocations.

The last two patches in the series demonstrate how __GFP_UNMAPPED can be
used in two in-tree use cases.

First one is to switch secretmem to use the new mechanism, which is
straight forward optimization.

The second use-case is to enable __GFP_UNMAPPED in x86::module_alloc()
that is essentially used as a method to allocate code pages and thus
requires permission changes for basic pages in the direct map.

This set is x86 specific at the moment because other architectures either
do not support set_memory APIs that split the direct^w linear map (e.g.
PowerPC) or only enable set_memory APIs when the linear map uses basic page
size (like arm64).

The patches are only lightly tested.

== Motivation ==

There are use-cases that need to remove pages from the direct map or at
least map them with 4K granularity. Whenever this is done e.g. with
set_memory/set_direct_map APIs, the PUD and PMD sized mappings in the
direct map are split into smaller pages.

To reduce the performance hit caused by the fragmentation of the direct map
it makes sense to group and/or cache the pages removed from the direct
map so that the split large pages won't be all over the place. 

There were RFCs for grouped page allocations for vmalloc permissions [1]
and for using PKS to protect page tables [2] as well as an attempt to use a
pool of large pages in secretmtm [3], but these suggestions address each
use case separately, while having a common mechanism at the core mm level
could be used by all use cases.

== Implementation overview ==

The pages that need to be removed from the direct map are grouped in a
dedicated cache. When there is a page allocation request with
__GFP_UNMAPPED set, it is redirected from __alloc_pages() to that cache
using a new unmapped_alloc() function.

The cache is implemented as a buddy allocator and it can handle high order
requests.

The cache starts empty and whenever it does not have enough pages to
satisfy an allocation request the cache attempts to allocate PMD_SIZE page
to replenish the cache. If PMD_SIZE page cannot be allocated, the cache is
replenished with a page of the highest order available. That page is
removed from the direct map and added to the local buddy allocator.

There is also a shrinker that releases pages from the unmapped cache when
there us a memory pressure in the system. When shrinker releases a page it
is mapped back into the direct map.

[1] https://lore.kernel.org/lkml/20210405203711.1095940-1-rick.p.edgecombe@intel.com
[2] https://lore.kernel.org/lkml/20210505003032.489164-1-rick.p.edgecombe@intel.com
[3] https://lore.kernel.org/lkml/20210121122723.3446-8-rppt@kernel.org

Mike Rapoport (IBM) (5):
  mm: intorduce __GFP_UNMAPPED and unmapped_alloc()
  mm/unmapped_alloc: add debugfs file similar to /proc/pagetypeinfo
  mm/unmapped_alloc: add shrinker
  EXPERIMENTAL: x86: use __GFP_UNMAPPED for modele_alloc()
  EXPERIMENTAL: mm/secretmem: use __GFP_UNMAPPED

 arch/x86/Kconfig                |   3 +
 arch/x86/kernel/module.c        |   2 +-
 include/linux/gfp_types.h       |  11 +-
 include/linux/page-flags.h      |   6 +
 include/linux/pageblock-flags.h |  28 +++
 include/trace/events/mmflags.h  |  10 +-
 mm/Kconfig                      |   4 +
 mm/Makefile                     |   1 +
 mm/internal.h                   |  24 +++
 mm/page_alloc.c                 |  39 +++-
 mm/secretmem.c                  |  26 +--
 mm/unmapped-alloc.c             | 334 ++++++++++++++++++++++++++++++++
 mm/vmalloc.c                    |   2 +-
 13 files changed, 459 insertions(+), 31 deletions(-)
 create mode 100644 mm/unmapped-alloc.c


base-commit: fe15c26ee26efa11741a7b632e9f23b01aca4cc6
-- 
2.35.1


             reply	other threads:[~2023-03-08  9:41 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-08  9:41 Mike Rapoport [this message]
2023-03-08  9:41 ` [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() Mike Rapoport
2023-03-09  1:56   ` Edgecombe, Rick P
2023-03-09 14:39     ` Mike Rapoport
2023-03-09 15:34       ` Edgecombe, Rick P
2023-03-09  6:31   ` Hyeonggon Yoo
2023-03-09 15:27     ` Mike Rapoport
2023-03-24  8:37   ` Michal Hocko
2023-03-25  6:38     ` Mike Rapoport
2023-03-27 13:43       ` Michal Hocko
2023-03-27 14:31         ` Vlastimil Babka
2023-03-27 15:10           ` Michal Hocko
2023-03-28  6:25         ` Mike Rapoport
2023-03-28  7:39           ` Michal Hocko
2023-03-28 15:11             ` Mike Rapoport
2023-03-28 15:24               ` Michal Hocko
2023-03-29  7:28                 ` Mike Rapoport
2023-03-29  8:13                   ` Michal Hocko
2023-03-30  5:13                     ` Mike Rapoport
2023-03-30  8:11                       ` Michal Hocko
2023-03-28 17:18               ` Luis Chamberlain
2023-03-28 17:37                 ` Matthew Wilcox
2023-03-28 17:52                   ` Luis Chamberlain
2023-03-28 17:55                     ` Luis Chamberlain
2023-05-18  3:35   ` Kent Overstreet
2023-05-18 15:23     ` Mike Rapoport
2023-05-18 16:33       ` Song Liu
2023-05-18 16:48         ` Kent Overstreet
2023-05-18 17:00           ` Song Liu
2023-05-18 17:23             ` Kent Overstreet
2023-05-18 18:47               ` Song Liu
2023-05-18 19:03                 ` Song Liu
2023-05-18 19:15                   ` Kent Overstreet
2023-05-18 20:03                     ` Song Liu
2023-05-18 20:13                       ` Kent Overstreet
2023-05-18 20:51                     ` Song Liu
2023-05-19  1:24                       ` Kent Overstreet
2023-05-19 15:08                         ` Song Liu
2023-05-18 19:16                   ` Kent Overstreet
2023-05-19  8:29               ` Mike Rapoport
2023-05-19 15:42                 ` Song Liu
2023-05-22 22:05                   ` Thomas Gleixner
2023-05-19 15:47                 ` Kent Overstreet
2023-05-19 16:14                   ` Mike Rapoport
2023-05-19 16:21                     ` Kent Overstreet
2023-05-18 16:58         ` Kent Overstreet
2023-05-18 17:15           ` Song Liu
2023-05-18 17:25             ` Kent Overstreet
2023-05-18 18:54               ` Song Liu
2023-05-18 19:01           ` Song Liu
2023-05-18 19:10             ` Kent Overstreet
2023-03-08  9:41 ` [RFC PATCH 2/5] mm/unmapped_alloc: add debugfs file similar to /proc/pagetypeinfo Mike Rapoport
2023-03-08  9:41 ` [RFC PATCH 3/5] mm/unmapped_alloc: add shrinker Mike Rapoport
2023-03-08  9:41 ` [RFC PATCH 4/5] EXPERIMENTAL: x86: use __GFP_UNMAPPED for modele_alloc() Mike Rapoport
2023-03-09  1:54   ` Edgecombe, Rick P
2023-03-08  9:41 ` [RFC PATCH 5/5] EXPERIMENTAL: mm/secretmem: use __GFP_UNMAPPED Mike Rapoport
2023-03-09  1:59 ` [RFC PATCH 0/5] Prototype for direct map awareness in page allocator Edgecombe, Rick P
2023-03-09 15:14   ` Mike Rapoport
2023-05-19 15:40     ` Sean Christopherson
2023-05-19 16:24       ` Mike Rapoport
2023-05-19 18:25         ` Sean Christopherson
2023-05-25 20:37           ` Mike Rapoport
2023-03-10  7:27 ` Christoph Hellwig
2023-03-27 14:27 ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230308094106.227365-1-rppt@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.