linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Mike Rapoport <rppt@kernel.org>, linux-kernel@vger.kernel.org
Cc: Alexey Dobriyan <adobriyan@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	James Bottomley <jejb@linux.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	linux-api@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
	Mike Rapoport <rppt@linux.ibm.com>
Subject: Re: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings
Date: Mon, 28 Oct 2019 10:12:44 -0700	[thread overview]
Message-ID: <d6ac08fe-23f3-c2d5-24c4-88e68f3fd4d0@intel.com> (raw)
In-Reply-To: <1572171452-7958-2-git-send-email-rppt@kernel.org>

On 10/27/19 3:17 AM, Mike Rapoport wrote:
> The pages in these mappings are removed from the kernel direct map and
> marked with PG_user_exclusive flag. When the exclusive area is unmapped,
> the pages are mapped back into the direct map.

This looks fun.  It's certainly simple.

But, the description is not really calling out the pros and cons very
well.  I'm also not sure that folks will use an interface like this that
requires up-front, special code to do an allocation instead of something
like madvise().  That's why protection keys ended up the way it did: if
you do this as a mmap() replacement, you need to modify all *allocators*
to be enabled for this.  If you do it with mprotect()-style, you can
apply it to existing allocations.

Some other random thoughts:

 * The page flag is probably not a good idea.  It would be probably
   better to set _PAGE_SPECIAL on the PTE and force get_user_pages()
   into the slow path.
 * This really stops being "normal" memory.  You can't do futexes on it,
   cant splice it.  Probably need a more fleshed-out list of
   incompatible features.
 * As Kirill noted, each 4k page ends up with a potential 1GB "blast
   radius" of demoted pages in the direct map.  Not cool.  This is
   probably a non-starter as it stands.
 * The global TLB flushes are going to eat you alive.  They probably
   border on a DoS on larger systems.
 * Do we really want this user interface to dictate the kernel
   implementation?  In other words, do we really want MAP_EXCLUSIVE,
   or do we want MAP_SECRET?  One tells the kernel what do *do*, the
   other tells the kernel what the memory *IS*.
 * There's a lot of other stuff going on in this area: XPFO, SEV, MKTME,
   Persistent Memory, where the kernel direct map is a liability in some
   way.  We probably need some kind of overall, architected solution
   rather than five or ten things all poking at the direct map.


  parent reply	other threads:[~2019-10-28 17:12 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-27 10:17 [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Mike Rapoport
2019-10-27 10:17 ` Mike Rapoport
2019-10-28 12:31   ` Kirill A. Shutemov
2019-10-28 13:00     ` Mike Rapoport
2019-10-28 13:16       ` Kirill A. Shutemov
2019-10-28 13:55         ` Peter Zijlstra
2019-10-28 19:59           ` Edgecombe, Rick P
2019-10-28 21:00             ` Peter Zijlstra
2019-10-29 17:27               ` Edgecombe, Rick P
2019-10-30 10:04                 ` Peter Zijlstra
2019-10-30 15:35                   ` Alexei Starovoitov
2019-10-30 18:39                     ` Peter Zijlstra
2019-10-30 18:52                       ` Alexei Starovoitov
2019-10-30 17:48                   ` Edgecombe, Rick P
2019-10-30 17:58                     ` Dave Hansen
2019-10-30 18:01                       ` Dave Hansen
2019-10-29  5:43         ` Dan Williams
2019-10-29  6:43           ` Kirill A. Shutemov
2019-10-29  8:56             ` Peter Zijlstra
2019-10-29 11:00               ` Kirill A. Shutemov
2019-10-29 12:39                 ` AMD TLB errata, (Was: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings) Peter Zijlstra
2019-11-15 14:12                   ` Tom Lendacky
2019-11-15 14:31                     ` Peter Zijlstra
2019-10-29 19:43             ` [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Dan Williams
2019-10-29 20:07               ` Dave Hansen
2019-10-29  7:08         ` Christopher Lameter
2019-10-29  8:55           ` Mike Rapoport
2019-10-29 10:12             ` Christopher Lameter
2019-10-30  7:11               ` Mike Rapoport
2019-10-30 12:09                 ` Christopher Lameter
2019-10-28 14:55   ` David Hildenbrand
2019-10-28 17:12   ` Dave Hansen [this message]
2019-10-28 17:32     ` Sean Christopherson
2019-10-28 18:08     ` Matthew Wilcox
2019-10-29  9:28       ` Mike Rapoport
2019-10-29  9:19     ` Mike Rapoport
2019-10-28 18:02   ` Andy Lutomirski
2019-10-29 11:02   ` David Hildenbrand
2019-10-30  8:15     ` Mike Rapoport
2019-10-30  8:19       ` David Hildenbrand
2019-10-31 19:16         ` Mike Rapoport
2019-10-31 21:52           ` Dan Williams
2019-10-27 10:30 ` Florian Weimer
2019-10-27 11:00   ` Mike Rapoport
2019-10-28 20:23     ` Florian Weimer
2019-10-29  9:01       ` Mike Rapoport
2019-10-28 20:44 ` Andy Lutomirski
2019-10-29  9:32   ` Mike Rapoport
2019-10-29 17:00     ` Andy Lutomirski
2019-10-30  8:40       ` Mike Rapoport
2019-10-30 21:28         ` Andy Lutomirski
2019-10-31  7:21           ` Mike Rapoport
2019-12-05 15:34           ` Mike Rapoport
2019-12-08 14:10             ` [PATCH] mm: extend memfd with ability to create secret memory kbuild test robot
2019-10-29 11:25 ` [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Reshetova, Elena
2019-10-29 15:13   ` Tycho Andersen
2019-10-29 17:03   ` Andy Lutomirski
2019-10-29 17:37     ` Alan Cox
2019-10-29 17:43     ` James Bottomley
2019-10-29 18:10       ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d6ac08fe-23f3-c2d5-24c4-88e68f3fd4d0@intel.com \
    --to=dave.hansen@intel.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jejb@linux.ibm.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).