linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Mike Rapoport <rppt@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	James Bottomley <jejb@linux.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Linux API <linux-api@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, X86 ML <x86@kernel.org>,
	Mike Rapoport <rppt@linux.ibm.com>
Subject: Re: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings
Date: Wed, 30 Oct 2019 14:28:21 -0700	[thread overview]
Message-ID: <CALCETrXajrY+0SmzkL7t++ndYwRoYLLE9VPKwSGSyW8HZx-TeA@mail.gmail.com> (raw)
In-Reply-To: <20191030084005.GC20624@rapoport-lnx>

On Wed, Oct 30, 2019 at 1:40 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Tue, Oct 29, 2019 at 10:00:55AM -0700, Andy Lutomirski wrote:
> > On Tue, Oct 29, 2019 at 2:33 AM Mike Rapoport <rppt@kernel.org> wrote:
> > >
> > > On Mon, Oct 28, 2019 at 02:44:23PM -0600, Andy Lutomirski wrote:
> > > >
> > > > > On Oct 27, 2019, at 4:17 AM, Mike Rapoport <rppt@kernel.org> wrote:
> > > > >
> > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > >
> > > > > Hi,
> > > > >
> > > > > The patch below aims to allow applications to create mappins that have
> > > > > pages visible only to the owning process. Such mappings could be used to
> > > > > store secrets so that these secrets are not visible neither to other
> > > > > processes nor to the kernel.
> > > > >
> > > > > I've only tested the basic functionality, the changes should be verified
> > > > > against THP/migration/compaction. Yet, I'd appreciate early feedback.
> > > >
> > > > I’ve contemplated the concept a fair amount, and I think you should
> > > > consider a change to the API. In particular, rather than having it be a
> > > > MAP_ flag, make it a chardev.  You can, at least at first, allow only
> > > > MAP_SHARED, and admins can decide who gets to use it.  It might also play
> > > > better with the VM overall, and you won’t need a VM_ flag for it — you
> > > > can just wire up .fault to do the right thing.
> > >
> > > I think mmap()/mprotect()/madvise() are the natural APIs for such
> > > interface.
> >
> > Then you have a whole bunch of questions to answer.  For example:
> >
> > What happens if you mprotect() or similar when the mapping is already
> > in use in a way that's incompatible with MAP_EXCLUSIVE?
>
> Then we refuse to mprotect()? Like in any other case when vm_flags are not
> compatible with required madvise()/mprotect() operation.
>

I'm not talking about flags.  I'm talking about the case where one
thread (or RDMA or whatever) has get_user_pages()'d a mapping and
another thread mprotect()s it MAP_EXCLUSIVE.

> > Is it actually reasonable to malloc() some memory and then make it exclusive?
> >
> > Are you permitted to map a file MAP_EXCLUSIVE?  What does it mean?
>
> I'd limit MAP_EXCLUSIVE only to anonymous memory.
>
> > What does MAP_PRIVATE | MAP_EXCLUSIVE do?
>
> My preference is to have only mmap() and then the semantics is more clear:
>
> MAP_PRIVATE | MAP_EXCLUSIVE creates a pre-populated region, marks it locked
> and drops the pages in this region from the direct map.
> The pages are returned back on munmap().
> Then there is no way to change an existing area to be exclusive or vice
> versa.

And what happens if you fork()?  Limiting it to MAP_SHARED |
MAP_EXCLUSIVE would about this particular nasty question.

>
> > How does one pass exclusive memory via SCM_RIGHTS?  (If it's a
> > memfd-like or chardev interface, it's trivial.  mmap(), not so much.)
>
> Why passing such memory via SCM_RIGHTS would be useful?

Suppose I want to put a secret into exclusive memory and then send
that secret to some other process.  The obvious approach would be to
SCM_RIGHTS an fd over, but you can't do that with MAP_EXCLUSIVE as
you've defined it.  In general, there are lots of use cases for memfd
and other fd-backed memory.

>
> > And finally, there's my personal giant pet peeve: a major use of this
> > will be for virtualization.  I suspect that a lot of people would like
> > the majority of KVM guest memory to be unmapped from the host
> > pagetables.  But people might also like for guest memory to be
> > unmapped in *QEMU's* pagetables, and mmap() is a basically worthless
> > interface for this.  Getting fd-backed memory into a guest will take
> > some possibly major work in the kernel, but getting vma-backed memory
> > into a guest without mapping it in the host user address space seems
> > much, much worse.
>
> Well, in my view, the MAP_EXCLUSIVE is intended to keep small secrets
> rather than use it for the entire guest memory. I even considered adding a
> limit for the mapping size, but then I decided that since RLIMIT_MEMLOCK is
> anyway enforced there is no need for a new one.
>
> I agree that getting fd-backed memory into a guest would be less pain that
> VMA, but KVM can already use memory outside the control of the kernel via
> /dev/map [1].

That series doesn't address the problem I'm talking about at all.  I'm
saying that there is a legitimate use case where QEMU should *not*
have a mapping of the memory.  So QEMU would create some exclusive
memory using /dev/exclusive_memory and would tell KVM to map it into
the guest without mapping it into QEMU's address space at all.

(In fact, the way that SEV currently works is *functionally* like
this, except that there's a bogus incoherent mapping in the QEMU
process that is a giant can of worms.


IMO a major benefit of a chardev approach is that you don't need a new
VM_ flag and you don't need to worry about wiring it up everywhere in
the core mm code.

  reply	other threads:[~2019-10-30 21:28 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-27 10:17 [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Mike Rapoport
2019-10-27 10:17 ` Mike Rapoport
2019-10-28 12:31   ` Kirill A. Shutemov
2019-10-28 13:00     ` Mike Rapoport
2019-10-28 13:16       ` Kirill A. Shutemov
2019-10-28 13:55         ` Peter Zijlstra
2019-10-28 19:59           ` Edgecombe, Rick P
2019-10-28 21:00             ` Peter Zijlstra
2019-10-29 17:27               ` Edgecombe, Rick P
2019-10-30 10:04                 ` Peter Zijlstra
2019-10-30 15:35                   ` Alexei Starovoitov
2019-10-30 18:39                     ` Peter Zijlstra
2019-10-30 18:52                       ` Alexei Starovoitov
2019-10-30 17:48                   ` Edgecombe, Rick P
2019-10-30 17:58                     ` Dave Hansen
2019-10-30 18:01                       ` Dave Hansen
2019-10-29  5:43         ` Dan Williams
2019-10-29  6:43           ` Kirill A. Shutemov
2019-10-29  8:56             ` Peter Zijlstra
2019-10-29 11:00               ` Kirill A. Shutemov
2019-10-29 12:39                 ` AMD TLB errata, (Was: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings) Peter Zijlstra
2019-11-15 14:12                   ` Tom Lendacky
2019-11-15 14:31                     ` Peter Zijlstra
2019-10-29 19:43             ` [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Dan Williams
2019-10-29 20:07               ` Dave Hansen
2019-10-29  7:08         ` Christopher Lameter
2019-10-29  8:55           ` Mike Rapoport
2019-10-29 10:12             ` Christopher Lameter
2019-10-30  7:11               ` Mike Rapoport
2019-10-30 12:09                 ` Christopher Lameter
2019-10-28 14:55   ` David Hildenbrand
2019-10-28 17:12   ` Dave Hansen
2019-10-28 17:32     ` Sean Christopherson
2019-10-28 18:08     ` Matthew Wilcox
2019-10-29  9:28       ` Mike Rapoport
2019-10-29  9:19     ` Mike Rapoport
2019-10-28 18:02   ` Andy Lutomirski
2019-10-29 11:02   ` David Hildenbrand
2019-10-30  8:15     ` Mike Rapoport
2019-10-30  8:19       ` David Hildenbrand
2019-10-31 19:16         ` Mike Rapoport
2019-10-31 21:52           ` Dan Williams
2019-10-27 10:30 ` Florian Weimer
2019-10-27 11:00   ` Mike Rapoport
2019-10-28 20:23     ` Florian Weimer
2019-10-29  9:01       ` Mike Rapoport
2019-10-28 20:44 ` Andy Lutomirski
2019-10-29  9:32   ` Mike Rapoport
2019-10-29 17:00     ` Andy Lutomirski
2019-10-30  8:40       ` Mike Rapoport
2019-10-30 21:28         ` Andy Lutomirski [this message]
2019-10-31  7:21           ` Mike Rapoport
2019-12-05 15:34           ` Mike Rapoport
2019-12-08 14:10             ` [PATCH] mm: extend memfd with ability to create secret memory kbuild test robot
2019-10-29 11:25 ` [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Reshetova, Elena
2019-10-29 15:13   ` Tycho Andersen
2019-10-29 17:03   ` Andy Lutomirski
2019-10-29 17:37     ` Alan Cox
2019-10-29 17:43     ` James Bottomley
2019-10-29 18:10       ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrXajrY+0SmzkL7t++ndYwRoYLLE9VPKwSGSyW8HZx-TeA@mail.gmail.com \
    --to=luto@kernel.org \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jejb@linux.ibm.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).