All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: "Adalbert Lazăr" <alazar@bitdefender.com>,
	Linux-MM <linux-mm@kvack.org>,
	"Linux API" <linux-api@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Alexander Graf" <graf@amazon.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Jerome Glisse" <jglisse@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Mihai Donțu" <mdontu@bitdefender.com>,
	"Mircea Cirjaliu" <mcirjaliu@bitdefender.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Sargun Dhillon" <sargun@sargun.me>,
	"Aleksa Sarai" <cyphar@cyphar.com>,
	"Oleg Nesterov" <oleg@redhat.com>, "Jann Horn" <jannh@google.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Matthew Wilcox" <willy@infradead.org>
Subject: Re: [RESEND RFC PATCH 0/5] Remote mapping
Date: Mon, 7 Sep 2020 13:43:48 -0700	[thread overview]
Message-ID: <CALCETrUSUp_7svg8EHNTk3nQ0x9sdzMCU=h8G-Sy6=SODq5GHg@mail.gmail.com> (raw)
In-Reply-To: <20200907150547.hst4luvrpntdb3lr@wittgenstein>

On Mon, Sep 7, 2020 at 8:05 AM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Fri, Sep 04, 2020 at 02:31:11PM +0300, Adalbert Lazăr wrote:
> > This patchset adds support for the remote mapping feature.
> > Remote mapping, as its name suggests, is a means for transparent and
> > zero-copy access of a remote process' address space.
> > access of a remote process' address space.
> >
> > The feature was designed according to a specification suggested by
> > Paolo Bonzini:
> > >> The proposed API is a new pidfd system call, through which the parent
> > >> can map portions of its virtual address space into a file descriptor
> > >> and then pass that file descriptor to a child.
> > >>
> > >> This should be:
> > >>
> > >> - upstreamable, pidfd is the new cool thing and we could sell it as a
> > >> better way to do PTRACE_{PEEK,POKE}DATA
>
> In all honesty, that sentence made me a bit uneasy as it reads like this
> is implemented on top of pidfds because it makes it more likely to go
> upstream not because it is the right design. To be clear, I'm not
> implying any sort of malicious intent on your part but I would suggest
> to phrase this a little better. :)


I thought about this whole thing some more, and here are some thoughts.

First, I was nervous about two things.  One was faulting in pages from
the wrong context.  (When a normal page fault or KVM faults in a page,
the mm is loaded.  (In the KVM case, the mm is sort of not loaded when
the actual fault happens, but the mm is loaded when the fault is
handled, I think.  Maybe there are workqueues involved and I'm wrong.)
 When a remote mapping faults in a page, the mm is *not* loaded.)
This ought not to be a problem, though -- get_user_pages_remote() also
faults in pages from a non-current mm, and that's at least supposed to
work correctly, so maybe this is okay.

Second is recursion.  I think this is a genuine problem.

And I think that tying this to pidfds is the wrong approach.  In fact,
tying it to processes at all seems wrong.  There is a lot of demand
for various forms of memory isolation in which memory is mapped only
by its intended user.  Using something tied to a process mm gets in
the way of this in the same way that KVM's current mapping model gets
in the way.

All that being said, I think the whole idea of making fancy address
spaces composed from other mappable objects is neat and possibly quite
useful.  And, if you squint a bit, this is a lot like what KVM does
today.

So I suggest something that may be more generally useful as an
alternative.  This is a sketch and very subject to bikeshedding:

Create an empty address space:

int address_space_create(int flags, etc);

Map an fd into an address space:

int address_space_mmap(int asfd, int fd_to_map, offset, size, prot,
...);  /* might run out of args here */

Unmap from an address space:

int address_space_munmap(int asfd, unsigned long addr, unsigned long len);

Stick an address space into KVM:

ioctl(vmfd, KVM_MAP_ADDRESS_SPACE, asfd);  /* or similar */

Maybe some day allow mapping an address space into a process.

mmap(..., asfd, ...);


And at least for now, there's a rule that an address space that is
address_space_mmapped into an address space is disallowed.


Maybe some day we also allow mremap(), madvise(), etc.  And maybe some
day we allow creating a special address_space that represents a real
process's address space.


Under the hood, an address_space could own an mm_struct that is not
used by any tasks.  And we could have special memfds that are bound to
a VM such that all you can do with them is stick them into an
address_space and map that address_space into the VM in question.  For
this to work, we would want a special vm_operation for mapping into a
VM.


What do you all think?  Is this useful?  Does it solve your problems?
Is it a good approach going forward?

  reply	other threads:[~2020-09-07 20:44 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-04 11:31 [RESEND RFC PATCH 0/5] Remote mapping Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 1/5] mm: add atomic capability to zap_details Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 2/5] mm: let the VMA decide how zap_pte_range() acts on mapped pages Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 3/5] mm/mmu_notifier: remove lockdep map, allow mmu notifier to be used in nested scenarios Adalbert Lazăr
2020-09-04 12:03   ` Jason Gunthorpe
2020-09-04 11:31 ` [RESEND RFC PATCH 4/5] mm/remote_mapping: use a pidfd to access memory belonging to unrelated process Adalbert Lazăr
2020-09-04 17:55   ` Oleg Nesterov
2020-09-07 14:30   ` Oleg Nesterov
2020-09-07 15:16     ` Adalbert Lazăr
2020-09-09  8:32     ` Mircea CIRJALIU - MELIU
2020-09-10 16:43       ` Oleg Nesterov
2020-09-07 15:02   ` Christian Brauner
2020-09-07 16:04     ` Mircea CIRJALIU - MELIU
2020-09-04 11:31 ` [RESEND RFC PATCH 5/5] pidfd_mem: implemented remote memory mapping system call Adalbert Lazăr
2020-09-04 19:18   ` Florian Weimer
2020-09-07 14:55   ` Christian Brauner
2020-09-04 12:11 ` [RESEND RFC PATCH 0/5] Remote mapping Jason Gunthorpe
2020-09-04 13:24   ` Mircea CIRJALIU - MELIU
2020-09-04 13:39     ` Jason Gunthorpe
2020-09-04 14:18       ` Mircea CIRJALIU - MELIU
2020-09-04 14:39         ` Jason Gunthorpe
2020-09-04 15:40           ` Mircea CIRJALIU - MELIU
2020-09-04 16:11             ` Jason Gunthorpe
2020-09-04 19:41   ` Matthew Wilcox
2020-09-04 19:49     ` Jason Gunthorpe
2020-09-04 20:08     ` Paolo Bonzini
2020-12-01 18:01     ` Jason Gunthorpe
2020-09-04 19:19 ` Florian Weimer
2020-09-04 20:18   ` Paolo Bonzini
2020-09-07  8:33     ` Christian Brauner
2020-09-04 19:39 ` Andy Lutomirski
2020-09-04 20:09   ` Paolo Bonzini
2020-09-04 20:34     ` Andy Lutomirski
2020-09-04 21:58       ` Paolo Bonzini
2020-09-04 23:17         ` Andy Lutomirski
2020-09-05 18:27           ` Paolo Bonzini
2020-09-07  8:38             ` Christian Brauner
2020-09-07 12:41           ` Mircea CIRJALIU - MELIU
2020-09-07  7:05         ` Christoph Hellwig
2020-09-07  8:44           ` Paolo Bonzini
2020-09-07 10:25   ` Mircea CIRJALIU - MELIU
2020-09-07 15:05 ` Christian Brauner
2020-09-07 20:43   ` Andy Lutomirski [this message]
2020-09-09 11:38     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrUSUp_7svg8EHNTk3nQ0x9sdzMCU=h8G-Sy6=SODq5GHg@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alazar@bitdefender.com \
    --cc=arnd@arndb.de \
    --cc=christian.brauner@ubuntu.com \
    --cc=cyphar@cyphar.com \
    --cc=graf@amazon.com \
    --cc=jannh@google.com \
    --cc=jglisse@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcirjaliu@bitdefender.com \
    --cc=mdontu@bitdefender.com \
    --cc=oleg@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sargun@sargun.me \
    --cc=stefanha@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.