LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Daniel Colascione <dancol@google.com>
Cc: Peter Xu <peterx@redhat.com>, Jann Horn <jannh@google.com>,
	Kees Cook <keescook@chromium.org>,
	Tim Murray <timmurray@google.com>,
	Nosh Minwalla <nosh@google.com>, Nick Kralevich <nnk@google.com>,
	Lokesh Gidra <lokeshgidra@google.com>,
	kernel list <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	SElinux list <selinux@vger.kernel.org>,
	Mike Rapoport <rppt@linux.ibm.com>,
	linux-security-module <linux-security-module@vger.kernel.org>
Subject: Re: [PATCH v2 0/6] Harden userfaultfd
Date: Wed, 12 Feb 2020 18:41:19 -0500
Message-ID: <20200212234119.GB29809@redhat.com> (raw)
In-Reply-To: <CAKOZuevusieaKCt5r-jnQ5ArGfw5Otszq2CAcrqFi6MYxkKwtA@mail.gmail.com>

Hi Daniel,

On Wed, Feb 12, 2020 at 12:04:39PM -0800, Daniel Colascione wrote:
> We don't pass pointers to the heap into system calls. (Big primitive
> arrays, ByteBuffer, etc. are allocated off the regular heap.)

That sounds pretty restrictive, I wonder what you gain by enforcing
that invariant or if it just happened incidentally for some other
reason?  Do you need to copy that memory every time if you need to do
I/O on it? Are you sure this won't need to change any time soon to
increase performance?

> I don't understand what you mean. The purpose of preventing UFFD from
> handling kernel faults is exploit mitigation.

That part was clear. What wasn't clear is what the new feature
does exactly and what it blocks, because it's all about blocking or
how does it make things more secure?

> The key requirement here is the ability to prevent unprivileged
> processes from using UFFD to widen kernel exploit windows by
> preventing UFFD from taking kernel faults. Forcing unprivileged
> processes to use UFFD only with UFFD_FEATURE_SIGBUS would satisfy this
> requirement, but it's much less flexible and unnecessarily couples two
> features.

I mentioned it in case you could use something like that model.

> > On the other end I'm also planning a mremap_vma_merge userland syscall
> > that will merge fragmented vmas.
> 
> This is probably a separate discussion, but does that operation really
> need to be a system call? Historically, userspace has operated mostly

mremap_vma_merge was not intended as a system call, if implemented as
a system call it wouldn't use uffd.

> on page ranges and not VMAs per se, and the kernel has been free to

Userland doesn't need to know anything.. unless it wants to optimize.

The userland can know full well if it does certain mremap operations
and puts its ranges virtually contiguous in a non linear way, so that
the kernel cannot merge the vmas.

> merge and split VMAs as needed for its internal purposes. This
> approach has serious negative side effects (like making munmap
> fallible: see [1]), but it is what it is.
> 
> [1] https://lore.kernel.org/linux-mm/CAKOZuetOD6MkGPVvYFLj5RXh200FaDyu3sQqZviVRhTFFS3fjA@mail.gmail.com/

The fact it's fallible is a secondary concern here. Even if you make
it unlimited, if it grows it slowdown everything and also prevents THP
to be collapsed. Even the scalability of the mmap_sem worsens.

> > Currently once you have a nice heap all contiguous but with small
> > objects and you free the fragments you can't build THP anymore even if
> > you make the memory virtually contiguous again by calling mremap. That
> > just build up a ton of vmas slowing down the app forever and also
> > preventing THP collapsing ever again.
> 
> Shouldn't the THP background kthread take care of merging VMAs?

The solution can't depend on any THP feature, because the buildup of
vmas is a scalability issue and a performance regression over time
even if THP is not configured in the kernel. However once that's
solved THP also gets naturally optimized.

What should happen (in my view) is just the simplest solution that can
defrag and forcefully merge the vmas without having to stop or restart
the app.

> Presumably, those apps wouldn't issue the system call on address
> ranges managed with a non-kernel-fault UFFD.

So the new security feature won't have to block kernel faults on those
apps and they can run side by side with the blocked app?

> We shouldn't be fragmenting at all, either at the memory level or the
> VMA level. The GC is a moving collector, and we don't punch holes in
> the heap.

That sounds good.

Thanks,
Andrea


  reply index

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11 22:55 Daniel Colascione
2020-02-11 22:55 ` [PATCH v2 1/6] Add a new flags-accepting interface for anonymous inodes Daniel Colascione
2020-02-12 16:37   ` Stephen Smalley
2020-02-12 17:23     ` Daniel Colascione
2020-02-11 22:55 ` [PATCH v2 2/6] Add a concept of a "secure" anonymous file Daniel Colascione
2020-02-12 16:49   ` Stephen Smalley
2020-02-14 22:13   ` kbuild test robot
2020-02-11 22:55 ` [PATCH v2 3/6] Teach SELinux about a new userfaultfd class Daniel Colascione
2020-02-12 17:05   ` Stephen Smalley
2020-02-12 17:19     ` Daniel Colascione
2020-02-12 18:04       ` Stephen Smalley
2020-02-12 18:59         ` Stephen Smalley
2020-02-12 19:04           ` Daniel Colascione
2020-02-12 19:11             ` Stephen Smalley
2020-02-12 19:13               ` Daniel Colascione
2020-02-12 19:17               ` Stephen Smalley
2020-02-11 22:55 ` [PATCH v2 4/6] Wire UFFD up to SELinux Daniel Colascione
2020-02-11 22:55 ` [PATCH v2 5/6] Let userfaultfd opt out of handling kernel-mode faults Daniel Colascione
2020-02-11 22:55 ` [PATCH v2 6/6] Add a new sysctl for limiting userfaultfd to user mode faults Daniel Colascione
2020-02-11 23:13 ` [PATCH v2 0/6] Harden userfaultfd Casey Schaufler
2020-02-11 23:27   ` Daniel Colascione
2020-02-12 16:09     ` Stephen Smalley
2020-02-21 17:56     ` James Morris
2020-02-12  7:50 ` Kees Cook
2020-02-12 16:54   ` Jann Horn
2020-02-12 17:14     ` Peter Xu
2020-02-12 19:41       ` Andrea Arcangeli
2020-02-12 20:04         ` Daniel Colascione
2020-02-12 23:41           ` Andrea Arcangeli [this message]
2020-02-12 17:12   ` Daniel Colascione
2020-02-14  3:26 ` [PATCH 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-02-14  3:26   ` [PATCH 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-02-14  3:26   ` [PATCH 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-02-14 16:39     ` Stephen Smalley
2020-02-14 17:21       ` Daniel Colascione
2020-02-14 18:02         ` Stephen Smalley
2020-02-14 18:08           ` Stephen Smalley
2020-02-14 20:24             ` Stephen Smalley
2020-02-14  3:26   ` [PATCH 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-03-25 23:02   ` [PATCH v2 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-03-25 23:02   ` [PATCH v2 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-03-26 13:53     ` Stephen Smalley
2020-03-25 23:02   ` [PATCH v2 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-03-26 13:58     ` Stephen Smalley
2020-03-26 17:59       ` Daniel Colascione
2020-03-26 17:37     ` Stephen Smalley
2020-03-25 23:02   ` [PATCH v2 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-03-25 23:49     ` Casey Schaufler
2020-03-26 18:14   ` [PATCH v3 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-03-26 18:14     ` [PATCH v3 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-03-26 19:00       ` Stephen Smalley
2020-03-26 18:14     ` [PATCH v3 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-03-26 19:02       ` Stephen Smalley
2020-03-26 18:14     ` [PATCH v3 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-03-26 20:06     ` [PATCH v4 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-03-26 20:06       ` [PATCH v4 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-03-27 13:40         ` Stephen Smalley
2020-03-26 20:06       ` [PATCH v4 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-03-27 13:41         ` Stephen Smalley
2020-03-26 20:06       ` [PATCH v4 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-04-01 21:39       ` [PATCH v5 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-04-01 21:39         ` [PATCH v5 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-05-07 16:02           ` James Morris
2020-08-04 21:22           ` Eric Biggers
2020-04-01 21:39         ` [PATCH v5 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-04-01 21:39         ` [PATCH v5 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-08-04 21:16           ` Eric Biggers
2020-04-13 13:29         ` [PATCH v5 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-04-22 16:55           ` James Morris
2020-04-22 17:12             ` Casey Schaufler
2020-04-23 22:24               ` Casey Schaufler
2020-04-27 16:18                 ` Casey Schaufler
2020-04-27 16:48                   ` Stephen Smalley
2020-04-27 17:12                     ` Casey Schaufler
2020-04-29 17:02                     ` Stephen Smalley
2020-04-27 17:15             ` Casey Schaufler
2020-04-27 19:40               ` Stephen Smalley
2020-06-04  3:56         ` James Morris
2020-06-04 18:51           ` Stephen Smalley
2020-06-04 19:24             ` Lokesh Gidra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200212234119.GB29809@redhat.com \
    --to=aarcange@redhat.com \
    --cc=dancol@google.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lokeshgidra@google.com \
    --cc=nnk@google.com \
    --cc=nosh@google.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.ibm.com \
    --cc=selinux@vger.kernel.org \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git