Linux-Security-Module Archive on lore.kernel.org
 help / color / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Jann Horn <jannh@google.com>, Kees Cook <keescook@chromium.org>,
	Daniel Colascione <dancol@google.com>,
	Tim Murray <timmurray@google.com>,
	Nosh Minwalla <nosh@google.com>, Nick Kralevich <nnk@google.com>,
	Lokesh Gidra <lokeshgidra@google.com>,
	kernel list <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	SElinux list <selinux@vger.kernel.org>,
	Mike Rapoport <rppt@linux.ibm.com>,
	linux-security-module <linux-security-module@vger.kernel.org>
Subject: Re: [PATCH v2 0/6] Harden userfaultfd
Date: Wed, 12 Feb 2020 14:41:00 -0500
Message-ID: <20200212194100.GA29809@redhat.com> (raw)
In-Reply-To: <20200212171416.GD1083891@xz-x1>

Hello everyone,

On Wed, Feb 12, 2020 at 12:14:16PM -0500, Peter Xu wrote:
> Right. AFAICT QEMU uses it far more than disk IOs.  A guest page can
> be accessed by any kernel component on the destination host during a
> postcopy procedure.  It can be as simple as when a vcpu writes to a
> missing guest page which still resides on the source host, then KVM
> will get a page fault and trap into userfaultfd asking for that page.
> The same thing happens to other modules like vhost, etc., as long as a
> missing guest page is touched by a kernel module.

Correct.

How does the android garbage collection work to make sure there cannot
be kernel faults on the missing memory?

If I understood correctly (I didn't have much time to review sorry)
what's proposed with regard to limiting uffd events from kernel
faults, the only use case I know that could deal with it is the
UFFD_FEATURE_SIGBUS but that's not normal userfaultfd: that's also the
only feature required from uffd to implement a pure malloc library in
userland that never takes the mmap sem for writing to implement
userland mremap/mmap/munmap lib calls (as those will convert to
UFFDIO_ZEROPAGE and MADV_DONTNEED internally to the lib and there will
be always a single vma). We just need to extend UFFDIO_ZEROPAGE to map
the THP zeropage to make this future pure-uffd malloc lib perform
better.

On the other end I'm also planning a mremap_vma_merge userland syscall
that will merge fragmented vmas.

Currently once you have a nice heap all contiguous but with small
objects and you free the fragments you can't build THP anymore even if
you make the memory virtually contiguous again by calling mremap. That
just build up a ton of vmas slowing down the app forever and also
preventing THP collapsing ever again.

mremap_vma_merge will require no new kernel feature, but it
fundamentally must be able to handle kernel faults. If databases
starts to use that, how can you enable this feature without breaking
random apps then?

So it'd be a feature usable only by one user (Android) perhaps? And
only until you start defragging the vmas of small objects?

Thanks,
Andrea


  reply index

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200211225547.235083-1-dancol@google.com>
     [not found] ` <9ae20f6e-c5c0-4fd7-5b61-77218d19480b@schaufler-ca.com>
2020-02-11 23:27   ` Daniel Colascione
2020-02-12 16:09     ` Stephen Smalley
2020-02-21 17:56     ` James Morris
2020-02-12  7:50 ` Kees Cook
2020-02-12 16:54   ` Jann Horn
2020-02-12 17:14     ` Peter Xu
2020-02-12 19:41       ` Andrea Arcangeli [this message]
2020-02-12 20:04         ` Daniel Colascione
2020-02-12 23:41           ` Andrea Arcangeli
2020-02-12 17:12   ` Daniel Colascione
2020-02-14  3:26 ` [PATCH 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-02-14  3:26   ` [PATCH 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-02-14  3:26   ` [PATCH 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-02-14 16:39     ` Stephen Smalley
2020-02-14 17:21       ` Daniel Colascione
2020-02-14 18:02         ` Stephen Smalley
2020-02-14 18:08           ` Stephen Smalley
2020-02-14 20:24             ` Stephen Smalley
2020-02-14  3:26   ` [PATCH 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-03-25 23:02   ` [PATCH v2 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-03-25 23:02   ` [PATCH v2 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-03-26 13:53     ` Stephen Smalley
2020-03-25 23:02   ` [PATCH v2 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-03-26 13:58     ` Stephen Smalley
2020-03-26 17:59       ` Daniel Colascione
2020-03-26 17:37     ` Stephen Smalley
2020-03-25 23:02   ` [PATCH v2 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-03-25 23:49     ` Casey Schaufler
2020-03-26 18:14   ` [PATCH v3 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-03-26 18:14     ` [PATCH v3 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-03-26 19:00       ` Stephen Smalley
2020-03-26 18:14     ` [PATCH v3 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-03-26 19:02       ` Stephen Smalley
2020-03-26 18:14     ` [PATCH v3 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-03-26 20:06     ` [PATCH v4 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-03-26 20:06       ` [PATCH v4 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-03-27 13:40         ` Stephen Smalley
2020-03-26 20:06       ` [PATCH v4 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-03-27 13:41         ` Stephen Smalley
2020-03-26 20:06       ` [PATCH v4 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-04-01 21:39       ` [PATCH v5 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-04-01 21:39         ` [PATCH v5 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione
2020-05-07 16:02           ` James Morris
2020-04-01 21:39         ` [PATCH v5 2/3] Teach SELinux about anonymous inodes Daniel Colascione
2020-04-01 21:39         ` [PATCH v5 3/3] Wire UFFD up to SELinux Daniel Colascione
2020-04-13 13:29         ` [PATCH v5 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione
2020-04-22 16:55           ` James Morris
2020-04-22 17:12             ` Casey Schaufler
2020-04-23 22:24               ` Casey Schaufler
2020-04-27 16:18                 ` Casey Schaufler
2020-04-27 16:48                   ` Stephen Smalley
2020-04-27 17:12                     ` Casey Schaufler
2020-04-29 17:02                     ` Stephen Smalley
2020-04-27 17:15             ` Casey Schaufler
2020-04-27 19:40               ` Stephen Smalley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200212194100.GA29809@redhat.com \
    --to=aarcange@redhat.com \
    --cc=dancol@google.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lokeshgidra@google.com \
    --cc=nnk@google.com \
    --cc=nosh@google.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.ibm.com \
    --cc=selinux@vger.kernel.org \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Security-Module Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-security-module/0 linux-security-module/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-security-module linux-security-module/ https://lore.kernel.org/linux-security-module \
		linux-security-module@vger.kernel.org
	public-inbox-index linux-security-module

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-security-module


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git