From: Daniel Colascione <firstname.lastname@example.org> To: Andrea Arcangeli <email@example.com> Cc: Peter Xu <firstname.lastname@example.org>, Jann Horn <email@example.com>, Kees Cook <firstname.lastname@example.org>, Tim Murray <email@example.com>, Nosh Minwalla <firstname.lastname@example.org>, Nick Kralevich <email@example.com>, Lokesh Gidra <firstname.lastname@example.org>, kernel list <email@example.com>, Linux API <firstname.lastname@example.org>, SElinux list <email@example.com>, Mike Rapoport <firstname.lastname@example.org>, linux-security-module <email@example.com> Subject: Re: [PATCH v2 0/6] Harden userfaultfd Date: Wed, 12 Feb 2020 12:04:39 -0800 [thread overview] Message-ID: <CAKOZuevusieaKCt5r-jnQ5ArGfw5Otszq2CAcrqFi6MYxkKwtA@mail.gmail.com> (raw) In-Reply-To: <20200212194100.GA29809@redhat.com> On Wed, Feb 12, 2020 at 11:41 AM Andrea Arcangeli <firstname.lastname@example.org> wrote: > > Hello everyone, > > On Wed, Feb 12, 2020 at 12:14:16PM -0500, Peter Xu wrote: > > Right. AFAICT QEMU uses it far more than disk IOs. A guest page can > > be accessed by any kernel component on the destination host during a > > postcopy procedure. It can be as simple as when a vcpu writes to a > > missing guest page which still resides on the source host, then KVM > > will get a page fault and trap into userfaultfd asking for that page. > > The same thing happens to other modules like vhost, etc., as long as a > > missing guest page is touched by a kernel module. > > Correct. > > How does the android garbage collection work to make sure there cannot > be kernel faults on the missing memory? We don't pass pointers to the heap into system calls. (Big primitive arrays, ByteBuffer, etc. are allocated off the regular heap.) > If I understood correctly (I didn't have much time to review sorry) > what's proposed with regard to limiting uffd events from kernel > faults, I don't understand what you mean. The purpose of preventing UFFD from handling kernel faults is exploit mitigation. > the only use case I know that could deal with it is the > UFFD_FEATURE_SIGBUS but that's not normal userfaultfd: that's also the > only feature required from uffd to implement a pure malloc library in > userland that never takes the mmap sem for writing to implement > userland mremap/mmap/munmap lib calls (as those will convert to > UFFDIO_ZEROPAGE and MADV_DONTNEED internally to the lib and there will > be always a single vma). We just need to extend UFFDIO_ZEROPAGE to map > the THP zeropage to make this future pure-uffd malloc lib perform > better. The key requirement here is the ability to prevent unprivileged processes from using UFFD to widen kernel exploit windows by preventing UFFD from taking kernel faults. Forcing unprivileged processes to use UFFD only with UFFD_FEATURE_SIGBUS would satisfy this requirement, but it's much less flexible and unnecessarily couples two features. > On the other end I'm also planning a mremap_vma_merge userland syscall > that will merge fragmented vmas. This is probably a separate discussion, but does that operation really need to be a system call? Historically, userspace has operated mostly on page ranges and not VMAs per se, and the kernel has been free to merge and split VMAs as needed for its internal purposes. This approach has serious negative side effects (like making munmap fallible: see ), but it is what it is.  https://lore.kernel.org/linux-mm/CAKOZuetOD6MkGPVvYFLj5RXh200FaDyu3sQqZviVRhTFFS3fjA@mail.gmail.com/ > Currently once you have a nice heap all contiguous but with small > objects and you free the fragments you can't build THP anymore even if > you make the memory virtually contiguous again by calling mremap. That > just build up a ton of vmas slowing down the app forever and also > preventing THP collapsing ever again. Shouldn't the THP background kthread take care of merging VMAs? > mremap_vma_merge will require no new kernel feature, but it > fundamentally must be able to handle kernel faults. If databases > starts to use that, how can you enable this feature without breaking > random apps then? Presumably, those apps wouldn't issue the system call on address ranges managed with a non-kernel-fault UFFD. > So it'd be a feature usable only by one user (Android) perhaps? And > only until you start defragging the vmas of small objects? We shouldn't be fragmenting at all, either at the memory level or the VMA level. The GC is a moving collector, and we don't punch holes in the heap.
next prev parent reply other threads:[~2020-02-12 20:05 UTC|newest] Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-02-11 22:55 Daniel Colascione 2020-02-11 22:55 ` [PATCH v2 1/6] Add a new flags-accepting interface for anonymous inodes Daniel Colascione 2020-02-12 16:37 ` Stephen Smalley 2020-02-12 17:23 ` Daniel Colascione 2020-02-11 22:55 ` [PATCH v2 2/6] Add a concept of a "secure" anonymous file Daniel Colascione 2020-02-12 16:49 ` Stephen Smalley 2020-02-14 22:13 ` kbuild test robot 2020-02-11 22:55 ` [PATCH v2 3/6] Teach SELinux about a new userfaultfd class Daniel Colascione 2020-02-12 17:05 ` Stephen Smalley 2020-02-12 17:19 ` Daniel Colascione 2020-02-12 18:04 ` Stephen Smalley 2020-02-12 18:59 ` Stephen Smalley 2020-02-12 19:04 ` Daniel Colascione 2020-02-12 19:11 ` Stephen Smalley 2020-02-12 19:13 ` Daniel Colascione 2020-02-12 19:17 ` Stephen Smalley 2020-02-11 22:55 ` [PATCH v2 4/6] Wire UFFD up to SELinux Daniel Colascione 2020-02-11 22:55 ` [PATCH v2 5/6] Let userfaultfd opt out of handling kernel-mode faults Daniel Colascione 2020-02-11 22:55 ` [PATCH v2 6/6] Add a new sysctl for limiting userfaultfd to user mode faults Daniel Colascione 2020-02-11 23:13 ` [PATCH v2 0/6] Harden userfaultfd Casey Schaufler 2020-02-11 23:27 ` Daniel Colascione 2020-02-12 16:09 ` Stephen Smalley 2020-02-21 17:56 ` James Morris 2020-02-12 7:50 ` Kees Cook 2020-02-12 16:54 ` Jann Horn 2020-02-12 17:14 ` Peter Xu 2020-02-12 19:41 ` Andrea Arcangeli 2020-02-12 20:04 ` Daniel Colascione [this message] 2020-02-12 23:41 ` Andrea Arcangeli 2020-02-12 17:12 ` Daniel Colascione 2020-02-14 3:26 ` [PATCH 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione 2020-02-14 3:26 ` [PATCH 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione 2020-02-14 3:26 ` [PATCH 2/3] Teach SELinux about anonymous inodes Daniel Colascione 2020-02-14 16:39 ` Stephen Smalley 2020-02-14 17:21 ` Daniel Colascione 2020-02-14 18:02 ` Stephen Smalley 2020-02-14 18:08 ` Stephen Smalley 2020-02-14 20:24 ` Stephen Smalley 2020-02-14 3:26 ` [PATCH 3/3] Wire UFFD up to SELinux Daniel Colascione 2020-03-25 23:02 ` [PATCH v2 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione 2020-03-25 23:02 ` [PATCH v2 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione 2020-03-26 13:53 ` Stephen Smalley 2020-03-25 23:02 ` [PATCH v2 2/3] Teach SELinux about anonymous inodes Daniel Colascione 2020-03-26 13:58 ` Stephen Smalley 2020-03-26 17:59 ` Daniel Colascione 2020-03-26 17:37 ` Stephen Smalley 2020-03-25 23:02 ` [PATCH v2 3/3] Wire UFFD up to SELinux Daniel Colascione 2020-03-25 23:49 ` Casey Schaufler 2020-03-26 18:14 ` [PATCH v3 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione 2020-03-26 18:14 ` [PATCH v3 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione 2020-03-26 19:00 ` Stephen Smalley 2020-03-26 18:14 ` [PATCH v3 2/3] Teach SELinux about anonymous inodes Daniel Colascione 2020-03-26 19:02 ` Stephen Smalley 2020-03-26 18:14 ` [PATCH v3 3/3] Wire UFFD up to SELinux Daniel Colascione 2020-03-26 20:06 ` [PATCH v4 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione 2020-03-26 20:06 ` [PATCH v4 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione 2020-03-27 13:40 ` Stephen Smalley 2020-03-26 20:06 ` [PATCH v4 2/3] Teach SELinux about anonymous inodes Daniel Colascione 2020-03-27 13:41 ` Stephen Smalley 2020-03-26 20:06 ` [PATCH v4 3/3] Wire UFFD up to SELinux Daniel Colascione 2020-04-01 21:39 ` [PATCH v5 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione 2020-04-01 21:39 ` [PATCH v5 1/3] Add a new LSM-supporting anonymous inode interface Daniel Colascione 2020-05-07 16:02 ` James Morris 2020-08-04 21:22 ` Eric Biggers 2020-04-01 21:39 ` [PATCH v5 2/3] Teach SELinux about anonymous inodes Daniel Colascione 2020-04-01 21:39 ` [PATCH v5 3/3] Wire UFFD up to SELinux Daniel Colascione 2020-08-04 21:16 ` Eric Biggers 2020-04-13 13:29 ` [PATCH v5 0/3] SELinux support for anonymous inodes and UFFD Daniel Colascione 2020-04-22 16:55 ` James Morris 2020-04-22 17:12 ` Casey Schaufler 2020-04-23 22:24 ` Casey Schaufler 2020-04-27 16:18 ` Casey Schaufler 2020-04-27 16:48 ` Stephen Smalley 2020-04-27 17:12 ` Casey Schaufler 2020-04-29 17:02 ` Stephen Smalley 2020-04-27 17:15 ` Casey Schaufler 2020-04-27 19:40 ` Stephen Smalley 2020-06-04 3:56 ` James Morris 2020-06-04 18:51 ` Stephen Smalley 2020-06-04 19:24 ` Lokesh Gidra
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAKOZuevusieaKCt5r-jnQ5ArGfw5Otszq2CAcrqFi6MYxkKwtA@mail.gmail.com \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: [PATCH v2 0/6] Harden userfaultfd' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).