From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.6 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B799C433DF for ; Fri, 24 Jul 2020 14:46:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 353BA2065E for ; Fri, 24 Jul 2020 14:46:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ejwR60RW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726754AbgGXOqP (ORCPT ); Fri, 24 Jul 2020 10:46:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726567AbgGXOqO (ORCPT ); Fri, 24 Jul 2020 10:46:14 -0400 Received: from mail-il1-x142.google.com (mail-il1-x142.google.com [IPv6:2607:f8b0:4864:20::142]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9ED3BC0619E4 for ; Fri, 24 Jul 2020 07:46:14 -0700 (PDT) Received: by mail-il1-x142.google.com with SMTP id t4so7406679iln.1 for ; Fri, 24 Jul 2020 07:46:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ooxbKiMWNxM2Qkg9OCv+AGbzLM8cpI7Yz8Qh84quFec=; b=ejwR60RWkPZ9/VVKHthh3FpYSzqmZIanKKHiIZIzNQfGxkc34Rg8Z9LYBQPYpo1teZ SZz9q9XA9dY263aHUQ2ps+DaSxtV0Dz1nnNw4Nko5bTb0QAWMHSxWoRW56nnr0xT2d8f N0LyUn2C3G9JYflsDNQE/0EhPghMtFdYKsOqAVk/yJyN54LNZSzsM9V88gijgsLn1KZK saK9TyhP/8yogG+tYgBngjt1QmJhVi/HN8Jg2czuum+IvvJfuKZfmilrJ8v2XGSPhDNG eeMHriv7AnjefyokfOeATQlWcY2nOw95zZn4vGsOfE9Ya6LxXfX0Gps6lYzq0x/F+S1y oEtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ooxbKiMWNxM2Qkg9OCv+AGbzLM8cpI7Yz8Qh84quFec=; b=SBOPaw4USfxeJQ50Fmv77LJ3PgBAnpkVmieQruVlKGA5moe+ZXFDgprokOzAsCWyO3 1g+3qj2Lu8eirAULbm8zV5K6HQOG8Nn6BWIqnYwW0aBv87Zp+oODlK1iZjcXYU7jREL4 FyRe/R1Luq5SZQPpvBcQVBZaO2ygZanrrH3rKWDRmiGpxymdbiOseCehANJGFx3YtVQJ Eo7Z1v6wN/qPgpnS9puYIAWxO/tsAt/NkDXJEDG01rrMLALMv3Ie+mykwejoxxjkbH4M hdEkJQuLSmKvKw5n7n1YzQxKYoqg6UGk30Huos/OF9zgEDSI9PwvWu/b0P/Sm/i2Sxyz pi8Q== X-Gm-Message-State: AOAM533aPuJBy+4YtTrx4nmEelGsdEOFFaWHRCbXijo6CoCYwkOvzJNz k0YSw3ZEe0S+oFTEJJNO6C8NgmlPLxlmRw8Ipg09RA== X-Google-Smtp-Source: ABdhPJyPvAM0JnGaS6jm4dJ5xNruS00zsCyrN9+cibN31itzCWNJYwmf6/zGbKSKU1KI+6JSZWEekKEjpUlnYPbQH7I= X-Received: by 2002:a92:dc90:: with SMTP id c16mr10596811iln.202.1595601973678; Fri, 24 Jul 2020 07:46:13 -0700 (PDT) MIME-Version: 1.0 References: <20200423002632.224776-1-dancol@google.com> <20200423002632.224776-2-dancol@google.com> <20200724100153-mutt-send-email-mst@kernel.org> In-Reply-To: <20200724100153-mutt-send-email-mst@kernel.org> From: Lokesh Gidra Date: Fri, 24 Jul 2020 07:46:02 -0700 Message-ID: Subject: Re: [PATCH 1/2] Add UFFD_USER_MODE_ONLY To: "Michael S. Tsirkin" Cc: Jonathan Corbet , Alexander Viro , Luis Chamberlain , Kees Cook , Iurii Zaikin , Mauro Carvalho Chehab , Andrew Morton , Andy Shevchenko , Vlastimil Babka , Mel Gorman , Sebastian Andrzej Siewior , Peter Xu , Andrea Arcangeli , Mike Rapoport , Jerome Glisse , Shaohua Li , linux-doc@vger.kernel.org, linux-kernel , Linux FS Devel , Tim Murray , Minchan Kim , Sandeep Patil , Daniel Colascione , Jeffrey Vander Stoep , Nick Kralevich , kernel@android.com, Kalesh Singh Content-Type: text/plain; charset="UTF-8" Sender: linux-doc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org On Fri, Jul 24, 2020 at 7:28 AM Michael S. Tsirkin wrote: > > On Wed, Apr 22, 2020 at 05:26:31PM -0700, Daniel Colascione wrote: > > userfaultfd handles page faults from both user and kernel code. Add a > > new UFFD_USER_MODE_ONLY flag for userfaultfd(2) that makes the > > resulting userfaultfd object refuse to handle faults from kernel mode, > > treating these faults as if SIGBUS were always raised, causing the > > kernel code to fail with EFAULT. > > > > A future patch adds a knob allowing administrators to give some > > processes the ability to create userfaultfd file objects only if they > > pass UFFD_USER_MODE_ONLY, reducing the likelihood that these processes > > will exploit userfaultfd's ability to delay kernel page faults to open > > timing windows for future exploits. > > > > Signed-off-by: Daniel Colascione > > Something to add here is that there is separate work on selinux to > support limiting specific userspace programs to only this type of > userfaultfd. > > I also think Kees' comment about documenting what is the threat being solved > including some links to external sources still applies. > > Finally, a question: > > Is there any way at all to increase security without breaking > the assumption that copy_from_user is the same as userspace read? > > > As an example of a drastical approach that might solve some issues, how > about allocating some special memory and setting some VMA flag, then > limiting copy from/to user to just this subset of virtual addresses? > We can then do things like pin these pages in RAM, forbid > madvise/userfaultfd for these addresses, etc. > > Affected userspace then needs to use a kind of a bounce buffer for any > calls into kernel. This needs much more support from userspace and adds > much more overhead, but on the flip side, affects more ways userspace > can slow down the kernel. > > Was this discussed in the past? Links would be appreciated. > Adding Nick and Jeff to the discussion. > > > --- > > fs/userfaultfd.c | 7 ++++++- > > include/uapi/linux/userfaultfd.h | 9 +++++++++ > > 2 files changed, 15 insertions(+), 1 deletion(-) > > > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > > index e39fdec8a0b0..21378abe8f7b 100644 > > --- a/fs/userfaultfd.c > > +++ b/fs/userfaultfd.c > > @@ -418,6 +418,9 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) > > > > if (ctx->features & UFFD_FEATURE_SIGBUS) > > goto out; > > + if ((vmf->flags & FAULT_FLAG_USER) == 0 && > > + ctx->flags & UFFD_USER_MODE_ONLY) > > + goto out; > > > > /* > > * If it's already released don't get it. This avoids to loop > > @@ -2003,6 +2006,7 @@ static void init_once_userfaultfd_ctx(void *mem) > > > > SYSCALL_DEFINE1(userfaultfd, int, flags) > > { > > + static const int uffd_flags = UFFD_USER_MODE_ONLY; > > struct userfaultfd_ctx *ctx; > > int fd; > > > > @@ -2012,10 +2016,11 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) > > BUG_ON(!current->mm); > > > > /* Check the UFFD_* constants for consistency. */ > > + BUILD_BUG_ON(uffd_flags & UFFD_SHARED_FCNTL_FLAGS); > > BUILD_BUG_ON(UFFD_CLOEXEC != O_CLOEXEC); > > BUILD_BUG_ON(UFFD_NONBLOCK != O_NONBLOCK); > > > > - if (flags & ~UFFD_SHARED_FCNTL_FLAGS) > > + if (flags & ~(UFFD_SHARED_FCNTL_FLAGS | uffd_flags)) > > return -EINVAL; > > > > ctx = kmem_cache_alloc(userfaultfd_ctx_cachep, GFP_KERNEL); > > diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h > > index e7e98bde221f..5f2d88212f7c 100644 > > --- a/include/uapi/linux/userfaultfd.h > > +++ b/include/uapi/linux/userfaultfd.h > > @@ -257,4 +257,13 @@ struct uffdio_writeprotect { > > __u64 mode; > > }; > > > > +/* > > + * Flags for the userfaultfd(2) system call itself. > > + */ > > + > > +/* > > + * Create a userfaultfd that can handle page faults only in user mode. > > + */ > > +#define UFFD_USER_MODE_ONLY 1 > > + > > #endif /* _LINUX_USERFAULTFD_H */ > > -- > > 2.26.2.303.gf8c07b1a785-goog > > >