Linux-Doc Archive on lore.kernel.org
 help / color / Atom feed
From: Nick Kralevich <nnk@google.com>
To: Lokesh Gidra <lokeshgidra@google.com>
Cc: Jeffrey Vander Stoep <jeffv@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Kees Cook <keescook@chromium.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Daniel Colascione <dancol@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Iurii Zaikin <yzaikin@google.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Shevchenko <andy.shevchenko@gmail.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Peter Xu <peterx@redhat.com>, Mike Rapoport <rppt@linux.ibm.com>,
	Jerome Glisse <jglisse@redhat.com>, Shaohua Li <shli@fb.com>,
	linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Tim Murray <timmurray@google.com>,
	Minchan Kim <minchan@google.com>,
	Sandeep Patil <sspatil@google.com>,
	kernel@android.com, Daniel Colascione <dancol@dancol.org>,
	Kalesh Singh <kaleshsingh@google.com>
Subject: Re: [PATCH 2/2] Add a new sysctl knob: unprivileged_userfaultfd_user_mode_only
Date: Thu, 23 Jul 2020 17:13:28 -0700
Message-ID: <CAFJ0LnGfrzvVgtyZQ+UqRM6F3M7iXOhTkUBTc+9sV+=RrFntyQ@mail.gmail.com> (raw)
In-Reply-To: <CA+EESO4kLaje0yTOyMSxHfSLC0n86zAF+M1DWB_XrwFDLOCawQ@mail.gmail.com>

On Thu, Jul 23, 2020 at 10:30 AM Lokesh Gidra <lokeshgidra@google.com> wrote:
> From the discussion so far it seems that there is a consensus that
> patch 1/2 in this series should be upstreamed in any case. Is there
> anything that is pending on that patch?

That's my reading of this thread too.

> > > Unless I'm mistaken that you can already enforce bit 1 of the second
> > > parameter of the userfaultfd syscall to be set with seccomp-bpf, this
> > > would be more a question to the Android userland team.
> > >
> > > The question would be: does it ever happen that a seccomp filter isn't
> > > already applied to unprivileged software running without
> > > SYS_CAP_PTRACE capability?
> >
> > Yes.
> >
> > Android uses selinux as our primary sandboxing mechanism. We do use
> > seccomp on a few processes, but we have found that it has a
> > surprisingly high performance cost [1] on arm64 devices so turning it
> > on system wide is not a good option.
> >
> > [1] https://lore.kernel.org/linux-security-module/202006011116.3F7109A@keescook/T/#m82ace19539ac595682affabdf652c0ffa5d27dad

As Jeff mentioned, seccomp is used strategically on Android, but is
not applied to all processes. It's too expensive and impractical when
simpler implementations (such as this sysctl) can exist. It's also
significantly simpler to test a sysctl value for correctness as
opposed to a seccomp filter.

> > >
> > >
> > > If answer is "no" the behavior of the new sysctl in patch 2/2 (in
> > > subject) should be enforceable with minor changes to the BPF
> > > assembly. Otherwise it'd require more changes.

It would be good to understand what these changes are.

> > > Why exactly is it preferable to enlarge the surface of attack of the
> > > kernel and take the risk there is a real bug in userfaultfd code (not
> > > just a facilitation of exploiting some other kernel bug) that leads to
> > > a privilege escalation, when you still break 99% of userfaultfd users,
> > > if you set with option "2"?

I can see your point if you think about the feature as a whole.
However, distributions (such as Android) have specialized knowledge of
their security environments, and may not want to support the typical
usages of userfaultfd. For such distributions, providing a mechanism
to prevent userfaultfd from being useful as an exploit primitive,
while still allowing the very limited use of userfaultfd for userspace
faults only, is desirable. Distributions shouldn't be forced into
supporting 100% of the use cases envisioned by userfaultfd when their
needs may be more specialized, and this sysctl knob empowers
distributions to make this choice for themselves.

> > > Is the system owner really going to purely run on his systems CRIU
> > > postcopy live migration (which already runs with CAP_SYS_PTRACE) and
> > > nothing else that could break?

This is a great example of a capability which a distribution may not
want to support, due to distribution specific security policies.

> > >
> > > Option "2" to me looks with a single possible user, and incidentally
> > > this single user can already enforce model "2" by only tweaking its
> > > seccomp-bpf filters without applying 2/2. It'd be a bug if android
> > > apps runs unprotected by seccomp regardless of 2/2.

Can you elaborate on what bug is present by processes being
unprotected by seccomp?

Seccomp cannot be universally applied on Android due to previously
mentioned performance concerns. Seccomp is used in Android primarily
as a tool to enforce the list of allowed syscalls, so that such
syscalls can be audited before being included as part of the Android
API.

-- Nick

-- 
Nick Kralevich | nnk@google.com

  reply index

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-23  0:26 [PATCH 0/2] Control over userfaultfd kernel-fault handling Daniel Colascione
2020-04-23  0:26 ` [PATCH 1/2] Add UFFD_USER_MODE_ONLY Daniel Colascione
2020-07-24 14:28   ` Michael S. Tsirkin
2020-07-24 14:46     ` Lokesh Gidra
2020-07-26 10:09       ` Michael S. Tsirkin
2020-04-23  0:26 ` [PATCH 2/2] Add a new sysctl knob: unprivileged_userfaultfd_user_mode_only Daniel Colascione
2020-05-06 19:38   ` Peter Xu
2020-05-07 19:15     ` Jonathan Corbet
2020-05-20  4:06       ` Andrea Arcangeli
2020-05-08 16:52   ` Michael S. Tsirkin
2020-05-08 16:54     ` Michael S. Tsirkin
2020-05-20  4:59       ` Andrea Arcangeli
2020-05-20 18:03         ` Kees Cook
2020-05-20 19:48           ` Andrea Arcangeli
2020-05-20 19:51             ` Andrea Arcangeli
2020-05-20 20:17               ` Lokesh Gidra
2020-05-20 21:16                 ` Andrea Arcangeli
2020-07-17 12:57                   ` Jeffrey Vander Stoep
2020-07-23 17:30                     ` Lokesh Gidra
2020-07-24  0:13                       ` Nick Kralevich [this message]
2020-07-24 13:40                         ` Michael S. Tsirkin
2020-08-06  0:43                           ` Nick Kralevich
2020-08-06  5:44                             ` Michael S. Tsirkin
2020-08-17 22:11                               ` Lokesh Gidra
2020-09-04  3:34                                 ` Andrea Arcangeli
2020-09-05  0:36                                   ` Lokesh Gidra
2020-09-19 18:14                                     ` Nick Kralevich
2020-07-24 14:01 ` [PATCH 0/2] Control over userfaultfd kernel-fault handling Michael S. Tsirkin
2020-07-24 14:41   ` Lokesh Gidra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFJ0LnGfrzvVgtyZQ+UqRM6F3M7iXOhTkUBTc+9sV+=RrFntyQ@mail.gmail.com' \
    --to=nnk@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andy.shevchenko@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=corbet@lwn.net \
    --cc=dancol@dancol.org \
    --cc=dancol@google.com \
    --cc=jeffv@google.com \
    --cc=jglisse@redhat.com \
    --cc=kaleshsingh@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel@android.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lokeshgidra@google.com \
    --cc=mcgrof@kernel.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=minchan@google.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.ibm.com \
    --cc=shli@fb.com \
    --cc=sspatil@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Doc Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-doc/0 linux-doc/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-doc linux-doc/ https://lore.kernel.org/linux-doc \
		linux-doc@vger.kernel.org
	public-inbox-index linux-doc

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-doc


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git