From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from thejh.net ([37.221.195.125]:35229 "EHLO thejh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755569AbcKBWrG (ORCPT ); Wed, 2 Nov 2016 18:47:06 -0400 Date: Wed, 2 Nov 2016 23:47:01 +0100 From: Jann Horn To: Oleg Nesterov Cc: Alexander Viro , Roland McGrath , John Johansen , James Morris , "Serge E. Hallyn" , Paul Moore , Stephen Smalley , Eric Paris , Casey Schaufler , Kees Cook , Andrew Morton , Janis Danisevskis , Seth Forshee , "Eric . Biederman" , Thomas Gleixner , Benjamin LaHaise , Ben Hutchings , Andy Lutomirski , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-security-module@vger.kernel.org, security@kernel.org Subject: Re: [PATCH v2 4/8] futex: don't leak robust_list pointer Message-ID: <20161102224701.GB13748@pc.thejh.net> References: <1474663238-22134-1-git-send-email-jann@thejh.net> <1474663238-22134-5-git-send-email-jann@thejh.net> <20160930145256.GB12862@redhat.com> <20161030171650.GB2558@pc.thejh.net> <20161102213932.GA13748@pc.thejh.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="98e8jtXdkpgskNou" Content-Disposition: inline In-Reply-To: <20161102213932.GA13748@pc.thejh.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: --98e8jtXdkpgskNou Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 02, 2016 at 10:39:32PM +0100, Jann Horn wrote: > On Sun, Oct 30, 2016 at 06:16:50PM +0100, Jann Horn wrote: > > On Fri, Sep 30, 2016 at 04:52:57PM +0200, Oleg Nesterov wrote: > > > On 09/23, Jann Horn wrote: > > > > > > > > This prevents an attacker from determining the robust_list or > > > > compat_robust_list userspace pointer of a process created by execut= ing > > > > a setuid binary. Such an attack could be performed by racing > > > > get_robust_list() with a setuid execution. The impact of this issue= is that > > > > an attacker could theoretically bypass ASLR when attacking setuid b= inaries. > > >=20 > > > Well. I am not sure this actually needs a fix, but I won't argue. > > >=20 > > > I can't really understand what this patch actually fixes, > > >=20 > > > > @@ -3007,31 +3007,43 @@ SYSCALL_DEFINE3(get_robust_list, int, pid, > > > > if (!futex_cmpxchg_enabled) > > > > return -ENOSYS; > > > > > > > > - rcu_read_lock(); > > > > - > > > > - ret =3D -ESRCH; > > > > - if (!pid) > > > > + if (!pid) { > > > > p =3D current; > > > > - else { > > > > + get_task_struct(p); > > > > + } else { > > > > + rcu_read_lock(); > > > > p =3D find_task_by_vpid(pid); > > > > + /* pin the task to permit dropping the RCU read lock before > > > > + * acquiring the mutex > > > > + */ > > > > + if (p) > > > > + get_task_struct(p); > > > > + rcu_read_unlock(); > > > > if (!p) > > > > - goto err_unlock; > > > > + return -ESRCH; > > > > } > > > > > > > > + ret =3D mutex_lock_killable(&p->signal->cred_guard_light); > > > > + if (ret) > > > > + goto err_put; > > > > + > > > > ret =3D -EPERM; > > > > if (!ptrace_may_access(p, PTRACE_MODE_READ_REALCREDS)) > > > > goto err_unlock; > > > > > > > > head =3D p->robust_list; > > > > - rcu_read_unlock(); > > >=20 > > > OK, suppose it races with setuid exec, and mutex_lock_killable() + > > > ptrace_may_access() comes after flush_old_exec() but before > > > install_exec_creds(), in this case ptrace_may_access() can wrongly > > > succeed. > >=20 > > I take cred_guard_light in flush_old_exec() and release it in > > install_exec_creds(), so that shouldn't work, I think. > >=20 > >=20 > > > In theory, it is possible that the execing thread can complete exec, > > > return to user-mode and call sys_set_robust_list() before we read > > > head =3D p->robust_list. Yes, this is unlikely, but unless I am total= ly > > > confused the race you are trying to fix is equally unlikely? > > >=20 > > > perhaps we can make a much simpler change to prevent this, see below. > > > We can rely on fact that both ptrace_may_access() and exec_mmap() > > > takes the same task_lock(). Sure, this can "leak" robust_list too, > > > a set-uid binary can exec and/or lower its credentials after we > > > read p->robust_list, but personally I think we do not care. > > >=20 > > > Or I missed something else? > >=20 > > No - I think your patch would work, too, apart from the potential > > leak you mentioned. >=20 > Changing my opinion: >=20 > This does not just affect setuid binaries. It also affects daemons like > cron and atd that execute processes with dropped privileges. >=20 > This is how atd runs jobs (strace output, with irrelevant stuff removed): >=20 > [...] > clone(child_stack=3D0, flags=3DCLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SI= GCHLD, child_tidptr=3D0x7fa81b1099d0) =3D 14915 > Process 14915 attached > [...] > [pid 14915] set_robust_list(0x7fa81b1099e0, 24) =3D 0 > [...] > [pid 14915] setregid(0, 1) =3D 0 > [pid 14915] setreuid(0, 1) =3D 0 > [pid 14915] close(0) =3D 0 > [pid 14915] close(1) =3D 0 > [pid 14915] close(2) =3D 0 > [pid 14915] clone(Process 14916 attached > child_stack=3D0, flags=3DCLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,= child_tidptr=3D0x7fa81b1099d0) =3D 14916 > [pid 14916] set_robust_list(0x7fa81b1099e0, 24) =3D 0 > [pid 14915] wait4(14916, > [pid 14916] lseek(6, 0, SEEK_SET) =3D 0 > [pid 14916] dup2(6, 0) =3D 0 > [pid 14916] dup2(5, 1) =3D 1 > [pid 14916] dup2(5, 2) =3D 2 > [pid 14916] close(6) =3D 0 > [pid 14916] close(5) =3D 0 > [pid 14916] setreuid(1, 0) =3D 0 > [pid 14916] setregid(1, 0) =3D 0 > [...] > [pid 14916] setgroups(13, [1000, [...]]) =3D 0 > [pid 14916] setgid(1000) =3D 0 > [pid 14916] setuid(1000) =3D 0 > [pid 14916] chdir("/") =3D 0 > [pid 14916] execve("/bin/sh", ["sh"], [/* 0 vars */]) =3D 0 > [...] >=20 > Basically, you can see that the pointer 0x7fa81b1099e0, which reveals > information about the address space layout, is the robust list of pid 149= 16 > when it calls execve(), and after that execve() call, pid 14916 will be > ptraceable for the user (modulo LSMs). >=20 > So I think that my patch is a bit safer. Yes, there aren't many local > daemons whose address space layout you can discover this way, but it's st= ill > not great. I think my previous message wasn't very clear about what I think the issue = is. Basically, here, it would be plausible for uid 1000 to be able to determine the pre-execve() robust_list pointer of pid 14916 by racing get_robust_list= () during the execve(). That itself isn't a big issue because the memory mappi= ngs of pid 14916 are thrown away during the execve(), but what is potentially interesting to an attacker is that before the execve(), pid 14916 shared its address space layout with its parents, including the atd daemon. So if an attacker has a vulnerability in atd but needs an address leak in order to exploit it, this would be such a leak. --98e8jtXdkpgskNou Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJYGmzlAAoJED4KNFJOeCOoWpcP/jAxKltpCUJE/DDP0p3fOSIQ dxmeXzoVKimkhBcguis5EHkQWjqs9Gfk0RL0aLmJ2pQbV2A09BeulGPyyHawv2ij E0EigTc3X6lTdNl+Nco4lXZeG60L0+ejFHfCTqb5lzRDt+V4dJSXmdf3MfZD/gkf MhUjqwqBsufNqR+nbTIVXqFdMYUP67s2jnZs84h2sYymWr98bCruU2k+A831niQm SCCYLOaNp2nirXDTer8sM4mF2MwazB7NWKNnC0L/DmF/ZkoW14kpIwu2M01AN9BY Y+4knV/6vYr6QIVErxSE3ngZ+t1Tvq2ESMn3AFWzmgGfpmqYfCMJ+ssh99us6bZr iVs47tfV8JMI+Y6xp3gFQUn+vvu5Odq0PqFdu/o6RJILmFRxwSF9619QANZOO3Ev 1tuE7qYCB8hA5NfNiEtwNM6E/p/YGYuTHZvOag3+w/XPPGB+7ng7NEPBfaamvToH ZUhyOmQerU45lN/KWSIfmMCiuJfOyrYkzXL12fdDHzdFOHD7jtPkaTxxxRk8r8wG Zp0u4cXWa48Mg/BNRtlQkOb6X89INY4CJGceO35GSQOtWs+Lhu7QemaambJUyYzB OHKB6hgqeAQgeabVOIP8dWS1Ymu1KcDT3FSp7S0MdxRhS+uZ9CXnT5ZZSfi4upGo C5sfsw28xDLNjCKRz/Wd =gHyD -----END PGP SIGNATURE----- --98e8jtXdkpgskNou--