From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752903Ab1GZBQx (ORCPT <rfc822;w@1wt.eu>);
	Mon, 25 Jul 2011 21:16:53 -0400
Received: from mother.openwall.net ([195.42.179.200]:36724 "HELO
	mother.openwall.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with SMTP id S1752219Ab1GZBQq (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 25 Jul 2011 21:16:46 -0400
Date: Tue, 26 Jul 2011 05:16:29 +0400
From: Solar Designer <solar@openwall.com>
To: NeilBrown <neilb@suse.de>
Cc: Vasiliy Kulikov <segoon@openwall.com>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Stephen Smalley <sds@tycho.nsa.gov>,
        kernel-hardening@lists.openwall.com, James Morris <jmorris@namei.org>,
        linux-kernel@vger.kernel.org, Greg Kroah-Hartman <gregkh@suse.de>,
        Andrew Morton <akpm@linux-foundation.org>,
        "David S. Miller" <davem@davemloft.net>, Jiri Slaby <jslaby@suse.cz>,
        Alexander Viro <viro@zeniv.linux.org.uk>,
        linux-fsdevel@vger.kernel.org,
        KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
        Eric Paris <eparis@redhat.com>, Willy Tarreau <w@1wt.eu>,
        Sebastian Krahmer <krahmer@suse.de>
Subject: Re: [kernel-hardening] [PATCH] move RLIMIT_NPROC check from set_user() to do_execve_common()
Message-ID: <20110726011629.GA24610@openwall.com>
References: <1310738313.30257.27.camel@moss-pluto> <20110715152641.GA6286@albatros> <1310759683.30257.123.camel@moss-pluto> <20110721140936.632d2c8b@notabene.brown> <20110721124830.GA1325@openwall.com> <CA+55aFy4ybatb0McQr2f4A56ympZVLqK2sbp2=73idu58_4RxQ@mail.gmail.com> <20110721193939.GA3914@openwall.com> <20110725171423.GA3739@albatros> <20110725234013.GB24110@openwall.com> <20110726104713.37273143@notabene.brown>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110726104713.37273143@notabene.brown>
User-Agent: Mutt/1.4.2.3i
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Jul 26, 2011 at 10:47:13AM +1000, NeilBrown wrote:
> On Tue, 26 Jul 2011 03:40:13 +0400 Solar Designer <solar@openwall.com> wrote:
> > On Mon, Jul 25, 2011 at 09:14:23PM +0400, Vasiliy Kulikov wrote:
> > > @@ -1433,6 +1433,19 @@ static int do_execve_common(const char *filename,
> > >  	struct files_struct *displaced;
> > >  	bool clear_in_exec;
> > >  	int retval;
> > > +	const struct cred *cred = current_cred();
> > > +
> > > +	/*
> > > +	 * We move the actual failure in case of RLIMIT_NPROC excess from
> > > +	 * set*uid() to execve() because too many poorly written programs
> > > +	 * don't check setuid() return code.  Here we additionally recheck
> > > +	 * whether NPROC limit is still exceeded.
> > > +	 */
> > > +	if ((current->flags & PF_NPROC_EXCEEDED) &&
> > > +	    atomic_read(&cred->user->processes) > rlimit(RLIMIT_NPROC)) {
> > > +		retval = -EAGAIN;
> > > +		goto out_ret;
> > > +	}
> > 
> > Do you possibly need:
> > 
> > 	current->flags &= ~PF_NPROC_EXCEEDED;
> > 
> > somewhere after this point?
> > 
> > I think it's weird to have past set_user() failure affect other than the
> > very next execve().
> 
> So we are hoping that no program uses execvp() or similar...

Why?  No, we don't, unless I am missing something.

> Maybe that is
> reasonable but "in for a penny, in for a pound" - I'd fail them all.
> 
> I think the flag should only be cleared once we notice that the limit is no
> longer exceeded.  So clearing the flag can appear *after* the code you quote
> above, but not in the middle of it.

Definitely.  In case execve() fails because of the limit, the flag
remains set, so a second execve() by the process will fail too.

> > Perhaps also reset the flag on fork() because we have an RLIMIT_NPROC
> > check on fork() anyway.
> 
> I agree it should be cleared here too.

Great.  Just to clarify my own words: on fork(), clear the flag in the
child process only.

> But there is still the issue of 'zygot' like services....

Here's my take on it:

1. It is not known (from the discussion so far) whether Android/Zygote
even cares about RLIMIT_NPROC specifically or not.  The code is very
generic, usable for any rlimits, and the rationale behind it might have
been to be able to apply certain other limits.  I don't know whether or
not there exists a system that actually sets RLIMIT_NPROC via that
mechanism and expects it working.

2. If desired, Android/Zygote will be able to check the
PF_NPROC_EXCEEDED flag, via procfs or via a prctl() interface that we
might introduce.  Or it may simply pass an extra fork().

> Let me try another suggestion.  Instead of catching the error in
> do_execve_common, how about we catch it in do_mmap_pgoff.
> i.e. if the flag is set and an attempt it made to create an executable
> mapping, we check the user->processes against the limit then - either failing
> or clearing the flag and succeeding.
> 
> This will stop an execve, and an attempt to load a shared library and call it.

This sounds too hackish to me, although if others are (unexpectedly) OK
with it, I don't mind.

Thanks,

Alexander