linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: <linux-api@vger.kernel.org>
Cc: "Etienne Dechamps" <etienne@edechamps.fr>,
	"Alexey Gladkov" <legion@kernel.org>,
	"Kees Cook" <keescook@chromium.org>,
	"Shuah Khan" <shuah@kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Solar Designer" <solar@openwall.com>,
	"Ran Xiaokai" <ran.xiaokai@zte.com.cn>,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	"Linux Containers" <containers@lists.linux-foundation.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	security@kernel.org, "Neil Brown" <neilb@cse.unsw.edu.au>,
	NeilBrown <neilb@suse.de>, "Serge E. Hallyn" <serge@hallyn.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Jann Horn" <jannh@google.com>
Subject: How should rlimits, suid exec, and capabilities interact?
Date: Wed, 23 Feb 2022 12:00:16 -0600	[thread overview]
Message-ID: <87fso91n0v.fsf_-_@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <87zgmi5rhm.fsf@email.froward.int.ebiederm.org> (Eric W. Biederman's message of "Tue, 22 Feb 2022 18:57:57 -0600")


[CC'd the security list because I really don't know who the right people
 are to drag into this discussion]
 
While looking at some issues that have cropped up with making it so
that RLIMIT_NPROC cannot be escaped by creating a user namespace I have
stumbled upon a very old issue of how rlimits and suid exec interact
poorly.

This specific saga starts with commit 909cc4ae86f3 ("[PATCH] Fix two
bugs with process limits (RLIMIT_NPROC)") from
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git which
essentially replaced a capable() check with a an open-coded
implementation of suser(), for RLIMIT_NPROC.

The description from Neil Brown was:

   1/ If a setuid process swaps it's real and effective uids and then forks,
     the fork fails if the new realuid has more processes
     than the original process was limited to.
     This is particularly a problem if a user with a process limit
     (e.g. 256) runs a setuid-root program which does setuid() + fork()
     (e.g. lprng) while root already has more than 256 process (which
     is quite possible).
    
     The root problem here is that a limit which should be a per-user
     limit is being implemented as a per-process limit with
     per-process (e.g. CAP_SYS_RESOURCE) controls.
     Being a per-user limit, it should be that the root-user can over-ride
     it, not just some process with CAP_SYS_RESOURCE.
    
     This patch adds a test to ignore process limits if the real user is root.

The test to see if the real user is root was:
	if (p->real_cred->user != INIT_USER) ...
which persists to this day in fs/fork.c:copy_process().

The practical problem with this test is that it works like nothing else
in the kernel, and so does not look like what it is.  Saying:
	if (!uid_eq(p->real_cred->uid, GLOBAL_ROOT_USER)) ...

would at least be more recognizable.

Really this entire test should be if (!capable(CAP_SYS_RESOURCE) because
CAP_SYS_RESOURCE is the capability that controls if you are allowed to
exceed your rlimits.

Which brings us to the practical issues of how all of these things are
wired together today.

The per-user rlimits are accounted based upon a processes real user, not
the effective user.  All other permission checks are based upon the
effective user.  This has the practical effect that uids are swapped as
above that the processes are charged to root, but use the permissions of
an ordinary user.

The problems get worse when you realize that suid exec does not reset
any of the rlimits except for RLIMIT_STACK.

The rlimits that are particularly affected and are per-user are:
RLIMIT_NPROC, RLIMIT_MSGQUEUE, RLIMIT_SIGPENDING, RLIMIT_MEMLOCK.

But I think failing to reset rlimits during exec has the potential to
effect any suid exec.

Does anyone have any historical knowledge or sense of how this should
work?

Right now it feels like we have coded ourselves into a corner and will
have to risk breaking userspace to get out of it.  AKA I think we need
a policy of reseting rlimits on suid exec, and I think we need to store
global rlimits based upon the effective user not the real user.  Those
changes should allow making capable calls where they belong, and
removing the much too magic user == INIT_USER test for RLIMIT_NPROC.

Eric

  reply	other threads:[~2022-02-23 18:00 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-07 12:17 [RFC PATCH 0/6] RLIMIT_NPROC in ucounts fixups Michal Koutný
2022-02-07 12:17 ` [RFC PATCH 1/6] set_user: Perform RLIMIT_NPROC capability check against new user credentials Michal Koutný
2022-02-10  1:14   ` Solar Designer
2022-02-10  1:57     ` Eric W. Biederman
2022-02-11 20:32     ` Eric W. Biederman
2022-02-12 22:14       ` Solar Designer
2022-02-15 11:55     ` Michal Koutný
2022-02-07 12:17 ` [RFC PATCH 2/6] set*uid: Check RLIMIT_PROC against new credentials Michal Koutný
2022-02-07 12:17 ` [RFC PATCH 3/6] cred: Count tasks by their real uid into RLIMIT_NPROC Michal Koutný
2022-02-07 12:17 ` [RFC PATCH 4/6] ucounts: Allow root to override RLIMIT_NPROC Michal Koutný
2022-02-10  0:21   ` Eric W. Biederman
2022-02-07 12:17 ` [RFC PATCH 5/6] selftests: Challenge RLIMIT_NPROC in user namespaces Michal Koutný
2022-02-10  1:22   ` Shuah Khan
2022-02-15  9:45     ` Michal Koutný
2022-02-07 12:18 ` [RFC PATCH 6/6] selftests: Test RLIMIT_NPROC in clone-created " Michal Koutný
2022-02-10  1:25   ` Shuah Khan
2022-02-15  9:34     ` Michal Koutný
2022-02-08 13:54 ` [RFC PATCH 0/6] RLIMIT_NPROC in ucounts fixups Eric W. Biederman
2022-02-11  2:01 ` [PATCH 0/8] ucounts: RLIMIT_NPROC fixes Eric W. Biederman
2022-02-11  2:13   ` [PATCH 1/8] ucounts: Fix RLIMIT_NPROC regression Eric W. Biederman
2022-02-14 18:37     ` Michal Koutný
2022-02-16 15:22       ` Eric W. Biederman
2022-02-11  2:13   ` [PATCH 2/8] ucounts: Fix set_cred_ucounts Eric W. Biederman
2022-02-15 11:10     ` Michal Koutný
2022-02-11  2:13   ` [PATCH 3/8] ucounts: Fix and simplify RLIMIT_NPROC handling during setuid()+execve Eric W. Biederman
2022-02-12 23:17     ` Solar Designer
2022-02-14 15:10       ` Eric W. Biederman
2022-02-14 17:43         ` Eric W. Biederman
2022-02-15 10:25         ` Michal Koutný
2022-02-16 15:35           ` Eric W. Biederman
2022-02-11  2:13   ` [PATCH 4/8] ucounts: Only except the root user in init_user_ns from RLIMIT_NPROC Eric W. Biederman
2022-02-15 10:54     ` Michal Koutný
2022-02-16 15:41       ` Eric W. Biederman
2022-02-11  2:13   ` [PATCH 5/8] ucounts: Handle wrapping in is_ucounts_overlimit Eric W. Biederman
2022-02-12 22:36     ` Solar Designer
2022-02-14 15:23       ` Eric W. Biederman
2022-02-14 15:23         ` Eric W. Biederman
2022-02-15 11:25         ` Michal Koutný
2022-02-14 17:16       ` David Laight
2022-02-11  2:13   ` [PATCH 6/8] ucounts: Handle inc_rlimit_ucounts wrapping in fork Eric W. Biederman
2022-02-11 11:34     ` Alexey Gladkov
2022-02-11 17:50       ` Eric W. Biederman
2022-02-11 18:32         ` Shuah Khan
2022-02-11 18:40         ` Alexey Gladkov
2022-02-11 19:56           ` Eric W. Biederman
2022-02-11  2:13   ` [PATCH 7/8] rlimit: For RLIMIT_NPROC test the child not the parent for capabilites Eric W. Biederman
2022-02-11  2:13   ` [PATCH 8/8] ucounts: Use the same code to enforce RLIMIT_NPROC in fork and exec Eric W. Biederman
2022-02-11 18:22   ` [PATCH 0/8] ucounts: RLIMIT_NPROC fixes Shuah Khan
2022-02-11 19:23     ` Eric W. Biederman
2022-02-15 11:37     ` Michal Koutný
2022-02-16 15:56   ` [PATCH v2 0/5] " Eric W. Biederman
2022-02-16 15:58     ` [PATCH v2 1/5] rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user Eric W. Biederman
2022-02-16 17:42       ` Solar Designer
2022-02-16 15:58     ` [PATCH v2 2/5] ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1 Eric W. Biederman
2022-02-16 15:58     ` [PATCH v2 3/5] ucounts: Base set_cred_ucounts changes on the real user Eric W. Biederman
2022-02-16 15:58     ` [PATCH v2 4/5] ucounts: Move RLIMIT_NPROC handling after set_user Eric W. Biederman
2022-02-16 15:58     ` [PATCH v2 5/5] ucounts: Handle wrapping in is_ucounts_overlimit Eric W. Biederman
2022-02-16 17:28       ` Shuah Khan
2022-02-18 15:34     ` [GIT PULL] ucounts: RLIMIT_NPROC fixes for v5.17 Eric W. Biederman
2022-02-20 19:05       ` pr-tracker-bot
2022-03-03  0:12       ` [GIT PULL] ucounts: Regression fix " Eric W. Biederman
2022-03-03  0:30         ` pr-tracker-bot
2022-02-12 15:32 ` [RFC PATCH 0/6] RLIMIT_NPROC in ucounts fixups Etienne Dechamps
2022-02-15 10:11   ` Michal Koutný
2022-02-23  0:57     ` Eric W. Biederman
2022-02-23 18:00       ` Eric W. Biederman [this message]
2022-02-23 19:44         ` How should rlimits, suid exec, and capabilities interact? Andy Lutomirski
2022-02-23 21:28           ` Willy Tarreau
2022-02-23 19:50         ` Linus Torvalds
2022-02-24  1:24           ` Eric W. Biederman
2022-02-24  1:41             ` Linus Torvalds
2022-02-24  2:12               ` Eric W. Biederman
2022-02-24 15:41                 ` [PATCH] ucounts: Fix systemd LimigtNPROC with private users regression Eric W. Biederman
2022-02-24 16:28                   ` Kees Cook
2022-02-24 18:53                     ` Michal Koutný
2022-02-25  0:29                     ` Eric W. Biederman
2022-02-24  3:00               ` How should rlimits, suid exec, and capabilities interact? David Laight
2022-02-24  1:32           ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fso91n0v.fsf_-_@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=brauner@kernel.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=etienne@edechamps.fr \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=legion@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mkoutny@suse.com \
    --cc=neilb@cse.unsw.edu.au \
    --cc=neilb@suse.de \
    --cc=ran.xiaokai@zte.com.cn \
    --cc=security@kernel.org \
    --cc=serge@hallyn.com \
    --cc=shuah@kernel.org \
    --cc=solar@openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).