From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755609AbcAWDMi (ORCPT ); Fri, 22 Jan 2016 22:12:38 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:39852 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755200AbcAWDMe (ORCPT ); Fri, 22 Jan 2016 22:12:34 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Kees Cook Cc: Andrew Morton , Al Viro , Richard Weinberger , Andy Lutomirski , Robert =?utf-8?B?xZp3acSZY2tp?= , Dmitry Vyukov , David Howells , Miklos Szeredi , Kostya Serebryany , Alexander Potapenko , Eric Dumazet , Sasha Levin , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com References: <1453502345-30416-1-git-send-email-keescook@chromium.org> Date: Fri, 22 Jan 2016 21:02:40 -0600 In-Reply-To: <1453502345-30416-1-git-send-email-keescook@chromium.org> (Kees Cook's message of "Fri, 22 Jan 2016 14:39:03 -0800") Message-ID: <8737tp0zhr.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-AID: U2FsdGVkX19msmNWBjkRme/ktTPnwya6a32J+fTQU9I= X-SA-Exim-Connect-IP: 97.121.81.63 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;Kees Cook X-Spam-Relay-Country: X-Spam-Timing: total 315 ms - load_scoreonly_sql: 0.05 (0.0%), signal_user_changed: 4.8 (1.5%), b_tie_ro: 3.4 (1.1%), parse: 1.33 (0.4%), extract_message_metadata: 15 (4.6%), get_uri_detail_list: 3.2 (1.0%), tests_pri_-1000: 5 (1.6%), tests_pri_-950: 1.11 (0.4%), tests_pri_-900: 0.96 (0.3%), tests_pri_-400: 26 (8.3%), check_bayes: 25 (7.9%), b_tokenize: 8 (2.4%), b_tok_get_all: 8 (2.6%), b_comp_prob: 2.7 (0.9%), b_tok_touch_all: 3.5 (1.1%), b_finish: 0.90 (0.3%), tests_pri_0: 251 (79.9%), check_dkim_signature: 0.49 (0.2%), check_dkim_adsp: 4.1 (1.3%), tests_pri_500: 5 (1.7%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Kees Cook writes: > There continues to be unexpected side-effects and security exposures > via CLONE_NEWUSER. For many end-users running distro kernels with > CONFIG_USER_NS enabled, there is no way to disable this feature when > desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so > admins not running containers or Chrome can avoid the risks of this > feature. I don't actually think there do continue to be unexpected side-effects and security exposures with CLONE_NEWUSER. It takes a while for all of the fixes to trickle out to distros. At most what I have seen recently are problems with other kernel interfaces being amplified with user namespaces. AKA the current mess with devpts, and the unexpected issues with bind mounts in mount namespaces. I have a couple of concerns with a sysctl. 1) As user namespaces settle out this sysctl has the potential to decrease the security of the system overall as sandboxing features of the kernel will not be available to unprivileged applications. Web browsing with chrome will be less safe for example. 2) I strongly suspect the granularity of a sysctl is wrong for access to user namespaces on a production system. In general I suspect what we want is something like seccomp. I believe all of the relevant bits are in registers. I actually thought that was enough for seccomp. Does seccomp not work for some reason? 3) A sysctl breeds a false sense of security in thinking that if a security issue is discovered you can just flip a switch, disable all new user namespaces and you won't be vulnerable. In fact most of the issues in the past have only required being in a user namespace to trigger. Which means any containers or user namespaces that already exist could be used to exploit any new found issue. Which means that a I don't think a sysctl will give the desired level of protection. In my analysis of the issues to date I don't know of anything short of a reboot that would meaninfully remove the threat. 4) With applications like docker coming on-line I don't think a restriction to processes with capabilities is actually meaninful for restricting access to user namespaces. So I have concerns about both efficacy and usability with the proposed sysctl. So to keep this productive. Please tell me about the threat model you envision, and how you envision knobs in the kernel being used to counter those threats. Eric