From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753376AbcDTPHf (ORCPT ); Wed, 20 Apr 2016 11:07:35 -0400 Received: from out03.mta.xmission.com ([166.70.13.233]:42063 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750929AbcDTPHc (ORCPT ); Wed, 20 Apr 2016 11:07:32 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Konstantin Khlebnikov , "H. Peter Anvin" , Andy Lutomirski , security@debian.org, "security\@kernel.org" , Al Viro , "security\@ubuntu.com \>\> security" , Peter Hurley , Serge Hallyn , Willy Tarreau , Aurelien Jarno , One Thousand Gnomes , Jann Horn , Greg KH , Linux Kernel Mailing List , Jiri Slaby , Florian Weimer References: <878u0s3orx.fsf_-_@x220.int.ebiederm.org> <20160409140909.42315e6d@lxorguk.ukuu.org.uk> <83FE8CD2-C0A2-4ADB-AEBD-8DD89AD4F88A@zytor.com> <87bn5ij0x1.fsf@x220.int.ebiederm.org> <78205895-E11D-417F-91DC-4BCA0B61A122@zytor.com> <570D4781.3070600@zytor.com> <877ffyzy1j.fsf_-_@x220.int.ebiederm.org> <87twixgsnq.fsf@x220.int.ebiederm.org> <87oa95gevf.fsf_-_@x220.int.ebiederm.org> Date: Wed, 20 Apr 2016 09:55:50 -0500 In-Reply-To: (Linus Torvalds's message of "Tue, 19 Apr 2016 21:49:52 -0700") Message-ID: <87mvoo8h3d.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-AID: U2FsdGVkX19zOSR/DvWB8zJxH2cyXIv1HbzxI5gnXf4= X-SA-Exim-Connect-IP: 97.119.105.151 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.7 XMSubLong Long Subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 1552 ms - load_scoreonly_sql: 0.08 (0.0%), signal_user_changed: 6 (0.4%), b_tie_ro: 5 (0.3%), parse: 1.25 (0.1%), extract_message_metadata: 33 (2.1%), get_uri_detail_list: 2.4 (0.2%), tests_pri_-1000: 6 (0.4%), tests_pri_-950: 1.59 (0.1%), tests_pri_-900: 13 (0.9%), tests_pri_-400: 54 (3.5%), check_bayes: 52 (3.3%), b_tokenize: 24 (1.5%), b_tok_get_all: 9 (0.6%), b_comp_prob: 3.3 (0.2%), b_tok_touch_all: 3.2 (0.2%), b_finish: 0.79 (0.1%), tests_pri_0: 1389 (89.5%), check_dkim_signature: 0.79 (0.1%), check_dkim_adsp: 395 (25.4%), tests_pri_500: 43 (2.8%), poll_dns_idle: 35 (2.3%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] devpts: Make each mount of devpts an independent filesystem. X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linus Torvalds writes: > On Tue, Apr 19, 2016 at 9:36 PM, Konstantin Khlebnikov wrote: >> On Wed, Apr 20, 2016 at 6:04 AM, Eric W. Biederman >>> >>> The kernel.pty.reserve sysctl is neutered with no way currently >>> implemented to be able to use the reserved ptys. >> >> I think we could convert this into reserve for init user namespace, >> ssh in host will work even if containers eaten all ptys. > > Yes. That's basically how it effectively worked before (ie everything > but the initial non-newinstance devpts mount would be limited to the > non-reserved numbers). > > We required the non-init namespaces to do a newinstance mount, so the > whole test for "newinstance" was effectively the same thing as just > checking for the init namespace from a security standpoint. > > And in fact, rewriting it in that form (ie checking for init_ns) would > just make it much more obvious what the intent it. How does this sound. When mounting a devpts filesystem. We look at the caller (aka current) and if we are in the initial mount namespace set a flag in fsi that allows that instance of devpts to draw into the reserve pool. That will still allow crazy pieces of code like xen-create-instance run by root that mount a devpts filesystem in a chroot environment to draw into the reserved pool, but any sane users that set up their own mount namespace won't be able to user the reserve pool. I believe that will give an almost identical policy to what we have today, and it certainly makes a good default test for a container. Just for cleanliness containers (of anyone's definition) almost always use mount namespaces instead of chroots. Sigh one last past through all of the distros, to confirm that this works in practice. Eric