From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E50DC433F5 for ; Tue, 8 Feb 2022 14:12:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377866AbiBHOM4 convert rfc822-to-8bit (ORCPT ); Tue, 8 Feb 2022 09:12:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377752AbiBHOMs (ORCPT ); Tue, 8 Feb 2022 09:12:48 -0500 X-Greylist: delayed 1092 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Tue, 08 Feb 2022 06:12:43 PST Received: from out03.mta.xmission.com (out03.mta.xmission.com [166.70.13.233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18384C03FEDA; Tue, 8 Feb 2022 06:12:42 -0800 (PST) Received: from in02.mta.xmission.com ([166.70.13.52]:39206) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nHQwu-0004wn-9Z; Tue, 08 Feb 2022 06:54:28 -0700 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:42144 helo=email.froward.int.ebiederm.org.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nHQws-009yMp-W6; Tue, 08 Feb 2022 06:54:27 -0700 From: "Eric W. Biederman" To: Michal =?utf-8?Q?Koutn=C3=BD?= Cc: Alexey Gladkov , Kees Cook , Shuah Khan , Christian Brauner , Solar Designer , Ran Xiaokai , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Linux Containers References: <20220207121800.5079-1-mkoutny@suse.com> Date: Tue, 08 Feb 2022 07:54:00 -0600 In-Reply-To: <20220207121800.5079-1-mkoutny@suse.com> ("Michal =?utf-8?Q?K?= =?utf-8?Q?outn=C3=BD=22's?= message of "Mon, 7 Feb 2022 13:17:54 +0100") Message-ID: <87ee4dihvr.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=1nHQws-009yMp-W6;;;mid=<87ee4dihvr.fsf@email.froward.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=68.227.174.4;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/ahRGJgxPKAsR4MwgKINp4kU2cAU6uSPQ= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [RFC PATCH 0/6] RLIMIT_NPROC in ucounts fixups X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Michal Koutný writes: > This series is a result of looking deeper into breakage of > tools/testing/selftests/rlimits/rlimits-per-userns.c after > https://lore.kernel.org/r/20220204181144.24462-1-mkoutny@suse.com/ > is applied. > > The description of the original problem that lead to RLIMIT_NPROC et al. > ucounts rewrite could be ambiguously interpretted as supporting either > the case of: > - never-fork service or > - fork (RLIMIT_NPROC-1) times service. > > The scenario is weird anyway given existence of pids controller. > > The realization of that scenario relies not only on tracking number of > processes per user_ns but also newly allows the root to override limit through > set*uid. The commit message didn't mention that, so it's unclear if it > was the intention too. > > I also noticed that the RLIMIT_NPROC enforcing in fork seems subject to TOCTOU > race (check(nr_tasks),...,nr_tasks++) so the limit is rather advisory (but > that's not a new thing related to ucounts rewrite). > > This series is RFC to discuss relevance of the subtle changes RLIMIT_NPROC to > ucounts rewrite introduced. A quick reply (because I don't have a lot of time at the moment). I agree with the issues your first patch before this series addresses and the issues the first 3 patches address. I have not looked at the tests. I actually disagree with most of your fixes. Both because of intrusiveness and because of awkwardness. My basic problem with your fixes is I don't think they leave the code in a more maintainable state. Hopefully later today I can propose some alternative fixes and we can continue the discussion. One thing I think you misunderstood is the capability checks in set_user have always been there. There is a very good argument they are badly placed so are not exactly checking the correct credentials. Especially now. Your patch 4/6 I don't think makes sense. It has always been the case that root without capabilities is subject to the rlimit. If you are in a user namespace you are root without capabilities. Eric