From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752905AbdK3P2N (ORCPT ); Thu, 30 Nov 2017 10:28:13 -0500 Received: from mail-qk0-f194.google.com ([209.85.220.194]:43922 "EHLO mail-qk0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751786AbdK3P2L (ORCPT ); Thu, 30 Nov 2017 10:28:11 -0500 X-Google-Smtp-Source: AGs4zMZpznieHoJUizdw9bwtY6x6An17h4g7EoF5ZmXNCBCJyjxXoXKqRPFzf6Te1FNzI14h1ptAuA== Date: Wed, 29 Nov 2017 22:33:43 -0200 From: Marcos Paulo de Souza To: Michal Hocko Cc: Andrew Morton , Ingo Molnar , Rik van Riel , Stephen Rothwell , "Kirill A. Shutemov" , Jiri Olsa , Hari Bathini , Peter Zijlstra , Arnaldo Carvalho de Melo , "Eric W. Biederman" , linux-kernel@vger.kernel.org Subject: Re: [PATCH -next] fork.c: Move check of clone NEWIPC and SYSVSEM to copy_process Message-ID: <20171130003341.GA14339@marcos-builder> References: <20171126160717.14727-1-marcos.souza.org@gmail.com> <20171130100406.qnn2zofbfaviorgs@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171130100406.qnn2zofbfaviorgs@dhcp22.suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 30, 2017 at 11:04:06AM +0100, Michal Hocko wrote: > CC Eric > > On Sun 26-11-17 14:06:52, Marcos Paulo de Souza wrote: > > Currently this check for CLONE_NEWIPC with CLONE_SYSVSEM is done inside > > copy_namespaces, resulting in a handful of error paths being executed if > > these flags were used together. So, move this check to the beginning of > > copy_process, exiting earlier if the condition is true. > > > > This move is safe because copy_namespaces is called just from > > copy_process function. This change is introduced right below the point where clone_flags is already checking for inconsistencies in namespace flags[1], and returns EINVAL when conflicting flags are informed together. In this case, it's easier to return early when conflicting flags are informed at the beginning, so moving a namespace check to where namespaces are already being sanitized makes sense. If the code stays where it is now, and a user calls clone syscalls informing CLONE_NEWIPC | CLONE_SYSVSEM, the code will need to undo a lot of work before returning the same EINVAL[2]. [1] https://elixir.free-electrons.com/linux/latest/source/kernel/fork.c#L1552 [2] https://elixir.free-electrons.com/linux/latest/source/kernel/fork.c#L1953 > > I am not familiar with the code all that much but the justification is > not clear to me. Thesea re namespace related flags so why should we pull > them out of copy_namespaces. I do not see any simplifications in the > error code paths or something like that. > > > Signed-off-by: Marcos Paulo de Souza > > --- > > kernel/fork.c | 11 +++++++++++ > > kernel/nsproxy.c | 11 ----------- > > 2 files changed, 11 insertions(+), 11 deletions(-) > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > index 2113e252cb9d..691f9ba135fc 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -1600,6 +1600,17 @@ static __latent_entropy struct task_struct *copy_process( > > return ERR_PTR(-EINVAL); > > > > /* > > + * CLONE_NEWIPC must detach from the undolist: after switching > > + * to a new ipc namespace, the semaphore arrays from the old > > + * namespace are unreachable. In clone parlance, CLONE_SYSVSEM > > + * means share undolist with parent, so we must forbid using > > + * it along with CLONE_NEWIPC. > > + */ > > + if ((clone_flags & (CLONE_NEWIPC | CLONE_SYSVSEM)) == > > + (CLONE_NEWIPC | CLONE_SYSVSEM)) > > + return ERR_PTR(-EINVAL); > > + > > + /* > > * Thread groups must share signals as well, and detached threads > > * can only be started up within the thread group. > > */ > > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c > > index f6c5d330059a..30882727dff5 100644 > > --- a/kernel/nsproxy.c > > +++ b/kernel/nsproxy.c > > @@ -151,17 +151,6 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk) > > if (!ns_capable(user_ns, CAP_SYS_ADMIN)) > > return -EPERM; > > > > - /* > > - * CLONE_NEWIPC must detach from the undolist: after switching > > - * to a new ipc namespace, the semaphore arrays from the old > > - * namespace are unreachable. In clone parlance, CLONE_SYSVSEM > > - * means share undolist with parent, so we must forbid using > > - * it along with CLONE_NEWIPC. > > - */ > > - if ((flags & (CLONE_NEWIPC | CLONE_SYSVSEM)) == > > - (CLONE_NEWIPC | CLONE_SYSVSEM)) > > - return -EINVAL; > > - > > new_ns = create_new_namespaces(flags, tsk, user_ns, tsk->fs); > > if (IS_ERR(new_ns)) > > return PTR_ERR(new_ns); > > -- > > 2.13.6 > > > > -- > Michal Hocko > SUSE Labs -- Thanks, Marcos