From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out03.mta.xmission.com ([166.70.13.233]:56636 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726949AbeIPWNQ (ORCPT ); Sun, 16 Sep 2018 18:13:16 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Oleg Nesterov Cc: Jeff Layton , viro@zeniv.linux.org.uk, berrange@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton References: <20180914105310.6454-1-jlayton@kernel.org> <20180914105310.6454-4-jlayton@kernel.org> <20180915163704.GA31693@redhat.com> Date: Sun, 16 Sep 2018 18:49:33 +0200 In-Reply-To: <20180915163704.GA31693@redhat.com> (Oleg Nesterov's message of "Sat, 15 Sep 2018 18:37:04 +0200") Message-ID: <87efdttmjm.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [PATCH v3 3/3] exec: do unshare_files after de_thread Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Oleg Nesterov writes: > On 09/14, Jeff Layton wrote: >> >> POSIX mandates that open fds and their associated file locks should be >> preserved across an execve. This works, unless the process is >> multithreaded at the time that execve is called. >> >> In that case, we'll end up unsharing the files_struct but the locks will >> still have their fl_owner set to the address of the old one. Eventually, >> when the other threads die and the last reference to the old >> files_struct is put, any POSIX locks get torn down since it looks like >> a close occurred on them. >> >> The result is that all of your open files will be intact with none of >> the locks you held before execve. The simple answer to this is "use OFD >> locks", but this is a nasty surprise and it violates the spec. >> >> Fix this by doing unshare_files later during exec, > > See my reply to 1/3... if we can forget about the races with get_files_struct() > we can probably make a much simpler patch, plus we do not need 2/2, afaics. > > What I really can't understand is why we need to _change_ current->files > early in do_execve(). > > IOW. Lets ignore do_close_on_exec(), lets ignore the fact that unshare_fd() > can fail and thus it makes sense to call it before point-of-no-return. > > Any other reason why we can't simply call unshare_files() at the end of > __do_execve_file() on success? The reason we call we call unshare_files is in case the files are shared with another process. AKA old style linux threads, or someone being clever. In that case we need a private copy of files for close on exec because we should not close the files of the other process that has not called exec. The only reason for calling unshare_files before the point of no return is so that we can get a good error message to the calling process if unshare_files fails. Given that "files->count > 1" should only exist in rare and crazy cases. I expect we can legitimately have exec fail hard if we -ENOMEM in that case and kill the calling process. AKA it would be reasonable to move unshare_files to just above do_close_on_exec in flush_old_exec. We could further make the unshare_files not return displaced and just drop it. Thinking about Jeff's version already by necessity places unshare_files after de_thread. So it is already after the point of no return. So there really is no point in getting trying hard with displaced files. Eric