From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49688C432BE for ; Thu, 26 Aug 2021 22:14:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E4EA60FF2 for ; Thu, 26 Aug 2021 22:14:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243710AbhHZWPZ convert rfc822-to-8bit (ORCPT ); Thu, 26 Aug 2021 18:15:25 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:34684 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243665AbhHZWPY (ORCPT ); Thu, 26 Aug 2021 18:15:24 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]:43756) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mJNdh-00EQKs-GD; Thu, 26 Aug 2021 16:14:25 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95]:36634 helo=email.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mJNdf-00HV1Y-UD; Thu, 26 Aug 2021 16:14:25 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: David Hildenbrand Cc: Andy Lutomirski , Linus Torvalds , David Laight , Linux Kernel Mailing List , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Al Viro , Alexey Dobriyan , Steven Rostedt , "Peter Zijlstra \(Intel\)" , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Michel Lespinasse , Catalin Marinas , "Matthew Wilcox \(Oracle\)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Collin Fijalkovich , Michal Hocko , Miklos Szeredi , Chengguang Xu , Christian =?utf-8?Q?K=C3=B6nig?= , "linux-unionfs\@vger.kernel.org" , Linux API , the arch/x86 maintainers , linux-fsdevel@vger.kernel.org, Linux-MM , Florian Weimer , Michael Kerrisk References: <20210812084348.6521-1-david@redhat.com> <87o8a2d0wf.fsf@disp2133> <60db2e61-6b00-44fa-b718-e4361fcc238c@www.fastmail.com> <87lf56bllc.fsf@disp2133> <87eeay8pqx.fsf@disp2133> <5b0d7c1e73ca43ef9ce6665fec6c4d7e@AcuMS.aculab.com> <87h7ft2j68.fsf@disp2133> <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> Date: Thu, 26 Aug 2021 17:13:52 -0500 In-Reply-To: <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> (David Hildenbrand's message of "Thu, 26 Aug 2021 23:47:07 +0200") Message-ID: <87mtp3g8gv.fsf@disp2133> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=1mJNdf-00HV1Y-UD;;;mid=<87mtp3g8gv.fsf@disp2133>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18m5lfQvtNAYQfGYREq2VBtg2jfskLzpXA= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-unionfs@vger.kernel.org David Hildenbrand writes: > On 26.08.21 19:48, Andy Lutomirski wrote: >> On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote: >>> On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski wrote: >>>> >>>> I’ll bite. How about we attack this in the opposite direction: remove the deny write mechanism entirely. >>> >>> I think that would be ok, except I can see somebody relying on it. >>> >>> It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ time. >> >> Someone off-list just pointed something out to me, and I think we should push harder to remove ETXTBSY. Specifically, we've all been focused on open() failing with ETXTBSY, and it's easy to make fun of anyone opening a running program for write when they should be unlinking and replacing it. >> >> Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, and deny_write_access() *also* returns ETXTBSY if the file is open for write. So, in a multithreaded program, one thread does: >> >> fd = open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC); >> write(fd, some stuff); >> >> <--- problem is here >> >> close(fd); >> execve("some exefile"); >> >> Another thread does: >> >> fork(); >> execve("something else"); >> >> In between fork and execve, there's another copy of the open file description, and i_writecount is held, and the execve() fails. Whoops. See, for example: >> >> https://github.com/golang/go/issues/22315 >> >> I propose we get rid of deny_write_access() completely to solve this. >> >> Getting rid of i_writecount itself seems a bit harder, since a handful of filesystems use it for clever reasons. >> >> (OFD locks seem like they might have the same problem. Maybe we should have a clone() flag to unshare the file table and close close-on-exec things?) >> > > It's not like this issue is new (^2017) or relevant in practice. So no > need to hurry IMHO. One step at a time: it might make perfect sense to > remove ETXTBSY, but we have to be careful to not break other user > space that actually cares about the current behavior in practice. It is an old enough issue that I agree there is no need to hurry. I also ran into this issue not too long ago when I refactored the usermode_driver code. My challenge was not being in userspace the delayed fput was not happening in my kernel thread. Which meant that writing the file, then closing the file, then execing the file consistently reported -ETXTBSY. The kernel code wound up doing: /* Flush delayed fput so exec can open the file read-only */ flush_delayed_fput(); task_work_run(); As I read the code the delay for userspace file descriptors is always done with task_work_add, so userspace should not hit that kind of silliness, and should be able to actually close the file descriptor before the exec. On the flip side, I don't know how anything can depend upon getting an -ETXTBSY. So I don't think there is any real risk of breaking userspace if we remove it. Eric