From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E414C432BE for ; Thu, 26 Aug 2021 17:49:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4FD0B60C3E for ; Thu, 26 Aug 2021 17:49:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231625AbhHZRuU (ORCPT ); Thu, 26 Aug 2021 13:50:20 -0400 Received: from mail.kernel.org ([198.145.29.99]:40442 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230306AbhHZRuT (ORCPT ); Thu, 26 Aug 2021 13:50:19 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2043460BD3; Thu, 26 Aug 2021 17:49:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1630000172; bh=lReqgclYIFAZnv3E1iu34NrB91WXtZu1iej15RQQIFY=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=r0D1vkve2VQ3RChfzcruMmdh48kqLDSNBDHeNxpqyreG0SiFvvBPBEZXcNMeMpM4V xKmOyf19vxmVgId0s8j1S6dk3ONiMCTNbjyWAwywh5RaB8/DVSYcj5lY8zNK66Fqwj DM+C7qcur26iBGD/AuVh+n+wmIKVbhSNLYrMmu864cSnu1ufovsKqsMgNN9pxngI9e Yqg/MFnIOWz6aoFhpWSUJgM6Rdat8QlewEgNpNzzDo0WqGzDqBnX5fq/UEJOBN42+Z kN/dQbC6HhuSHaMK2M2rgo4HpMqGlkgd1ga5HgZYhkFo5ze+VKsFb+Op77+3vC7lJU MAgcdnTYcyxWA== Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id 35C6027C0054; Thu, 26 Aug 2021 13:49:30 -0400 (EDT) Received: from imap2 ([10.202.2.52]) by compute6.internal (MEProxy); Thu, 26 Aug 2021 13:49:30 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrudduuddgudduiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreerjeenucfhrhhomhepfdet nhguhicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenuc ggtffrrghtthgvrhhnpefghfeiueevhfeifeetudegkeeileekhedvieeivdehkeeuuedv teehkeetffdtheenucffohhmrghinhepghhithhhuhgsrdgtohhmnecuvehluhhsthgvrh fuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprghnugihodhmvghsmhhtphgr uhhthhhpvghrshhonhgrlhhithihqdduudeiudekheeifedvqddvieefudeiiedtkedqlh huthhopeepkhgvrhhnvghlrdhorhhgsehlihhnuhigrdhluhhtohdruhhs X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 6C0B3A038A7; Thu, 26 Aug 2021 13:49:28 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-1125-g685cec594c-fm-20210825.001-g685cec59 Mime-Version: 1.0 Message-Id: In-Reply-To: References: <20210812084348.6521-1-david@redhat.com> <87o8a2d0wf.fsf@disp2133> <60db2e61-6b00-44fa-b718-e4361fcc238c@www.fastmail.com> <87lf56bllc.fsf@disp2133> <87eeay8pqx.fsf@disp2133> <5b0d7c1e73ca43ef9ce6665fec6c4d7e@AcuMS.aculab.com> <87h7ft2j68.fsf@disp2133> Date: Thu, 26 Aug 2021 10:48:55 -0700 From: "Andy Lutomirski" To: "Linus Torvalds" Cc: "Eric W. Biederman" , "David Laight" , "David Hildenbrand" , "Linux Kernel Mailing List" , "Andrew Morton" , "Thomas Gleixner" , "Ingo Molnar" , "Borislav Petkov" , "H. Peter Anvin" , "Al Viro" , "Alexey Dobriyan" , "Steven Rostedt" , "Peter Zijlstra (Intel)" , "Arnaldo Carvalho de Melo" , "Mark Rutland" , "Alexander Shishkin" , "Jiri Olsa" , "Namhyung Kim" , "Petr Mladek" , "Sergey Senozhatsky" , "Andy Shevchenko" , "Rasmus Villemoes" , "Kees Cook" , "Greg Ungerer" , "Geert Uytterhoeven" , "Mike Rapoport" , "Vlastimil Babka" , "Vincenzo Frascino" , "Chinwen Chang" , "Michel Lespinasse" , "Catalin Marinas" , "Matthew Wilcox (Oracle)" , "Huang Ying" , "Jann Horn" , "Feng Tang" , "Kevin Brodsky" , "Michael Ellerman" , "Shawn Anastasio" , "Steven Price" , "Nicholas Piggin" , "Christian Brauner" , "Jens Axboe" , "Gabriel Krisman Bertazi" , "Peter Xu" , "Suren Baghdasaryan" , "Shakeel Butt" , "Marco Elver" , "Daniel Jordan" , "Nicolas Viennot" , "Thomas Cedeno" , "Collin Fijalkovich" , "Michal Hocko" , "Miklos Szeredi" , "Chengguang Xu" , =?UTF-8?Q?Christian_K=C3=B6nig?= , "linux-unionfs@vger.kernel.org" , "Linux API" , "the arch/x86 maintainers" , "" , Linux-MM , "Florian Weimer" , "Michael Kerrisk" Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-unionfs@vger.kernel.org On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote: > On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski wrot= e: > > > > I=E2=80=99ll bite. How about we attack this in the opposite directi= on: remove the deny write mechanism entirely. >=20 > I think that would be ok, except I can see somebody relying on it. >=20 > It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ t= ime. Someone off-list just pointed something out to me, and I think we should= push harder to remove ETXTBSY. Specifically, we've all been focused on= open() failing with ETXTBSY, and it's easy to make fun of anyone openin= g a running program for write when they should be unlinking and replacin= g it. Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, = and deny_write_access() *also* returns ETXTBSY if the file is open for w= rite. So, in a multithreaded program, one thread does: fd =3D open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC); write(fd, some stuff); <--- problem is here close(fd); execve("some exefile"); Another thread does: fork(); execve("something else"); In between fork and execve, there's another copy of the open file descri= ption, and i_writecount is held, and the execve() fails. Whoops. See, = for example: https://github.com/golang/go/issues/22315 I propose we get rid of deny_write_access() completely to solve this. Getting rid of i_writecount itself seems a bit harder, since a handful o= f filesystems use it for clever reasons. (OFD locks seem like they might have the same problem. Maybe we should = have a clone() flag to unshare the file table and close close-on-exec th= ings?) >=20 > But you are right that we have removed parts of it over time (no more > MAP_DENYWRITE, no more uselib()) so that what we have today is a > fairly weak form of what we used to do. >=20 > And nobody really complained when we weakened it, so maybe removing it > entirely might be acceptable. >=20 > Linus >=20