linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: David Hildenbrand <david@redhat.com>
Cc: "Andy Lutomirski" <luto@kernel.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"David Laight" <David.Laight@aculab.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Alexey Dobriyan" <adobriyan@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	"Arnaldo Carvalho de Melo" <acme@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Jiri Olsa" <jolsa@redhat.com>,
	"Namhyung Kim" <namhyung@kernel.org>,
	"Petr Mladek" <pmladek@suse.com>,
	"Sergey Senozhatsky" <sergey.senozhatsky@gmail.com>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Rasmus Villemoes" <linux@rasmusvillemoes.dk>,
	"Kees Cook" <keescook@chromium.org>,
	"Greg Ungerer" <gerg@linux-m68k.org>,
	"Geert Uytterhoeven" <geert@linux-m68k.org>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Vincenzo Frascino" <vincenzo.frascino@arm.com>,
	"Chinwen Chang" <chinwen.chang@mediatek.com>,
	"Michel Lespinasse" <walken@google.com>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Huang Ying" <ying.huang@intel.com>,
	"Jann Horn" <jannh@google.com>, "Feng Tang" <feng.tang@intel.com>,
	"Kevin Brodsky" <Kevin.Brodsky@arm.com>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Shawn Anastasio" <shawn@anastas.io>,
	"Steven Price" <steven.price@arm.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Christian Brauner" <christian.brauner@ubuntu.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Gabriel Krisman Bertazi" <krisman@collabora.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Marco Elver" <elver@google.com>,
	"Daniel Jordan" <daniel.m.jordan@oracle.com>,
	"Nicolas Viennot" <Nicolas.Viennot@twosigma.com>,
	"Thomas Cedeno" <thomascedeno@google.com>,
	"Collin Fijalkovich" <cfijalkovich@google.com>,
	"Michal Hocko" <mhocko@suse.com>,
	"Miklos Szeredi" <miklos@szeredi.hu>,
	"Chengguang Xu" <cgxu519@mykernel.net>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"linux-unionfs@vger.kernel.org" <linux-unionfs@vger.kernel.org>,
	"Linux API" <linux-api@vger.kernel.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	linux-fsdevel@vger.kernel.org, Linux-MM <linux-mm@kvack.org>,
	"Florian Weimer" <fweimer@redhat.com>,
	"Michael Kerrisk" <mtk.manpages@gmail.com>
Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE
Date: Thu, 26 Aug 2021 17:13:52 -0500	[thread overview]
Message-ID: <87mtp3g8gv.fsf@disp2133> (raw)
In-Reply-To: <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> (David Hildenbrand's message of "Thu, 26 Aug 2021 23:47:07 +0200")

David Hildenbrand <david@redhat.com> writes:

> On 26.08.21 19:48, Andy Lutomirski wrote:
>> On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote:
>>> On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski <luto@kernel.org> wrote:
>>>>
>>>> I’ll bite.  How about we attack this in the opposite direction: remove the deny write mechanism entirely.
>>>
>>> I think that would be ok, except I can see somebody relying on it.
>>>
>>> It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ time.
>>
>> Someone off-list just pointed something out to me, and I think we should push harder to remove ETXTBSY.  Specifically, we've all been focused on open() failing with ETXTBSY, and it's easy to make fun of anyone opening a running program for write when they should be unlinking and replacing it.
>>
>> Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, and deny_write_access() *also* returns ETXTBSY if the file is open for write.  So, in a multithreaded program, one thread does:
>>
>> fd = open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC);
>> write(fd, some stuff);
>>
>> <--- problem is here
>>
>> close(fd);
>> execve("some exefile");
>>
>> Another thread does:
>>
>> fork();
>> execve("something else");
>>
>> In between fork and execve, there's another copy of the open file description, and i_writecount is held, and the execve() fails.  Whoops.  See, for example:
>>
>> https://github.com/golang/go/issues/22315
>>
>> I propose we get rid of deny_write_access() completely to solve this.
>>
>> Getting rid of i_writecount itself seems a bit harder, since a handful of filesystems use it for clever reasons.
>>
>> (OFD locks seem like they might have the same problem.  Maybe we should have a clone() flag to unshare the file table and close close-on-exec things?)
>>
>
> It's not like this issue is new (^2017) or relevant in practice. So no
> need to hurry IMHO. One step at a time: it might make perfect sense to
> remove ETXTBSY, but we have to be careful to not break other user
> space that actually cares about the current behavior in practice.

It is an old enough issue that I agree there is no need to hurry.

I also ran into this issue not too long ago when I refactored the
usermode_driver code.  My challenge was not being in userspace
the delayed fput was not happening in my kernel thread.  Which meant
that writing the file, then closing the file, then execing the file
consistently reported -ETXTBSY.

The kernel code wound up doing:
	/* Flush delayed fput so exec can open the file read-only */
	flush_delayed_fput();
	task_work_run();

As I read the code the delay for userspace file descriptors is
always done with task_work_add, so userspace should not hit
that kind of silliness, and should be able to actually close
the file descriptor before the exec.


On the flip side, I don't know how anything can depend upon getting an
-ETXTBSY.  So I don't think there is any real risk of breaking userspace
if we remove it.

Eric


  reply	other threads:[~2021-08-26 22:14 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-12  8:43 [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE David Hildenbrand
2021-08-12  8:43 ` [PATCH v1 1/7] binfmt: don't use MAP_DENYWRITE when loading shared libraries via uselib() David Hildenbrand
2021-08-12  8:43 ` [PATCH v1 2/7] kernel/fork: factor out atomcially replacing the current MM exe_file David Hildenbrand
2021-08-12  9:17   ` Christian Brauner
2021-08-12  8:43 ` [PATCH v1 3/7] kernel/fork: always deny write access to " David Hildenbrand
2021-08-12 10:05   ` Christian Brauner
2021-08-12 10:13     ` David Hildenbrand
2021-08-12 12:32       ` Christian Brauner
2021-08-12 12:38         ` David Hildenbrand
2021-08-12 16:51   ` Linus Torvalds
2021-08-12 19:38     ` David Hildenbrand
2021-08-12  8:43 ` [PATCH v1 4/7] binfmt: remove in-tree usage of MAP_DENYWRITE David Hildenbrand
2021-08-12  8:43 ` [PATCH v1 5/7] mm: remove VM_DENYWRITE David Hildenbrand
2021-08-12  8:43 ` [PATCH v1 6/7] mm: ignore MAP_DENYWRITE in ksys_mmap_pgoff() David Hildenbrand
2021-08-12  8:43 ` [PATCH v1 7/7] fs: update documentation of get_write_access() and friends David Hildenbrand
2021-08-12 12:20 ` [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE Florian Weimer
2021-08-12 12:47   ` David Hildenbrand
2021-08-12 16:17   ` Eric W. Biederman
2021-08-12 17:32 ` Eric W. Biederman
2021-08-12 17:35   ` Andy Lutomirski
2021-08-12 17:48     ` Eric W. Biederman
2021-08-12 18:01       ` Andy Lutomirski
2021-08-12 18:10       ` Linus Torvalds
2021-08-12 18:47         ` Eric W. Biederman
2021-08-13  9:05           ` David Laight
     [not found]             ` <87h7ft2j68.fsf@disp2133>
2021-08-13 20:51               ` Florian Weimer
2021-08-14  0:31               ` Linus Torvalds
2021-08-14  0:49                 ` Andy Lutomirski
2021-08-14  0:54                   ` Linus Torvalds
2021-08-14  0:58                     ` Linus Torvalds
2021-08-14  1:57                       ` Al Viro
2021-08-14  2:02                         ` Al Viro
2021-08-14  9:06                           ` David Hildenbrand
2021-08-14  7:53                         ` Christian Brauner
2021-08-14 19:52                     ` David Laight
2021-08-26 17:48                     ` Andy Lutomirski
2021-08-26 21:47                       ` David Hildenbrand
2021-08-26 22:13                         ` Eric W. Biederman [this message]
2021-08-27  8:22                           ` David Laight
2021-08-27 15:58                             ` Eric W. Biederman
2021-09-01  8:28                           ` David Hildenbrand
2021-08-27 10:18                         ` Christian Brauner
2021-08-14  3:04                   ` Matthew Wilcox
2021-08-17 16:48                     ` Removing Mandatory Locks Eric W. Biederman
2021-08-17 16:50                       ` David Hildenbrand
2021-08-18  9:34                       ` Rodrigo Campos
2021-08-19 19:18                         ` Jeff Layton
2021-08-19 20:03                           ` Willy Tarreau
2021-08-19 18:39                       ` Jeff Layton
2021-08-19 19:15                         ` Linus Torvalds
2021-08-19 19:55                           ` Eric Biggers
2021-08-19 20:18                           ` Jeff Layton
2021-08-19 20:31                             ` Linus Torvalds
2021-08-19 21:43                               ` Jeff Layton
2021-08-19 22:32                                 ` Linus Torvalds
2021-08-20  8:30                                   ` David Laight
2021-08-23  7:55                                     ` Geert Uytterhoeven
2021-08-23  8:14                                       ` David Laight
2021-08-20 13:43                                   ` Steven Rostedt
2021-08-20 16:06                                     ` Linus Torvalds
2021-08-20  2:10                               ` Matthew Wilcox
2021-08-20  6:36                               ` Amir Goldstein
2021-08-20  7:14                                 ` Amir Goldstein
2021-08-20 12:27                                   ` Jeff Layton
2021-08-20 12:38                                     ` Willy Tarreau
2021-08-20 13:03                                       ` Jeff Layton
2021-08-20 13:11                                         ` Willy Tarreau
2021-08-20 16:30                           ` Kees Cook
2021-08-20 19:17                             ` H. Peter Anvin
2021-08-20 21:29                               ` Jeff Layton
2021-08-21 12:45                                 ` Jeff Layton
2021-08-23 22:15                                   ` J. Bruce Fields
2021-08-20 22:31                               ` Matthew Wilcox
2021-08-18  7:51                     ` [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE Christian Brauner
2021-08-18 15:42                   ` J. Bruce Fields
2021-08-19 13:56                     ` Eric W. Biederman
2021-08-19 14:33                       ` J. Bruce Fields
2021-08-20 12:54                         ` Jeff Layton
     [not found]                     ` <162943109106.9892.7426782042253067338@noble.neil.brown.name>
2021-08-20  8:25                       ` David Laight
2021-08-12 19:24         ` David Hildenbrand
2021-08-12 18:15       ` Florian Weimer
2021-08-12 18:21         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtp3g8gv.fsf@disp2133 \
    --to=ebiederm@xmission.com \
    --cc=David.Laight@aculab.com \
    --cc=Kevin.Brodsky@arm.com \
    --cc=Nicolas.Viennot@twosigma.com \
    --cc=acme@kernel.org \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=cfijalkovich@google.com \
    --cc=cgxu519@mykernel.net \
    --cc=chinwen.chang@mediatek.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david@redhat.com \
    --cc=elver@google.com \
    --cc=feng.tang@intel.com \
    --cc=fweimer@redhat.com \
    --cc=geert@linux-m68k.org \
    --cc=gerg@linux-m68k.org \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=jolsa@redhat.com \
    --cc=keescook@chromium.org \
    --cc=krisman@collabora.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhocko@suse.com \
    --cc=miklos@szeredi.hu \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=mtk.manpages@gmail.com \
    --cc=namhyung@kernel.org \
    --cc=npiggin@gmail.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=shakeelb@google.com \
    --cc=shawn@anastas.io \
    --cc=steven.price@arm.com \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomascedeno@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=vincenzo.frascino@arm.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=walken@google.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).