linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: david@redhat.com, tglx@linutronix.de, hch@lst.de,
	patches@lists.linux.dev, linux-modules@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	pmladek@suse.com, petr.pavlu@suse.com, prarit@redhat.com,
	lennart@poettering.net, gregkh@linuxfoundation.org,
	rafael@kernel.org, song@kernel.org, lucas.de.marchi@gmail.com,
	lucas.demarchi@intel.com, christophe.leroy@csgroup.eu,
	peterz@infradead.org, rppt@kernel.org, dave@stgolabs.net,
	willy@infradead.org, vbabka@suse.cz, mhocko@suse.com,
	dave.hansen@linux.intel.com, colin.i.king@gmail.com,
	jim.cromie@gmail.com, catalin.marinas@arm.com, jbaron@akamai.com,
	rick.p.edgecombe@intel.com, yujie.liu@intel.com
Subject: Re: [PATCH 1/2] fs/kernel_read_file: add support for duplicate detection
Date: Wed, 24 May 2023 21:00:02 -0700	[thread overview]
Message-ID: <CAHk-=wgKu=tJf1bm_dtme4Hde4zTB=_7EdgR8avsDRK4_jD+uA@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wjahcAqLYm0ijcAVcPcQAz-UUuJ3Ubx4GzP_SJAupf=qQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2033 bytes --]

On Wed, May 24, 2023 at 2:52 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> This is all disgusting.

Bringing back the original thread, because I just sent an alternate
patch to Luis to test.

That one is also disgusting, but for different reasons: it needs some
polish if it works. It's a very simple patch, in that it just extends
our existing i_writecount and ETXTBSY logic to also have a "exclusive"
mode, and says that we do the module file reading in that exclusive
mode (so if/when udev in its incompetence tries to load the same
module X number of times at the same time, only one will read at a
time).

The disgusting part is mainly the hacky test for "id ==
READING_MODULE", and it would probably be better with some kind of
"exclusive flag" field for general use, but right now READING_MODULE
is basically that one user.

Luis having explained _why_ we'd want this (and honestly, it took a
couple of tries), I can only say that udev is horribly broken, and
this most definitely should be fixed in user mode. udev randomly
loading the same module multiple times just because it gets confused
is not ok.

Any udev developer that goes "we can't fix it in user space" should be
ashamed of themselves. Really? Just randomly doing the same thing in
parallel and expecting the kernel to sort out your mess? What a crock.

But this *might* mitigate that udev horror. And not introduce any new
kernel-side horror, just a slight extension of our existing writer
exclusion logic to allow "full exclusive access".

(Note: it's not actually excluding other purely regular readers - but
it *is* excluding not just writers, but also other "special readers"
that want to exclude writers)

I'd like to point out that this patch really is completely untested.
It built for me, but that's all the testing it has gotten. It's
_small_. Tiny, even. But that "id == READING_MODULE" thing really is
pretty disgusting and I feel this needs more thought.

                         Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 1304 bytes --]

 fs/kernel_read_file.c | 6 +++++-
 include/linux/fs.h    | 6 ++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/kernel_read_file.c b/fs/kernel_read_file.c
index 5d826274570c..ff3e894f8cd4 100644
--- a/fs/kernel_read_file.c
+++ b/fs/kernel_read_file.c
@@ -48,7 +48,11 @@ ssize_t kernel_read_file(struct file *file, loff_t offset, void **buf,
 	if (!S_ISREG(file_inode(file)->i_mode))
 		return -EINVAL;
 
-	ret = deny_write_access(file);
+	/* Module reading wants *exclusive* access to the file */
+	if (id == READING_MODULE)
+		ret = exclusive_deny_write_access(file);
+	else
+		ret = deny_write_access(file);
 	if (ret)
 		return ret;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21a981680856..722b42a77d51 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2566,6 +2566,12 @@ static inline int deny_write_access(struct file *file)
 	struct inode *inode = file_inode(file);
 	return atomic_dec_unless_positive(&inode->i_writecount) ? 0 : -ETXTBSY;
 }
+static inline int exclusive_deny_write_access(struct file *file)
+{
+	int old = 0;
+	struct inode *inode = file_inode(file);
+	return atomic_try_cmpxchg(&inode->i_writecount, &old, -1) ? 0 : -ETXTBSY;
+}
 static inline void put_write_access(struct inode * inode)
 {
 	atomic_dec(&inode->i_writecount);

  parent reply	other threads:[~2023-05-25  4:00 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-24 21:36 [PATCH 0/2] module: avoid all memory pressure due to duplicates Luis Chamberlain
2023-05-24 21:36 ` [PATCH 1/2] fs/kernel_read_file: add support for duplicate detection Luis Chamberlain
2023-05-24 21:52   ` Linus Torvalds
2023-05-24 21:56     ` Linus Torvalds
2023-05-24 22:07       ` Luis Chamberlain
2023-05-25  4:00     ` Linus Torvalds [this message]
2023-05-25 18:08       ` Luis Chamberlain
2023-05-25 18:35         ` Luis Chamberlain
2023-05-25 18:50         ` Linus Torvalds
2023-05-25 19:32           ` Luis Chamberlain
2023-05-25  7:01     ` Christian Brauner
2023-05-24 21:36 ` [PATCH 2/2] module: add support to avoid duplicates early on load Luis Chamberlain
2023-05-25 11:40   ` Petr Pavlu
2023-05-25 16:07     ` Linus Torvalds
2023-05-25 16:42       ` Greg KH
2023-05-25 18:22         ` Luis Chamberlain
2023-05-25 17:52       ` Linus Torvalds
2023-05-25 18:45       ` Lucas De Marchi
2023-05-25 21:12         ` Linus Torvalds
2023-05-25 22:02           ` Luis Chamberlain
2023-05-26  1:39             ` Linus Torvalds
2023-05-29  8:58               ` Johan Hovold
2023-05-29 11:00                 ` Linus Torvalds
2023-05-29 12:44                   ` Johan Hovold
2023-05-29 15:18                     ` Johan Hovold
2023-05-30  1:55                       ` Linus Torvalds
2023-05-30  9:40                         ` Johan Hovold
2023-06-05 12:25                           ` Johan Hovold
2023-05-30 16:22                         ` Luis Chamberlain
2023-05-30 17:16                           ` Lucas De Marchi
2023-05-30 19:41                             ` Luis Chamberlain
2023-05-30 22:17                               ` Linus Torvalds
2023-05-31  5:30                                 ` Lucas De Marchi
2023-05-31  0:31                           ` Luis Chamberlain
2023-05-31  7:51                           ` David Hildenbrand
2023-05-31 16:57                             ` Luis Chamberlain
2023-06-02 15:19                               ` David Hildenbrand
2023-06-02 16:04                                 ` Luis Chamberlain
2023-06-05 11:26                                   ` David Hildenbrand
2023-06-05 15:17                                     ` Luis Chamberlain
2023-06-05 15:28                                       ` Luis Chamberlain
2023-06-28 18:52                                         ` Luis Chamberlain
2023-06-28 20:14                                           ` Linus Torvalds
2023-06-28 22:07                                             ` Linus Torvalds
2023-06-28 23:17                                               ` Linus Torvalds
2023-06-29  0:18                                                 ` Luis Chamberlain
2023-06-02 16:06                                 ` Linus Torvalds
2023-06-02 16:37                                   ` David Hildenbrand
2023-05-30 22:45                         ` Dan Williams
2023-06-04 14:26                         ` Rudi Heitbaum
2023-05-29 17:47                     ` Linus Torvalds
2023-05-30 10:01                       ` Johan Hovold
2023-05-25 16:54     ` Lucas De Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wgKu=tJf1bm_dtme4Hde4zTB=_7EdgR8avsDRK4_jD+uA@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=colin.i.king@gmail.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=jbaron@akamai.com \
    --cc=jim.cromie@gmail.com \
    --cc=lennart@poettering.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=lucas.de.marchi@gmail.com \
    --cc=lucas.demarchi@intel.com \
    --cc=mcgrof@kernel.org \
    --cc=mhocko@suse.com \
    --cc=patches@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=petr.pavlu@suse.com \
    --cc=pmladek@suse.com \
    --cc=prarit@redhat.com \
    --cc=rafael@kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yujie.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).