All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Christoph Hellwig <hch@lst.de>, Jan Kara <jack@suse.cz>,
	Jens Axboe <axboe@kernel.dk>
Cc: Christian Brauner <brauner@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	linux-block@vger.kernel.org
Subject: [PATCH 1/2] block: handle BLK_OPEN_RESTRICT_WRITES correctly
Date: Sat, 23 Mar 2024 17:11:19 +0100	[thread overview]
Message-ID: <20240323-zielbereich-mittragen-6fdf14876c3e@brauner> (raw)
In-Reply-To: <20240323-seide-erbrachten-5c60873fadc1@brauner>

Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By
default this option is set. When it is set the long-standing behavior
of being able to write to mounted block devices is enabled.

But in order to guard against unintended corruption by writing to the
block device buffer cache CONFIG_BLK_DEV_WRITE_MOUNTED can be turned
off. In that case it isn't possible to write to mounted block devices
anymore.

A filesystem may open its block devices with BLK_OPEN_RESTRICT_WRITES
which disallows concurrent BLK_OPEN_WRITE access. When we still had the
bdev handle around we could recognize BLK_OPEN_RESTRICT_WRITES because
the mode was passed around. Since we managed to get rid of the bdev
handle we changed that logic to recognize BLK_OPEN_RESTRICT_WRITES based
on whether the file was opened writable and writes to that block device
are blocked. That logic doesn't work because we do allow
BLK_OPEN_RESTRICT_WRITES to be specified without BLK_OPEN_WRITE.

So fix the detection logic. Use O_EXCL as an indicator that
BLK_OPEN_RESTRICT_WRITES has been requested. We do the exact same thing
for pidfds where O_EXCL means that this is a pidfd that refers to a
thread. For userspace open paths O_EXCL will never be retained but for
internal opens where we open files that are never installed into a file
descriptor table this is fine.

Note that BLK_OPEN_RESTRICT_WRITES is an internal only flag that cannot
directly be raised by userspace. It is implicitly raised during
mounting.

Passes xftests and blktests with CONFIG_BLK_DEV_WRITE_MOUNTED set and
unset.

Fixes: 321de651fa56 ("block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access")
Reported-by: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/ZfyyEwu9Uq5Pgb94@casper.infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 7a5f611c3d2e..f819f3086905 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -821,13 +821,12 @@ static void bdev_yield_write_access(struct file *bdev_file)
 		return;
 
 	bdev = file_bdev(bdev_file);
-	/* Yield exclusive or shared write access. */
-	if (bdev_file->f_mode & FMODE_WRITE) {
-		if (bdev_writes_blocked(bdev))
-			bdev_unblock_writes(bdev);
-		else
-			bdev->bd_writers--;
-	}
+
+	/* O_EXCL is only set for internal BLK_OPEN_RESTRICT_WRITES. */
+	if (bdev_file->f_flags & O_EXCL)
+		bdev_unblock_writes(bdev);
+	else if (bdev_file->f_mode & FMODE_WRITE)
+		bdev->bd_writers--;
 }
 
 /**
@@ -946,6 +945,13 @@ static unsigned blk_to_file_flags(blk_mode_t mode)
 	else
 		WARN_ON_ONCE(true);
 
+	/*
+	 * BLK_OPEN_RESTRICT_WRITES is never set from userspace and
+	 * O_EXCL is stripped from userspace.
+	 */
+	if (mode & BLK_OPEN_RESTRICT_WRITES)
+		flags |= O_EXCL;
+
 	if (mode & BLK_OPEN_NDELAY)
 		flags |= O_NDELAY;
 
-- 
2.43.0


  parent reply	other threads:[~2024-03-23 16:11 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-23 14:54 [PATCH] block: handle BLK_OPEN_RESTRICT_WRITES correctly Christian Brauner
2024-03-23 15:59 ` Christian Brauner
2024-03-23 16:11 ` Christian Brauner [this message]
2024-03-23 16:11   ` [PATCH 2/2] [RFC]: block: count BLK_OPEN_RESTRICT_WRITES openers Christian Brauner
2024-03-26 13:24     ` Jan Kara
2024-03-25 11:51   ` [PATCH 1/2] block: handle BLK_OPEN_RESTRICT_WRITES correctly Yu Kuai
2024-03-25 12:04     ` Christian Brauner
2024-03-25 13:52       ` Yu Kuai
2024-03-25 13:54     ` Christian Brauner
2024-03-26  1:32       ` Yu Kuai
2024-03-26 12:57   ` Jan Kara
2024-03-26 13:17     ` Christian Brauner
2024-03-26 13:31       ` Jan Kara
2024-03-26 15:46         ` [PATCH v2] " Christian Brauner
2024-03-26 17:25           ` Christoph Hellwig
2024-03-26 22:42           ` Jan Kara
2024-03-26 15:47         ` [PATCH 1/2] " Christian Brauner
2024-03-27 12:01   ` Christian Brauner
2024-03-29  4:56   ` Matthew Wilcox
2024-03-29 12:10     ` Christian Brauner
2024-03-29 15:11       ` Christian Brauner
2024-03-29 15:24         ` Christian Brauner
2024-04-03  6:04       ` Christian Brauner
2024-04-03 19:22         ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240323-zielbereich-mittragen-6fdf14876c3e@brauner \
    --to=brauner@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.