All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Christian Brauner <brauner@kernel.org>,
	Josef Bacik <josef@toxicpanda.com>,
	David Howells <dhowells@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	Miklos Szeredi <miklos@szeredi.hu>,
	linux-fsdevel@vger.kernel.org
Subject: [PATCH] fs: allow calling kiocb_end_write() unmatched with kiocb_start_write()
Date: Tue, 21 Nov 2023 15:25:51 +0200	[thread overview]
Message-ID: <20231121132551.2337431-1-amir73il@gmail.com> (raw)

We want to move kiocb_start_write() into vfs_iocb_iter_write(), after
the permission hook, but leave kiocb_end_write() in the write completion
handler of the callers of vfs_iocb_iter_write().

After this change, there will be no way of knowing in completion handler,
if write has failed before or after calling kiocb_start_write().

Add a flag IOCB_WRITE_STARTED, which is set and cleared internally by
kiocb_{start,end}_write(), so that kiocb_end_write() could be called for
cleanup of async write, whether it was successful or whether it failed
before or after calling kiocb_start_write().

This flag must not be copied by stacked filesystems (e.g. overlayfs)
that clone the iocb to another iocb for io request on a backing file.

Link: https://lore.kernel.org/r/CAOQ4uxihfJJRxxUhAmOwtD97Lg8PL8RgXw88rH1UfEeP8AtP+w@mail.gmail.com/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Jens,

The kiocb_{start,end}_write() helpers were originally taken out of
an early version of the permission hooks cleanup patches [1].

This early version of the helpers had the IOCB_WRITE_STARTED flag, but
when I posted the helpers independently from the cleanup series, you had
correctly pointed out [2] that IOCB_WRITE_STARTED is not needed for any
of the existing callers of kiocb_{start,end}_write(), so I removed it for
the final version of the helpers that got merged.

When coming back to the permission hook cleanup, I see that the
IOCB_WRITE_STARTED flag is needed to allow the new semantics of calling
kiocb_start_write() inside vfs_iocb_iter_write() and kiocb_end_write()
in completion callback.

I realize these semantics are not great, but the alternative of moving
the permission hook from vfs_iocb_iter_write() into all the callers is
worse IMO.

Can you accept the solution with IOCB_WRITE_STARTED state flag?
Have a better idea for cleaner iocb issue semantics that will allow
calling the permission hook with freeze protection held?

Thanks,
Amir.

[1] https://lore.kernel.org/linux-fsdevel/20231114153321.1716028-1-amir73il@gmail.com/
[2] https://lore.kernel.org/linux-fsdevel/e6948836-1d9d-4219-9a21-a2ab442a9a34@kernel.dk/

 fs/overlayfs/file.c |  2 +-
 include/linux/fs.h  | 18 ++++++++++++++++--
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 131621daeb13..e4baa4ea5c95 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -455,7 +455,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
 
 		aio_req->orig_iocb = iocb;
 		kiocb_clone(&aio_req->iocb, iocb, get_file(real.file));
-		aio_req->iocb.ki_flags = ifl;
+		aio_req->iocb.ki_flags &= ifl;
 		aio_req->iocb.ki_complete = ovl_aio_queue_completion;
 		refcount_set(&aio_req->ref, 2);
 		kiocb_start_write(&aio_req->iocb);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 98b7a7a8c42e..64414e146e1e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -352,6 +352,8 @@ enum rw_hint {
  * unrelated IO (like cache flushing, new IO generation, etc).
  */
 #define IOCB_DIO_CALLER_COMP	(1 << 22)
+/* file_start_write() was called */
+#define IOCB_WRITE_STARTED	(1 << 23)
 
 /* for use in trace events */
 #define TRACE_IOCB_STRINGS \
@@ -366,7 +368,8 @@ enum rw_hint {
 	{ IOCB_WAITQ,		"WAITQ" }, \
 	{ IOCB_NOIO,		"NOIO" }, \
 	{ IOCB_ALLOC_CACHE,	"ALLOC_CACHE" }, \
-	{ IOCB_DIO_CALLER_COMP,	"CALLER_COMP" }
+	{ IOCB_DIO_CALLER_COMP,	"CALLER_COMP" }, \
+	{ IOCB_WRITE_STARTED,	"WRITE_STARTED" }
 
 struct kiocb {
 	struct file		*ki_filp;
@@ -2183,12 +2186,15 @@ static inline void init_sync_kiocb(struct kiocb *kiocb, struct file *filp)
 	};
 }
 
+/* IOCB flags associated with ki_filp state must not be cloned */
+#define IOCB_CLONE_FLAGS_MASK	(~IOCB_WRITE_STARTED)
+
 static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
 			       struct file *filp)
 {
 	*kiocb = (struct kiocb) {
 		.ki_filp = filp,
-		.ki_flags = kiocb_src->ki_flags,
+		.ki_flags = kiocb_src->ki_flags & IOCB_CLONE_FLAGS_MASK,
 		.ki_ioprio = kiocb_src->ki_ioprio,
 		.ki_pos = kiocb_src->ki_pos,
 	};
@@ -2744,12 +2750,16 @@ static inline void kiocb_start_write(struct kiocb *iocb)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
 
+	if (WARN_ON_ONCE(iocb->ki_flags & IOCB_WRITE_STARTED))
+		return;
+
 	sb_start_write(inode->i_sb);
 	/*
 	 * Fool lockdep by telling it the lock got released so that it
 	 * doesn't complain about the held lock when we return to userspace.
 	 */
 	__sb_writers_release(inode->i_sb, SB_FREEZE_WRITE);
+	iocb->ki_flags |= IOCB_WRITE_STARTED;
 }
 
 /**
@@ -2762,11 +2772,15 @@ static inline void kiocb_end_write(struct kiocb *iocb)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
 
+	if (!(iocb->ki_flags & IOCB_WRITE_STARTED))
+		return;
+
 	/*
 	 * Tell lockdep we inherited freeze protection from submission thread.
 	 */
 	__sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE);
 	sb_end_write(inode->i_sb);
+	iocb->ki_flags &= ~IOCB_WRITE_STARTED;
 }
 
 /*
-- 
2.34.1


             reply	other threads:[~2023-11-21 13:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-21 13:25 Amir Goldstein [this message]
2023-11-21 21:00 ` [PATCH] fs: allow calling kiocb_end_write() unmatched with kiocb_start_write() Josef Bacik
2023-11-21 21:30   ` Dave Chinner
2023-11-22  5:38     ` Amir Goldstein
2023-11-22  6:50     ` Christoph Hellwig
2023-11-22 10:20       ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231121132551.2337431-1-amir73il@gmail.com \
    --to=amir73il@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.