[RFC] gfs2: Prevent endless loops in gfs2_file_buffered_write

From: Andreas Gruenbacher <agruenba@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-fsdevel@vger.kernel.org, cluster-devel@redhat.com
Subject: [RFC] gfs2: Prevent endless loops in gfs2_file_buffered_write
Date: Wed, 10 Nov 2021 18:44:57 +0100	[thread overview]
Message-ID: <20211110174457.533866-1-agruenba@redhat.com> (raw)

Hi Linus,

in commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for
buffered I/O"), I've managed to introduce a hang in gfs2 due to the
following check in iomap_write_iter:

  if (unlikely(fault_in_iov_iter_readable(i, bytes))) {

which fails if any of the iov iterator cannot be faulted in for reading.
At the filesystem level, we're retrying the rest of the write if any of
the iov iterator can be faulted in, so we can end up in a loop without
ever making progress.  The fix in iomap_write_iter would be as follows:

  if (unlikely(fault_in_iov_iter_readable(i, bytes) == bytes)) {

The same bug exists in generic_perform_write, but I'm not aware of any
callers of generic_perform_write that have page faults turned off.

Is this fix still appropriate for 5.16, or should we work around it in
the filesystem as below for now?

A related post-5.16 option would be to turn the pre-faulting in
iomap_write_iter and generic_perform_write into post-faulting, but at
the very least, that still needs a bit of performance analysis:

  https://lore.kernel.org/linux-fsdevel/20211026094430.3669156-1-agruenba@redhat.com/
  https://lore.kernel.org/linux-fsdevel/20211027212138.3722977-1-agruenba@redhat.com/

Thanks,
Andreas

---
 fs/gfs2/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index c486b702e00f..3e718cfc19a7 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1013,6 +1013,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct gfs2_holder *statfs_gh = NULL;
 	size_t prev_count = 0, window_size = 0;
+	size_t orig_count = iov_iter_count(from);
 	size_t read = 0;
 	ssize_t ret;
 
@@ -1057,6 +1058,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 	if (inode == sdp->sd_rindex)
 		gfs2_glock_dq_uninit(statfs_gh);
 
+	from->count = orig_count - read;
 	if (should_fault_in_pages(ret, from, &prev_count, &window_size)) {
 		size_t leftover;
 
@@ -1064,6 +1066,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 		leftover = fault_in_iov_iter_readable(from, window_size);
 		gfs2_holder_disallow_demote(gh);
 		if (leftover != window_size) {
+			from->count = min(from->count, window_size - leftover);
 			if (!gfs2_holder_queued(gh)) {
 				if (read)
 					goto out_uninit;
-- 
2.31.1