All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruenba@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-fsdevel@vger.kernel.org, cluster-devel@redhat.com
Subject: [RFC] gfs2: Prevent endless loops in gfs2_file_buffered_write
Date: Wed, 10 Nov 2021 18:44:57 +0100	[thread overview]
Message-ID: <20211110174457.533866-1-agruenba@redhat.com> (raw)

Hi Linus,

in commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for
buffered I/O"), I've managed to introduce a hang in gfs2 due to the
following check in iomap_write_iter:

  if (unlikely(fault_in_iov_iter_readable(i, bytes))) {

which fails if any of the iov iterator cannot be faulted in for reading.
At the filesystem level, we're retrying the rest of the write if any of
the iov iterator can be faulted in, so we can end up in a loop without
ever making progress.  The fix in iomap_write_iter would be as follows:

  if (unlikely(fault_in_iov_iter_readable(i, bytes) == bytes)) {

The same bug exists in generic_perform_write, but I'm not aware of any
callers of generic_perform_write that have page faults turned off.

Is this fix still appropriate for 5.16, or should we work around it in
the filesystem as below for now?

A related post-5.16 option would be to turn the pre-faulting in
iomap_write_iter and generic_perform_write into post-faulting, but at
the very least, that still needs a bit of performance analysis:

  https://lore.kernel.org/linux-fsdevel/20211026094430.3669156-1-agruenba@redhat.com/
  https://lore.kernel.org/linux-fsdevel/20211027212138.3722977-1-agruenba@redhat.com/

Thanks,
Andreas

---
 fs/gfs2/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index c486b702e00f..3e718cfc19a7 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1013,6 +1013,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct gfs2_holder *statfs_gh = NULL;
 	size_t prev_count = 0, window_size = 0;
+	size_t orig_count = iov_iter_count(from);
 	size_t read = 0;
 	ssize_t ret;
 
@@ -1057,6 +1058,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 	if (inode == sdp->sd_rindex)
 		gfs2_glock_dq_uninit(statfs_gh);
 
+	from->count = orig_count - read;
 	if (should_fault_in_pages(ret, from, &prev_count, &window_size)) {
 		size_t leftover;
 
@@ -1064,6 +1066,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 		leftover = fault_in_iov_iter_readable(from, window_size);
 		gfs2_holder_disallow_demote(gh);
 		if (leftover != window_size) {
+			from->count = min(from->count, window_size - leftover);
 			if (!gfs2_holder_queued(gh)) {
 				if (read)
 					goto out_uninit;
-- 
2.31.1


WARNING: multiple messages have this Message-ID (diff)
From: Andreas Gruenbacher <agruenba@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [RFC] gfs2: Prevent endless loops in gfs2_file_buffered_write
Date: Wed, 10 Nov 2021 18:44:57 +0100	[thread overview]
Message-ID: <20211110174457.533866-1-agruenba@redhat.com> (raw)

Hi Linus,

in commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for
buffered I/O"), I've managed to introduce a hang in gfs2 due to the
following check in iomap_write_iter:

  if (unlikely(fault_in_iov_iter_readable(i, bytes))) {

which fails if any of the iov iterator cannot be faulted in for reading.
At the filesystem level, we're retrying the rest of the write if any of
the iov iterator can be faulted in, so we can end up in a loop without
ever making progress.  The fix in iomap_write_iter would be as follows:

  if (unlikely(fault_in_iov_iter_readable(i, bytes) == bytes)) {

The same bug exists in generic_perform_write, but I'm not aware of any
callers of generic_perform_write that have page faults turned off.

Is this fix still appropriate for 5.16, or should we work around it in
the filesystem as below for now?

A related post-5.16 option would be to turn the pre-faulting in
iomap_write_iter and generic_perform_write into post-faulting, but at
the very least, that still needs a bit of performance analysis:

  https://lore.kernel.org/linux-fsdevel/20211026094430.3669156-1-agruenba at redhat.com/
  https://lore.kernel.org/linux-fsdevel/20211027212138.3722977-1-agruenba at redhat.com/

Thanks,
Andreas

---
 fs/gfs2/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index c486b702e00f..3e718cfc19a7 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1013,6 +1013,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct gfs2_holder *statfs_gh = NULL;
 	size_t prev_count = 0, window_size = 0;
+	size_t orig_count = iov_iter_count(from);
 	size_t read = 0;
 	ssize_t ret;
 
@@ -1057,6 +1058,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 	if (inode == sdp->sd_rindex)
 		gfs2_glock_dq_uninit(statfs_gh);
 
+	from->count = orig_count - read;
 	if (should_fault_in_pages(ret, from, &prev_count, &window_size)) {
 		size_t leftover;
 
@@ -1064,6 +1066,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
 		leftover = fault_in_iov_iter_readable(from, window_size);
 		gfs2_holder_disallow_demote(gh);
 		if (leftover != window_size) {
+			from->count = min(from->count, window_size - leftover);
 			if (!gfs2_holder_queued(gh)) {
 				if (read)
 					goto out_uninit;
-- 
2.31.1



             reply	other threads:[~2021-11-10 17:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 17:44 Andreas Gruenbacher [this message]
2021-11-10 17:44 ` [Cluster-devel] [RFC] gfs2: Prevent endless loops in gfs2_file_buffered_write Andreas Gruenbacher
2021-11-12 19:15 ` Catalin Marinas
2021-11-12 19:15   ` [Cluster-devel] " Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211110174457.533866-1-agruenba@redhat.com \
    --to=agruenba@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=cluster-devel@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.