All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH] GFS2: Take inode off order_write list when setting jdata flag
Date: Fri, 6 Oct 2017 09:39:08 -0400 (EDT)	[thread overview]
Message-ID: <2136838051.17705949.1507297148540.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <840948171.17705947.1507297147763.JavaMail.zimbra@redhat.com>

Hi,

This patch fixes a deadlock caused when the jdata flag is set for
inodes that are already on the ordered write list. Since it is
on the ordered write list, log_flush calls gfs2_ordered_write which
calls filemap_fdatawrite. But since the inode had the jdata flag
set, that calls gfs2_jdata_writepages, which tries to start a new
transaction. A new transaction cannot be started because it tries
to acquire the log_flush rwsem which is already locked by the log
flush operation.

The bottom line is: We cannot switch an inode from ordered to jdata
until we eliminate any ordered data pages (via log flush) or any
log_flush operation afterward will create the circular dependency
above. So we need to flush the log before setting the diskflags to
switch the file mode, then we need to remove the inode from the
ordered writes list.

Before this patch, the log flush was done for jdata->ordered, but
that's wrong. If we're going from jdata to ordered, we don't need
to call gfs2_log_flush because the call to filemap_fdatawrite will
do it for us:

   filemap_fdatawrite() -> __filemap_fdatawrite_range()
      __filemap_fdatawrite_range() -> do_writepages()
         do_writepages() -> gfs2_jdata_writepages()
            gfs2_jdata_writepages() -> gfs2_log_flush()

This patch modifies function do_gfs2_set_flags so that if a file
has its jdata flag set, and it's already on the ordered write list,
the log will be flushed and it will be removed from the list
before setting the flag.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index c7a904a8fbb4..0e9b81acf191 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -256,7 +256,7 @@ static int do_gfs2_set_flags(struct file *filp, u32 reqflags, u32 mask)
 			goto out;
 	}
 	if ((flags ^ new_flags) & GFS2_DIF_JDATA) {
-		if (flags & GFS2_DIF_JDATA)
+		if (new_flags & GFS2_DIF_JDATA)
 			gfs2_log_flush(sdp, ip->i_gl, NORMAL_FLUSH);
 		error = filemap_fdatawrite(inode->i_mapping);
 		if (error)
@@ -264,6 +264,8 @@ static int do_gfs2_set_flags(struct file *filp, u32 reqflags, u32 mask)
 		error = filemap_fdatawait(inode->i_mapping);
 		if (error)
 			goto out;
+		if (new_flags & GFS2_DIF_JDATA)
+			gfs2_ordered_del_inode(ip);
 	}
 	error = gfs2_trans_begin(sdp, RES_DINODE, 0);
 	if (error)



       reply	other threads:[~2017-10-06 13:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <840948171.17705947.1507297147763.JavaMail.zimbra@redhat.com>
2017-10-06 13:39 ` Bob Peterson [this message]
2017-10-16 18:46   ` [Cluster-devel] [GFS2 PATCH] GFS2: Take inode off order_write list when setting jdata flag Abhijith Das

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2136838051.17705949.1507297148540.JavaMail.zimbra@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.