All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH 3/3 v3] GFS2: Add retry loop to delete_work_func
Date: Thu, 28 Apr 2016 10:16:09 -0400 (EDT)	[thread overview]
Message-ID: <1675787118.58792048.1461852969312.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <5721DB89.7040906@redhat.com>

Hi Steve,

Comments below.

----- Original Message -----
> This generally looks like a much better solution, however I'm always
> worried about arbitrary delays being added to the code. Is this just a
> wait for an inode in I_FREEING to go away, along with a timeout? It
> would be good to at least document why 30 loops here is the right
> amount. In other words will this still be sure to work on different
> machines with different timing characteristics?
> 
> I wonder whether it might not be better to reschedule the work function
> for later, rather than loop in the work function itself?
> 
> Steve.

1. I chose 30 iterations arbitrarily based on instrumentation on the
   virt cluster I was using. I never saw it go over 29 retries.
   In my experience, virt clusters are faster and have more critical
   timing than bare metal, so I considered this worst case. But it is
   still arbitrary.
2. Your idea of re-queuing the delete work is an excellent one, and
   I've reworked the patch to do this. I'm testing the implementation now
   and it seems to be working well so far. This time my retry value
   is 30 seconds, with a 10ms delay between tries, but this is still
   arbitrary, so I'm open to suggestions.

Here is the replacement patch:

Patch description:

The delete work function, delete_work_func, often doesn't find
the inode needed to free an inode that's been marked unlinked.
That's because it only tries gfs2_ilookup once, and it's often
not found because of two things: The fact that gfs2_lookup_by_inum
is only called in an else condition, and the fact that the
non-blocking lookup often encounters inodes that are being
I_FREEd by the vfs. This patch allows it to retry the lookup when
under that circumstance, otherwise call the gfs2_lookup_by_inum.
If the inode is in I_FREEING, -EAGAIN is returned to the caller,
who then re-queues the delete work for later. After a certain
timeout value, the delete_work_func stops using ilookup and
uses lookup_by_inum instead. If that fails for a certain number
of retries, it gives up.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 672de35..59efb8b 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -571,12 +571,14 @@ out_unlock:
 	return;
 }
 
+#define LOOKUP_TIMEOUT (HZ >> 1)
+
 static void delete_work_func(struct work_struct *work)
 {
-	struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_delete);
+	struct gfs2_glock *gl = container_of(work, struct gfs2_glock,
+					     gl_delete.work);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
-	struct gfs2_inode *ip;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	u64 no_addr = gl->gl_name.ln_number;
 
 	/* If someone's using this glock to create a new dinode, the block must
@@ -585,13 +587,43 @@ static void delete_work_func(struct work_struct *work)
 	if (test_bit(GLF_INODE_CREATING, &gl->gl_flags))
 		goto out;
 
-	ip = gl->gl_object;
-	/* Note: Unsafe to dereference ip as we don't hold right refs/locks */
-
-	if (ip)
+	if (test_bit(GLF_TRY_ILOOKUP, &gl->gl_flags)) {
 		inode = gfs2_ilookup(sdp->sd_vfs, no_addr, 1);
-	else
-		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL, GFS2_BLKST_UNLINKED);
+
+		if (inode == ERR_PTR(-EAGAIN)) {
+			if (time_before(jiffies, gl->gl_tchange +
+					LOOKUP_TIMEOUT)) {
+				gfs2_glock_hold(gl);
+				if (queue_delayed_work(gfs2_delete_workqueue,
+						       &gl->gl_delete, 10) == 0)
+					gfs2_glock_put(gl);
+				goto out;
+			} else {
+				clear_bit(GLF_TRY_ILOOKUP, &gl->gl_flags);
+				gl->gl_tchange = jiffies;
+			}
+		}
+	}
+
+	if (inode == NULL || IS_ERR(inode)) {
+		/* Note: This function uses the iopen glock only. It relies on
+		   the fact that gfs2_inode_lookup (called by lookup_by_inum)
+		   will return -EAGAIN before it does any manipulation of the
+		   iopen glock that might change gl_tchange. */
+		inode = gfs2_lookup_by_inum(sdp, no_addr, NULL,
+					    GFS2_BLKST_UNLINKED);
+		if (inode == ERR_PTR(-EAGAIN)) {
+			if (time_before(jiffies, gl->gl_tchange +
+					LOOKUP_TIMEOUT)) {
+				gfs2_glock_hold(gl);
+				if (queue_delayed_work(gfs2_delete_workqueue,
+						       &gl->gl_delete, 10) == 0) {
+					gfs2_glock_put(gl);
+				}
+			}
+			goto out;
+		}
+	}
 	if (inode && !IS_ERR(inode)) {
 		d_prune_aliases(inode);
 		iput(inode);
@@ -713,7 +745,7 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
 	gl->gl_object = NULL;
 	gl->gl_hold_time = GL_GLOCK_DFT_HOLD;
 	INIT_DELAYED_WORK(&gl->gl_work, glock_work_func);
-	INIT_WORK(&gl->gl_delete, delete_work_func);
+	INIT_DELAYED_WORK(&gl->gl_delete, delete_work_func);
 
 	mapping = gfs2_glock2aspace(gl);
 	if (mapping) {
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 5db59d4..c0a45ed 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -550,7 +550,10 @@ static void iopen_go_callback(struct gfs2_glock *gl, bool remote)
 	if (gl->gl_demote_state == LM_ST_UNLOCKED &&
 	    gl->gl_state == LM_ST_SHARED && ip) {
 		gl->gl_lockref.count++;
-		if (queue_work(gfs2_delete_workqueue, &gl->gl_delete) == 0)
+		gl->gl_tchange = jiffies;
+		set_bit(GLF_TRY_ILOOKUP, &gl->gl_flags);
+		if (queue_delayed_work(gfs2_delete_workqueue, &gl->gl_delete,
+			    0) == 0)
 			gl->gl_lockref.count--;
 	}
 }
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index a6a3389..8c79fe4 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -329,6 +329,7 @@ enum {
 	GLF_OBJECT			= 14, /* Used only for tracing */
 	GLF_BLOCKING			= 15,
 	GLF_INODE_CREATING		= 16, /* Inode creation occurring */
+	GLF_TRY_ILOOKUP			= 17, /* Try gfs2_ilookup for del */
 };
 
 struct gfs2_glock {
@@ -363,7 +364,7 @@ struct gfs2_glock {
 	struct delayed_work gl_work;
 	union {
 		/* For inode and iopen glocks only */
-		struct work_struct gl_delete;
+		struct delayed_work gl_delete;
 		/* For rgrp glocks only */
 		struct {
 			loff_t start;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 48c1418..3cd06c7 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -37,6 +37,11 @@
 #include "super.h"
 #include "glops.h"
 
+enum {
+	LOOKUP_UNLOCKED = 0,
+	LOOKUP_LOCKED = 1,
+};
+
 struct gfs2_skip_data {
 	u64 no_addr;
 	int skipped;
@@ -71,27 +76,34 @@ static int iget_set(struct inode *inode, void *opaque)
 	return 0;
 }
 
-struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+static struct inode *inode_lookup_common(struct super_block *sb, u64 no_addr,
+					 int non_block, int locked)
 {
 	unsigned long hash = (unsigned long)no_addr;
-	struct gfs2_skip_data data;
+	struct gfs2_skip_data data = {.no_addr = no_addr, .skipped = 0,
+				      .non_block = non_block};
+	struct inode *inode;
 
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = non_block;
-	return ilookup5(sb, hash, iget_test, &data);
+	if (locked == LOOKUP_LOCKED)
+		inode = iget5_locked(sb, hash, iget_test, iget_set, &data);
+	else
+		inode = ilookup5(sb, hash, iget_test, &data);
+
+	if (non_block && data.skipped)
+		return ERR_PTR(-EAGAIN);
+
+	return inode;
+}
+
+struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr, int non_block)
+{
+	return inode_lookup_common(sb, no_addr, non_block, LOOKUP_UNLOCKED);
 }
 
 static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
 			       int non_block)
 {
-	struct gfs2_skip_data data;
-	unsigned long hash = (unsigned long)no_addr;
-
-	data.no_addr = no_addr;
-	data.skipped = 0;
-	data.non_block = non_block;
-	return iget5_locked(sb, hash, iget_test, iget_set, &data);
+	return inode_lookup_common(sb, no_addr, non_block, LOOKUP_LOCKED);
 }
 
 /**
@@ -148,6 +160,8 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
 	inode = gfs2_iget(sb, no_addr, non_block);
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
+	if (inode == ERR_PTR(-EAGAIN))
+		return inode;
 
 	ip = GFS2_I(inode);
 	ip->i_no_addr = no_addr;
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 07c0265..58c74ca 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1801,8 +1801,10 @@ static void try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked, u64 skip
 		 * answer to whether it is NULL or not.
 		 */
 		ip = gl->gl_object;
-
-		if (ip || queue_work(gfs2_delete_workqueue, &gl->gl_delete) == 0)
+		gl->gl_tchange = jiffies;
+		set_bit(GLF_TRY_ILOOKUP, &gl->gl_flags);
+		if (ip || queue_delayed_work(gfs2_delete_workqueue,
+					     &gl->gl_delete, 0) == 0)
 			gfs2_glock_put(gl);
 		else
 			found++;



      reply	other threads:[~2016-04-28 14:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-27 15:35 [Cluster-devel] [GFS2 PATCH 0/3] Fix inode transition from unlinked to free Bob Peterson
2016-04-27 15:35 ` [Cluster-devel] [GFS2 PATCH 1/3] Revert "GFS2: Eliminate parameter non_block on gfs2_inode_lookup" Bob Peterson
2016-04-27 15:35 ` [Cluster-devel] [GFS2 PATCH 2/3] Revert "GFS2: Don't filter out I_FREEING inodes anymore" Bob Peterson
2016-04-27 15:35 ` [Cluster-devel] [GFS2 PATCH 3/3] GFS2: Add retry loop to delete_work_func Bob Peterson
2016-04-27 17:10   ` [Cluster-devel] [GFS2 PATCH 3/3 v2] " Bob Peterson
2016-04-28  9:44     ` Steven Whitehouse
2016-04-28 14:16       ` Bob Peterson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1675787118.58792048.1461852969312.JavaMail.zimbra@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.