All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2)
@ 2018-05-08 20:04 Bob Peterson
  2018-05-08 20:04 ` [Cluster-devel] [PATCH 1/2] GFS2: Introduce GLF_EX_SHARING bit: local EX sharing Bob Peterson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Bob Peterson @ 2018-05-08 20:04 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 18 April, I posted v1 of this patch set. The idea is to allow multiple
processes on a node to share a glock that's held exclusively in order to
improve performance. Sharing rgrps allows for better throughput by
reducing contention.

Version 1 implemented this by introducing a new glock mode for sharing
glocks. Steve Whitehouse suggested we didn't need a new mode: we can
accomplish the same thing just by having a new glock flag, which also
makes the patch more simple.

This version 2 patch set implements Steve's suggestion.

The first patch introduces the new glock flag. The second patch puts
it into use for rgrp sharing. Exclusive access to the rgrp is implemented
through an rwsem.

Performance testing using iozone looks even better than version 1.
---
Bob Peterson (2):
  GFS2: Introduce GLF_EX_SHARING bit: local EX sharing
  GFS2: Take advantage of new EXSH glock mode for rgrps

 fs/gfs2/bmap.c   |   2 +-
 fs/gfs2/dir.c    |   2 +-
 fs/gfs2/glock.c  |  23 ++++++++++---
 fs/gfs2/glock.h  |   4 +++
 fs/gfs2/incore.h |   2 ++
 fs/gfs2/inode.c  |   7 ++--
 fs/gfs2/rgrp.c   | 103 ++++++++++++++++++++++++++++++++++++++++++++++---------
 fs/gfs2/rgrp.h   |   2 +-
 fs/gfs2/super.c  |   8 +++--
 fs/gfs2/xattr.c  |   8 +++--
 10 files changed, 129 insertions(+), 32 deletions(-)

-- 
2.14.3



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH 1/2] GFS2: Introduce GLF_EX_SHARING bit: local EX sharing
  2018-05-08 20:04 [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2) Bob Peterson
@ 2018-05-08 20:04 ` Bob Peterson
  2018-05-08 20:04 ` [Cluster-devel] [PATCH 2/2] GFS2: Take advantage of new EXSH glock mode for rgrps Bob Peterson
  2018-05-09  8:06 ` [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2) Steven Whitehouse
  2 siblings, 0 replies; 4+ messages in thread
From: Bob Peterson @ 2018-05-08 20:04 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch is a first step in rgrp sharing. It allows for glocks
locked in EX mode to be shared amongst processes on that node.
Like a Shared glock, multiple processes may hold the lock in
EX mode at the same time, provided they're all on the same
node. All other nodes will see this as an EX lock.
For now, there are no users of the new flag. A future patch
will use it to improve performance with rgrp sharing.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/glock.c | 23 +++++++++++++++++++----
 fs/gfs2/glock.h |  4 ++++
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 097bd3c0f270..a463de5fad2e 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -279,10 +279,20 @@ void gfs2_glock_put(struct gfs2_glock *gl)
 
 static inline int may_grant(const struct gfs2_glock *gl, const struct gfs2_holder *gh)
 {
-	const struct gfs2_holder *gh_head = list_entry(gl->gl_holders.next, const struct gfs2_holder, gh_list);
-	if ((gh->gh_state == LM_ST_EXCLUSIVE ||
-	     gh_head->gh_state == LM_ST_EXCLUSIVE) && gh != gh_head)
-		return 0;
+	const struct gfs2_holder *gh_head = list_entry(gl->gl_holders.next,
+						       const struct gfs2_holder,
+						       gh_list);
+
+	if (gh != gh_head) {
+		if (gh_head->gh_state == LM_ST_EXCLUSIVE &&
+		    (gh_head->gh_flags & LM_FLAG_SHAREDEX) &&
+		    gh->gh_state == LM_ST_EXCLUSIVE &&
+		    (gh->gh_flags & LM_FLAG_SHAREDEX))
+			return 1;
+		if ((gh->gh_state == LM_ST_EXCLUSIVE ||
+		     gh_head->gh_state == LM_ST_EXCLUSIVE))
+			return 0;
+	}
 	if (gl->gl_state == gh->gh_state)
 		return 1;
 	if (gh->gh_flags & GL_EXACT)
@@ -292,6 +302,9 @@ static inline int may_grant(const struct gfs2_glock *gl, const struct gfs2_holde
 			return 1;
 		if (gh->gh_state == LM_ST_DEFERRED && gh_head->gh_state == LM_ST_DEFERRED)
 			return 1;
+		if (gh_head->gh_flags & LM_FLAG_SHAREDEX &&
+		    gh->gh_flags & LM_FLAG_SHAREDEX)
+			return 1;
 	}
 	if (gl->gl_state != LM_ST_UNLOCKED && (gh->gh_flags & LM_FLAG_ANY))
 		return 1;
@@ -1680,6 +1693,8 @@ static const char *hflags2str(char *buf, u16 flags, unsigned long iflags)
 		*p++ = 'A';
 	if (flags & LM_FLAG_PRIORITY)
 		*p++ = 'p';
+	if (flags & LM_FLAG_SHAREDEX)
+		*p++ = 's';
 	if (flags & GL_ASYNC)
 		*p++ = 'a';
 	if (flags & GL_EXACT)
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index 5e12220cc0c2..3ab7d7f8d986 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -78,6 +78,9 @@ enum {
  * request and directly join the other shared lock.  A shared lock request
  * without the priority flag might be forced to wait until the deferred
  * requested had acquired and released the lock.
+ *
+ * LM_FLAG_SHAREDEX
+ * The glock may be held in EXCLUSIVE mode, but it's still shared
  */
 
 #define LM_FLAG_TRY		0x0001
@@ -85,6 +88,7 @@ enum {
 #define LM_FLAG_NOEXP		0x0004
 #define LM_FLAG_ANY		0x0008
 #define LM_FLAG_PRIORITY	0x0010
+#define LM_FLAG_SHAREDEX	0x0020
 #define GL_ASYNC		0x0040
 #define GL_EXACT		0x0080
 #define GL_SKIP			0x0100
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH 2/2] GFS2: Take advantage of new EXSH glock mode for rgrps
  2018-05-08 20:04 [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2) Bob Peterson
  2018-05-08 20:04 ` [Cluster-devel] [PATCH 1/2] GFS2: Introduce GLF_EX_SHARING bit: local EX sharing Bob Peterson
@ 2018-05-08 20:04 ` Bob Peterson
  2018-05-09  8:06 ` [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2) Steven Whitehouse
  2 siblings, 0 replies; 4+ messages in thread
From: Bob Peterson @ 2018-05-08 20:04 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch switches rgrp locking to EXSH mode so that it's only
taken when the rgrp is added to an active transaction, or when
blocks are being reserved. As soon as the transaction is ended,
the rgrp exclusivity is released. This allows for rgrp sharing.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/bmap.c   |   2 +-
 fs/gfs2/dir.c    |   2 +-
 fs/gfs2/glock.h  |   2 +-
 fs/gfs2/incore.h |   2 ++
 fs/gfs2/inode.c  |   7 ++--
 fs/gfs2/rgrp.c   | 103 ++++++++++++++++++++++++++++++++++++++++++++++---------
 fs/gfs2/rgrp.h   |   2 +-
 fs/gfs2/super.c  |   8 +++--
 fs/gfs2/xattr.c  |   8 +++--
 9 files changed, 107 insertions(+), 29 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 0590e93494f7..e899cf6b4365 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1108,7 +1108,7 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 				goto out;
 			}
 			ret = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
-						 0, rd_gh);
+						 LM_FLAG_SHAREDEX, rd_gh);
 			if (ret)
 				goto out;
 
diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index d9fb0ad6cc30..7327a9d43692 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -2020,7 +2020,7 @@ static int leaf_dealloc(struct gfs2_inode *dip, u32 index, u32 len,
 		l_blocks++;
 	}
 
-	gfs2_rlist_alloc(&rlist, LM_ST_EXCLUSIVE);
+	gfs2_rlist_alloc(&rlist);
 
 	for (x = 0; x < rlist.rl_rgrps; x++) {
 		struct gfs2_rgrpd *rgd = gfs2_glock2rgrp(rlist.rl_ghs[x].gh_gl);
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index 3ab7d7f8d986..cb893d3a75c4 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -80,7 +80,7 @@ enum {
  * requested had acquired and released the lock.
  *
  * LM_FLAG_SHAREDEX
- * The glock may be held in EXCLUSIVE mode, but it's still shared
+ * The glock may be held in EXCLUSIVE mode, but it's still shared.
  */
 
 #define LM_FLAG_TRY		0x0001
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 0bbbaa9b05cb..67140f726c68 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -100,6 +100,8 @@ struct gfs2_rgrpd {
 #define GFS2_RDF_PREFERRED	0x80000000 /* This rgrp is preferred */
 #define GFS2_RDF_MASK		0xf0000000 /* mask for internal flags */
 	spinlock_t rd_rsspin;           /* protects reservation related vars */
+	struct rw_semaphore rd_sem;	/* ensures local rgrp exclusivity */
+	pid_t rd_expid;			/* rgrp locked EX by this pid */
 	struct rb_root rd_rstree;       /* multi-block reservation tree */
 };
 
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 8700eb815638..a47c3ff5b3b9 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -1121,8 +1121,8 @@ static int gfs2_unlink(struct inode *dir, struct dentry *dentry)
 	if (!rgd)
 		goto out_inodes;
 
-	gfs2_holder_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, ghs + 2);
-
+	gfs2_holder_init(rgd->rd_gl, LM_ST_EXCLUSIVE, LM_FLAG_SHAREDEX,
+			 ghs + 2);
 
 	error = gfs2_glock_nq(ghs); /* parent */
 	if (error)
@@ -1409,7 +1409,8 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 		 */
 		nrgd = gfs2_blk2rgrpd(sdp, nip->i_no_addr, 1);
 		if (nrgd)
-			gfs2_holder_init(nrgd->rd_gl, LM_ST_EXCLUSIVE, 0, ghs + num_gh++);
+			gfs2_holder_init(nrgd->rd_gl, LM_ST_EXCLUSIVE,
+					 LM_FLAG_SHAREDEX, ghs + num_gh++);
 	}
 
 	for (x = 0; x < num_gh; x++) {
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 8b683917a27e..437b88e20ad6 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -93,6 +93,7 @@ static inline void gfs2_setbit(const struct gfs2_rbm *rbm, bool do_clone,
 	unsigned int buflen = bi->bi_len;
 	const unsigned int bit = (rbm->offset % GFS2_NBBY) * GFS2_BIT_SIZE;
 
+	BUG_ON(rbm->rgd->rd_expid != pid_nr(task_pid(current)));
 	byte1 = bi->bi_bh->b_data + bi->bi_offset + (rbm->offset / GFS2_NBBY);
 	end = bi->bi_bh->b_data + bi->bi_offset + buflen;
 
@@ -904,6 +905,7 @@ static int read_rindex_entry(struct gfs2_inode *ip)
 	rgd->rd_data = be32_to_cpu(buf.ri_data);
 	rgd->rd_bitbytes = be32_to_cpu(buf.ri_bitbytes);
 	spin_lock_init(&rgd->rd_rsspin);
+	init_rwsem(&rgd->rd_sem);
 
 	error = compute_bitstructs(rgd);
 	if (error)
@@ -1344,6 +1346,37 @@ int gfs2_rgrp_send_discards(struct gfs2_sbd *sdp, u64 offset,
 	return -EIO;
 }
 
+/**
+ * rgrp_lockex - gain exclusive access to an rgrp locked in EXSH
+ * @gl: the glock
+ *
+ * This function waits for exclusive access to an rgrp's glock that is held in
+ * EXSH, by way of its rwsem.
+ *
+ * Locking rule: You can't start a transaction with this held.
+ *
+ * If the rwsem is already held, we don't need to wait, but we may need to
+ * queue the rgrp to a transaction if we have one, on the assumption that the
+ * rwsem may have been locked prior to starting the transaction.
+ */
+static void rgrp_lockex(struct gfs2_rgrpd *rgd)
+{
+	BUG_ON(!gfs2_glock_is_held_excl(rgd->rd_gl));
+	if (rgd->rd_expid != pid_nr(task_pid(current))) {
+		down_write(&rgd->rd_sem);
+		BUG_ON(rgd->rd_expid != 0);
+		rgd->rd_expid = pid_nr(task_pid(current));
+	}
+}
+
+static void rgrp_unlockex(struct gfs2_rgrpd *rgd)
+{
+	BUG_ON(!gfs2_glock_is_held_excl(rgd->rd_gl));
+	BUG_ON(rgd->rd_expid != pid_nr(task_pid(current)));
+	rgd->rd_expid = 0;
+	up_write(&rgd->rd_sem);
+}
+
 /**
  * gfs2_fitrim - Generate discard requests for unused bits of the filesystem
  * @filp: Any file on the filesystem
@@ -1399,7 +1432,8 @@ int gfs2_fitrim(struct file *filp, void __user *argp)
 
 	while (1) {
 
-		ret = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &gh);
+		ret = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
+					 LM_FLAG_SHAREDEX, &gh);
 		if (ret)
 			goto out;
 
@@ -1420,11 +1454,13 @@ int gfs2_fitrim(struct file *filp, void __user *argp)
 			/* Mark rgrp as having been trimmed */
 			ret = gfs2_trans_begin(sdp, RES_RG_HDR, 0);
 			if (ret == 0) {
+				rgrp_lockex(rgd);
 				bh = rgd->rd_bits[0].bi_bh;
 				rgd->rd_flags |= GFS2_RGF_TRIMMED;
 				gfs2_trans_add_meta(rgd->rd_gl, bh);
 				gfs2_rgrp_out(rgd, bh->b_data);
 				gfs2_rgrp_ondisk2lvb(rgd->rd_rgl, bh->b_data);
+				rgrp_unlockex(rgd);
 				gfs2_trans_end(sdp);
 			}
 		}
@@ -1768,6 +1804,16 @@ static int gfs2_rbm_find(struct gfs2_rbm *rbm, u8 state, u32 *minext,
  *
  * Returns: 0 if no error
  *          The inode, if one has been found, in inode.
+ * We must be careful to avoid deadlock here:
+ *
+ * All transactions expect: sd_log_flush_lock followed by rgrp ex (if neeeded),
+ * but try_rgrp_unlink takes sd_log_flush_lock outside a transaction and
+ * therefore must not have the rgrp ex already held. To avoid deadlock, we
+ * drop the rgrp ex lock before taking the log_flush_lock, then reacquire it
+ * to protect our call to gfs2_rbm_find.
+ *
+ * Also note that rgrp_unlockex must come AFTER the caller does gfs2_rs_deltree
+ * because rgrp ex needs to be held before making reservations.
  */
 
 static void try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked, u64 skip)
@@ -1781,7 +1827,12 @@ static void try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked, u64 skip
 	struct gfs2_rbm rbm = { .rgd = rgd, .bii = 0, .offset = 0 };
 
 	while (1) {
+		/* As explained above, we need to drop the rgrp ex lock and
+		 * reacquire it after we get for sd_log_flush_lock.
+		 */
+		rgrp_unlockex(rgd);
 		down_write(&sdp->sd_log_flush_lock);
+		rgrp_lockex(rgd);
 		error = gfs2_rbm_find(&rbm, GFS2_BLKST_UNLINKED, NULL, NULL,
 				      true);
 		up_write(&sdp->sd_log_flush_lock);
@@ -1980,7 +2031,7 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
 	struct gfs2_rgrpd *begin = NULL;
 	struct gfs2_blkreserv *rs = &ip->i_res;
-	int error = 0, rg_locked, flags = 0;
+	int error = 0, rg_locked, flags = LM_FLAG_SHAREDEX;
 	u64 last_unlinked = NO_BLOCK;
 	int loops = 0;
 	u32 skip = 0;
@@ -1991,6 +2042,7 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
 		return -EINVAL;
 	if (gfs2_rs_active(rs)) {
 		begin = rs->rs_rbm.rgd;
+		flags = LM_FLAG_SHAREDEX;
 	} else if (ip->i_rgd && rgrp_contains_block(ip->i_rgd, ip->i_goal)) {
 		rs->rs_rbm.rgd = begin = ip->i_rgd;
 	} else {
@@ -2023,16 +2075,20 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
 						   &rs->rs_rgd_gh);
 			if (unlikely(error))
 				return error;
+			rgrp_lockex(rs->rs_rbm.rgd);
 			if (!gfs2_rs_active(rs) && (loops < 2) &&
 			    gfs2_rgrp_congested(rs->rs_rbm.rgd, loops))
 				goto skip_rgrp;
 			if (sdp->sd_args.ar_rgrplvb) {
 				error = update_rgrp_lvb(rs->rs_rbm.rgd);
 				if (unlikely(error)) {
+					rgrp_unlockex(rs->rs_rbm.rgd);
 					gfs2_glock_dq_uninit(&rs->rs_rgd_gh);
 					return error;
 				}
 			}
+		} else {
+			rgrp_lockex(rs->rs_rbm.rgd);
 		}
 
 		/* Skip unuseable resource groups */
@@ -2058,14 +2114,21 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
 		     rs->rs_rbm.rgd->rd_free_clone >= ap->min_target)) {
 			ip->i_rgd = rs->rs_rbm.rgd;
 			ap->allowed = ip->i_rgd->rd_free_clone;
+			rgrp_unlockex(rs->rs_rbm.rgd);
 			return 0;
 		}
 check_rgrp:
 		/* Check for unlinked inodes which can be reclaimed */
-		if (rs->rs_rbm.rgd->rd_flags & GFS2_RDF_CHECK)
+		if (rs->rs_rbm.rgd->rd_flags & GFS2_RDF_CHECK) {
+			/* Drop reservation if we couldn't use reserved rgrp */
+			if (gfs2_rs_active(rs))
+				gfs2_rs_deltree(rs);
 			try_rgrp_unlink(rs->rs_rbm.rgd, &last_unlinked,
 					ip->i_no_addr);
+		}
 skip_rgrp:
+		rgrp_unlockex(rs->rs_rbm.rgd);
+
 		/* Drop reservation, if we couldn't use reserved rgrp */
 		if (gfs2_rs_active(rs))
 			gfs2_rs_deltree(rs);
@@ -2169,7 +2232,7 @@ static void gfs2_alloc_extent(const struct gfs2_rbm *rbm, bool dinode,
 }
 
 /**
- * rgblk_free - Change alloc state of given block(s)
+ * rgblk_free_and_lock - Change alloc state of given block(s) and lock rgrp ex
  * @sdp: the filesystem
  * @bstart: the start of a run of blocks to free
  * @blen: the length of the block run (all must lie within ONE RG!)
@@ -2178,8 +2241,8 @@ static void gfs2_alloc_extent(const struct gfs2_rbm *rbm, bool dinode,
  * Returns:  Resource group containing the block(s)
  */
 
-static struct gfs2_rgrpd *rgblk_free(struct gfs2_sbd *sdp, u64 bstart,
-				     u32 blen, unsigned char new_state)
+static struct gfs2_rgrpd *rgblk_free_and_lock(struct gfs2_sbd *sdp, u64 bstart,
+					      u32 blen, unsigned char new_state)
 {
 	struct gfs2_rbm rbm;
 	struct gfs2_bitmap *bi, *bi_prev = NULL;
@@ -2192,6 +2255,7 @@ static struct gfs2_rgrpd *rgblk_free(struct gfs2_sbd *sdp, u64 bstart,
 	}
 
 	gfs2_rbm_from_block(&rbm, bstart);
+	rgrp_lockex(rbm.rgd);
 	while (blen--) {
 		bi = rbm_bi(&rbm);
 		if (bi != bi_prev) {
@@ -2227,10 +2291,10 @@ void gfs2_rgrp_dump(struct seq_file *seq, const struct gfs2_glock *gl)
 
 	if (rgd == NULL)
 		return;
-	gfs2_print_dbg(seq, " R: n:%llu f:%02x b:%u/%u i:%u r:%u e:%u\n",
+	gfs2_print_dbg(seq, " R: n:%llu f:%02x b:%u/%u i:%u r:%u e:%u t:%d\n",
 		       (unsigned long long)rgd->rd_addr, rgd->rd_flags,
 		       rgd->rd_free, rgd->rd_free_clone, rgd->rd_dinodes,
-		       rgd->rd_reserved, rgd->rd_extfail_pt);
+		       rgd->rd_reserved, rgd->rd_extfail_pt, rgd->rd_expid);
 	spin_lock(&rgd->rd_rsspin);
 	for (n = rb_first(&rgd->rd_rstree); n; n = rb_next(&trs->rs_node)) {
 		trs = rb_entry(n, struct gfs2_blkreserv, rs_node);
@@ -2341,6 +2405,7 @@ int gfs2_alloc_blocks(struct gfs2_inode *ip, u64 *bn, unsigned int *nblocks,
 	int error;
 
 	gfs2_set_alloc_start(&rbm, ip, dinode);
+	rgrp_lockex(rbm.rgd);
 	error = gfs2_rbm_find(&rbm, GFS2_BLKST_FREE, NULL, ip, false);
 
 	if (error == -ENOSPC) {
@@ -2395,6 +2460,8 @@ int gfs2_alloc_blocks(struct gfs2_inode *ip, u64 *bn, unsigned int *nblocks,
 	gfs2_rgrp_out(rbm.rgd, rbm.rgd->rd_bits[0].bi_bh->b_data);
 	gfs2_rgrp_ondisk2lvb(rbm.rgd->rd_rgl, rbm.rgd->rd_bits[0].bi_bh->b_data);
 
+	rgrp_unlockex(rbm.rgd);
+
 	gfs2_statfs_change(sdp, 0, -(s64)*nblocks, dinode ? 1 : 0);
 	if (dinode)
 		gfs2_trans_add_unrevoke(sdp, block, *nblocks);
@@ -2408,6 +2475,7 @@ int gfs2_alloc_blocks(struct gfs2_inode *ip, u64 *bn, unsigned int *nblocks,
 	return 0;
 
 rgrp_error:
+	rgrp_unlockex(rbm.rgd);
 	gfs2_rgrp_error(rbm.rgd);
 	return -EIO;
 }
@@ -2426,7 +2494,7 @@ void __gfs2_free_blocks(struct gfs2_inode *ip, u64 bstart, u32 blen, int meta)
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
 	struct gfs2_rgrpd *rgd;
 
-	rgd = rgblk_free(sdp, bstart, blen, GFS2_BLKST_FREE);
+	rgd = rgblk_free_and_lock(sdp, bstart, blen, GFS2_BLKST_FREE);
 	if (!rgd)
 		return;
 	trace_gfs2_block_alloc(ip, rgd, bstart, blen, GFS2_BLKST_FREE);
@@ -2435,6 +2503,7 @@ void __gfs2_free_blocks(struct gfs2_inode *ip, u64 bstart, u32 blen, int meta)
 	gfs2_trans_add_meta(rgd->rd_gl, rgd->rd_bits[0].bi_bh);
 	gfs2_rgrp_out(rgd, rgd->rd_bits[0].bi_bh->b_data);
 	gfs2_rgrp_ondisk2lvb(rgd->rd_rgl, rgd->rd_bits[0].bi_bh->b_data);
+	rgrp_unlockex(rgd);
 
 	/* Directories keep their data in the metadata address space */
 	if (meta || ip->i_depth)
@@ -2465,7 +2534,7 @@ void gfs2_unlink_di(struct inode *inode)
 	struct gfs2_rgrpd *rgd;
 	u64 blkno = ip->i_no_addr;
 
-	rgd = rgblk_free(sdp, blkno, 1, GFS2_BLKST_UNLINKED);
+	rgd = rgblk_free_and_lock(sdp, blkno, 1, GFS2_BLKST_UNLINKED);
 	if (!rgd)
 		return;
 	trace_gfs2_block_alloc(ip, rgd, blkno, 1, GFS2_BLKST_UNLINKED);
@@ -2473,6 +2542,7 @@ void gfs2_unlink_di(struct inode *inode)
 	gfs2_rgrp_out(rgd, rgd->rd_bits[0].bi_bh->b_data);
 	gfs2_rgrp_ondisk2lvb(rgd->rd_rgl, rgd->rd_bits[0].bi_bh->b_data);
 	update_rgrp_lvb_unlinked(rgd, 1);
+	rgrp_unlockex(rgd);
 }
 
 void gfs2_free_di(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip)
@@ -2480,7 +2550,7 @@ void gfs2_free_di(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip)
 	struct gfs2_sbd *sdp = rgd->rd_sbd;
 	struct gfs2_rgrpd *tmp_rgd;
 
-	tmp_rgd = rgblk_free(sdp, ip->i_no_addr, 1, GFS2_BLKST_FREE);
+	tmp_rgd = rgblk_free_and_lock(sdp, ip->i_no_addr, 1, GFS2_BLKST_FREE);
 	if (!tmp_rgd)
 		return;
 	gfs2_assert_withdraw(sdp, rgd == tmp_rgd);
@@ -2494,6 +2564,7 @@ void gfs2_free_di(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip)
 	gfs2_rgrp_out(rgd, rgd->rd_bits[0].bi_bh->b_data);
 	gfs2_rgrp_ondisk2lvb(rgd->rd_rgl, rgd->rd_bits[0].bi_bh->b_data);
 	update_rgrp_lvb_unlinked(rgd, -1);
+	rgrp_unlockex(rgd);
 
 	gfs2_statfs_change(sdp, 0, +1, -1);
 	trace_gfs2_block_alloc(ip, rgd, ip->i_no_addr, 1, GFS2_BLKST_FREE);
@@ -2522,7 +2593,8 @@ int gfs2_check_blk_type(struct gfs2_sbd *sdp, u64 no_addr, unsigned int type)
 	if (!rgd)
 		goto fail;
 
-	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_SHARED, 0, &rgd_gh);
+	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
+				   LM_FLAG_SHAREDEX, &rgd_gh);
 	if (error)
 		goto fail;
 
@@ -2601,16 +2673,15 @@ void gfs2_rlist_add(struct gfs2_inode *ip, struct gfs2_rgrp_list *rlist,
  *
  */
 
-void gfs2_rlist_alloc(struct gfs2_rgrp_list *rlist, unsigned int state)
+void gfs2_rlist_alloc(struct gfs2_rgrp_list *rlist)
 {
 	unsigned int x;
 
 	rlist->rl_ghs = kmalloc(rlist->rl_rgrps * sizeof(struct gfs2_holder),
 				GFP_NOFS | __GFP_NOFAIL);
 	for (x = 0; x < rlist->rl_rgrps; x++)
-		gfs2_holder_init(rlist->rl_rgd[x]->rd_gl,
-				state, 0,
-				&rlist->rl_ghs[x]);
+		gfs2_holder_init(rlist->rl_rgd[x]->rd_gl, LM_ST_EXCLUSIVE,
+				 LM_FLAG_SHAREDEX, &rlist->rl_ghs[x]);
 }
 
 /**
diff --git a/fs/gfs2/rgrp.h b/fs/gfs2/rgrp.h
index e90478e2f545..ca40cdcc4b7e 100644
--- a/fs/gfs2/rgrp.h
+++ b/fs/gfs2/rgrp.h
@@ -68,7 +68,7 @@ struct gfs2_rgrp_list {
 
 extern void gfs2_rlist_add(struct gfs2_inode *ip, struct gfs2_rgrp_list *rlist,
 			   u64 block);
-extern void gfs2_rlist_alloc(struct gfs2_rgrp_list *rlist, unsigned int state);
+extern void gfs2_rlist_alloc(struct gfs2_rgrp_list *rlist);
 extern void gfs2_rlist_free(struct gfs2_rgrp_list *rlist);
 extern u64 gfs2_ri_total(struct gfs2_sbd *sdp);
 extern void gfs2_rgrp_dump(struct seq_file *seq, const struct gfs2_glock *gl);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index cf5c7f3080d2..fe37b69d6d3c 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1131,8 +1131,9 @@ static int gfs2_statfs_slow(struct gfs2_sbd *sdp, struct gfs2_statfs_change_host
 				done = 0;
 			else if (rgd_next && !error) {
 				error = gfs2_glock_nq_init(rgd_next->rd_gl,
-							   LM_ST_SHARED,
-							   GL_ASYNC,
+							   LM_ST_EXCLUSIVE,
+							   GL_ASYNC|
+							   LM_FLAG_SHAREDEX,
 							   gh);
 				rgd_next = gfs2_rgrpd_get_next(rgd_next);
 				done = 0;
@@ -1507,7 +1508,8 @@ static int gfs2_dinode_dealloc(struct gfs2_inode *ip)
 		goto out_qs;
 	}
 
-	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &gh);
+	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
+				   LM_FLAG_SHAREDEX, &gh);
 	if (error)
 		goto out_qs;
 
diff --git a/fs/gfs2/xattr.c b/fs/gfs2/xattr.c
index f2bce1e0f6fb..5766d44959d7 100644
--- a/fs/gfs2/xattr.c
+++ b/fs/gfs2/xattr.c
@@ -262,7 +262,8 @@ static int ea_dealloc_unstuffed(struct gfs2_inode *ip, struct buffer_head *bh,
 		return -EIO;
 	}
 
-	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &rg_gh);
+	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
+				   LM_FLAG_SHAREDEX, &rg_gh);
 	if (error)
 		return error;
 
@@ -1314,7 +1315,7 @@ static int ea_dealloc_indirect(struct gfs2_inode *ip)
 	else
 		goto out;
 
-	gfs2_rlist_alloc(&rlist, LM_ST_EXCLUSIVE);
+	gfs2_rlist_alloc(&rlist);
 
 	for (x = 0; x < rlist.rl_rgrps; x++) {
 		struct gfs2_rgrpd *rgd = gfs2_glock2rgrp(rlist.rl_ghs[x].gh_gl);
@@ -1397,7 +1398,8 @@ static int ea_dealloc_block(struct gfs2_inode *ip)
 		return -EIO;
 	}
 
-	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &gh);
+	error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
+				   LM_FLAG_SHAREDEX, &gh);
 	if (error)
 		return error;
 
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2)
  2018-05-08 20:04 [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2) Bob Peterson
  2018-05-08 20:04 ` [Cluster-devel] [PATCH 1/2] GFS2: Introduce GLF_EX_SHARING bit: local EX sharing Bob Peterson
  2018-05-08 20:04 ` [Cluster-devel] [PATCH 2/2] GFS2: Take advantage of new EXSH glock mode for rgrps Bob Peterson
@ 2018-05-09  8:06 ` Steven Whitehouse
  2 siblings, 0 replies; 4+ messages in thread
From: Steven Whitehouse @ 2018-05-09  8:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,


On 08/05/18 21:04, Bob Peterson wrote:
> Hi,
>
> On 18 April, I posted v1 of this patch set. The idea is to allow multiple
> processes on a node to share a glock that's held exclusively in order to
> improve performance. Sharing rgrps allows for better throughput by
> reducing contention.
>
> Version 1 implemented this by introducing a new glock mode for sharing
> glocks. Steve Whitehouse suggested we didn't need a new mode: we can
> accomplish the same thing just by having a new glock flag, which also
> makes the patch more simple.
>
> This version 2 patch set implements Steve's suggestion.
>
> The first patch introduces the new glock flag. The second patch puts
> it into use for rgrp sharing. Exclusive access to the rgrp is implemented
> through an rwsem.
>
> Performance testing using iozone looks even better than version 1.
Sounds really good! We should make sure that we give this a really good 
round of testing and it would be nice to see some details of the 
performance improvements. Overall though, that's an excellent result :-)

Steve.

> ---
> Bob Peterson (2):
>    GFS2: Introduce GLF_EX_SHARING bit: local EX sharing
>    GFS2: Take advantage of new EXSH glock mode for rgrps
>
>   fs/gfs2/bmap.c   |   2 +-
>   fs/gfs2/dir.c    |   2 +-
>   fs/gfs2/glock.c  |  23 ++++++++++---
>   fs/gfs2/glock.h  |   4 +++
>   fs/gfs2/incore.h |   2 ++
>   fs/gfs2/inode.c  |   7 ++--
>   fs/gfs2/rgrp.c   | 103 ++++++++++++++++++++++++++++++++++++++++++++++---------
>   fs/gfs2/rgrp.h   |   2 +-
>   fs/gfs2/super.c  |   8 +++--
>   fs/gfs2/xattr.c  |   8 +++--
>   10 files changed, 129 insertions(+), 32 deletions(-)
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-05-09  8:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-08 20:04 [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2) Bob Peterson
2018-05-08 20:04 ` [Cluster-devel] [PATCH 1/2] GFS2: Introduce GLF_EX_SHARING bit: local EX sharing Bob Peterson
2018-05-08 20:04 ` [Cluster-devel] [PATCH 2/2] GFS2: Take advantage of new EXSH glock mode for rgrps Bob Peterson
2018-05-09  8:06 ` [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2) Steven Whitehouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.