All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups
@ 2020-12-14  8:54 Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hello,

here's a bit of fallout in the log space management code that resulted
from looking into a revokes accounting problem in conjunction with the
resource group glock sharing patches.  I'm still working on some
follow-up changes, but this patch set should be reasonably
self-contained.  Please review.

Thanks,
Andreas

Andreas Gruenbacher (12):
  gfs2: Deobfuscate function jdesc_find_i
  gfs2: Simplify the buf_limit and databuf_limit definitions
  gfs2: Minor gfs2_write_revokes cleanups
  gfs2: Some documentation clarifications
  gfs2: A minor debugging improvement
  gfs2: Clean up ail2_empty
  gfs2: Get rid of on-stack transactions
  gfs2: Get rid of sd_reserving_log
  gfs2: Move lock flush locking to gfs2_trans_{begin,end}
  gfs2: Don't wait for journal flush in clean_journal
  gfs2: Clean up gfs2_log_reserve
  gfs2: Use a tighter bound in gfs2_trans_begin

 fs/gfs2/glops.c      |  29 ++------
 fs/gfs2/incore.h     |   4 -
 fs/gfs2/log.c        | 174 ++++++++++++++++++-------------------------
 fs/gfs2/log.h        |   3 +-
 fs/gfs2/lops.h       |  17 +----
 fs/gfs2/ops_fstype.c |   2 -
 fs/gfs2/super.c      |  25 +++----
 fs/gfs2/trans.c      |  59 ++++++++++-----
 fs/gfs2/trans.h      |   2 +
 9 files changed, 135 insertions(+), 180 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Clean up this function to show that it is trivial.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/super.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index b3d951ab8068..08aef061b29d 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -81,19 +81,12 @@ void gfs2_jindex_free(struct gfs2_sbd *sdp)
 static struct gfs2_jdesc *jdesc_find_i(struct list_head *head, unsigned int jid)
 {
 	struct gfs2_jdesc *jd;
-	int found = 0;
 
 	list_for_each_entry(jd, head, jd_list) {
-		if (jd->jd_jid == jid) {
-			found = 1;
-			break;
-		}
+		if (jd->jd_jid == jid)
+			return jd;
 	}
-
-	if (!found)
-		jd = NULL;
-
-	return jd;
+	return NULL;
 }
 
 struct gfs2_jdesc *gfs2_jdesc_find(struct gfs2_sbd *sdp, unsigned int jid)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The BUF_OFFSET and DATABUF_OFFSET definitions are only used in buf_limit
and databuf_limit, respectively, and the rounding done in those
definitions is immediately wiped out by dividing by the element size.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/lops.h | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index fbdbb08dcec6..3fca5bf239d3 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -10,13 +10,6 @@
 #include <linux/list.h>
 #include "incore.h"
 
-#define BUF_OFFSET \
-	((sizeof(struct gfs2_log_descriptor) + sizeof(__be64) - 1) & \
-	 ~(sizeof(__be64) - 1))
-#define DATABUF_OFFSET \
-	((sizeof(struct gfs2_log_descriptor) + (2 * sizeof(__be64) - 1)) & \
-	 ~(2 * sizeof(__be64) - 1))
-
 extern const struct gfs2_log_operations *gfs2_log_ops[];
 extern void gfs2_log_incr_head(struct gfs2_sbd *sdp);
 extern u64 gfs2_log_bmap(struct gfs2_jdesc *jd, unsigned int lbn);
@@ -29,18 +22,12 @@ extern int gfs2_find_jhead(struct gfs2_jdesc *jd,
 			   struct gfs2_log_header_host *head, bool keep_cache);
 static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
 {
-	unsigned int limit;
-
-	limit = (sdp->sd_sb.sb_bsize - BUF_OFFSET) / sizeof(__be64);
-	return limit;
+	return sdp->sd_ldptrs;
 }
 
 static inline unsigned int databuf_limit(struct gfs2_sbd *sdp)
 {
-	unsigned int limit;
-
-	limit = (sdp->sd_sb.sb_bsize - DATABUF_OFFSET) / (2 * sizeof(__be64));
-	return limit;
+	return sdp->sd_ldptrs / 2;
 }
 
 static inline void lops_before_commit(struct gfs2_sbd *sdp,
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications Andreas Gruenbacher
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Clean up the computations in gfs2_write_revokes (no change in functionality).

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 2e9314091c81..c65fdb1a30a0 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -712,11 +712,12 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
 void gfs2_write_revokes(struct gfs2_sbd *sdp)
 {
 	/* number of revokes we still have room for */
-	int max_revokes = (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_log_descriptor)) / sizeof(u64);
+	unsigned int max_revokes = sdp->sd_ldptrs;
 
 	gfs2_log_lock(sdp);
-	while (sdp->sd_log_num_revoke > max_revokes)
-		max_revokes += (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_meta_header)) / sizeof(u64);
+	if (sdp->sd_log_num_revoke > sdp->sd_ldptrs)
+		max_revokes += roundup(sdp->sd_log_num_revoke - sdp->sd_ldptrs,
+				       sdp->sd_inptrs);
 	max_revokes -= sdp->sd_log_num_revoke;
 	if (!sdp->sd_log_num_revoke) {
 		atomic_dec(&sdp->sd_log_blks_free);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (2 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement Andreas Gruenbacher
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The calc_reserved description claims that buf_limit is 502 (on 4k
filesystems), but it is actually 503.  Fix / clarify the entire
description.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index c65fdb1a30a0..f7c225520c38 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -50,10 +50,12 @@ unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct)
 	unsigned int blks;
 	unsigned int first, second;
 
+	/* The initial struct gfs2_log_descriptor block */
 	blks = 1;
 	first = sdp->sd_ldptrs;
 
 	if (nstruct > first) {
+		/* Subsequent struct gfs2_meta_header blocks */
 		second = sdp->sd_inptrs;
 		blks += DIV_ROUND_UP(nstruct - first, second);
 	}
@@ -507,24 +509,20 @@ static inline unsigned int log_distance(struct gfs2_sbd *sdp, unsigned int newer
 }
 
 /**
- * calc_reserved - Calculate the number of blocks to reserve when
- *                 refunding a transaction's unused buffers.
+ * calc_reserved - Calculate the number of blocks to keep reserved
  * @sdp: The GFS2 superblock
  *
  * This is complex.  We need to reserve room for all our currently used
- * metadata buffers (e.g. normal file I/O rewriting file time stamps) and 
- * all our journaled data buffers for journaled files (e.g. files in the 
+ * metadata blocks (e.g. normal file I/O rewriting file time stamps) and
+ * all our journaled data blocks for journaled files (e.g. files in the
  * meta_fs like rindex, or files for which chattr +j was done.)
- * If we don't reserve enough space, gfs2_log_refund and gfs2_log_flush
- * will count it as free space (sd_log_blks_free) and corruption will follow.
+ * If we don't reserve enough space, corruption will follow.
  *
- * We can have metadata bufs and jdata bufs in the same journal.  So each
- * type gets its own log header, for which we need to reserve a block.
- * In fact, each type has the potential for needing more than one header 
- * in cases where we have more buffers than will fit on a journal page.
+ * We can have metadata blocks and jdata blocks in the same journal.  Each
+ * type gets its own log descriptor, for which we need to reserve a block.
+ * In fact, each type has the potential for needing more than one log descriptor
+ * in cases where we have more blocks than will fit in a log descriptor.
  * Metadata journal entries take up half the space of journaled buffer entries.
- * Thus, metadata entries have buf_limit (502) and journaled buffers have
- * databuf_limit (251) before they cause a wrap around.
  *
  * Also, we need to reserve blocks for revoke journal entries and one for an
  * overall header for the lot.
@@ -1007,7 +1005,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	if (sdp->sd_log_head != sdp->sd_log_flush_head) {
 		log_flush_wait(sdp);
 		log_write_header(sdp, flags);
-	} else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle){
+	} else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle) {
 		atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
 		trace_gfs2_log_blocks(sdp, -1);
 		log_write_header(sdp, flags);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (3 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty Andreas Gruenbacher
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Split the assert in gfs2_trans_end into two parts.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/trans.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 6d4bf7ea7b3b..7705f04621f4 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -109,8 +109,8 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 	nbuf -= tr->tr_num_buf_rm;
 	nbuf -= tr->tr_num_databuf_rm;
 
-	if (gfs2_assert_withdraw(sdp, (nbuf <= tr->tr_blocks) &&
-				       (tr->tr_num_revoke <= tr->tr_revokes)))
+	if (gfs2_assert_withdraw(sdp, nbuf <= tr->tr_blocks) ||
+	    gfs2_assert_withdraw(sdp, tr->tr_num_revoke <= tr->tr_revokes))
 		gfs2_print_trans(sdp, tr);
 
 	gfs2_log_commit(sdp, tr);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (4 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Clean up the logic in ail2_empty (no functional change).

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 38 +++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index f7c225520c38..33f7b1544958 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -367,29 +367,33 @@ static void gfs2_ail_empty_tr(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
 	}
 }
 
+static void __ail2_empty(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
+{
+	gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
+	list_del(&tr->tr_list);
+	gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));
+	gfs2_assert_warn(sdp, list_empty(&tr->tr_ail2_list));
+	gfs2_trans_free(sdp, tr);
+}
+
 static void ail2_empty(struct gfs2_sbd *sdp, unsigned int new_tail)
 {
-	struct gfs2_trans *tr, *safe;
+	struct list_head *ail2_list = &sdp->sd_ail2_list;
 	unsigned int old_tail = sdp->sd_log_tail;
-	int wrap = (new_tail < old_tail);
-	int a, b, rm;
+	struct gfs2_trans *tr, *safe;
 
 	spin_lock(&sdp->sd_ail_lock);
-
-	list_for_each_entry_safe(tr, safe, &sdp->sd_ail2_list, tr_list) {
-		a = (old_tail <= tr->tr_first);
-		b = (tr->tr_first < new_tail);
-		rm = (wrap) ? (a || b) : (a && b);
-		if (!rm)
-			continue;
-
-		gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
-		list_del(&tr->tr_list);
-		gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));
-		gfs2_assert_warn(sdp, list_empty(&tr->tr_ail2_list));
-		gfs2_trans_free(sdp, tr);
+	if (old_tail <= new_tail) {
+		list_for_each_entry_safe(tr, safe, ail2_list, tr_list) {
+			if (old_tail <= tr->tr_first && tr->tr_first < new_tail)
+				__ail2_empty(sdp, tr);
+		}
+	} else {
+		list_for_each_entry_safe(tr, safe, ail2_list, tr_list) {
+			if (old_tail <= tr->tr_first || tr->tr_first < new_tail)
+				__ail2_empty(sdp, tr);
+		}
 	}
-
 	spin_unlock(&sdp->sd_ail_lock);
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (5 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14 14:02   ` Bob Peterson
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On-stack transactions were introduced to work around a transaction glock
deadlock in gfs2_trans_begin in commit d8348de06f70 ("GFS2: Fix deadlock
on journal flush").  Subsequently, transaction glocks were eliminated in
favor of the more efficient freeze glocks in commit 24972557b12c ("GFS2:
remove transaction glock") without also removing the on-stack
transactions.

It has now turned out that committing on-stack transactions
significantly complicates journal free space accounting when no system
transaction (sdp->sd_log_tr) is active at the time.  It doesn't seem
that on-stack transactions provide a significant benefit beyond their
original purpose (as an optimization), so remove them to allow fixing
the journal free space accounting in a reasonable way in a subsequent
patch.

FIXME: Can we better handle a gfs2_trans_begin failure in gfs2_ail_empty_gl?
If we skip the __gfs2_ail_flush, we'll just end up with leftover items on
gl_ail_list.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/glops.c  | 29 +++++++----------------------
 fs/gfs2/incore.h |  1 -
 fs/gfs2/log.c    |  1 -
 fs/gfs2/trans.c  | 25 +++++++++++++------------
 fs/gfs2/trans.h  |  2 ++
 5 files changed, 22 insertions(+), 36 deletions(-)

diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 3faa421568b0..853e590ccc15 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -84,18 +84,11 @@ static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync,
 
 static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
 {
+	unsigned int revokes = atomic_read(&gl->gl_ail_count);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
-	struct gfs2_trans tr;
 	int ret;
 
-	memset(&tr, 0, sizeof(tr));
-	INIT_LIST_HEAD(&tr.tr_buf);
-	INIT_LIST_HEAD(&tr.tr_databuf);
-	INIT_LIST_HEAD(&tr.tr_ail1_list);
-	INIT_LIST_HEAD(&tr.tr_ail2_list);
-	tr.tr_revokes = atomic_read(&gl->gl_ail_count);
-
-	if (!tr.tr_revokes) {
+	if (!revokes) {
 		bool have_revokes;
 		bool log_in_flight;
 
@@ -122,20 +115,12 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
 		return 0;
 	}
 
-	/* A shortened, inline version of gfs2_trans_begin()
-         * tr->alloced is not set since the transaction structure is
-         * on the stack */
-	tr.tr_reserved = 1 + gfs2_struct2blk(sdp, tr.tr_revokes);
-	tr.tr_ip = _RET_IP_;
-	ret = gfs2_log_reserve(sdp, tr.tr_reserved);
-	if (ret < 0)
-		return ret;
-	WARN_ON_ONCE(current->journal_info);
-	current->journal_info = &tr;
-
-	__gfs2_ail_flush(gl, 0, tr.tr_revokes);
-
+	ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
+	if (ret)
+		goto flush;
+	__gfs2_ail_flush(gl, 0, revokes);
 	gfs2_trans_end(sdp);
+
 flush:
 	gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 		       GFS2_LFC_AIL_EMPTY_GL);
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index f8858d995b24..adea010124aa 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -491,7 +491,6 @@ struct gfs2_quota_data {
 enum {
 	TR_TOUCHED = 1,
 	TR_ATTACHED = 2,
-	TR_ALLOCED = 3,
 };
 
 struct gfs2_trans {
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 33f7b1544958..4b090f8afb37 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1113,7 +1113,6 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	if (sdp->sd_log_tr) {
 		gfs2_merge_trans(sdp, tr);
 	} else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
-		gfs2_assert_withdraw(sdp, test_bit(TR_ALLOCED, &tr->tr_flags));
 		sdp->sd_log_tr = tr;
 		set_bit(TR_ATTACHED, &tr->tr_flags);
 	}
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 7705f04621f4..4f461ab37ced 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -37,8 +37,8 @@ static void gfs2_print_trans(struct gfs2_sbd *sdp, const struct gfs2_trans *tr)
 		tr->tr_num_revoke, tr->tr_num_revoke_rm);
 }
 
-int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
-		     unsigned int revokes)
+int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+		       unsigned int revokes, gfp_t gfp_mask)
 {
 	struct gfs2_trans *tr;
 	int error;
@@ -52,7 +52,7 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	if (!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
 		return -EROFS;
 
-	tr = kmem_cache_zalloc(gfs2_trans_cachep, GFP_NOFS);
+	tr = kmem_cache_zalloc(gfs2_trans_cachep, gfp_mask);
 	if (!tr)
 		return -ENOMEM;
 
@@ -60,7 +60,6 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	tr->tr_blocks = blocks;
 	tr->tr_revokes = revokes;
 	tr->tr_reserved = 1;
-	set_bit(TR_ALLOCED, &tr->tr_flags);
 	if (blocks)
 		tr->tr_reserved += 6 + blocks;
 	if (revokes)
@@ -88,20 +87,23 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	return error;
 }
 
+int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+		     unsigned int revokes)
+{
+	return __gfs2_trans_begin(sdp, blocks, revokes, GFP_NOFS);
+}
+
 void gfs2_trans_end(struct gfs2_sbd *sdp)
 {
 	struct gfs2_trans *tr = current->journal_info;
 	s64 nbuf;
-	int alloced = test_bit(TR_ALLOCED, &tr->tr_flags);
 
 	current->journal_info = NULL;
 
 	if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
 		gfs2_log_release(sdp, tr->tr_reserved);
-		if (alloced) {
-			gfs2_trans_free(sdp, tr);
-			sb_end_intwrite(sdp->sd_vfs);
-		}
+		gfs2_trans_free(sdp, tr);
+		sb_end_intwrite(sdp->sd_vfs);
 		return;
 	}
 
@@ -114,15 +116,14 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 		gfs2_print_trans(sdp, tr);
 
 	gfs2_log_commit(sdp, tr);
-	if (alloced && !test_bit(TR_ATTACHED, &tr->tr_flags))
+	if (!test_bit(TR_ATTACHED, &tr->tr_flags))
 		gfs2_trans_free(sdp, tr);
 	up_read(&sdp->sd_log_flush_lock);
 
 	if (sdp->sd_vfs->s_flags & SB_SYNCHRONOUS)
 		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 			       GFS2_LFC_TRANS_END);
-	if (alloced)
-		sb_end_intwrite(sdp->sd_vfs);
+	sb_end_intwrite(sdp->sd_vfs);
 }
 
 static struct gfs2_bufdata *gfs2_alloc_bufdata(struct gfs2_glock *gl,
diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
index 83199ce5a5c5..9c732a5f28bf 100644
--- a/fs/gfs2/trans.h
+++ b/fs/gfs2/trans.h
@@ -34,6 +34,8 @@ static inline unsigned int gfs2_rg_blocks(const struct gfs2_inode *ip, unsigned
 	return rgd->rd_length;
 }
 
+extern int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+			      unsigned int revokes, gfp_t gfp_mask);
 extern int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 			    unsigned int revokes);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (6 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This counter and the associated wait queue are only used so that
gfs2_make_fs_ro can efficiently wait for all pending log space
allocations to fail after setting the filesystem to read-only.  This
comes at the cost of waking up that wait queue very frequently.

Instead, when gfs2_log_reserve fails because the filesystem has become
read-only, Wake up sd_log_waitq.  In gfs2_make_fs_ro, set the file
system read-only and then wait until all the log space has been
released.  Give up and report the problem after a while.  With that,
sd_reserving_log and sd_reserving_log_wait can be removed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/incore.h     |  3 ---
 fs/gfs2/log.c        | 17 ++++++++++-------
 fs/gfs2/log.h        |  1 +
 fs/gfs2/ops_fstype.c |  2 --
 fs/gfs2/super.c      | 12 ++++++------
 fs/gfs2/trans.c      |  5 ++++-
 6 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index adea010124aa..ecc3f596319c 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -849,9 +849,6 @@ struct gfs2_sbd {
 	int sd_log_error; /* First log error */
 	wait_queue_head_t sd_withdraw_wait;
 
-	atomic_t sd_reserving_log;
-	wait_queue_head_t sd_reserving_log_wait;
-
 	unsigned int sd_log_flush_head;
 
 	spinlock_t sd_ail_lock;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 4b090f8afb37..8c46f8f64c9e 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -414,6 +414,15 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
 	up_read(&sdp->sd_log_flush_lock);
 }
 
+/**
+ * gfs2_log_is_empty - Check if the log is empty
+ * @sdp: The GFS2 superblock
+ */
+
+bool gfs2_log_is_empty(struct gfs2_sbd *sdp) {
+	return atomic_read(&sdp->sd_log_blks_free) == sdp->sd_jdesc->jd_blocks;
+}
+
 /**
  * gfs2_log_reserve - Make a log reservation
  * @sdp: The GFS2 superblock
@@ -461,13 +470,9 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 		} while(free_blocks <= wanted);
 		finish_wait(&sdp->sd_log_waitq, &wait);
 	}
-	atomic_inc(&sdp->sd_reserving_log);
 	if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
-				free_blocks - blks) != free_blocks) {
-		if (atomic_dec_and_test(&sdp->sd_reserving_log))
-			wake_up(&sdp->sd_reserving_log_wait);
+				free_blocks - blks) != free_blocks)
 		goto retry;
-	}
 	atomic_sub(blks, &sdp->sd_log_blks_needed);
 	trace_gfs2_log_blocks(sdp, -blks);
 
@@ -483,8 +488,6 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 		gfs2_log_release(sdp, blks);
 		ret = -EROFS;
 	}
-	if (atomic_dec_and_test(&sdp->sd_reserving_log))
-		wake_up(&sdp->sd_reserving_log_wait);
 	return ret;
 }
 
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 79f97290146e..f483e706db3a 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -66,6 +66,7 @@ extern unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct);
 extern void gfs2_remove_from_ail(struct gfs2_bufdata *bd);
 extern void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
 extern int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
+extern bool gfs2_log_is_empty(struct gfs2_sbd *sdp);
 extern void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 				  u64 seq, u32 tail, u32 lblock, u32 flags,
 				  int op_flags);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 61fce59cb4d3..986dc2ebebf0 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -136,8 +136,6 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
 
 	init_rwsem(&sdp->sd_log_flush_lock);
 	atomic_set(&sdp->sd_log_in_flight, 0);
-	atomic_set(&sdp->sd_reserving_log, 0);
-	init_waitqueue_head(&sdp->sd_reserving_log_wait);
 	init_waitqueue_head(&sdp->sd_log_flush_wait);
 	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
 	mutex_init(&sdp->sd_freeze_mutex);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 08aef061b29d..7da55d9a9792 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -647,13 +647,13 @@ int gfs2_make_fs_ro(struct gfs2_sbd *sdp)
 
 		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
 			       GFS2_LFC_MAKE_FS_RO);
-		wait_event(sdp->sd_reserving_log_wait,
-			   atomic_read(&sdp->sd_reserving_log) == 0);
-		gfs2_assert_warn(sdp, atomic_read(&sdp->sd_log_blks_free) ==
-				 sdp->sd_jdesc->jd_blocks);
+		wait_event_timeout(sdp->sd_log_waitq,
+				   gfs2_log_is_empty(sdp),
+				   HZ * 5);
+		gfs2_assert_warn(sdp, gfs2_log_is_empty(sdp));
 	} else {
-		wait_event_timeout(sdp->sd_reserving_log_wait,
-				   atomic_read(&sdp->sd_reserving_log) == 0,
+		wait_event_timeout(sdp->sd_log_waitq,
+				   gfs2_log_is_empty(sdp),
 				   HZ * 5);
 	}
 	if (gfs2_holder_initialized(&freeze_gh))
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 4f461ab37ced..e33cb8da056a 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -73,8 +73,11 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	sb_start_intwrite(sdp->sd_vfs);
 
 	error = gfs2_log_reserve(sdp, tr->tr_reserved);
-	if (error)
+	if (error) {
+		if (error == -EROFS)
+			wake_up(&sdp->sd_log_waitq);
 		goto fail;
+	}
 
 	current->journal_info = tr;
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end}
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (7 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Move the read locking of sd_log_flush_lock from gfs2_log_reserve to
gfs2_trans_begin, and its unlocking from gfs2_log_release to
gfs2_trans_end.  Use gfs2_log_release in two places in which it was open
coded before.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c   | 28 +++-------------------------
 fs/gfs2/log.h   |  2 +-
 fs/gfs2/trans.c | 23 ++++++++++++++++-------
 3 files changed, 20 insertions(+), 33 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 8c46f8f64c9e..b0d4ce4992ed 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -411,7 +411,6 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
 	trace_gfs2_log_blocks(sdp, blks);
 	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 				  sdp->sd_jdesc->jd_blocks);
-	up_read(&sdp->sd_log_flush_lock);
 }
 
 /**
@@ -439,22 +438,16 @@ bool gfs2_log_is_empty(struct gfs2_sbd *sdp) {
  * with queued waiters, we use an exclusive wait. This means that when we
  * get woken with enough journal space to get our reservation, we need to
  * wake the next waiter on the list.
- *
- * Returns: errno
  */
 
-int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
+void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 {
-	int ret = 0;
 	unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
 	unsigned wanted = blks + reserved_blks;
 	DEFINE_WAIT(wait);
 	int did_wait = 0;
 	unsigned int free_blocks;
 
-	if (gfs2_assert_warn(sdp, blks) ||
-	    gfs2_assert_warn(sdp, blks <= sdp->sd_jdesc->jd_blocks))
-		return -EINVAL;
 	atomic_add(blks, &sdp->sd_log_blks_needed);
 retry:
 	free_blocks = atomic_read(&sdp->sd_log_blks_free);
@@ -482,13 +475,6 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 	 */
 	if (unlikely(did_wait))
 		wake_up(&sdp->sd_log_waitq);
-
-	down_read(&sdp->sd_log_flush_lock);
-	if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
-		gfs2_log_release(sdp, blks);
-		ret = -EROFS;
-	}
-	return ret;
 }
 
 /**
@@ -585,12 +571,7 @@ static void log_pull_tail(struct gfs2_sbd *sdp, unsigned int new_tail)
 	unsigned int dist = log_distance(sdp, new_tail, sdp->sd_log_tail);
 
 	ail2_empty(sdp, new_tail);
-
-	atomic_add(dist, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, dist);
-	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
-			     sdp->sd_jdesc->jd_blocks);
-
+	gfs2_log_release(sdp, dist);
 	sdp->sd_log_tail = new_tail;
 }
 
@@ -1125,10 +1106,7 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	maxres = sdp->sd_log_blks_reserved + tr->tr_reserved;
 	gfs2_assert_withdraw(sdp, maxres >= reserved);
 	unused = maxres - reserved;
-	atomic_add(unused, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, unused);
-	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
-			     sdp->sd_jdesc->jd_blocks);
+	gfs2_log_release(sdp, unused);
 	sdp->sd_log_blks_reserved = reserved;
 
 	gfs2_log_unlock(sdp);
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index f483e706db3a..db3bb3564ea0 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -65,7 +65,7 @@ extern void gfs2_ordered_del_inode(struct gfs2_inode *ip);
 extern unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct);
 extern void gfs2_remove_from_ail(struct gfs2_bufdata *bd);
 extern void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
-extern int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
+extern void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
 extern bool gfs2_log_is_empty(struct gfs2_sbd *sdp);
 extern void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 				  u64 seq, u32 tail, u32 lblock, u32 flags,
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index e33cb8da056a..c9d8247ffa19 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -70,12 +70,22 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	INIT_LIST_HEAD(&tr->tr_ail1_list);
 	INIT_LIST_HEAD(&tr->tr_ail2_list);
 
+	if (gfs2_assert_warn(sdp, tr->tr_reserved <= sdp->sd_jdesc->jd_blocks)) {
+		error = -EINVAL;
+		goto fail;
+	}
+
 	sb_start_intwrite(sdp->sd_vfs);
 
-	error = gfs2_log_reserve(sdp, tr->tr_reserved);
-	if (error) {
-		if (error == -EROFS)
-			wake_up(&sdp->sd_log_waitq);
+	gfs2_log_reserve(sdp, tr->tr_reserved);
+
+	down_read(&sdp->sd_log_flush_lock);
+	if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
+		gfs2_log_release(sdp, tr->tr_reserved);
+		up_read(&sdp->sd_log_flush_lock);
+		sb_end_intwrite(sdp->sd_vfs);
+		wake_up(&sdp->sd_log_waitq);
+		error = -EROFS;
 		goto fail;
 	}
 
@@ -84,9 +94,7 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	return 0;
 
 fail:
-	sb_end_intwrite(sdp->sd_vfs);
 	kmem_cache_free(gfs2_trans_cachep, tr);
-
 	return error;
 }
 
@@ -105,8 +113,9 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 
 	if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
 		gfs2_log_release(sdp, tr->tr_reserved);
-		gfs2_trans_free(sdp, tr);
+		up_read(&sdp->sd_log_flush_lock);
 		sb_end_intwrite(sdp->sd_vfs);
+		gfs2_trans_free(sdp, tr);
 		return;
 	}
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (8 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Commit 588bff95c94e added gfs2_write_log_header() and started using it in
clean_journal(), with an additional call to log_flush_wait() at the end of
gfs2_write_log_header() which is unnecessary for clean_journal().  Move
that call out of gfs2_write_log_header() to restore the previous behavior.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index b0d4ce4992ed..a8cc71ba4852 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -756,7 +756,7 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 	u64 dblock;
 
 	if (gfs2_withdrawn(sdp))
-		goto out;
+		return;
 
 	page = mempool_alloc(gfs2_page_pool, GFP_NOIO);
 	lh = page_address(page);
@@ -811,8 +811,6 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 
 	gfs2_log_write(sdp, page, sb->s_blocksize, 0, dblock);
 	gfs2_log_submit_bio(&sdp->sd_log_bio, REQ_OP_WRITE | op_flags);
-out:
-	log_flush_wait(sdp);
 }
 
 /**
@@ -841,6 +839,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags)
 	gfs2_write_log_header(sdp, sdp->sd_jdesc, sdp->sd_log_sequence++, tail,
 			      sdp->sd_log_flush_head, flags, op_flags);
 	gfs2_log_incr_head(sdp);
+	log_flush_wait(sdp);
 
 	if (sdp->sd_log_tail != tail)
 		log_pull_tail(sdp, tail);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (9 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Wake up log waiters in gfs2_log_release when log space has actually become
available.  This is a much better place for the wakeup than gfs2_logd.

Check if enough log space is immeditely available before anything else.  If
there isn't, use io_wait_event to wait instead of open coding it.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 58 ++++++++++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 33 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index a8cc71ba4852..36dbec4f9a54 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -411,6 +411,8 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
 	trace_gfs2_log_blocks(sdp, blks);
 	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 				  sdp->sd_jdesc->jd_blocks);
+	if (atomic_read(&sdp->sd_log_blks_needed))
+		wake_up(&sdp->sd_log_waitq);
 }
 
 /**
@@ -444,37 +446,34 @@ void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 {
 	unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
 	unsigned wanted = blks + reserved_blks;
-	DEFINE_WAIT(wait);
-	int did_wait = 0;
 	unsigned int free_blocks;
 
-	atomic_add(blks, &sdp->sd_log_blks_needed);
-retry:
 	free_blocks = atomic_read(&sdp->sd_log_blks_free);
-	if (unlikely(free_blocks <= wanted)) {
-		do {
-			prepare_to_wait_exclusive(&sdp->sd_log_waitq, &wait,
-					TASK_UNINTERRUPTIBLE);
-			wake_up(&sdp->sd_logd_waitq);
-			did_wait = 1;
-			if (atomic_read(&sdp->sd_log_blks_free) <= wanted)
-				io_schedule();
-			free_blocks = atomic_read(&sdp->sd_log_blks_free);
-		} while(free_blocks <= wanted);
-		finish_wait(&sdp->sd_log_waitq, &wait);
+	while (free_blocks >= wanted) {
+	       if (atomic_try_cmpxchg(&sdp->sd_log_blks_free, &free_blocks,
+				      free_blocks - blks))
+		       return;
 	}
-	if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
-				free_blocks - blks) != free_blocks)
-		goto retry;
-	atomic_sub(blks, &sdp->sd_log_blks_needed);
-	trace_gfs2_log_blocks(sdp, -blks);
 
-	/*
-	 * If we waited, then so might others, wake them up _after_ we get
-	 * our share of the log.
-	 */
-	if (unlikely(did_wait))
-		wake_up(&sdp->sd_log_waitq);
+	atomic_add(blks, &sdp->sd_log_blks_needed);
+	for (;;) {
+		if (current != sdp->sd_logd_process)
+			wake_up(&sdp->sd_logd_waitq);
+		io_wait_event(sdp->sd_log_waitq,
+			(free_blocks = atomic_read(&sdp->sd_log_blks_free),
+			 free_blocks >= wanted));
+		do {
+		       if (atomic_try_cmpxchg(&sdp->sd_log_blks_free,
+					      &free_blocks,
+					      free_blocks - blks)) {
+				trace_gfs2_log_blocks(sdp, -blks);
+				if (atomic_sub_return(blks,
+						&sdp->sd_log_blks_needed))
+					wake_up(&sdp->sd_log_waitq);
+				return;
+		       }
+		} while (free_blocks >= wanted);
+	}
 }
 
 /**
@@ -1190,7 +1189,6 @@ int gfs2_logd(void *data)
 	struct gfs2_sbd *sdp = data;
 	unsigned long t = 1;
 	DEFINE_WAIT(wait);
-	bool did_flush;
 
 	while (!kthread_should_stop()) {
 
@@ -1209,12 +1207,10 @@ int gfs2_logd(void *data)
 			continue;
 		}
 
-		did_flush = false;
 		if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
 			gfs2_ail1_empty(sdp, 0);
 			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 				       GFS2_LFC_LOGD_JFLUSH_REQD);
-			did_flush = true;
 		}
 
 		if (gfs2_ail_flush_reqd(sdp)) {
@@ -1223,12 +1219,8 @@ int gfs2_logd(void *data)
 			gfs2_ail1_empty(sdp, 0);
 			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 				       GFS2_LFC_LOGD_AIL_FLUSH_REQD);
-			did_flush = true;
 		}
 
-		if (!gfs2_ail_flush_reqd(sdp) || did_flush)
-			wake_up(&sdp->sd_log_waitq);
-
 		t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
 
 		try_to_freeze();
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin
  2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
                   ` (10 preceding siblings ...)
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
@ 2020-12-14  8:54 ` Andreas Gruenbacher
  11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14  8:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Use a tighter bound for the number of blocks required by transactions in
gfs2_trans_begin: in the worst case, we'll have mixed data and metadata,
so we'll need a log desciptor for each type.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/trans.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index c9d8247ffa19..6a6b07499d07 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -60,8 +60,14 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	tr->tr_blocks = blocks;
 	tr->tr_revokes = revokes;
 	tr->tr_reserved = 1;
-	if (blocks)
-		tr->tr_reserved += 6 + blocks;
+	if (blocks) {
+		/*
+		 * The reserved blocks are either used for data or metadata.
+		 * We can have mixed data and metadata, each with its own log
+		 * descriptor block; see calc_reserved().
+		 */
+		tr->tr_reserved += blocks + 1 + DIV_ROUND_UP(blocks - 1, databuf_limit(sdp));
+	}
 	if (revokes)
 		tr->tr_reserved += gfs2_struct2blk(sdp, revokes);
 	INIT_LIST_HEAD(&tr->tr_databuf);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
  2020-12-14  8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
@ 2020-12-14 14:02   ` Bob Peterson
  2020-12-14 14:05     ` Steven Whitehouse
  0 siblings, 1 reply; 16+ messages in thread
From: Bob Peterson @ 2020-12-14 14:02 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

----- Original Message -----
> +	ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);

The addition of __GFP_NOFAIL means that this operation can now block.
Looking at the code, I don't think it will be a problem because it can
already block in the log_flush operations that precede it, but it
makes me nervous. Obviously, we need to test this really well.

Bob



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
  2020-12-14 14:02   ` Bob Peterson
@ 2020-12-14 14:05     ` Steven Whitehouse
  2020-12-14 17:08       ` Andreas Gruenbacher
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Whitehouse @ 2020-12-14 14:05 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 14/12/2020 14:02, Bob Peterson wrote:
> Hi,
>
> ----- Original Message -----
>> +	ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
> The addition of __GFP_NOFAIL means that this operation can now block.
> Looking at the code, I don't think it will be a problem because it can
> already block in the log_flush operations that precede it, but it
> makes me nervous. Obviously, we need to test this really well.
>
> Bob
>
Not sure of the context here exactly, but why are we adding an instance 
of __GFP_NOFAIL? There is already a return code there so that we can 
fail in that case if required,

Steve.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
  2020-12-14 14:05     ` Steven Whitehouse
@ 2020-12-14 17:08       ` Andreas Gruenbacher
  0 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 17:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, Dec 14, 2020 at 3:06 PM Steven Whitehouse <swhiteho@redhat.com> wrote:
> On 14/12/2020 14:02, Bob Peterson wrote:
> > Hi,
> >
> > ----- Original Message -----
> >> +    ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
> > The addition of __GFP_NOFAIL means that this operation can now block.
> > Looking at the code, I don't think it will be a problem because it can
> > already block in the log_flush operations that precede it, but it
> > makes me nervous. Obviously, we need to test this really well.
> >
> > Bob
> >
> Not sure of the context here exactly, but why are we adding an instance
> of __GFP_NOFAIL? There is already a return code there so that we can
> fail in that case if required,

__GFS_NOFAIL is used in {inode,rgrp}_go_sync -> gfs2_ail_empty_gl ->
__gfs2_trans_begin, and there isn't any reasonable way to react an
-ENOMEM error but to repeat the allocation.

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-12-14 17:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14  8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
2020-12-14 14:02   ` Bob Peterson
2020-12-14 14:05     ` Steven Whitehouse
2020-12-14 17:08       ` Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
2020-12-14  8:54 ` [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.