All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups
@ 2020-12-19 20:48 Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 01/20] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
                   ` (19 more replies)
  0 siblings, 20 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hello,

here's an update on the log log space management cleanup.  The
notable additions are:

* gfs2: Get rid of current_tail()

  Replace current_tail() with a new sd_log_flush_tail variable in
  the super block.

* gfs2: No revokes for transactions at the tail of the log

  We no longer write out revokes for transactions at the tail
  of the log that have completed; the log header with its
  updated lh_tail will implicitly take care of revoking those.

* gfs2: Rework the log space allocation logic

  Clean up the entire log space allocation code to make it
  more manageable.

* gfs2: Hand out revokes to transactions one by one

  Based on the previous patch, clean up the revoke allocation
  code.

Thanks,
Andreas

Andreas Gruenbacher (20):
  gfs2: Deobfuscate function jdesc_find_i
  gfs2: Simplify the buf_limit and databuf_limit definitions
  gfs2: Minor gfs2_write_revokes cleanups
  gfs2: Some documentation updates
  gfs2: A minor debugging improvement
  gfs2: Rename gfs2_{write => add_aux}_revokes
  gfs2: Clean up ail2_empty
  gfs2: Get rid of on-stack transactions
  gfs2: Get rid of sd_reserving_log
  gfs2: Move lock flush locking to gfs2_trans_{begin,end}
  gfs2: Don't wait for journal flush in clean_journal
  gfs2: Clean up gfs2_log_reserve
  gfs2: Use a tighter bound in gfs2_trans_begin
  gfs2: Get rid of current_tail()
  gfs2: Move function gfs2_ail_empty_tr
  gfs2: No revokes for transactions at the tail of the log
  gfs2: Remove sd_log_committed_revoke
  gfs2: Remove sd_log_blks_reserved
  gfs2: Rework the log space allocation logic
  gfs2: Hand out revokes to transactions one by one

 fs/gfs2/glops.c      |  29 +--
 fs/gfs2/incore.h     |  13 +-
 fs/gfs2/log.c        | 558 +++++++++++++++++++++++++------------------
 fs/gfs2/log.h        |  20 +-
 fs/gfs2/lops.c       |  10 +-
 fs/gfs2/lops.h       |  17 +-
 fs/gfs2/ops_fstype.c |   9 +-
 fs/gfs2/super.c      |  25 +-
 fs/gfs2/trans.c      |  91 ++++---
 fs/gfs2/trans.h      |   2 +
 10 files changed, 444 insertions(+), 330 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 01/20] gfs2: Deobfuscate function jdesc_find_i
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 02/20] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Clean up this function to show that it is trivial.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/super.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index b3d951ab8068..08aef061b29d 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -81,19 +81,12 @@ void gfs2_jindex_free(struct gfs2_sbd *sdp)
 static struct gfs2_jdesc *jdesc_find_i(struct list_head *head, unsigned int jid)
 {
 	struct gfs2_jdesc *jd;
-	int found = 0;
 
 	list_for_each_entry(jd, head, jd_list) {
-		if (jd->jd_jid == jid) {
-			found = 1;
-			break;
-		}
+		if (jd->jd_jid == jid)
+			return jd;
 	}
-
-	if (!found)
-		jd = NULL;
-
-	return jd;
+	return NULL;
 }
 
 struct gfs2_jdesc *gfs2_jdesc_find(struct gfs2_sbd *sdp, unsigned int jid)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 02/20] gfs2: Simplify the buf_limit and databuf_limit definitions
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 01/20] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 03/20] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The BUF_OFFSET and DATABUF_OFFSET definitions are only used in buf_limit
and databuf_limit, respectively, and the rounding done in those
definitions is immediately wiped out by dividing by the element size.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/lops.h | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index fbdbb08dcec6..3fca5bf239d3 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -10,13 +10,6 @@
 #include <linux/list.h>
 #include "incore.h"
 
-#define BUF_OFFSET \
-	((sizeof(struct gfs2_log_descriptor) + sizeof(__be64) - 1) & \
-	 ~(sizeof(__be64) - 1))
-#define DATABUF_OFFSET \
-	((sizeof(struct gfs2_log_descriptor) + (2 * sizeof(__be64) - 1)) & \
-	 ~(2 * sizeof(__be64) - 1))
-
 extern const struct gfs2_log_operations *gfs2_log_ops[];
 extern void gfs2_log_incr_head(struct gfs2_sbd *sdp);
 extern u64 gfs2_log_bmap(struct gfs2_jdesc *jd, unsigned int lbn);
@@ -29,18 +22,12 @@ extern int gfs2_find_jhead(struct gfs2_jdesc *jd,
 			   struct gfs2_log_header_host *head, bool keep_cache);
 static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
 {
-	unsigned int limit;
-
-	limit = (sdp->sd_sb.sb_bsize - BUF_OFFSET) / sizeof(__be64);
-	return limit;
+	return sdp->sd_ldptrs;
 }
 
 static inline unsigned int databuf_limit(struct gfs2_sbd *sdp)
 {
-	unsigned int limit;
-
-	limit = (sdp->sd_sb.sb_bsize - DATABUF_OFFSET) / (2 * sizeof(__be64));
-	return limit;
+	return sdp->sd_ldptrs / 2;
 }
 
 static inline void lops_before_commit(struct gfs2_sbd *sdp,
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 03/20] gfs2: Minor gfs2_write_revokes cleanups
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 01/20] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 02/20] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 04/20] gfs2: Some documentation updates Andreas Gruenbacher
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Clean up the computations in gfs2_write_revokes (no change in functionality).

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 41d13f19d1b4..6778473b2a7f 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -712,11 +712,13 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
 void gfs2_write_revokes(struct gfs2_sbd *sdp)
 {
 	/* number of revokes we still have room for */
-	int max_revokes = (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_log_descriptor)) / sizeof(u64);
+	unsigned int max_revokes;
 
 	gfs2_log_lock(sdp);
-	while (sdp->sd_log_num_revoke > max_revokes)
-		max_revokes += (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_meta_header)) / sizeof(u64);
+	max_revokes = sdp->sd_ldptrs;
+	if (sdp->sd_log_num_revoke > sdp->sd_ldptrs)
+		max_revokes += roundup(sdp->sd_log_num_revoke - sdp->sd_ldptrs,
+				       sdp->sd_inptrs);
 	max_revokes -= sdp->sd_log_num_revoke;
 	if (!sdp->sd_log_num_revoke) {
 		atomic_dec(&sdp->sd_log_blks_free);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 04/20] gfs2: Some documentation updates
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (2 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 03/20] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 05/20] gfs2: A minor debugging improvement Andreas Gruenbacher
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The calc_reserved description claims that buf_limit is 502 (on 4k
filesystems), but it is actually 503.  Fix / clarify the entire
description.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 6778473b2a7f..5bc3cc8ef7f4 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -50,10 +50,12 @@ unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct)
 	unsigned int blks;
 	unsigned int first, second;
 
+	/* The initial struct gfs2_log_descriptor block */
 	blks = 1;
 	first = sdp->sd_ldptrs;
 
 	if (nstruct > first) {
+		/* Subsequent struct gfs2_meta_header blocks */
 		second = sdp->sd_inptrs;
 		blks += DIV_ROUND_UP(nstruct - first, second);
 	}
@@ -507,24 +509,20 @@ static inline unsigned int log_distance(struct gfs2_sbd *sdp, unsigned int newer
 }
 
 /**
- * calc_reserved - Calculate the number of blocks to reserve when
- *                 refunding a transaction's unused buffers.
+ * calc_reserved - Calculate the number of blocks to keep reserved
  * @sdp: The GFS2 superblock
  *
  * This is complex.  We need to reserve room for all our currently used
- * metadata buffers (e.g. normal file I/O rewriting file time stamps) and 
- * all our journaled data buffers for journaled files (e.g. files in the 
+ * metadata blocks (e.g. normal file I/O rewriting file time stamps) and
+ * all our journaled data blocks for journaled files (e.g. files in the
  * meta_fs like rindex, or files for which chattr +j was done.)
- * If we don't reserve enough space, gfs2_log_refund and gfs2_log_flush
- * will count it as free space (sd_log_blks_free) and corruption will follow.
+ * If we don't reserve enough space, corruption will follow.
  *
- * We can have metadata bufs and jdata bufs in the same journal.  So each
- * type gets its own log header, for which we need to reserve a block.
- * In fact, each type has the potential for needing more than one header 
- * in cases where we have more buffers than will fit on a journal page.
+ * We can have metadata blocks and jdata blocks in the same journal.  Each
+ * type gets its own log descriptor, for which we need to reserve a block.
+ * In fact, each type has the potential for needing more than one log descriptor
+ * in cases where we have more blocks than will fit in a log descriptor.
  * Metadata journal entries take up half the space of journaled buffer entries.
- * Thus, metadata entries have buf_limit (502) and journaled buffers have
- * databuf_limit (251) before they cause a wrap around.
  *
  * Also, we need to reserve blocks for revoke journal entries and one for an
  * overall header for the lot.
@@ -1007,7 +1005,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 
 	if (sdp->sd_log_head != sdp->sd_log_flush_head) {
 		log_write_header(sdp, flags);
-	} else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle){
+	} else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle) {
 		atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
 		trace_gfs2_log_blocks(sdp, -1);
 		log_write_header(sdp, flags);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 05/20] gfs2: A minor debugging improvement
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (3 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 04/20] gfs2: Some documentation updates Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 06/20] gfs2: Rename gfs2_{write => add_aux}_revokes Andreas Gruenbacher
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Split the assert in gfs2_trans_end into two parts.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/trans.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 6d4bf7ea7b3b..7705f04621f4 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -109,8 +109,8 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 	nbuf -= tr->tr_num_buf_rm;
 	nbuf -= tr->tr_num_databuf_rm;
 
-	if (gfs2_assert_withdraw(sdp, (nbuf <= tr->tr_blocks) &&
-				       (tr->tr_num_revoke <= tr->tr_revokes)))
+	if (gfs2_assert_withdraw(sdp, nbuf <= tr->tr_blocks) ||
+	    gfs2_assert_withdraw(sdp, tr->tr_num_revoke <= tr->tr_revokes))
 		gfs2_print_trans(sdp, tr);
 
 	gfs2_log_commit(sdp, tr);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 06/20] gfs2: Rename gfs2_{write => add_aux}_revokes
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (4 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 05/20] gfs2: A minor debugging improvement Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 07/20] gfs2: Clean up ail2_empty Andreas Gruenbacher
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Function gfs2_write_revokes doesn't actually write any revokes; instead, it
adds auxiliary revokes to the system transaction.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c  | 4 ++--
 fs/gfs2/log.h  | 2 +-
 fs/gfs2/lops.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 5bc3cc8ef7f4..8efc0c1c5c87 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -696,7 +696,7 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
 }
 
 /**
- * gfs2_write_revokes - Add as many revokes to the system transaction as we can
+ * gfs2_add_aux_revokes - Add as many revokes to the system transaction as we can
  * @sdp: The GFS2 superblock
  *
  * Our usual strategy is to defer writing revokes as much as we can in the hope
@@ -707,7 +707,7 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
  * been written back.  This will basically come at no cost now, and will save
  * us from having to keep track of those blocks on the AIL2 list later.
  */
-void gfs2_write_revokes(struct gfs2_sbd *sdp)
+void gfs2_add_aux_revokes(struct gfs2_sbd *sdp)
 {
 	/* number of revokes we still have room for */
 	unsigned int max_revokes;
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 79f97290146e..0a0058f449fc 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -78,6 +78,6 @@ extern void log_flush_wait(struct gfs2_sbd *sdp);
 extern int gfs2_logd(void *data);
 extern void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd);
 extern void gfs2_glock_remove_revoke(struct gfs2_glock *gl);
-extern void gfs2_write_revokes(struct gfs2_sbd *sdp);
+extern void gfs2_add_aux_revokes(struct gfs2_sbd *sdp);
 
 #endif /* __LOG_DOT_H__ */
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 3922b26264f5..3b4e51089bbd 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -845,7 +845,7 @@ static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	struct page *page;
 	unsigned int length;
 
-	gfs2_write_revokes(sdp);
+	gfs2_add_aux_revokes(sdp);
 	if (!sdp->sd_log_num_revoke)
 		return;
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 07/20] gfs2: Clean up ail2_empty
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (5 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 06/20] gfs2: Rename gfs2_{write => add_aux}_revokes Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 08/20] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Clean up the logic in ail2_empty (no functional change).

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 38 +++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 8efc0c1c5c87..d7b0dc94fc8c 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -367,29 +367,33 @@ static void gfs2_ail_empty_tr(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
 	}
 }
 
+static void __ail2_empty(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
+{
+	gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
+	list_del(&tr->tr_list);
+	gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));
+	gfs2_assert_warn(sdp, list_empty(&tr->tr_ail2_list));
+	gfs2_trans_free(sdp, tr);
+}
+
 static void ail2_empty(struct gfs2_sbd *sdp, unsigned int new_tail)
 {
-	struct gfs2_trans *tr, *safe;
+	struct list_head *ail2_list = &sdp->sd_ail2_list;
 	unsigned int old_tail = sdp->sd_log_tail;
-	int wrap = (new_tail < old_tail);
-	int a, b, rm;
+	struct gfs2_trans *tr, *safe;
 
 	spin_lock(&sdp->sd_ail_lock);
-
-	list_for_each_entry_safe(tr, safe, &sdp->sd_ail2_list, tr_list) {
-		a = (old_tail <= tr->tr_first);
-		b = (tr->tr_first < new_tail);
-		rm = (wrap) ? (a || b) : (a && b);
-		if (!rm)
-			continue;
-
-		gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
-		list_del(&tr->tr_list);
-		gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));
-		gfs2_assert_warn(sdp, list_empty(&tr->tr_ail2_list));
-		gfs2_trans_free(sdp, tr);
+	if (old_tail <= new_tail) {
+		list_for_each_entry_safe(tr, safe, ail2_list, tr_list) {
+			if (old_tail <= tr->tr_first && tr->tr_first < new_tail)
+				__ail2_empty(sdp, tr);
+		}
+	} else {
+		list_for_each_entry_safe(tr, safe, ail2_list, tr_list) {
+			if (old_tail <= tr->tr_first || tr->tr_first < new_tail)
+				__ail2_empty(sdp, tr);
+		}
 	}
-
 	spin_unlock(&sdp->sd_ail_lock);
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 08/20] gfs2: Get rid of on-stack transactions
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (6 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 07/20] gfs2: Clean up ail2_empty Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 09/20] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On-stack transactions were introduced to work around a transaction glock
deadlock in gfs2_trans_begin in commit d8348de06f70 ("GFS2: Fix deadlock
on journal flush").  Subsequently, transaction glocks were eliminated in
favor of the more efficient freeze glocks in commit 24972557b12c ("GFS2:
remove transaction glock") without also removing the on-stack
transactions.

It has now turned out that committing on-stack transactions
significantly complicates journal free space accounting when no system
transaction (sdp->sd_log_tr) is active at the time.  It doesn't seem
that on-stack transactions provide a significant benefit beyond their
original purpose (as an optimization), so remove them to allow fixing
the journal free space accounting in a reasonable way in a subsequent
patch.

FIXME: Can we better handle a gfs2_trans_begin failure in gfs2_ail_empty_gl?
If we skip the __gfs2_ail_flush, we'll just end up with leftover items on
gl_ail_list.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/glops.c  | 29 +++++++----------------------
 fs/gfs2/incore.h |  1 -
 fs/gfs2/log.c    |  1 -
 fs/gfs2/trans.c  | 25 +++++++++++++------------
 fs/gfs2/trans.h  |  2 ++
 5 files changed, 22 insertions(+), 36 deletions(-)

diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 3faa421568b0..853e590ccc15 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -84,18 +84,11 @@ static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync,
 
 static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
 {
+	unsigned int revokes = atomic_read(&gl->gl_ail_count);
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
-	struct gfs2_trans tr;
 	int ret;
 
-	memset(&tr, 0, sizeof(tr));
-	INIT_LIST_HEAD(&tr.tr_buf);
-	INIT_LIST_HEAD(&tr.tr_databuf);
-	INIT_LIST_HEAD(&tr.tr_ail1_list);
-	INIT_LIST_HEAD(&tr.tr_ail2_list);
-	tr.tr_revokes = atomic_read(&gl->gl_ail_count);
-
-	if (!tr.tr_revokes) {
+	if (!revokes) {
 		bool have_revokes;
 		bool log_in_flight;
 
@@ -122,20 +115,12 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
 		return 0;
 	}
 
-	/* A shortened, inline version of gfs2_trans_begin()
-         * tr->alloced is not set since the transaction structure is
-         * on the stack */
-	tr.tr_reserved = 1 + gfs2_struct2blk(sdp, tr.tr_revokes);
-	tr.tr_ip = _RET_IP_;
-	ret = gfs2_log_reserve(sdp, tr.tr_reserved);
-	if (ret < 0)
-		return ret;
-	WARN_ON_ONCE(current->journal_info);
-	current->journal_info = &tr;
-
-	__gfs2_ail_flush(gl, 0, tr.tr_revokes);
-
+	ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
+	if (ret)
+		goto flush;
+	__gfs2_ail_flush(gl, 0, revokes);
 	gfs2_trans_end(sdp);
+
 flush:
 	gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 		       GFS2_LFC_AIL_EMPTY_GL);
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index f8858d995b24..adea010124aa 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -491,7 +491,6 @@ struct gfs2_quota_data {
 enum {
 	TR_TOUCHED = 1,
 	TR_ATTACHED = 2,
-	TR_ALLOCED = 3,
 };
 
 struct gfs2_trans {
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index d7b0dc94fc8c..1c3c4b8ffec8 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1113,7 +1113,6 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	if (sdp->sd_log_tr) {
 		gfs2_merge_trans(sdp, tr);
 	} else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
-		gfs2_assert_withdraw(sdp, test_bit(TR_ALLOCED, &tr->tr_flags));
 		sdp->sd_log_tr = tr;
 		set_bit(TR_ATTACHED, &tr->tr_flags);
 	}
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 7705f04621f4..4f461ab37ced 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -37,8 +37,8 @@ static void gfs2_print_trans(struct gfs2_sbd *sdp, const struct gfs2_trans *tr)
 		tr->tr_num_revoke, tr->tr_num_revoke_rm);
 }
 
-int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
-		     unsigned int revokes)
+int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+		       unsigned int revokes, gfp_t gfp_mask)
 {
 	struct gfs2_trans *tr;
 	int error;
@@ -52,7 +52,7 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	if (!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
 		return -EROFS;
 
-	tr = kmem_cache_zalloc(gfs2_trans_cachep, GFP_NOFS);
+	tr = kmem_cache_zalloc(gfs2_trans_cachep, gfp_mask);
 	if (!tr)
 		return -ENOMEM;
 
@@ -60,7 +60,6 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	tr->tr_blocks = blocks;
 	tr->tr_revokes = revokes;
 	tr->tr_reserved = 1;
-	set_bit(TR_ALLOCED, &tr->tr_flags);
 	if (blocks)
 		tr->tr_reserved += 6 + blocks;
 	if (revokes)
@@ -88,20 +87,23 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	return error;
 }
 
+int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+		     unsigned int revokes)
+{
+	return __gfs2_trans_begin(sdp, blocks, revokes, GFP_NOFS);
+}
+
 void gfs2_trans_end(struct gfs2_sbd *sdp)
 {
 	struct gfs2_trans *tr = current->journal_info;
 	s64 nbuf;
-	int alloced = test_bit(TR_ALLOCED, &tr->tr_flags);
 
 	current->journal_info = NULL;
 
 	if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
 		gfs2_log_release(sdp, tr->tr_reserved);
-		if (alloced) {
-			gfs2_trans_free(sdp, tr);
-			sb_end_intwrite(sdp->sd_vfs);
-		}
+		gfs2_trans_free(sdp, tr);
+		sb_end_intwrite(sdp->sd_vfs);
 		return;
 	}
 
@@ -114,15 +116,14 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 		gfs2_print_trans(sdp, tr);
 
 	gfs2_log_commit(sdp, tr);
-	if (alloced && !test_bit(TR_ATTACHED, &tr->tr_flags))
+	if (!test_bit(TR_ATTACHED, &tr->tr_flags))
 		gfs2_trans_free(sdp, tr);
 	up_read(&sdp->sd_log_flush_lock);
 
 	if (sdp->sd_vfs->s_flags & SB_SYNCHRONOUS)
 		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 			       GFS2_LFC_TRANS_END);
-	if (alloced)
-		sb_end_intwrite(sdp->sd_vfs);
+	sb_end_intwrite(sdp->sd_vfs);
 }
 
 static struct gfs2_bufdata *gfs2_alloc_bufdata(struct gfs2_glock *gl,
diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
index 83199ce5a5c5..9c732a5f28bf 100644
--- a/fs/gfs2/trans.h
+++ b/fs/gfs2/trans.h
@@ -34,6 +34,8 @@ static inline unsigned int gfs2_rg_blocks(const struct gfs2_inode *ip, unsigned
 	return rgd->rd_length;
 }
 
+extern int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+			      unsigned int revokes, gfp_t gfp_mask);
 extern int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 			    unsigned int revokes);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 09/20] gfs2: Get rid of sd_reserving_log
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (7 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 08/20] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 10/20] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This counter and the associated wait queue are only used so that
gfs2_make_fs_ro can efficiently wait for all pending log space
allocations to fail after setting the filesystem to read-only.  This
comes at the cost of waking up that wait queue very frequently.

Instead, when gfs2_log_reserve fails because the filesystem has become
read-only, Wake up sd_log_waitq.  In gfs2_make_fs_ro, set the file
system read-only and then wait until all the log space has been
released.  Give up and report the problem after a while.  With that,
sd_reserving_log and sd_reserving_log_wait can be removed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/incore.h     |  3 ---
 fs/gfs2/log.c        | 17 ++++++++++-------
 fs/gfs2/log.h        |  1 +
 fs/gfs2/ops_fstype.c |  2 --
 fs/gfs2/super.c      | 12 ++++++------
 fs/gfs2/trans.c      |  5 ++++-
 6 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index adea010124aa..ecc3f596319c 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -849,9 +849,6 @@ struct gfs2_sbd {
 	int sd_log_error; /* First log error */
 	wait_queue_head_t sd_withdraw_wait;
 
-	atomic_t sd_reserving_log;
-	wait_queue_head_t sd_reserving_log_wait;
-
 	unsigned int sd_log_flush_head;
 
 	spinlock_t sd_ail_lock;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 1c3c4b8ffec8..8fd93077a029 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -397,6 +397,15 @@ static void ail2_empty(struct gfs2_sbd *sdp, unsigned int new_tail)
 	spin_unlock(&sdp->sd_ail_lock);
 }
 
+/**
+ * gfs2_log_is_empty - Check if the log is empty
+ * @sdp: The GFS2 superblock
+ */
+
+bool gfs2_log_is_empty(struct gfs2_sbd *sdp) {
+	return atomic_read(&sdp->sd_log_blks_free) == sdp->sd_jdesc->jd_blocks;
+}
+
 /**
  * gfs2_log_release - Release a given number of log blocks
  * @sdp: The GFS2 superblock
@@ -461,13 +470,9 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 		} while(free_blocks <= wanted);
 		finish_wait(&sdp->sd_log_waitq, &wait);
 	}
-	atomic_inc(&sdp->sd_reserving_log);
 	if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
-				free_blocks - blks) != free_blocks) {
-		if (atomic_dec_and_test(&sdp->sd_reserving_log))
-			wake_up(&sdp->sd_reserving_log_wait);
+				free_blocks - blks) != free_blocks)
 		goto retry;
-	}
 	atomic_sub(blks, &sdp->sd_log_blks_needed);
 	trace_gfs2_log_blocks(sdp, -blks);
 
@@ -483,8 +488,6 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 		gfs2_log_release(sdp, blks);
 		ret = -EROFS;
 	}
-	if (atomic_dec_and_test(&sdp->sd_reserving_log))
-		wake_up(&sdp->sd_reserving_log_wait);
 	return ret;
 }
 
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 0a0058f449fc..ed802b26fadc 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -64,6 +64,7 @@ static inline void gfs2_ordered_add_inode(struct gfs2_inode *ip)
 extern void gfs2_ordered_del_inode(struct gfs2_inode *ip);
 extern unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct);
 extern void gfs2_remove_from_ail(struct gfs2_bufdata *bd);
+extern bool gfs2_log_is_empty(struct gfs2_sbd *sdp);
 extern void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
 extern int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
 extern void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 61fce59cb4d3..986dc2ebebf0 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -136,8 +136,6 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
 
 	init_rwsem(&sdp->sd_log_flush_lock);
 	atomic_set(&sdp->sd_log_in_flight, 0);
-	atomic_set(&sdp->sd_reserving_log, 0);
-	init_waitqueue_head(&sdp->sd_reserving_log_wait);
 	init_waitqueue_head(&sdp->sd_log_flush_wait);
 	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
 	mutex_init(&sdp->sd_freeze_mutex);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 08aef061b29d..7da55d9a9792 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -647,13 +647,13 @@ int gfs2_make_fs_ro(struct gfs2_sbd *sdp)
 
 		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
 			       GFS2_LFC_MAKE_FS_RO);
-		wait_event(sdp->sd_reserving_log_wait,
-			   atomic_read(&sdp->sd_reserving_log) == 0);
-		gfs2_assert_warn(sdp, atomic_read(&sdp->sd_log_blks_free) ==
-				 sdp->sd_jdesc->jd_blocks);
+		wait_event_timeout(sdp->sd_log_waitq,
+				   gfs2_log_is_empty(sdp),
+				   HZ * 5);
+		gfs2_assert_warn(sdp, gfs2_log_is_empty(sdp));
 	} else {
-		wait_event_timeout(sdp->sd_reserving_log_wait,
-				   atomic_read(&sdp->sd_reserving_log) == 0,
+		wait_event_timeout(sdp->sd_log_waitq,
+				   gfs2_log_is_empty(sdp),
 				   HZ * 5);
 	}
 	if (gfs2_holder_initialized(&freeze_gh))
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 4f461ab37ced..e33cb8da056a 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -73,8 +73,11 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	sb_start_intwrite(sdp->sd_vfs);
 
 	error = gfs2_log_reserve(sdp, tr->tr_reserved);
-	if (error)
+	if (error) {
+		if (error == -EROFS)
+			wake_up(&sdp->sd_log_waitq);
 		goto fail;
+	}
 
 	current->journal_info = tr;
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 10/20] gfs2: Move lock flush locking to gfs2_trans_{begin, end}
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (8 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 09/20] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 11/20] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Move the read locking of sd_log_flush_lock from gfs2_log_reserve to
gfs2_trans_begin, and its unlocking from gfs2_log_release to
gfs2_trans_end.  Use gfs2_log_release in two places in which it was open
coded before.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c   | 28 +++-------------------------
 fs/gfs2/log.h   |  2 +-
 fs/gfs2/trans.c | 23 ++++++++++++++++-------
 3 files changed, 20 insertions(+), 33 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 8fd93077a029..88f9d630fb96 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -420,7 +420,6 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
 	trace_gfs2_log_blocks(sdp, blks);
 	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 				  sdp->sd_jdesc->jd_blocks);
-	up_read(&sdp->sd_log_flush_lock);
 }
 
 /**
@@ -439,22 +438,16 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
  * with queued waiters, we use an exclusive wait. This means that when we
  * get woken with enough journal space to get our reservation, we need to
  * wake the next waiter on the list.
- *
- * Returns: errno
  */
 
-int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
+void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 {
-	int ret = 0;
 	unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
 	unsigned wanted = blks + reserved_blks;
 	DEFINE_WAIT(wait);
 	int did_wait = 0;
 	unsigned int free_blocks;
 
-	if (gfs2_assert_warn(sdp, blks) ||
-	    gfs2_assert_warn(sdp, blks <= sdp->sd_jdesc->jd_blocks))
-		return -EINVAL;
 	atomic_add(blks, &sdp->sd_log_blks_needed);
 retry:
 	free_blocks = atomic_read(&sdp->sd_log_blks_free);
@@ -482,13 +475,6 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 	 */
 	if (unlikely(did_wait))
 		wake_up(&sdp->sd_log_waitq);
-
-	down_read(&sdp->sd_log_flush_lock);
-	if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
-		gfs2_log_release(sdp, blks);
-		ret = -EROFS;
-	}
-	return ret;
 }
 
 /**
@@ -585,12 +571,7 @@ static void log_pull_tail(struct gfs2_sbd *sdp, unsigned int new_tail)
 	unsigned int dist = log_distance(sdp, new_tail, sdp->sd_log_tail);
 
 	ail2_empty(sdp, new_tail);
-
-	atomic_add(dist, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, dist);
-	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
-			     sdp->sd_jdesc->jd_blocks);
-
+	gfs2_log_release(sdp, dist);
 	sdp->sd_log_tail = new_tail;
 }
 
@@ -1125,10 +1106,7 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	maxres = sdp->sd_log_blks_reserved + tr->tr_reserved;
 	gfs2_assert_withdraw(sdp, maxres >= reserved);
 	unused = maxres - reserved;
-	atomic_add(unused, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, unused);
-	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
-			     sdp->sd_jdesc->jd_blocks);
+	gfs2_log_release(sdp, unused);
 	sdp->sd_log_blks_reserved = reserved;
 
 	gfs2_log_unlock(sdp);
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index ed802b26fadc..fd44b107602b 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -66,7 +66,7 @@ extern unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct);
 extern void gfs2_remove_from_ail(struct gfs2_bufdata *bd);
 extern bool gfs2_log_is_empty(struct gfs2_sbd *sdp);
 extern void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
-extern int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
+extern void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
 extern void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 				  u64 seq, u32 tail, u32 lblock, u32 flags,
 				  int op_flags);
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index e33cb8da056a..c9d8247ffa19 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -70,12 +70,22 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	INIT_LIST_HEAD(&tr->tr_ail1_list);
 	INIT_LIST_HEAD(&tr->tr_ail2_list);
 
+	if (gfs2_assert_warn(sdp, tr->tr_reserved <= sdp->sd_jdesc->jd_blocks)) {
+		error = -EINVAL;
+		goto fail;
+	}
+
 	sb_start_intwrite(sdp->sd_vfs);
 
-	error = gfs2_log_reserve(sdp, tr->tr_reserved);
-	if (error) {
-		if (error == -EROFS)
-			wake_up(&sdp->sd_log_waitq);
+	gfs2_log_reserve(sdp, tr->tr_reserved);
+
+	down_read(&sdp->sd_log_flush_lock);
+	if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
+		gfs2_log_release(sdp, tr->tr_reserved);
+		up_read(&sdp->sd_log_flush_lock);
+		sb_end_intwrite(sdp->sd_vfs);
+		wake_up(&sdp->sd_log_waitq);
+		error = -EROFS;
 		goto fail;
 	}
 
@@ -84,9 +94,7 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	return 0;
 
 fail:
-	sb_end_intwrite(sdp->sd_vfs);
 	kmem_cache_free(gfs2_trans_cachep, tr);
-
 	return error;
 }
 
@@ -105,8 +113,9 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 
 	if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
 		gfs2_log_release(sdp, tr->tr_reserved);
-		gfs2_trans_free(sdp, tr);
+		up_read(&sdp->sd_log_flush_lock);
 		sb_end_intwrite(sdp->sd_vfs);
+		gfs2_trans_free(sdp, tr);
 		return;
 	}
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 11/20] gfs2: Don't wait for journal flush in clean_journal
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (9 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 10/20] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 12/20] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Commit 588bff95c94e added gfs2_write_log_header() and started using it in
clean_journal(), with an additional call to log_flush_wait() at the end of
gfs2_write_log_header() which is unnecessary for clean_journal().  Move
that call out of gfs2_write_log_header() to restore the previous behavior.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 88f9d630fb96..95ad444bd3dc 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -757,7 +757,7 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 	u64 dblock;
 
 	if (gfs2_withdrawn(sdp))
-		goto out;
+		return;
 
 	page = mempool_alloc(gfs2_page_pool, GFP_NOIO);
 	lh = page_address(page);
@@ -812,8 +812,6 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 
 	gfs2_log_write(sdp, page, sb->s_blocksize, 0, dblock);
 	gfs2_log_submit_bio(&sdp->sd_log_bio, REQ_OP_WRITE | op_flags);
-out:
-	log_flush_wait(sdp);
 }
 
 /**
@@ -842,6 +840,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags)
 	gfs2_write_log_header(sdp, sdp->sd_jdesc, sdp->sd_log_sequence++, tail,
 			      sdp->sd_log_flush_head, flags, op_flags);
 	gfs2_log_incr_head(sdp);
+	log_flush_wait(sdp);
 
 	if (sdp->sd_log_tail != tail)
 		log_pull_tail(sdp, tail);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 12/20] gfs2: Clean up gfs2_log_reserve
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (10 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 11/20] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-22 14:19   ` Bob Peterson
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 13/20] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Wake up log waiters in gfs2_log_release when log space has actually become
available.  This is a much better place for the wakeup than gfs2_logd.

Check if enough log space is immeditely available before anything else.  If
there isn't, use io_wait_event to wait instead of open-coding it.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 54 ++++++++++++++++++++++-----------------------------
 1 file changed, 23 insertions(+), 31 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 95ad444bd3dc..7a65823ad1f3 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -420,6 +420,8 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
 	trace_gfs2_log_blocks(sdp, blks);
 	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 				  sdp->sd_jdesc->jd_blocks);
+	if (atomic_read(&sdp->sd_log_blks_needed))
+		wake_up(&sdp->sd_log_waitq);
 }
 
 /**
@@ -444,36 +446,33 @@ void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 {
 	unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
 	unsigned wanted = blks + reserved_blks;
-	DEFINE_WAIT(wait);
-	int did_wait = 0;
 	unsigned int free_blocks;
 
-	atomic_add(blks, &sdp->sd_log_blks_needed);
-retry:
 	free_blocks = atomic_read(&sdp->sd_log_blks_free);
-	if (unlikely(free_blocks <= wanted)) {
-		do {
-			prepare_to_wait_exclusive(&sdp->sd_log_waitq, &wait,
-					TASK_UNINTERRUPTIBLE);
+	while (free_blocks >= wanted) {
+		if (atomic_try_cmpxchg(&sdp->sd_log_blks_free, &free_blocks,
+				       free_blocks - blks))
+			return;
+	}
+
+	atomic_add(blks, &sdp->sd_log_blks_needed);
+	for (;;) {
+		if (current != sdp->sd_logd_process)
 			wake_up(&sdp->sd_logd_waitq);
-			did_wait = 1;
-			if (atomic_read(&sdp->sd_log_blks_free) <= wanted)
-				io_schedule();
-			free_blocks = atomic_read(&sdp->sd_log_blks_free);
-		} while(free_blocks <= wanted);
-		finish_wait(&sdp->sd_log_waitq, &wait);
+		io_wait_event(sdp->sd_log_waitq,
+			(free_blocks = atomic_read(&sdp->sd_log_blks_free),
+			 free_blocks >= wanted));
+		do {
+			if (atomic_try_cmpxchg(&sdp->sd_log_blks_free,
+					       &free_blocks,
+					       free_blocks - blks))
+				goto reserved;
+		} while (free_blocks >= wanted);
 	}
-	if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
-				free_blocks - blks) != free_blocks)
-		goto retry;
-	atomic_sub(blks, &sdp->sd_log_blks_needed);
-	trace_gfs2_log_blocks(sdp, -blks);
 
-	/*
-	 * If we waited, then so might others, wake them up _after_ we get
-	 * our share of the log.
-	 */
-	if (unlikely(did_wait))
+reserved:
+	trace_gfs2_log_blocks(sdp, -blks);
+	if (atomic_sub_return(blks, &sdp->sd_log_blks_needed))
 		wake_up(&sdp->sd_log_waitq);
 }
 
@@ -1190,7 +1189,6 @@ int gfs2_logd(void *data)
 	struct gfs2_sbd *sdp = data;
 	unsigned long t = 1;
 	DEFINE_WAIT(wait);
-	bool did_flush;
 
 	while (!kthread_should_stop()) {
 
@@ -1209,12 +1207,10 @@ int gfs2_logd(void *data)
 			continue;
 		}
 
-		did_flush = false;
 		if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
 			gfs2_ail1_empty(sdp, 0);
 			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 				       GFS2_LFC_LOGD_JFLUSH_REQD);
-			did_flush = true;
 		}
 
 		if (gfs2_ail_flush_reqd(sdp)) {
@@ -1223,12 +1219,8 @@ int gfs2_logd(void *data)
 			gfs2_ail1_empty(sdp, 0);
 			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
 				       GFS2_LFC_LOGD_AIL_FLUSH_REQD);
-			did_flush = true;
 		}
 
-		if (!gfs2_ail_flush_reqd(sdp) || did_flush)
-			wake_up(&sdp->sd_log_waitq);
-
 		t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
 
 		try_to_freeze();
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 13/20] gfs2: Use a tighter bound in gfs2_trans_begin
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (11 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 12/20] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 14/20] gfs2: Get rid of current_tail() Andreas Gruenbacher
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Use a tighter bound for the number of blocks required by transactions in
gfs2_trans_begin: in the worst case, we'll have mixed data and metadata,
so we'll need a log desciptor for each type.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/trans.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index c9d8247ffa19..6a6b07499d07 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -60,8 +60,14 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	tr->tr_blocks = blocks;
 	tr->tr_revokes = revokes;
 	tr->tr_reserved = 1;
-	if (blocks)
-		tr->tr_reserved += 6 + blocks;
+	if (blocks) {
+		/*
+		 * The reserved blocks are either used for data or metadata.
+		 * We can have mixed data and metadata, each with its own log
+		 * descriptor block; see calc_reserved().
+		 */
+		tr->tr_reserved += blocks + 1 + DIV_ROUND_UP(blocks - 1, databuf_limit(sdp));
+	}
 	if (revokes)
 		tr->tr_reserved += gfs2_struct2blk(sdp, revokes);
 	INIT_LIST_HEAD(&tr->tr_databuf);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 14/20] gfs2: Get rid of current_tail()
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (12 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 13/20] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 15/20] gfs2: Move function gfs2_ail_empty_tr Andreas Gruenbacher
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Keep the current value of the updated log tail in the super block as
sb_log_flush_tail instead of computing it on the fly.  This avoids
unnecessary sd_ail_lock taking and cleans up the code.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/incore.h |  5 ++--
 fs/gfs2/log.c    | 72 +++++++++++++++++++++++++-----------------------
 fs/gfs2/log.h    |  4 ++-
 3 files changed, 44 insertions(+), 37 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index ecc3f596319c..f1ad78dbd782 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -838,8 +838,6 @@ struct gfs2_sbd {
 	wait_queue_head_t sd_logd_waitq;
 
 	u64 sd_log_sequence;
-	unsigned int sd_log_head;
-	unsigned int sd_log_tail;
 	int sd_log_idle;
 
 	struct rw_semaphore sd_log_flush_lock;
@@ -849,6 +847,9 @@ struct gfs2_sbd {
 	int sd_log_error; /* First log error */
 	wait_queue_head_t sd_withdraw_wait;
 
+	unsigned int sd_log_tail;
+	unsigned int sd_log_flush_tail;
+	unsigned int sd_log_head;
 	unsigned int sd_log_flush_head;
 
 	spinlock_t sd_ail_lock;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 7a65823ad1f3..47a25833acf3 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -242,6 +242,28 @@ static void gfs2_ail1_start(struct gfs2_sbd *sdp)
 	return gfs2_ail1_flush(sdp, &wbc);
 }
 
+static void gfs2_log_update_flush_tail(struct gfs2_sbd *sdp)
+{
+	unsigned int new_flush_tail = sdp->sd_log_head;
+	struct gfs2_trans *tr;
+
+	if (!list_empty(&sdp->sd_ail1_list)) {
+		tr = list_last_entry(&sdp->sd_ail1_list,
+				     struct gfs2_trans, tr_list);
+		new_flush_tail = tr->tr_first;
+	}
+	sdp->sd_log_flush_tail = new_flush_tail;
+}
+
+static void gfs2_log_update_head(struct gfs2_sbd *sdp)
+{
+	unsigned int new_head = sdp->sd_log_flush_head;
+
+	if (sdp->sd_log_flush_tail == sdp->sd_log_head)
+		sdp->sd_log_flush_tail = new_head;
+	sdp->sd_log_head = new_head;
+}
+
 /**
  * gfs2_ail1_empty_one - Check whether or not a trans in the AIL has been synced
  * @sdp: the filesystem
@@ -317,6 +339,7 @@ static int gfs2_ail1_empty(struct gfs2_sbd *sdp, int max_revokes)
 		else
 			oldest_tr = 0;
 	}
+	gfs2_log_update_flush_tail(sdp);
 	ret = list_empty(&sdp->sd_ail1_list);
 	spin_unlock(&sdp->sd_ail_lock);
 
@@ -545,30 +568,14 @@ static unsigned int calc_reserved(struct gfs2_sbd *sdp)
 	return reserved;
 }
 
-static unsigned int current_tail(struct gfs2_sbd *sdp)
-{
-	struct gfs2_trans *tr;
-	unsigned int tail;
-
-	spin_lock(&sdp->sd_ail_lock);
-
-	if (list_empty(&sdp->sd_ail1_list)) {
-		tail = sdp->sd_log_head;
-	} else {
-		tr = list_last_entry(&sdp->sd_ail1_list, struct gfs2_trans,
-				tr_list);
-		tail = tr->tr_first;
-	}
-
-	spin_unlock(&sdp->sd_ail_lock);
-
-	return tail;
-}
-
-static void log_pull_tail(struct gfs2_sbd *sdp, unsigned int new_tail)
+static void log_pull_tail(struct gfs2_sbd *sdp)
 {
-	unsigned int dist = log_distance(sdp, new_tail, sdp->sd_log_tail);
+	unsigned int new_tail = sdp->sd_log_flush_tail;
+	unsigned int dist;
 
+	if (new_tail == sdp->sd_log_tail)
+		return;
+	dist = log_distance(sdp, new_tail, sdp->sd_log_tail);
 	ail2_empty(sdp, new_tail);
 	gfs2_log_release(sdp, dist);
 	sdp->sd_log_tail = new_tail;
@@ -823,26 +830,23 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 
 static void log_write_header(struct gfs2_sbd *sdp, u32 flags)
 {
-	unsigned int tail;
 	int op_flags = REQ_PREFLUSH | REQ_FUA | REQ_META | REQ_SYNC;
 	enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state);
 
 	gfs2_assert_withdraw(sdp, (state != SFS_FROZEN));
-	tail = current_tail(sdp);
 
 	if (test_bit(SDF_NOBARRIERS, &sdp->sd_flags)) {
 		gfs2_ordered_wait(sdp);
 		log_flush_wait(sdp);
 		op_flags = REQ_SYNC | REQ_META | REQ_PRIO;
 	}
-	sdp->sd_log_idle = (tail == sdp->sd_log_flush_head);
-	gfs2_write_log_header(sdp, sdp->sd_jdesc, sdp->sd_log_sequence++, tail,
-			      sdp->sd_log_flush_head, flags, op_flags);
+	sdp->sd_log_idle = (sdp->sd_log_flush_tail == sdp->sd_log_flush_head);
+	gfs2_write_log_header(sdp, sdp->sd_jdesc, sdp->sd_log_sequence++,
+			      sdp->sd_log_flush_tail, sdp->sd_log_flush_head,
+			      flags, op_flags);
 	gfs2_log_incr_head(sdp);
 	log_flush_wait(sdp);
-
-	if (sdp->sd_log_tail != tail)
-		log_pull_tail(sdp, tail);
+	log_pull_tail(sdp);
 }
 
 /**
@@ -991,7 +995,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 
 	if (sdp->sd_log_head != sdp->sd_log_flush_head) {
 		log_write_header(sdp, flags);
-	} else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle) {
+	} else if (sdp->sd_log_tail != sdp->sd_log_flush_tail && !sdp->sd_log_idle) {
 		atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
 		trace_gfs2_log_blocks(sdp, -1);
 		log_write_header(sdp, flags);
@@ -1001,7 +1005,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	lops_after_commit(sdp, tr);
 
 	gfs2_log_lock(sdp);
-	sdp->sd_log_head = sdp->sd_log_flush_head;
+	gfs2_log_update_head(sdp);
 	sdp->sd_log_blks_reserved = 0;
 	sdp->sd_log_committed_revoke = 0;
 
@@ -1021,7 +1025,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 			atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
 			trace_gfs2_log_blocks(sdp, -1);
 			log_write_header(sdp, flags);
-			sdp->sd_log_head = sdp->sd_log_flush_head;
+			gfs2_log_update_head(sdp);
 		}
 		if (flags & (GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
 			     GFS2_LOG_HEAD_FLUSH_FREEZE))
@@ -1154,7 +1158,7 @@ static void gfs2_log_shutdown(struct gfs2_sbd *sdp)
 	gfs2_assert_warn(sdp, sdp->sd_log_head == sdp->sd_log_tail);
 	gfs2_assert_warn(sdp, list_empty(&sdp->sd_ail2_list));
 
-	sdp->sd_log_head = sdp->sd_log_flush_head;
+	gfs2_log_update_head(sdp);
 	sdp->sd_log_tail = sdp->sd_log_head;
 }
 
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index fd44b107602b..c7f894168600 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -43,7 +43,9 @@ static inline void gfs2_log_pointers_init(struct gfs2_sbd *sdp,
 	if (++value == sdp->sd_jdesc->jd_blocks) {
 		value = 0;
 	}
-	sdp->sd_log_head = sdp->sd_log_tail = value;
+	sdp->sd_log_tail = value;
+	sdp->sd_log_flush_tail = value;
+	sdp->sd_log_head = value;
 }
 
 static inline void gfs2_ordered_add_inode(struct gfs2_inode *ip)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 15/20] gfs2: Move function gfs2_ail_empty_tr
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (13 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 14/20] gfs2: Get rid of current_tail() Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 16/20] gfs2: No revokes for transactions at the tail of the log Andreas Gruenbacher
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Move this function further up in log.c so that we can use it in the next patch.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 47a25833acf3..b771575bce11 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -264,6 +264,23 @@ static void gfs2_log_update_head(struct gfs2_sbd *sdp)
 	sdp->sd_log_head = new_head;
 }
 
+/**
+ * gfs2_ail_empty_tr - empty one of the ail lists of a transaction
+ */
+
+static void gfs2_ail_empty_tr(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
+			      struct list_head *head)
+{
+	struct gfs2_bufdata *bd;
+
+	while (!list_empty(head)) {
+		bd = list_first_entry(head, struct gfs2_bufdata,
+				      bd_ail_st_list);
+		gfs2_assert(sdp, bd->bd_tr == tr);
+		gfs2_remove_from_ail(bd);
+	}
+}
+
 /**
  * gfs2_ail1_empty_one - Check whether or not a trans in the AIL has been synced
  * @sdp: the filesystem
@@ -373,23 +390,6 @@ static void gfs2_ail1_wait(struct gfs2_sbd *sdp)
 	spin_unlock(&sdp->sd_ail_lock);
 }
 
-/**
- * gfs2_ail_empty_tr - empty one of the ail lists for a transaction
- */
-
-static void gfs2_ail_empty_tr(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
-			      struct list_head *head)
-{
-	struct gfs2_bufdata *bd;
-
-	while (!list_empty(head)) {
-		bd = list_first_entry(head, struct gfs2_bufdata,
-				      bd_ail_st_list);
-		gfs2_assert(sdp, bd->bd_tr == tr);
-		gfs2_remove_from_ail(bd);
-	}
-}
-
 static void __ail2_empty(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 {
 	gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 16/20] gfs2: No revokes for transactions at the tail of the log
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (14 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 15/20] gfs2: Move function gfs2_ail_empty_tr Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 17/20] gfs2: Remove sd_log_committed_revoke Andreas Gruenbacher
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In gfs2_log_flush, we're going through all active transactions.  For
each of the buffers in those transactions that has completed, we either
add a revoke to the active transaction immediately, or we move the
buffer to the transaction's ail2 list, which may result in a revoke
later.

However, whenever a transaction at the tail of the log completes, the
current tail of the log advances.  gfs2_log_flush writes out the log
header for the system transaction, with lh_tail set to that current tail
(sd_log_flush_head).  This implicitly revokes all previous blocks in the
log, so the revokes we've just written immediately become obsolete.
(This is not the case for transactions that haven't completed or aren't
at the tail of the log.)

Fix this by skipping completed transactions at the tail of the log
instead of writing revokes for them.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index b771575bce11..14c5edf2bf51 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -287,15 +287,31 @@ static void gfs2_ail_empty_tr(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
  * @tr: the transaction
  * @max_revokes: If nonzero, issue revokes for the bd items for written buffers
  *
- * returns: the transaction's count of remaining active items
+ * Returns: true if the transaction has completed
  */
 
-static int gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
+static bool gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
 				int *max_revokes)
 {
 	struct gfs2_bufdata *bd, *s;
 	struct buffer_head *bh;
-	int active_count = 0;
+	bool empty = true;
+
+	if (!sdp->sd_log_error) {
+		list_for_each_entry_reverse(bd, &tr->tr_ail1_list, bd_ail_st_list) {
+			bh = bd->bd_bh;
+
+			if (buffer_busy(bh)) {
+				empty = false;
+				break;
+			}
+		}
+		if (empty) {
+			gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail1_list);
+			gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
+			return empty;
+		}
+	}
 
 	list_for_each_entry_safe_reverse(bd, s, &tr->tr_ail1_list,
 					 bd_ail_st_list) {
@@ -311,7 +327,7 @@ static int gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
 		 * for others.
 		 */
 		if (!sdp->sd_log_error && buffer_busy(bh)) {
-			active_count++;
+			empty = false;
 			continue;
 		}
 		if (!buffer_uptodate(bh) &&
@@ -332,7 +348,7 @@ static int gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
 		}
 		list_move(&bd->bd_ail_st_list, &tr->tr_ail2_list);
 	}
-	return active_count;
+	return empty;
 }
 
 /**
@@ -351,7 +367,7 @@ static int gfs2_ail1_empty(struct gfs2_sbd *sdp, int max_revokes)
 
 	spin_lock(&sdp->sd_ail_lock);
 	list_for_each_entry_safe_reverse(tr, s, &sdp->sd_ail1_list, tr_list) {
-		if (!gfs2_ail1_empty_one(sdp, tr, &max_revokes) && oldest_tr)
+		if (gfs2_ail1_empty_one(sdp, tr, &max_revokes) && oldest_tr)
 			list_move(&tr->tr_list, &sdp->sd_ail2_list);
 		else
 			oldest_tr = 0;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 17/20] gfs2: Remove sd_log_committed_revoke
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (15 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 16/20] gfs2: No revokes for transactions at the tail of the log Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 18/20] gfs2: Remove sd_log_blks_reserved Andreas Gruenbacher
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In gfs2_trans_end -> gfs2_log_commit -> gfs2_log_refund, if we don't have a
system transaction, always attach the new transaction even if it only accounts
for revokes.  That way, tr_num_revoke - tr_num_revoke_rm in the system transaction
(sdp->sd_log_tr) will be the number of revokes committed so far, and we can use
that in calc_reserved instead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/incore.h |  1 -
 fs/gfs2/log.c    | 27 ++++++++++-----------------
 2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index f1ad78dbd782..5489140a52cf 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -821,7 +821,6 @@ struct gfs2_sbd {
 
 	struct gfs2_trans *sd_log_tr;
 	unsigned int sd_log_blks_reserved;
-	int sd_log_committed_revoke;
 
 	atomic_t sd_log_pinned;
 	unsigned int sd_log_num_revoke;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 14c5edf2bf51..2d4defc9129d 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -563,21 +563,15 @@ static inline unsigned int log_distance(struct gfs2_sbd *sdp, unsigned int newer
 static unsigned int calc_reserved(struct gfs2_sbd *sdp)
 {
 	unsigned int reserved = 0;
-	unsigned int mbuf;
-	unsigned int dbuf;
+	unsigned int blocks;
 	struct gfs2_trans *tr = sdp->sd_log_tr;
 
-	if (tr) {
-		mbuf = tr->tr_num_buf_new - tr->tr_num_buf_rm;
-		dbuf = tr->tr_num_databuf_new - tr->tr_num_databuf_rm;
-		reserved = mbuf + dbuf;
-		/* Account for header blocks */
-		reserved += DIV_ROUND_UP(mbuf, buf_limit(sdp));
-		reserved += DIV_ROUND_UP(dbuf, databuf_limit(sdp));
-	}
-
-	if (sdp->sd_log_committed_revoke > 0)
-		reserved += gfs2_struct2blk(sdp, sdp->sd_log_committed_revoke);
+	blocks = tr->tr_num_buf_new - tr->tr_num_buf_rm;
+	reserved += blocks + DIV_ROUND_UP(blocks, buf_limit(sdp));
+	blocks = tr->tr_num_databuf_new - tr->tr_num_databuf_rm;
+	reserved += blocks + DIV_ROUND_UP(blocks, databuf_limit(sdp));
+	if (tr->tr_num_revoke - tr->tr_num_revoke_rm)
+		reserved += gfs2_struct2blk(sdp, tr->tr_num_revoke - tr->tr_num_revoke_rm);
 	/* One for the overall header */
 	if (reserved)
 		reserved++;
@@ -996,7 +990,8 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 		if (gfs2_assert_withdraw_delayed(sdp, !sdp->sd_log_num_revoke))
 			goto out_withdraw;
 	if (gfs2_assert_withdraw_delayed(sdp,
-			sdp->sd_log_num_revoke == sdp->sd_log_committed_revoke))
+			sdp->sd_log_num_revoke ==
+			(tr ? tr->tr_num_revoke - tr->tr_num_revoke_rm : 0)))
 		goto out_withdraw;
 
 	gfs2_ordered_write(sdp);
@@ -1023,7 +1018,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	gfs2_log_lock(sdp);
 	gfs2_log_update_head(sdp);
 	sdp->sd_log_blks_reserved = 0;
-	sdp->sd_log_committed_revoke = 0;
 
 	spin_lock(&sdp->sd_ail_lock);
 	if (tr && !list_empty(&tr->tr_ail1_list)) {
@@ -1114,12 +1108,11 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 
 	if (sdp->sd_log_tr) {
 		gfs2_merge_trans(sdp, tr);
-	} else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
+	} else {
 		sdp->sd_log_tr = tr;
 		set_bit(TR_ATTACHED, &tr->tr_flags);
 	}
 
-	sdp->sd_log_committed_revoke += tr->tr_num_revoke - tr->tr_num_revoke_rm;
 	reserved = calc_reserved(sdp);
 	maxres = sdp->sd_log_blks_reserved + tr->tr_reserved;
 	gfs2_assert_withdraw(sdp, maxres >= reserved);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 18/20] gfs2: Remove sd_log_blks_reserved
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (16 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 17/20] gfs2: Remove sd_log_committed_revoke Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 19/20] gfs2: Rework the log space allocation logic Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 20/20] gfs2: Hand out revokes to transactions one by one Andreas Gruenbacher
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Now that sdp->sd_log_tr is defined even when the transaction only
indicates revokes, tr_reserved is equivalent to sd_log_blks_reserved,
so we can remove it.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/incore.h |  1 -
 fs/gfs2/log.c    | 21 ++++++++++-----------
 fs/gfs2/log.h    |  2 +-
 fs/gfs2/lops.c   |  2 +-
 4 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 5489140a52cf..4298df94768a 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -820,7 +820,6 @@ struct gfs2_sbd {
 	spinlock_t sd_log_lock;
 
 	struct gfs2_trans *sd_log_tr;
-	unsigned int sd_log_blks_reserved;
 
 	atomic_t sd_log_pinned;
 	unsigned int sd_log_num_revoke;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 2d4defc9129d..2fed4b4a7a02 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -702,6 +702,7 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
 /**
  * gfs2_add_aux_revokes - Add as many revokes to the system transaction as we can
  * @sdp: The GFS2 superblock
+ * @tr: The transaction
  *
  * Our usual strategy is to defer writing revokes as much as we can in the hope
  * that we'll eventually overwrite the journal, which will make those revokes
@@ -711,7 +712,7 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
  * been written back.  This will basically come at no cost now, and will save
  * us from having to keep track of those blocks on the AIL2 list later.
  */
-void gfs2_add_aux_revokes(struct gfs2_sbd *sdp)
+void gfs2_add_aux_revokes(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 {
 	/* number of revokes we still have room for */
 	unsigned int max_revokes;
@@ -726,7 +727,7 @@ void gfs2_add_aux_revokes(struct gfs2_sbd *sdp)
 		atomic_dec(&sdp->sd_log_blks_free);
 		/* If no blocks have been reserved, we need to also
 		 * reserve a block for the header */
-		if (!sdp->sd_log_blks_reserved) {
+		if (!tr) {
 			atomic_dec(&sdp->sd_log_blks_free);
 			trace_gfs2_log_blocks(sdp, -2);
 		} else {
@@ -738,7 +739,7 @@ void gfs2_add_aux_revokes(struct gfs2_sbd *sdp)
 
 	if (!sdp->sd_log_num_revoke) {
 		atomic_inc(&sdp->sd_log_blks_free);
-		if (!sdp->sd_log_blks_reserved) {
+		if (!tr) {
 			atomic_inc(&sdp->sd_log_blks_free);
 			trace_gfs2_log_blocks(sdp, 2);
 		} else {
@@ -1017,7 +1018,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 
 	gfs2_log_lock(sdp);
 	gfs2_log_update_head(sdp);
-	sdp->sd_log_blks_reserved = 0;
 
 	spin_lock(&sdp->sd_ail_lock);
 	if (tr && !list_empty(&tr->tr_ail1_list)) {
@@ -1082,6 +1082,7 @@ static void gfs2_merge_trans(struct gfs2_sbd *sdp, struct gfs2_trans *new)
 
 	WARN_ON_ONCE(!test_bit(TR_ATTACHED, &old->tr_flags));
 
+	old->tr_reserved	+= new->tr_reserved;
 	old->tr_num_buf_new	+= new->tr_num_buf_new;
 	old->tr_num_databuf_new	+= new->tr_num_databuf_new;
 	old->tr_num_buf_rm	+= new->tr_num_buf_rm;
@@ -1102,23 +1103,22 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 {
 	unsigned int reserved;
 	unsigned int unused;
-	unsigned int maxres;
 
 	gfs2_log_lock(sdp);
 
 	if (sdp->sd_log_tr) {
 		gfs2_merge_trans(sdp, tr);
+		tr = sdp->sd_log_tr;
 	} else {
 		sdp->sd_log_tr = tr;
 		set_bit(TR_ATTACHED, &tr->tr_flags);
 	}
 
-	reserved = calc_reserved(sdp);
-	maxres = sdp->sd_log_blks_reserved + tr->tr_reserved;
-	gfs2_assert_withdraw(sdp, maxres >= reserved);
-	unused = maxres - reserved;
+	reserved = tr->tr_reserved;
+	tr->tr_reserved = calc_reserved(sdp);
+	gfs2_assert_withdraw(sdp, reserved >= tr->tr_reserved);
+	unused = reserved - tr->tr_reserved;
 	gfs2_log_release(sdp, unused);
-	sdp->sd_log_blks_reserved = reserved;
 
 	gfs2_log_unlock(sdp);
 }
@@ -1156,7 +1156,6 @@ void gfs2_log_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 
 static void gfs2_log_shutdown(struct gfs2_sbd *sdp)
 {
-	gfs2_assert_withdraw(sdp, !sdp->sd_log_blks_reserved);
 	gfs2_assert_withdraw(sdp, !sdp->sd_log_num_revoke);
 	gfs2_assert_withdraw(sdp, list_empty(&sdp->sd_ail1_list));
 
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index c7f894168600..911276be0e01 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -81,6 +81,6 @@ extern void log_flush_wait(struct gfs2_sbd *sdp);
 extern int gfs2_logd(void *data);
 extern void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd);
 extern void gfs2_glock_remove_revoke(struct gfs2_glock *gl);
-extern void gfs2_add_aux_revokes(struct gfs2_sbd *sdp);
+extern void gfs2_add_aux_revokes(struct gfs2_sbd *sdp, struct gfs2_trans *tr);
 
 #endif /* __LOG_DOT_H__ */
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 3b4e51089bbd..4266195d1f1e 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -845,7 +845,7 @@ static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	struct page *page;
 	unsigned int length;
 
-	gfs2_add_aux_revokes(sdp);
+	gfs2_add_aux_revokes(sdp, tr);
 	if (!sdp->sd_log_num_revoke)
 		return;
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 19/20] gfs2: Rework the log space allocation logic
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (17 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 18/20] gfs2: Remove sd_log_blks_reserved Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 20/20] gfs2: Hand out revokes to transactions one by one Andreas Gruenbacher
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The current log space allocation logic is hard to understand or extend.
The principle it that when the log is flushed, we may or may not have a
transaction active that has space allocated in the log.  To deal with
that, we set aside a magical number of blocks to be used in case we
don't have an active transaction.  This pool of blocks is managed in a
haphazard way, and it's not clear that the pool will always be big
enough, though.  We can't easily return unused log space at the end of a
transaction; instead, the number of blocks allocated must match the
number of blocks used.

Simplify this as follows:
 * When transactions are allocated or merged, always reserve enough
   blocks to flush the transaction (err on the safe side).
 * In gfs2_log_flush, return any allocated blocks that haven't been used.
 * Maintain a pool of spare of blocks big enough to do one log flush, as
   before.
 * In gfs2_log_flush, when we have no active transaction, allocate a
   suitable number of blocks locally.  For that, use the spare pool when
   called from logd, and leave the pool alone otherwise.  This means
   that when the log is almost full, logd will still be able to do one
   more log flush, which will result in more log space becoming
   available.

This makes the log space allocator code much easier to work with in the
future.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c   | 163 ++++++++++++++++++++++++++++--------------------
 fs/gfs2/log.h   |   9 ++-
 fs/gfs2/lops.c  |   2 +-
 fs/gfs2/trans.c |   4 +-
 4 files changed, 108 insertions(+), 70 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 2fed4b4a7a02..b0980769890e 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -464,15 +464,42 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
 }
 
 /**
- * gfs2_log_reserve - Make a log reservation
+ * __gfs2_log_try_reserve - Try to make a log reservation
+ * @sdp: The GFS2 superblock
+ * @blks: The number of blocks to reserve
+ * @taboo_blks: The number of blocks to leave free
+ *
+ * Try to do the same as __gfs2_log_reserve(), but fail if no more log
+ * space is immediately available.
+ */
+static bool __gfs2_log_try_reserve(struct gfs2_sbd *sdp, unsigned int blks,
+				   unsigned int taboo_blks)
+{
+	unsigned wanted = blks + taboo_blks;
+	unsigned int free_blocks;
+
+	free_blocks = atomic_read(&sdp->sd_log_blks_free);
+	while (free_blocks >= wanted) {
+		if (atomic_try_cmpxchg(&sdp->sd_log_blks_free, &free_blocks,
+				       free_blocks - blks)) {
+			trace_gfs2_log_blocks(sdp, -blks);
+			return true;
+		}
+	}
+	return false;
+}
+
+/**
+ * __gfs2_log_reserve - Make a log reservation
  * @sdp: The GFS2 superblock
  * @blks: The number of blocks to reserve
+ * @taboo_blks: The number of blocks to leave free
  *
- * Note that we never give out the last few blocks of the journal. Thats
- * due to the fact that there is a small number of header blocks
- * associated with each log flush. The exact number can't be known until
- * flush time, so we ensure that we have just enough free blocks at all
- * times to avoid running out during a log flush.
+ * @taboo_blks is set to 0 for logd, and to GFS2_LOG_FLUSH_MIN_BLOCKS
+ * for all other processes.  This ensures that when the log is almost full,
+ * logd will still be able to call gfs2_log_flush one more time  without
+ * blocking, which will advance the tail and make some more log space
+ * available.
  *
  * We no longer flush the log here, instead we wake up logd to do that
  * for us. To avoid the thundering herd and to ensure that we deal fairly
@@ -481,19 +508,12 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
  * wake the next waiter on the list.
  */
 
-void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
+static void __gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks,
+			       unsigned int taboo_blks)
 {
-	unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
-	unsigned wanted = blks + reserved_blks;
+	unsigned wanted = blks + taboo_blks;
 	unsigned int free_blocks;
 
-	free_blocks = atomic_read(&sdp->sd_log_blks_free);
-	while (free_blocks >= wanted) {
-		if (atomic_try_cmpxchg(&sdp->sd_log_blks_free, &free_blocks,
-				       free_blocks - blks))
-			return;
-	}
-
 	atomic_add(blks, &sdp->sd_log_blks_needed);
 	for (;;) {
 		if (current != sdp->sd_logd_process)
@@ -515,6 +535,19 @@ void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 		wake_up(&sdp->sd_log_waitq);
 }
 
+/**
+ * gfs2_log_reserve - Make a log reservation
+ * @sdp: The GFS2 superblock
+ * @blks: The number of blocks to reserve
+ */
+
+void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
+{
+	if (__gfs2_log_try_reserve(sdp, blks, GFS2_LOG_FLUSH_MIN_BLOCKS))
+		return;
+	__gfs2_log_reserve(sdp, blks, GFS2_LOG_FLUSH_MIN_BLOCKS);
+}
+
 /**
  * log_distance - Compute distance between two journal blocks
  * @sdp: The GFS2 superblock
@@ -562,7 +595,7 @@ static inline unsigned int log_distance(struct gfs2_sbd *sdp, unsigned int newer
  */
 static unsigned int calc_reserved(struct gfs2_sbd *sdp)
 {
-	unsigned int reserved = 0;
+	unsigned int reserved = GFS2_LOG_FLUSH_MIN_BLOCKS;
 	unsigned int blocks;
 	struct gfs2_trans *tr = sdp->sd_log_tr;
 
@@ -571,10 +604,7 @@ static unsigned int calc_reserved(struct gfs2_sbd *sdp)
 	blocks = tr->tr_num_databuf_new - tr->tr_num_databuf_rm;
 	reserved += blocks + DIV_ROUND_UP(blocks, databuf_limit(sdp));
 	if (tr->tr_num_revoke - tr->tr_num_revoke_rm)
-		reserved += gfs2_struct2blk(sdp, tr->tr_num_revoke - tr->tr_num_revoke_rm);
-	/* One for the overall header */
-	if (reserved)
-		reserved++;
+		reserved += gfs2_struct2blk(sdp, tr->tr_num_revoke - tr->tr_num_revoke_rm) - 1;
 	return reserved;
 }
 
@@ -702,7 +732,6 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
 /**
  * gfs2_add_aux_revokes - Add as many revokes to the system transaction as we can
  * @sdp: The GFS2 superblock
- * @tr: The transaction
  *
  * Our usual strategy is to defer writing revokes as much as we can in the hope
  * that we'll eventually overwrite the journal, which will make those revokes
@@ -712,7 +741,7 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
  * been written back.  This will basically come at no cost now, and will save
  * us from having to keep track of those blocks on the AIL2 list later.
  */
-void gfs2_add_aux_revokes(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
+void gfs2_add_aux_revokes(struct gfs2_sbd *sdp)
 {
 	/* number of revokes we still have room for */
 	unsigned int max_revokes;
@@ -723,29 +752,8 @@ void gfs2_add_aux_revokes(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 		max_revokes += roundup(sdp->sd_log_num_revoke - sdp->sd_ldptrs,
 				       sdp->sd_inptrs);
 	max_revokes -= sdp->sd_log_num_revoke;
-	if (!sdp->sd_log_num_revoke) {
-		atomic_dec(&sdp->sd_log_blks_free);
-		/* If no blocks have been reserved, we need to also
-		 * reserve a block for the header */
-		if (!tr) {
-			atomic_dec(&sdp->sd_log_blks_free);
-			trace_gfs2_log_blocks(sdp, -2);
-		} else {
-			trace_gfs2_log_blocks(sdp, -1);
-		}
-	}
 	gfs2_ail1_empty(sdp, max_revokes);
 	gfs2_log_unlock(sdp);
-
-	if (!sdp->sd_log_num_revoke) {
-		atomic_inc(&sdp->sd_log_blks_free);
-		if (!tr) {
-			atomic_inc(&sdp->sd_log_blks_free);
-			trace_gfs2_log_blocks(sdp, 2);
-		} else {
-			trace_gfs2_log_blocks(sdp, 1);
-		}
-	}
 }
 
 /**
@@ -858,6 +866,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags)
 	gfs2_log_incr_head(sdp);
 	log_flush_wait(sdp);
 	log_pull_tail(sdp);
+	gfs2_log_update_head(sdp);
 }
 
 /**
@@ -957,36 +966,57 @@ static void trans_drain(struct gfs2_trans *tr)
 void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 {
 	struct gfs2_trans *tr = NULL;
+	unsigned int reserved_blocks = 0, used_blocks = 0;
 	enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state);
+	unsigned int first_log_head;
 
 	down_write(&sdp->sd_log_flush_lock);
+	trace_gfs2_log_flush(sdp, 1, flags);
 
+repeat:
 	/*
 	 * Do this check while holding the log_flush_lock to prevent new
 	 * buffers from being added to the ail via gfs2_pin()
 	 */
 	if (gfs2_withdrawn(sdp))
-		goto out;
+		goto out_release;
 
 	/* Log might have been flushed while we waited for the flush lock */
 	if (gl && !test_bit(GLF_LFLUSH, &gl->gl_flags))
-		goto out;
-	trace_gfs2_log_flush(sdp, 1, flags);
+		goto out_release;
 
-	if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN)
-		clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
+	first_log_head = sdp->sd_log_head;
+	sdp->sd_log_flush_head = first_log_head;
 
-	sdp->sd_log_flush_head = sdp->sd_log_head;
 	tr = sdp->sd_log_tr;
 	if (tr) {
+		if (reserved_blocks)
+			gfs2_log_release(sdp, reserved_blocks);
+		reserved_blocks = tr->tr_reserved;
 		sdp->sd_log_tr = NULL;
-		tr->tr_first = sdp->sd_log_flush_head;
+		tr->tr_first = first_log_head;
 		if (unlikely (state == SFS_FROZEN))
 			if (gfs2_assert_withdraw_delayed(sdp,
 			       !tr->tr_num_buf_new && !tr->tr_num_databuf_new))
 				goto out_withdraw;
+	} else if (!reserved_blocks) {
+		unsigned int taboo_blocks = GFS2_LOG_FLUSH_MIN_BLOCKS;
+
+		reserved_blocks = GFS2_LOG_FLUSH_MIN_BLOCKS;
+		if (current == sdp->sd_logd_process)
+			taboo_blocks = 0;
+
+		if (!__gfs2_log_try_reserve(sdp, reserved_blocks, taboo_blocks)) {
+			up_write(&sdp->sd_log_flush_lock);
+			__gfs2_log_reserve(sdp, reserved_blocks, taboo_blocks);
+			down_write(&sdp->sd_log_flush_lock);
+			goto repeat;
+		}
 	}
 
+	if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN)
+		clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
+
 	if (unlikely(state == SFS_FROZEN))
 		if (gfs2_assert_withdraw_delayed(sdp, !sdp->sd_log_num_revoke))
 			goto out_withdraw;
@@ -1008,8 +1038,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	if (sdp->sd_log_head != sdp->sd_log_flush_head) {
 		log_write_header(sdp, flags);
 	} else if (sdp->sd_log_tail != sdp->sd_log_flush_tail && !sdp->sd_log_idle) {
-		atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
-		trace_gfs2_log_blocks(sdp, -1);
 		log_write_header(sdp, flags);
 	}
 	if (gfs2_withdrawn(sdp))
@@ -1017,8 +1045,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	lops_after_commit(sdp, tr);
 
 	gfs2_log_lock(sdp);
-	gfs2_log_update_head(sdp);
-
 	spin_lock(&sdp->sd_ail_lock);
 	if (tr && !list_empty(&tr->tr_ail1_list)) {
 		list_add(&tr->tr_list, &sdp->sd_ail1_list);
@@ -1032,10 +1058,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 			empty_ail1_list(sdp);
 			if (gfs2_withdrawn(sdp))
 				goto out_withdraw;
-			atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
-			trace_gfs2_log_blocks(sdp, -1);
 			log_write_header(sdp, flags);
-			gfs2_log_update_head(sdp);
 		}
 		if (flags & (GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
 			     GFS2_LOG_HEAD_FLUSH_FREEZE))
@@ -1045,8 +1068,20 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	}
 
 out_end:
-	trace_gfs2_log_flush(sdp, 0, flags);
+	used_blocks = log_distance(sdp, sdp->sd_log_flush_head, first_log_head);
+	if (gfs2_assert_withdraw_delayed(sdp, used_blocks <= reserved_blocks))
+		goto out;
+out_release:
+	if (reserved_blocks != used_blocks) {
+		unsigned int unused_blocks = reserved_blocks - used_blocks;
+
+		atomic_add(unused_blocks, &sdp->sd_log_blks_free);
+		trace_gfs2_log_blocks(sdp, unused_blocks);
+		gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
+				     sdp->sd_jdesc->jd_blocks);
+	}
 out:
+	trace_gfs2_log_flush(sdp, 0, flags);
 	up_write(&sdp->sd_log_flush_lock);
 	gfs2_trans_free(sdp, tr);
 	if (gfs2_withdrawing(sdp))
@@ -1159,15 +1194,11 @@ static void gfs2_log_shutdown(struct gfs2_sbd *sdp)
 	gfs2_assert_withdraw(sdp, !sdp->sd_log_num_revoke);
 	gfs2_assert_withdraw(sdp, list_empty(&sdp->sd_ail1_list));
 
-	sdp->sd_log_flush_head = sdp->sd_log_head;
-
 	log_write_header(sdp, GFS2_LOG_HEAD_UNMOUNT | GFS2_LFC_SHUTDOWN);
+	log_pull_tail(sdp);
 
 	gfs2_assert_warn(sdp, sdp->sd_log_head == sdp->sd_log_tail);
 	gfs2_assert_warn(sdp, list_empty(&sdp->sd_ail2_list));
-
-	gfs2_log_update_head(sdp);
-	sdp->sd_log_tail = sdp->sd_log_head;
 }
 
 static inline int gfs2_jrnl_flush_reqd(struct gfs2_sbd *sdp)
@@ -1222,7 +1253,7 @@ int gfs2_logd(void *data)
 		if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
 			gfs2_ail1_empty(sdp, 0);
 			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
-				       GFS2_LFC_LOGD_JFLUSH_REQD);
+						  GFS2_LFC_LOGD_JFLUSH_REQD);
 		}
 
 		if (gfs2_ail_flush_reqd(sdp)) {
@@ -1230,7 +1261,7 @@ int gfs2_logd(void *data)
 			gfs2_ail1_wait(sdp);
 			gfs2_ail1_empty(sdp, 0);
 			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
-				       GFS2_LFC_LOGD_AIL_FLUSH_REQD);
+						  GFS2_LFC_LOGD_AIL_FLUSH_REQD);
 		}
 
 		t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 911276be0e01..bc910960ae6c 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -13,6 +13,13 @@
 #include "incore.h"
 #include "inode.h"
 
+/*
+ * The minimum amount of log space required for a log flush is one block for
+ * revokes and one block for the log header.  Log flushes other than
+ * GFS2_LOG_HEAD_FLUSH_NORMAL may write one or two more log headers.
+ */
+#define GFS2_LOG_FLUSH_MIN_BLOCKS 4
+
 /**
  * gfs2_log_lock - acquire the right to mess with the log manager
  * @sdp: the filesystem
@@ -81,6 +88,6 @@ extern void log_flush_wait(struct gfs2_sbd *sdp);
 extern int gfs2_logd(void *data);
 extern void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd);
 extern void gfs2_glock_remove_revoke(struct gfs2_glock *gl);
-extern void gfs2_add_aux_revokes(struct gfs2_sbd *sdp, struct gfs2_trans *tr);
+extern void gfs2_add_aux_revokes(struct gfs2_sbd *sdp);
 
 #endif /* __LOG_DOT_H__ */
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 4266195d1f1e..3b4e51089bbd 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -845,7 +845,7 @@ static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	struct page *page;
 	unsigned int length;
 
-	gfs2_add_aux_revokes(sdp, tr);
+	gfs2_add_aux_revokes(sdp);
 	if (!sdp->sd_log_num_revoke)
 		return;
 
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 6a6b07499d07..d11977340e74 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -59,7 +59,7 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 	tr->tr_ip = _RET_IP_;
 	tr->tr_blocks = blocks;
 	tr->tr_revokes = revokes;
-	tr->tr_reserved = 1;
+	tr->tr_reserved = GFS2_LOG_FLUSH_MIN_BLOCKS;
 	if (blocks) {
 		/*
 		 * The reserved blocks are either used for data or metadata.
@@ -69,7 +69,7 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 		tr->tr_reserved += blocks + 1 + DIV_ROUND_UP(blocks - 1, databuf_limit(sdp));
 	}
 	if (revokes)
-		tr->tr_reserved += gfs2_struct2blk(sdp, revokes);
+		tr->tr_reserved += gfs2_struct2blk(sdp, revokes) - 1;
 	INIT_LIST_HEAD(&tr->tr_databuf);
 	INIT_LIST_HEAD(&tr->tr_buf);
 	INIT_LIST_HEAD(&tr->tr_list);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 20/20] gfs2: Hand out revokes to transactions one by one
  2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
                   ` (18 preceding siblings ...)
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 19/20] gfs2: Rework the log space allocation logic Andreas Gruenbacher
@ 2020-12-19 20:48 ` Andreas Gruenbacher
  19 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-19 20:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In the log, revokes are stored as a revoke descriptor (struct
gfs2_log_descriptor), followed by zero or more additional revoke blocks
(struct gfs2_meta_header).  On filesystems with a blocksize of 4k, the
revoke descriptor contains up to 503 revokes, and the metadata blocks
contain up to 509 revokes each.  We've so far been reserving space for revokes
in transactions on a block granularity, so a lot more space than needed was
being allocated and then immediately released again.

This patch switches to assigning revokes to transactions individually instead.
Initially, space for the revoke descriptor is reserved and handed out to
transactions.  When more revokes than that are reserved, additional revoke
blocks are added.  When the log is flushed, the space for the additional revoke
blocks is released, but we never release the revoke descriptor block.

Transactions may still reserve more revokes than they will actually need in the
end, but now we won't overshoot the target as much, and by only returning the
space for excess revokes at log flush time, we further reduce the amount of
contention between processes.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/incore.h     |   2 +-
 fs/gfs2/log.c        | 108 ++++++++++++++++++++++++++++++++++++-------
 fs/gfs2/log.h        |   6 ++-
 fs/gfs2/lops.c       |   8 ++--
 fs/gfs2/ops_fstype.c |   7 +++
 fs/gfs2/trans.c      |  32 ++++++++++---
 6 files changed, 135 insertions(+), 28 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 4298df94768a..5988d2235046 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -506,7 +506,6 @@ struct gfs2_trans {
 	unsigned int tr_num_buf_rm;
 	unsigned int tr_num_databuf_rm;
 	unsigned int tr_num_revoke;
-	unsigned int tr_num_revoke_rm;
 
 	struct list_head tr_list;
 	struct list_head tr_databuf;
@@ -832,6 +831,7 @@ struct gfs2_sbd {
 	atomic_t sd_log_thresh2;
 	atomic_t sd_log_blks_free;
 	atomic_t sd_log_blks_needed;
+	atomic_t sd_log_revokes_available;
 	wait_queue_head_t sd_log_waitq;
 	wait_queue_head_t sd_logd_waitq;
 
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index b0980769890e..e93afd8ac9ea 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -272,13 +272,16 @@ static void gfs2_ail_empty_tr(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
 			      struct list_head *head)
 {
 	struct gfs2_bufdata *bd;
+	unsigned int revokes = 0;
 
 	while (!list_empty(head)) {
 		bd = list_first_entry(head, struct gfs2_bufdata,
 				      bd_ail_st_list);
 		gfs2_assert(sdp, bd->bd_tr == tr);
 		gfs2_remove_from_ail(bd);
+		revokes++;
 	}
+	gfs2_log_release_revokes(sdp, revokes);
 }
 
 /**
@@ -445,6 +448,39 @@ bool gfs2_log_is_empty(struct gfs2_sbd *sdp) {
 	return atomic_read(&sdp->sd_log_blks_free) == sdp->sd_jdesc->jd_blocks;
 }
 
+static void gfs2_log_shrink_revokes(struct gfs2_sbd *sdp)
+{
+	unsigned int revokes = atomic_read(&sdp->sd_log_revokes_available);
+
+	while (revokes > sdp->sd_inptrs &&
+	       !atomic_try_cmpxchg(&sdp->sd_log_revokes_available,
+				   &revokes, revokes % sdp->sd_inptrs))
+		/* do nothing */ ;
+	if (revokes > sdp->sd_inptrs) {
+		gfs2_log_release(sdp, revokes / sdp->sd_inptrs);
+		revokes = revokes % sdp->sd_inptrs;
+	}
+	gfs2_assert_withdraw(sdp, revokes == sdp->sd_ldptrs);
+}
+
+static bool __gfs2_log_try_reserve_revokes(struct gfs2_sbd *sdp, unsigned int revokes)
+{
+	unsigned int available;
+
+	available = atomic_read(&sdp->sd_log_revokes_available);
+	while (available >= revokes) {
+		if (atomic_try_cmpxchg(&sdp->sd_log_revokes_available,
+				       &available, available - revokes))
+			return true;
+	}
+	return false;
+}
+
+void gfs2_log_release_revokes(struct gfs2_sbd *sdp, unsigned int revokes)
+{
+	atomic_add(revokes, &sdp->sd_log_revokes_available);
+}
+
 /**
  * gfs2_log_release - Release a given number of log blocks
  * @sdp: The GFS2 superblock
@@ -535,17 +571,64 @@ static void __gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks,
 		wake_up(&sdp->sd_log_waitq);
 }
 
+/**
+ * gfs2_log_try_reserve - Try to make a log reservation
+ * @sdp: The GFS2 superblock
+ * @blks: The number of blocks to reserve
+ * @revokes: The number of revokes to reserve
+ * @extra_revokes: The number of additional revokes reserved (output)
+ *
+ * This is similar to gfs2_log_reserve, but can be called while holding
+ * sd_log_flush_lock.
+ */
+
+bool gfs2_log_try_reserve(struct gfs2_sbd *sdp, unsigned int blks,
+			  unsigned int revokes, unsigned int *extra_revokes)
+{
+	unsigned int revoke_blks = 0;
+
+	if (revokes && !__gfs2_log_try_reserve_revokes(sdp, revokes)) {
+		revoke_blks = DIV_ROUND_UP(revokes, sdp->sd_inptrs);
+		blks += revoke_blks;
+	}
+	if (!blks)
+		goto blocks_reserved;
+
+	if (__gfs2_log_try_reserve(sdp, blks, GFS2_LOG_FLUSH_MIN_BLOCKS))
+		goto blocks_reserved;
+	if (revokes && !revoke_blks)
+		gfs2_log_release_revokes(sdp, revokes);
+	return false;
+
+blocks_reserved:
+	*extra_revokes = revoke_blks * sdp->sd_inptrs - revokes;
+	return true;
+}
+
 /**
  * gfs2_log_reserve - Make a log reservation
  * @sdp: The GFS2 superblock
  * @blks: The number of blocks to reserve
+ * @revokes: The number of revokes to reserve
+ * @extra_revokes: The number of additional revokes reserved (output)
  */
 
-void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
+void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks,
+		      unsigned int revokes, unsigned int *extra_revokes)
 {
+	unsigned int revoke_blks = 0;
+
+	if (revokes) {
+		revoke_blks = DIV_ROUND_UP(revokes, sdp->sd_inptrs);
+		blks += revoke_blks;
+	}
+
 	if (__gfs2_log_try_reserve(sdp, blks, GFS2_LOG_FLUSH_MIN_BLOCKS))
-		return;
+		goto blocks_reserved;
 	__gfs2_log_reserve(sdp, blks, GFS2_LOG_FLUSH_MIN_BLOCKS);
+
+blocks_reserved:
+	*extra_revokes = revoke_blks * sdp->sd_inptrs - revokes;
 }
 
 /**
@@ -603,8 +686,6 @@ static unsigned int calc_reserved(struct gfs2_sbd *sdp)
 	reserved += blocks + DIV_ROUND_UP(blocks, buf_limit(sdp));
 	blocks = tr->tr_num_databuf_new - tr->tr_num_databuf_rm;
 	reserved += blocks + DIV_ROUND_UP(blocks, databuf_limit(sdp));
-	if (tr->tr_num_revoke - tr->tr_num_revoke_rm)
-		reserved += gfs2_struct2blk(sdp, tr->tr_num_revoke - tr->tr_num_revoke_rm) - 1;
 	return reserved;
 }
 
@@ -743,15 +824,9 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
  */
 void gfs2_add_aux_revokes(struct gfs2_sbd *sdp)
 {
-	/* number of revokes we still have room for */
-	unsigned int max_revokes;
+	unsigned int max_revokes = atomic_read(&sdp->sd_log_revokes_available);
 
 	gfs2_log_lock(sdp);
-	max_revokes = sdp->sd_ldptrs;
-	if (sdp->sd_log_num_revoke > sdp->sd_ldptrs)
-		max_revokes += roundup(sdp->sd_log_num_revoke - sdp->sd_ldptrs,
-				       sdp->sd_inptrs);
-	max_revokes -= sdp->sd_log_num_revoke;
 	gfs2_ail1_empty(sdp, max_revokes);
 	gfs2_log_unlock(sdp);
 }
@@ -1020,10 +1095,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	if (unlikely(state == SFS_FROZEN))
 		if (gfs2_assert_withdraw_delayed(sdp, !sdp->sd_log_num_revoke))
 			goto out_withdraw;
-	if (gfs2_assert_withdraw_delayed(sdp,
-			sdp->sd_log_num_revoke ==
-			(tr ? tr->tr_num_revoke - tr->tr_num_revoke_rm : 0)))
-		goto out_withdraw;
 
 	gfs2_ordered_write(sdp);
 	if (gfs2_withdrawn(sdp))
@@ -1080,6 +1151,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 		gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
 				     sdp->sd_jdesc->jd_blocks);
 	}
+	gfs2_log_shrink_revokes(sdp);
 out:
 	trace_gfs2_log_flush(sdp, 0, flags);
 	up_write(&sdp->sd_log_flush_lock);
@@ -1118,12 +1190,12 @@ static void gfs2_merge_trans(struct gfs2_sbd *sdp, struct gfs2_trans *new)
 	WARN_ON_ONCE(!test_bit(TR_ATTACHED, &old->tr_flags));
 
 	old->tr_reserved	+= new->tr_reserved;
+	old->tr_revokes		+= new->tr_revokes;
 	old->tr_num_buf_new	+= new->tr_num_buf_new;
 	old->tr_num_databuf_new	+= new->tr_num_databuf_new;
 	old->tr_num_buf_rm	+= new->tr_num_buf_rm;
 	old->tr_num_databuf_rm	+= new->tr_num_databuf_rm;
 	old->tr_num_revoke	+= new->tr_num_revoke;
-	old->tr_num_revoke_rm	+= new->tr_num_revoke_rm;
 
 	list_splice_tail_init(&new->tr_databuf, &old->tr_databuf);
 	list_splice_tail_init(&new->tr_buf, &old->tr_buf);
@@ -1141,6 +1213,10 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 
 	gfs2_log_lock(sdp);
 
+	unused = tr->tr_revokes - tr->tr_num_revoke;
+        if (unused)
+                gfs2_log_release_revokes(sdp, unused);
+
 	if (sdp->sd_log_tr) {
 		gfs2_merge_trans(sdp, tr);
 		tr = sdp->sd_log_tr;
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index bc910960ae6c..f1dbd8220a34 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -74,8 +74,12 @@ extern void gfs2_ordered_del_inode(struct gfs2_inode *ip);
 extern unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct);
 extern void gfs2_remove_from_ail(struct gfs2_bufdata *bd);
 extern bool gfs2_log_is_empty(struct gfs2_sbd *sdp);
+extern void gfs2_log_release_revokes(struct gfs2_sbd *sdp, unsigned int revokes);
 extern void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
-extern void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
+extern bool gfs2_log_try_reserve(struct gfs2_sbd *sdp, unsigned int blks,
+				 unsigned int revokes, unsigned int *extra_revokes);
+extern void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks,
+			     unsigned int revokes, unsigned int *extra_revokes);
 extern void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 				  u64 seq, u32 tail, u32 lblock, u32 flags,
 				  int op_flags);
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 3b4e51089bbd..3eae7e4a4f5f 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -844,6 +844,7 @@ static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	struct gfs2_bufdata *bd;
 	struct page *page;
 	unsigned int length;
+	unsigned int revokes = 0;
 
 	gfs2_add_aux_revokes(sdp);
 	if (!sdp->sd_log_num_revoke)
@@ -854,10 +855,9 @@ static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	offset = sizeof(struct gfs2_log_descriptor);
 
 	list_for_each_entry(bd, head, bd_list) {
-		sdp->sd_log_num_revoke--;
+		revokes++;
 
 		if (offset + sizeof(u64) > sdp->sd_sb.sb_bsize) {
-
 			gfs2_log_write_page(sdp, page);
 			page = mempool_alloc(gfs2_page_pool, GFP_NOIO);
 			mh = page_address(page);
@@ -871,7 +871,9 @@ static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 		*(__be64 *)(page_address(page) + offset) = cpu_to_be64(bd->bd_blkno);
 		offset += sizeof(u64);
 	}
-	gfs2_assert_withdraw(sdp, !sdp->sd_log_num_revoke);
+	gfs2_assert_withdraw(sdp, revokes == sdp->sd_log_num_revoke);
+	gfs2_log_release_revokes(sdp, revokes);
+	sdp->sd_log_num_revoke = 0;
 
 	gfs2_log_write_page(sdp, page);
 }
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 986dc2ebebf0..316ca5fc99e8 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -315,6 +315,13 @@ static int gfs2_read_sb(struct gfs2_sbd *sdp, int silent)
 				     sizeof(struct gfs2_meta_header))
 		* GFS2_NBBY; /* not the rgrp bitmap, subsequent bitmaps only */
 
+	/*
+	 * We always keep at least one block reserved for revokes in
+	 * transactions.  This greatly simplifies allocating additional
+	 * revoke blocks.
+	 */
+	atomic_set(&sdp->sd_log_revokes_available, sdp->sd_ldptrs);
+
 	/* Compute maximum reservation required to add a entry to a directory */
 
 	hash_blocks = DIV_ROUND_UP(sizeof(u64) * BIT(GFS2_DIR_MAX_DEPTH),
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index d11977340e74..2340e2120c97 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -31,10 +31,10 @@ static void gfs2_print_trans(struct gfs2_sbd *sdp, const struct gfs2_trans *tr)
 	fs_warn(sdp, "blocks=%u revokes=%u reserved=%u touched=%u\n",
 		tr->tr_blocks, tr->tr_revokes, tr->tr_reserved,
 		test_bit(TR_TOUCHED, &tr->tr_flags));
-	fs_warn(sdp, "Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
+	fs_warn(sdp, "Buf %u/%u Databuf %u/%u Revoke %u\n",
 		tr->tr_num_buf_new, tr->tr_num_buf_rm,
 		tr->tr_num_databuf_new, tr->tr_num_databuf_rm,
-		tr->tr_num_revoke, tr->tr_num_revoke_rm);
+		tr->tr_num_revoke);
 }
 
 int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
@@ -68,8 +68,6 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 		 */
 		tr->tr_reserved += blocks + 1 + DIV_ROUND_UP(blocks - 1, databuf_limit(sdp));
 	}
-	if (revokes)
-		tr->tr_reserved += gfs2_struct2blk(sdp, revokes) - 1;
 	INIT_LIST_HEAD(&tr->tr_databuf);
 	INIT_LIST_HEAD(&tr->tr_buf);
 	INIT_LIST_HEAD(&tr->tr_list);
@@ -83,10 +81,27 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 
 	sb_start_intwrite(sdp->sd_vfs);
 
-	gfs2_log_reserve(sdp, tr->tr_reserved);
+	/*
+	 * Try the reservations under sd_log_flush_lock to prevent log flushes
+	 * from creating inconsistencies between the number of allocated and
+	 * reserved revokes.  If that fails, do a full-block allocation outside
+	 * of the lock to avoid stalling log flushes.  Then, allot the
+	 * appropriate number of blocks to revokes, use as many revokes locally
+	 * as needed, and "release" the surplus into the revokes pool.
+	 */
 
 	down_read(&sdp->sd_log_flush_lock);
+	if (gfs2_log_try_reserve(sdp, tr->tr_reserved, tr->tr_revokes, &revokes))
+		goto reserved;
+	up_read(&sdp->sd_log_flush_lock);
+	gfs2_log_reserve(sdp, tr->tr_reserved, tr->tr_revokes, &revokes);
+	down_read(&sdp->sd_log_flush_lock);
+
+reserved:
+	if (revokes)
+		gfs2_log_release_revokes(sdp, revokes);
 	if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
+		gfs2_log_release_revokes(sdp, tr->tr_revokes);
 		gfs2_log_release(sdp, tr->tr_reserved);
 		up_read(&sdp->sd_log_flush_lock);
 		sb_end_intwrite(sdp->sd_vfs);
@@ -118,6 +133,7 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 	current->journal_info = NULL;
 
 	if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
+		gfs2_log_release_revokes(sdp, tr->tr_revokes);
 		gfs2_log_release(sdp, tr->tr_reserved);
 		up_read(&sdp->sd_log_flush_lock);
 		sb_end_intwrite(sdp->sd_vfs);
@@ -125,6 +141,9 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 		return;
 	}
 
+	if (tr->tr_revokes - tr->tr_num_revoke)
+		gfs2_log_release_revokes(sdp, tr->tr_revokes - tr->tr_num_revoke);
+
 	nbuf = tr->tr_num_buf_new + tr->tr_num_databuf_new;
 	nbuf -= tr->tr_num_buf_rm;
 	nbuf -= tr->tr_num_databuf_rm;
@@ -281,7 +300,6 @@ void gfs2_trans_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd)
 void gfs2_trans_remove_revoke(struct gfs2_sbd *sdp, u64 blkno, unsigned int len)
 {
 	struct gfs2_bufdata *bd, *tmp;
-	struct gfs2_trans *tr = current->journal_info;
 	unsigned int n = len;
 
 	gfs2_log_lock(sdp);
@@ -293,7 +311,7 @@ void gfs2_trans_remove_revoke(struct gfs2_sbd *sdp, u64 blkno, unsigned int len)
 			if (bd->bd_gl)
 				gfs2_glock_remove_revoke(bd->bd_gl);
 			kmem_cache_free(gfs2_bufdata_cachep, bd);
-			tr->tr_num_revoke_rm++;
+			gfs2_log_release_revokes(sdp, 1);
 			if (--n == 0)
 				break;
 		}
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 12/20] gfs2: Clean up gfs2_log_reserve
  2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 12/20] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
@ 2020-12-22 14:19   ` Bob Peterson
  2020-12-22 15:12     ` Andreas Gruenbacher
  0 siblings, 1 reply; 23+ messages in thread
From: Bob Peterson @ 2020-12-22 14:19 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Comments inline.

----- Original Message -----
> Wake up log waiters in gfs2_log_release when log space has actually become
> available.  This is a much better place for the wakeup than gfs2_logd.
> 
> Check if enough log space is immeditely available before anything else.  If
> there isn't, use io_wait_event to wait instead of open-coding it.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> ---
>  fs/gfs2/log.c | 54 ++++++++++++++++++++++-----------------------------
>  1 file changed, 23 insertions(+), 31 deletions(-)
> 
> diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> index 95ad444bd3dc..7a65823ad1f3 100644
> --- a/fs/gfs2/log.c
> +++ b/fs/gfs2/log.c
> @@ -420,6 +420,8 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int
> blks)
>  	trace_gfs2_log_blocks(sdp, blks);
>  	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
>  				  sdp->sd_jdesc->jd_blocks);
> +	if (atomic_read(&sdp->sd_log_blks_needed))
> +		wake_up(&sdp->sd_log_waitq);
>  }
>  
>  /**
> @@ -444,36 +446,33 @@ void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned
> int blks)
>  {
>  	unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
>  	unsigned wanted = blks + reserved_blks;
> -	DEFINE_WAIT(wait);
> -	int did_wait = 0;
>  	unsigned int free_blocks;
>  
> -	atomic_add(blks, &sdp->sd_log_blks_needed);
> -retry:
>  	free_blocks = atomic_read(&sdp->sd_log_blks_free);
> -	if (unlikely(free_blocks <= wanted)) {
> -		do {
> -			prepare_to_wait_exclusive(&sdp->sd_log_waitq, &wait,
> -					TASK_UNINTERRUPTIBLE);
> +	while (free_blocks >= wanted) {
> +		if (atomic_try_cmpxchg(&sdp->sd_log_blks_free, &free_blocks,
> +				       free_blocks - blks))
> +			return;

This would be a good place to have cond_resched() or schedule() or something, no?

> +	}
> +
> +	atomic_add(blks, &sdp->sd_log_blks_needed);
> +	for (;;) {
> +		if (current != sdp->sd_logd_process)
>  			wake_up(&sdp->sd_logd_waitq);
> -			did_wait = 1;
> -			if (atomic_read(&sdp->sd_log_blks_free) <= wanted)
> -				io_schedule();
> -			free_blocks = atomic_read(&sdp->sd_log_blks_free);
> -		} while(free_blocks <= wanted);
> -		finish_wait(&sdp->sd_log_waitq, &wait);
> +		io_wait_event(sdp->sd_log_waitq,
> +			(free_blocks = atomic_read(&sdp->sd_log_blks_free),
> +			 free_blocks >= wanted));
> +		do {
> +			if (atomic_try_cmpxchg(&sdp->sd_log_blks_free,
> +					       &free_blocks,
> +					       free_blocks - blks))
> +				goto reserved;

Same here. We need to test these patches with a minimal number of cpus.

> +		} while (free_blocks >= wanted);
>  	}
> -	if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
> -				free_blocks - blks) != free_blocks)
> -		goto retry;
> -	atomic_sub(blks, &sdp->sd_log_blks_needed);
> -	trace_gfs2_log_blocks(sdp, -blks);
>  
> -	/*
> -	 * If we waited, then so might others, wake them up _after_ we get
> -	 * our share of the log.
> -	 */
> -	if (unlikely(did_wait))
> +reserved:
> +	trace_gfs2_log_blocks(sdp, -blks);
> +	if (atomic_sub_return(blks, &sdp->sd_log_blks_needed))
>  		wake_up(&sdp->sd_log_waitq);
>  }
>  
> @@ -1190,7 +1189,6 @@ int gfs2_logd(void *data)
>  	struct gfs2_sbd *sdp = data;
>  	unsigned long t = 1;
>  	DEFINE_WAIT(wait);
> -	bool did_flush;
>  
>  	while (!kthread_should_stop()) {
>  
> @@ -1209,12 +1207,10 @@ int gfs2_logd(void *data)
>  			continue;
>  		}
>  
> -		did_flush = false;
>  		if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
>  			gfs2_ail1_empty(sdp, 0);
>  			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
>  				       GFS2_LFC_LOGD_JFLUSH_REQD);
> -			did_flush = true;
>  		}
>  
>  		if (gfs2_ail_flush_reqd(sdp)) {
> @@ -1223,12 +1219,8 @@ int gfs2_logd(void *data)
>  			gfs2_ail1_empty(sdp, 0);
>  			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
>  				       GFS2_LFC_LOGD_AIL_FLUSH_REQD);
> -			did_flush = true;
>  		}
>  
> -		if (!gfs2_ail_flush_reqd(sdp) || did_flush)
> -			wake_up(&sdp->sd_log_waitq);
> -
>  		t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
>  
>  		try_to_freeze();
> --
> 2.26.2
> 
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Cluster-devel] [PATCH v2 12/20] gfs2: Clean up gfs2_log_reserve
  2020-12-22 14:19   ` Bob Peterson
@ 2020-12-22 15:12     ` Andreas Gruenbacher
  0 siblings, 0 replies; 23+ messages in thread
From: Andreas Gruenbacher @ 2020-12-22 15:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Dec 22, 2020 at 3:19 PM Bob Peterson <rpeterso@redhat.com> wrote:
>
> Hi,
>
> Comments inline.
>
> ----- Original Message -----
> > Wake up log waiters in gfs2_log_release when log space has actually become
> > available.  This is a much better place for the wakeup than gfs2_logd.
> >
> > Check if enough log space is immeditely available before anything else.  If
> > there isn't, use io_wait_event to wait instead of open-coding it.
> >
> > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > ---
> >  fs/gfs2/log.c | 54 ++++++++++++++++++++++-----------------------------
> >  1 file changed, 23 insertions(+), 31 deletions(-)
> >
> > diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> > index 95ad444bd3dc..7a65823ad1f3 100644
> > --- a/fs/gfs2/log.c
> > +++ b/fs/gfs2/log.c
> > @@ -420,6 +420,8 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int
> > blks)
> >       trace_gfs2_log_blocks(sdp, blks);
> >       gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
> >                                 sdp->sd_jdesc->jd_blocks);
> > +     if (atomic_read(&sdp->sd_log_blks_needed))
> > +             wake_up(&sdp->sd_log_waitq);
> >  }
> >
> >  /**
> > @@ -444,36 +446,33 @@ void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned
> > int blks)
> >  {
> >       unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
> >       unsigned wanted = blks + reserved_blks;
> > -     DEFINE_WAIT(wait);
> > -     int did_wait = 0;
> >       unsigned int free_blocks;
> >
> > -     atomic_add(blks, &sdp->sd_log_blks_needed);
> > -retry:
> >       free_blocks = atomic_read(&sdp->sd_log_blks_free);
> > -     if (unlikely(free_blocks <= wanted)) {
> > -             do {
> > -                     prepare_to_wait_exclusive(&sdp->sd_log_waitq, &wait,
> > -                                     TASK_UNINTERRUPTIBLE);
> > +     while (free_blocks >= wanted) {
> > +             if (atomic_try_cmpxchg(&sdp->sd_log_blks_free, &free_blocks,
> > +                                    free_blocks - blks))
> > +                     return;
>
> This would be a good place to have cond_resched() or schedule() or something, no?

No, this is just an atomic_sub_if_positive(); see
atomic_dec_if_positive in atomic-fallback.h.

> > +     }
> > +
> > +     atomic_add(blks, &sdp->sd_log_blks_needed);
> > +     for (;;) {
> > +             if (current != sdp->sd_logd_process)
> >                       wake_up(&sdp->sd_logd_waitq);
> > -                     did_wait = 1;
> > -                     if (atomic_read(&sdp->sd_log_blks_free) <= wanted)
> > -                             io_schedule();
> > -                     free_blocks = atomic_read(&sdp->sd_log_blks_free);
> > -             } while(free_blocks <= wanted);
> > -             finish_wait(&sdp->sd_log_waitq, &wait);
> > +             io_wait_event(sdp->sd_log_waitq,
> > +                     (free_blocks = atomic_read(&sdp->sd_log_blks_free),
> > +                      free_blocks >= wanted));
> > +             do {
> > +                     if (atomic_try_cmpxchg(&sdp->sd_log_blks_free,
> > +                                            &free_blocks,
> > +                                            free_blocks - blks))
> > +                             goto reserved;
>
> Same here. We need to test these patches with a minimal number of cpus.
>
> > +             } while (free_blocks >= wanted);
> >       }
> > -     if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
> > -                             free_blocks - blks) != free_blocks)
> > -             goto retry;
> > -     atomic_sub(blks, &sdp->sd_log_blks_needed);
> > -     trace_gfs2_log_blocks(sdp, -blks);
> >
> > -     /*
> > -      * If we waited, then so might others, wake them up _after_ we get
> > -      * our share of the log.
> > -      */
> > -     if (unlikely(did_wait))
> > +reserved:
> > +     trace_gfs2_log_blocks(sdp, -blks);
> > +     if (atomic_sub_return(blks, &sdp->sd_log_blks_needed))
> >               wake_up(&sdp->sd_log_waitq);
> >  }
> >
> > @@ -1190,7 +1189,6 @@ int gfs2_logd(void *data)
> >       struct gfs2_sbd *sdp = data;
> >       unsigned long t = 1;
> >       DEFINE_WAIT(wait);
> > -     bool did_flush;
> >
> >       while (!kthread_should_stop()) {
> >
> > @@ -1209,12 +1207,10 @@ int gfs2_logd(void *data)
> >                       continue;
> >               }
> >
> > -             did_flush = false;
> >               if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
> >                       gfs2_ail1_empty(sdp, 0);
> >                       gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
> >                                      GFS2_LFC_LOGD_JFLUSH_REQD);
> > -                     did_flush = true;
> >               }
> >
> >               if (gfs2_ail_flush_reqd(sdp)) {
> > @@ -1223,12 +1219,8 @@ int gfs2_logd(void *data)
> >                       gfs2_ail1_empty(sdp, 0);
> >                       gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
> >                                      GFS2_LFC_LOGD_AIL_FLUSH_REQD);
> > -                     did_flush = true;
> >               }
> >
> > -             if (!gfs2_ail_flush_reqd(sdp) || did_flush)
> > -                     wake_up(&sdp->sd_log_waitq);
> > -
> >               t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
> >
> >               try_to_freeze();
> > --
> > 2.26.2
> >
> >

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-12-22 15:12 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-19 20:48 [Cluster-devel] [PATCH v2 00/20] Some log space management cleanups Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 01/20] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 02/20] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 03/20] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 04/20] gfs2: Some documentation updates Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 05/20] gfs2: A minor debugging improvement Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 06/20] gfs2: Rename gfs2_{write => add_aux}_revokes Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 07/20] gfs2: Clean up ail2_empty Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 08/20] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 09/20] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 10/20] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 11/20] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 12/20] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
2020-12-22 14:19   ` Bob Peterson
2020-12-22 15:12     ` Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 13/20] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 14/20] gfs2: Get rid of current_tail() Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 15/20] gfs2: Move function gfs2_ail_empty_tr Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 16/20] gfs2: No revokes for transactions at the tail of the log Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 17/20] gfs2: Remove sd_log_committed_revoke Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 18/20] gfs2: Remove sd_log_blks_reserved Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 19/20] gfs2: Rework the log space allocation logic Andreas Gruenbacher
2020-12-19 20:48 ` [Cluster-devel] [PATCH v2 20/20] gfs2: Hand out revokes to transactions one by one Andreas Gruenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.