* [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups
@ 2020-12-14 8:54 Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
` (11 more replies)
0 siblings, 12 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hello,
here's a bit of fallout in the log space management code that resulted
from looking into a revokes accounting problem in conjunction with the
resource group glock sharing patches. I'm still working on some
follow-up changes, but this patch set should be reasonably
self-contained. Please review.
Thanks,
Andreas
Andreas Gruenbacher (12):
gfs2: Deobfuscate function jdesc_find_i
gfs2: Simplify the buf_limit and databuf_limit definitions
gfs2: Minor gfs2_write_revokes cleanups
gfs2: Some documentation clarifications
gfs2: A minor debugging improvement
gfs2: Clean up ail2_empty
gfs2: Get rid of on-stack transactions
gfs2: Get rid of sd_reserving_log
gfs2: Move lock flush locking to gfs2_trans_{begin,end}
gfs2: Don't wait for journal flush in clean_journal
gfs2: Clean up gfs2_log_reserve
gfs2: Use a tighter bound in gfs2_trans_begin
fs/gfs2/glops.c | 29 ++------
fs/gfs2/incore.h | 4 -
fs/gfs2/log.c | 174 ++++++++++++++++++-------------------------
fs/gfs2/log.h | 3 +-
fs/gfs2/lops.h | 17 +----
fs/gfs2/ops_fstype.c | 2 -
fs/gfs2/super.c | 25 +++----
fs/gfs2/trans.c | 59 ++++++++++-----
fs/gfs2/trans.h | 2 +
9 files changed, 135 insertions(+), 180 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
` (10 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Clean up this function to show that it is trivial.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/super.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index b3d951ab8068..08aef061b29d 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -81,19 +81,12 @@ void gfs2_jindex_free(struct gfs2_sbd *sdp)
static struct gfs2_jdesc *jdesc_find_i(struct list_head *head, unsigned int jid)
{
struct gfs2_jdesc *jd;
- int found = 0;
list_for_each_entry(jd, head, jd_list) {
- if (jd->jd_jid == jid) {
- found = 1;
- break;
- }
+ if (jd->jd_jid == jid)
+ return jd;
}
-
- if (!found)
- jd = NULL;
-
- return jd;
+ return NULL;
}
struct gfs2_jdesc *gfs2_jdesc_find(struct gfs2_sbd *sdp, unsigned int jid)
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
` (9 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
The BUF_OFFSET and DATABUF_OFFSET definitions are only used in buf_limit
and databuf_limit, respectively, and the rounding done in those
definitions is immediately wiped out by dividing by the element size.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/lops.h | 17 ++---------------
1 file changed, 2 insertions(+), 15 deletions(-)
diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index fbdbb08dcec6..3fca5bf239d3 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -10,13 +10,6 @@
#include <linux/list.h>
#include "incore.h"
-#define BUF_OFFSET \
- ((sizeof(struct gfs2_log_descriptor) + sizeof(__be64) - 1) & \
- ~(sizeof(__be64) - 1))
-#define DATABUF_OFFSET \
- ((sizeof(struct gfs2_log_descriptor) + (2 * sizeof(__be64) - 1)) & \
- ~(2 * sizeof(__be64) - 1))
-
extern const struct gfs2_log_operations *gfs2_log_ops[];
extern void gfs2_log_incr_head(struct gfs2_sbd *sdp);
extern u64 gfs2_log_bmap(struct gfs2_jdesc *jd, unsigned int lbn);
@@ -29,18 +22,12 @@ extern int gfs2_find_jhead(struct gfs2_jdesc *jd,
struct gfs2_log_header_host *head, bool keep_cache);
static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
{
- unsigned int limit;
-
- limit = (sdp->sd_sb.sb_bsize - BUF_OFFSET) / sizeof(__be64);
- return limit;
+ return sdp->sd_ldptrs;
}
static inline unsigned int databuf_limit(struct gfs2_sbd *sdp)
{
- unsigned int limit;
-
- limit = (sdp->sd_sb.sb_bsize - DATABUF_OFFSET) / (2 * sizeof(__be64));
- return limit;
+ return sdp->sd_ldptrs / 2;
}
static inline void lops_before_commit(struct gfs2_sbd *sdp,
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications Andreas Gruenbacher
` (8 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Clean up the computations in gfs2_write_revokes (no change in functionality).
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/log.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 2e9314091c81..c65fdb1a30a0 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -712,11 +712,12 @@ void gfs2_glock_remove_revoke(struct gfs2_glock *gl)
void gfs2_write_revokes(struct gfs2_sbd *sdp)
{
/* number of revokes we still have room for */
- int max_revokes = (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_log_descriptor)) / sizeof(u64);
+ unsigned int max_revokes = sdp->sd_ldptrs;
gfs2_log_lock(sdp);
- while (sdp->sd_log_num_revoke > max_revokes)
- max_revokes += (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_meta_header)) / sizeof(u64);
+ if (sdp->sd_log_num_revoke > sdp->sd_ldptrs)
+ max_revokes += roundup(sdp->sd_log_num_revoke - sdp->sd_ldptrs,
+ sdp->sd_inptrs);
max_revokes -= sdp->sd_log_num_revoke;
if (!sdp->sd_log_num_revoke) {
atomic_dec(&sdp->sd_log_blks_free);
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (2 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement Andreas Gruenbacher
` (7 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
The calc_reserved description claims that buf_limit is 502 (on 4k
filesystems), but it is actually 503. Fix / clarify the entire
description.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/log.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index c65fdb1a30a0..f7c225520c38 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -50,10 +50,12 @@ unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct)
unsigned int blks;
unsigned int first, second;
+ /* The initial struct gfs2_log_descriptor block */
blks = 1;
first = sdp->sd_ldptrs;
if (nstruct > first) {
+ /* Subsequent struct gfs2_meta_header blocks */
second = sdp->sd_inptrs;
blks += DIV_ROUND_UP(nstruct - first, second);
}
@@ -507,24 +509,20 @@ static inline unsigned int log_distance(struct gfs2_sbd *sdp, unsigned int newer
}
/**
- * calc_reserved - Calculate the number of blocks to reserve when
- * refunding a transaction's unused buffers.
+ * calc_reserved - Calculate the number of blocks to keep reserved
* @sdp: The GFS2 superblock
*
* This is complex. We need to reserve room for all our currently used
- * metadata buffers (e.g. normal file I/O rewriting file time stamps) and
- * all our journaled data buffers for journaled files (e.g. files in the
+ * metadata blocks (e.g. normal file I/O rewriting file time stamps) and
+ * all our journaled data blocks for journaled files (e.g. files in the
* meta_fs like rindex, or files for which chattr +j was done.)
- * If we don't reserve enough space, gfs2_log_refund and gfs2_log_flush
- * will count it as free space (sd_log_blks_free) and corruption will follow.
+ * If we don't reserve enough space, corruption will follow.
*
- * We can have metadata bufs and jdata bufs in the same journal. So each
- * type gets its own log header, for which we need to reserve a block.
- * In fact, each type has the potential for needing more than one header
- * in cases where we have more buffers than will fit on a journal page.
+ * We can have metadata blocks and jdata blocks in the same journal. Each
+ * type gets its own log descriptor, for which we need to reserve a block.
+ * In fact, each type has the potential for needing more than one log descriptor
+ * in cases where we have more blocks than will fit in a log descriptor.
* Metadata journal entries take up half the space of journaled buffer entries.
- * Thus, metadata entries have buf_limit (502) and journaled buffers have
- * databuf_limit (251) before they cause a wrap around.
*
* Also, we need to reserve blocks for revoke journal entries and one for an
* overall header for the lot.
@@ -1007,7 +1005,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
if (sdp->sd_log_head != sdp->sd_log_flush_head) {
log_flush_wait(sdp);
log_write_header(sdp, flags);
- } else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle){
+ } else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle) {
atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
trace_gfs2_log_blocks(sdp, -1);
log_write_header(sdp, flags);
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (3 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty Andreas Gruenbacher
` (6 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Split the assert in gfs2_trans_end into two parts.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/trans.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 6d4bf7ea7b3b..7705f04621f4 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -109,8 +109,8 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
nbuf -= tr->tr_num_buf_rm;
nbuf -= tr->tr_num_databuf_rm;
- if (gfs2_assert_withdraw(sdp, (nbuf <= tr->tr_blocks) &&
- (tr->tr_num_revoke <= tr->tr_revokes)))
+ if (gfs2_assert_withdraw(sdp, nbuf <= tr->tr_blocks) ||
+ gfs2_assert_withdraw(sdp, tr->tr_num_revoke <= tr->tr_revokes))
gfs2_print_trans(sdp, tr);
gfs2_log_commit(sdp, tr);
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (4 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
` (5 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Clean up the logic in ail2_empty (no functional change).
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/log.c | 38 +++++++++++++++++++++-----------------
1 file changed, 21 insertions(+), 17 deletions(-)
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index f7c225520c38..33f7b1544958 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -367,29 +367,33 @@ static void gfs2_ail_empty_tr(struct gfs2_sbd *sdp, struct gfs2_trans *tr,
}
}
+static void __ail2_empty(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
+{
+ gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
+ list_del(&tr->tr_list);
+ gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));
+ gfs2_assert_warn(sdp, list_empty(&tr->tr_ail2_list));
+ gfs2_trans_free(sdp, tr);
+}
+
static void ail2_empty(struct gfs2_sbd *sdp, unsigned int new_tail)
{
- struct gfs2_trans *tr, *safe;
+ struct list_head *ail2_list = &sdp->sd_ail2_list;
unsigned int old_tail = sdp->sd_log_tail;
- int wrap = (new_tail < old_tail);
- int a, b, rm;
+ struct gfs2_trans *tr, *safe;
spin_lock(&sdp->sd_ail_lock);
-
- list_for_each_entry_safe(tr, safe, &sdp->sd_ail2_list, tr_list) {
- a = (old_tail <= tr->tr_first);
- b = (tr->tr_first < new_tail);
- rm = (wrap) ? (a || b) : (a && b);
- if (!rm)
- continue;
-
- gfs2_ail_empty_tr(sdp, tr, &tr->tr_ail2_list);
- list_del(&tr->tr_list);
- gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));
- gfs2_assert_warn(sdp, list_empty(&tr->tr_ail2_list));
- gfs2_trans_free(sdp, tr);
+ if (old_tail <= new_tail) {
+ list_for_each_entry_safe(tr, safe, ail2_list, tr_list) {
+ if (old_tail <= tr->tr_first && tr->tr_first < new_tail)
+ __ail2_empty(sdp, tr);
+ }
+ } else {
+ list_for_each_entry_safe(tr, safe, ail2_list, tr_list) {
+ if (old_tail <= tr->tr_first || tr->tr_first < new_tail)
+ __ail2_empty(sdp, tr);
+ }
}
-
spin_unlock(&sdp->sd_ail_lock);
}
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (5 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 14:02 ` Bob Peterson
2020-12-14 8:54 ` [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
` (4 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
On-stack transactions were introduced to work around a transaction glock
deadlock in gfs2_trans_begin in commit d8348de06f70 ("GFS2: Fix deadlock
on journal flush"). Subsequently, transaction glocks were eliminated in
favor of the more efficient freeze glocks in commit 24972557b12c ("GFS2:
remove transaction glock") without also removing the on-stack
transactions.
It has now turned out that committing on-stack transactions
significantly complicates journal free space accounting when no system
transaction (sdp->sd_log_tr) is active at the time. It doesn't seem
that on-stack transactions provide a significant benefit beyond their
original purpose (as an optimization), so remove them to allow fixing
the journal free space accounting in a reasonable way in a subsequent
patch.
FIXME: Can we better handle a gfs2_trans_begin failure in gfs2_ail_empty_gl?
If we skip the __gfs2_ail_flush, we'll just end up with leftover items on
gl_ail_list.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/glops.c | 29 +++++++----------------------
fs/gfs2/incore.h | 1 -
fs/gfs2/log.c | 1 -
fs/gfs2/trans.c | 25 +++++++++++++------------
fs/gfs2/trans.h | 2 ++
5 files changed, 22 insertions(+), 36 deletions(-)
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 3faa421568b0..853e590ccc15 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -84,18 +84,11 @@ static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync,
static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
{
+ unsigned int revokes = atomic_read(&gl->gl_ail_count);
struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
- struct gfs2_trans tr;
int ret;
- memset(&tr, 0, sizeof(tr));
- INIT_LIST_HEAD(&tr.tr_buf);
- INIT_LIST_HEAD(&tr.tr_databuf);
- INIT_LIST_HEAD(&tr.tr_ail1_list);
- INIT_LIST_HEAD(&tr.tr_ail2_list);
- tr.tr_revokes = atomic_read(&gl->gl_ail_count);
-
- if (!tr.tr_revokes) {
+ if (!revokes) {
bool have_revokes;
bool log_in_flight;
@@ -122,20 +115,12 @@ static int gfs2_ail_empty_gl(struct gfs2_glock *gl)
return 0;
}
- /* A shortened, inline version of gfs2_trans_begin()
- * tr->alloced is not set since the transaction structure is
- * on the stack */
- tr.tr_reserved = 1 + gfs2_struct2blk(sdp, tr.tr_revokes);
- tr.tr_ip = _RET_IP_;
- ret = gfs2_log_reserve(sdp, tr.tr_reserved);
- if (ret < 0)
- return ret;
- WARN_ON_ONCE(current->journal_info);
- current->journal_info = &tr;
-
- __gfs2_ail_flush(gl, 0, tr.tr_revokes);
-
+ ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
+ if (ret)
+ goto flush;
+ __gfs2_ail_flush(gl, 0, revokes);
gfs2_trans_end(sdp);
+
flush:
gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
GFS2_LFC_AIL_EMPTY_GL);
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index f8858d995b24..adea010124aa 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -491,7 +491,6 @@ struct gfs2_quota_data {
enum {
TR_TOUCHED = 1,
TR_ATTACHED = 2,
- TR_ALLOCED = 3,
};
struct gfs2_trans {
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 33f7b1544958..4b090f8afb37 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1113,7 +1113,6 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
if (sdp->sd_log_tr) {
gfs2_merge_trans(sdp, tr);
} else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
- gfs2_assert_withdraw(sdp, test_bit(TR_ALLOCED, &tr->tr_flags));
sdp->sd_log_tr = tr;
set_bit(TR_ATTACHED, &tr->tr_flags);
}
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 7705f04621f4..4f461ab37ced 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -37,8 +37,8 @@ static void gfs2_print_trans(struct gfs2_sbd *sdp, const struct gfs2_trans *tr)
tr->tr_num_revoke, tr->tr_num_revoke_rm);
}
-int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
- unsigned int revokes)
+int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+ unsigned int revokes, gfp_t gfp_mask)
{
struct gfs2_trans *tr;
int error;
@@ -52,7 +52,7 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
if (!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
return -EROFS;
- tr = kmem_cache_zalloc(gfs2_trans_cachep, GFP_NOFS);
+ tr = kmem_cache_zalloc(gfs2_trans_cachep, gfp_mask);
if (!tr)
return -ENOMEM;
@@ -60,7 +60,6 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
tr->tr_blocks = blocks;
tr->tr_revokes = revokes;
tr->tr_reserved = 1;
- set_bit(TR_ALLOCED, &tr->tr_flags);
if (blocks)
tr->tr_reserved += 6 + blocks;
if (revokes)
@@ -88,20 +87,23 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
return error;
}
+int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+ unsigned int revokes)
+{
+ return __gfs2_trans_begin(sdp, blocks, revokes, GFP_NOFS);
+}
+
void gfs2_trans_end(struct gfs2_sbd *sdp)
{
struct gfs2_trans *tr = current->journal_info;
s64 nbuf;
- int alloced = test_bit(TR_ALLOCED, &tr->tr_flags);
current->journal_info = NULL;
if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
gfs2_log_release(sdp, tr->tr_reserved);
- if (alloced) {
- gfs2_trans_free(sdp, tr);
- sb_end_intwrite(sdp->sd_vfs);
- }
+ gfs2_trans_free(sdp, tr);
+ sb_end_intwrite(sdp->sd_vfs);
return;
}
@@ -114,15 +116,14 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
gfs2_print_trans(sdp, tr);
gfs2_log_commit(sdp, tr);
- if (alloced && !test_bit(TR_ATTACHED, &tr->tr_flags))
+ if (!test_bit(TR_ATTACHED, &tr->tr_flags))
gfs2_trans_free(sdp, tr);
up_read(&sdp->sd_log_flush_lock);
if (sdp->sd_vfs->s_flags & SB_SYNCHRONOUS)
gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
GFS2_LFC_TRANS_END);
- if (alloced)
- sb_end_intwrite(sdp->sd_vfs);
+ sb_end_intwrite(sdp->sd_vfs);
}
static struct gfs2_bufdata *gfs2_alloc_bufdata(struct gfs2_glock *gl,
diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
index 83199ce5a5c5..9c732a5f28bf 100644
--- a/fs/gfs2/trans.h
+++ b/fs/gfs2/trans.h
@@ -34,6 +34,8 @@ static inline unsigned int gfs2_rg_blocks(const struct gfs2_inode *ip, unsigned
return rgd->rd_length;
}
+extern int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
+ unsigned int revokes, gfp_t gfp_mask);
extern int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
unsigned int revokes);
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (6 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
` (3 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
This counter and the associated wait queue are only used so that
gfs2_make_fs_ro can efficiently wait for all pending log space
allocations to fail after setting the filesystem to read-only. This
comes at the cost of waking up that wait queue very frequently.
Instead, when gfs2_log_reserve fails because the filesystem has become
read-only, Wake up sd_log_waitq. In gfs2_make_fs_ro, set the file
system read-only and then wait until all the log space has been
released. Give up and report the problem after a while. With that,
sd_reserving_log and sd_reserving_log_wait can be removed.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/incore.h | 3 ---
fs/gfs2/log.c | 17 ++++++++++-------
fs/gfs2/log.h | 1 +
fs/gfs2/ops_fstype.c | 2 --
fs/gfs2/super.c | 12 ++++++------
fs/gfs2/trans.c | 5 ++++-
6 files changed, 21 insertions(+), 19 deletions(-)
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index adea010124aa..ecc3f596319c 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -849,9 +849,6 @@ struct gfs2_sbd {
int sd_log_error; /* First log error */
wait_queue_head_t sd_withdraw_wait;
- atomic_t sd_reserving_log;
- wait_queue_head_t sd_reserving_log_wait;
-
unsigned int sd_log_flush_head;
spinlock_t sd_ail_lock;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 4b090f8afb37..8c46f8f64c9e 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -414,6 +414,15 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
up_read(&sdp->sd_log_flush_lock);
}
+/**
+ * gfs2_log_is_empty - Check if the log is empty
+ * @sdp: The GFS2 superblock
+ */
+
+bool gfs2_log_is_empty(struct gfs2_sbd *sdp) {
+ return atomic_read(&sdp->sd_log_blks_free) == sdp->sd_jdesc->jd_blocks;
+}
+
/**
* gfs2_log_reserve - Make a log reservation
* @sdp: The GFS2 superblock
@@ -461,13 +470,9 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
} while(free_blocks <= wanted);
finish_wait(&sdp->sd_log_waitq, &wait);
}
- atomic_inc(&sdp->sd_reserving_log);
if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
- free_blocks - blks) != free_blocks) {
- if (atomic_dec_and_test(&sdp->sd_reserving_log))
- wake_up(&sdp->sd_reserving_log_wait);
+ free_blocks - blks) != free_blocks)
goto retry;
- }
atomic_sub(blks, &sdp->sd_log_blks_needed);
trace_gfs2_log_blocks(sdp, -blks);
@@ -483,8 +488,6 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
gfs2_log_release(sdp, blks);
ret = -EROFS;
}
- if (atomic_dec_and_test(&sdp->sd_reserving_log))
- wake_up(&sdp->sd_reserving_log_wait);
return ret;
}
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 79f97290146e..f483e706db3a 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -66,6 +66,7 @@ extern unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct);
extern void gfs2_remove_from_ail(struct gfs2_bufdata *bd);
extern void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
extern int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
+extern bool gfs2_log_is_empty(struct gfs2_sbd *sdp);
extern void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
u64 seq, u32 tail, u32 lblock, u32 flags,
int op_flags);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 61fce59cb4d3..986dc2ebebf0 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -136,8 +136,6 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
init_rwsem(&sdp->sd_log_flush_lock);
atomic_set(&sdp->sd_log_in_flight, 0);
- atomic_set(&sdp->sd_reserving_log, 0);
- init_waitqueue_head(&sdp->sd_reserving_log_wait);
init_waitqueue_head(&sdp->sd_log_flush_wait);
atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
mutex_init(&sdp->sd_freeze_mutex);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 08aef061b29d..7da55d9a9792 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -647,13 +647,13 @@ int gfs2_make_fs_ro(struct gfs2_sbd *sdp)
gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
GFS2_LFC_MAKE_FS_RO);
- wait_event(sdp->sd_reserving_log_wait,
- atomic_read(&sdp->sd_reserving_log) == 0);
- gfs2_assert_warn(sdp, atomic_read(&sdp->sd_log_blks_free) ==
- sdp->sd_jdesc->jd_blocks);
+ wait_event_timeout(sdp->sd_log_waitq,
+ gfs2_log_is_empty(sdp),
+ HZ * 5);
+ gfs2_assert_warn(sdp, gfs2_log_is_empty(sdp));
} else {
- wait_event_timeout(sdp->sd_reserving_log_wait,
- atomic_read(&sdp->sd_reserving_log) == 0,
+ wait_event_timeout(sdp->sd_log_waitq,
+ gfs2_log_is_empty(sdp),
HZ * 5);
}
if (gfs2_holder_initialized(&freeze_gh))
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 4f461ab37ced..e33cb8da056a 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -73,8 +73,11 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
sb_start_intwrite(sdp->sd_vfs);
error = gfs2_log_reserve(sdp, tr->tr_reserved);
- if (error)
+ if (error) {
+ if (error == -EROFS)
+ wake_up(&sdp->sd_log_waitq);
goto fail;
+ }
current->journal_info = tr;
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end}
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (7 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
` (2 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Move the read locking of sd_log_flush_lock from gfs2_log_reserve to
gfs2_trans_begin, and its unlocking from gfs2_log_release to
gfs2_trans_end. Use gfs2_log_release in two places in which it was open
coded before.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/log.c | 28 +++-------------------------
fs/gfs2/log.h | 2 +-
fs/gfs2/trans.c | 23 ++++++++++++++++-------
3 files changed, 20 insertions(+), 33 deletions(-)
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 8c46f8f64c9e..b0d4ce4992ed 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -411,7 +411,6 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
trace_gfs2_log_blocks(sdp, blks);
gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
sdp->sd_jdesc->jd_blocks);
- up_read(&sdp->sd_log_flush_lock);
}
/**
@@ -439,22 +438,16 @@ bool gfs2_log_is_empty(struct gfs2_sbd *sdp) {
* with queued waiters, we use an exclusive wait. This means that when we
* get woken with enough journal space to get our reservation, we need to
* wake the next waiter on the list.
- *
- * Returns: errno
*/
-int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
+void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
{
- int ret = 0;
unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
unsigned wanted = blks + reserved_blks;
DEFINE_WAIT(wait);
int did_wait = 0;
unsigned int free_blocks;
- if (gfs2_assert_warn(sdp, blks) ||
- gfs2_assert_warn(sdp, blks <= sdp->sd_jdesc->jd_blocks))
- return -EINVAL;
atomic_add(blks, &sdp->sd_log_blks_needed);
retry:
free_blocks = atomic_read(&sdp->sd_log_blks_free);
@@ -482,13 +475,6 @@ int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
*/
if (unlikely(did_wait))
wake_up(&sdp->sd_log_waitq);
-
- down_read(&sdp->sd_log_flush_lock);
- if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
- gfs2_log_release(sdp, blks);
- ret = -EROFS;
- }
- return ret;
}
/**
@@ -585,12 +571,7 @@ static void log_pull_tail(struct gfs2_sbd *sdp, unsigned int new_tail)
unsigned int dist = log_distance(sdp, new_tail, sdp->sd_log_tail);
ail2_empty(sdp, new_tail);
-
- atomic_add(dist, &sdp->sd_log_blks_free);
- trace_gfs2_log_blocks(sdp, dist);
- gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
- sdp->sd_jdesc->jd_blocks);
-
+ gfs2_log_release(sdp, dist);
sdp->sd_log_tail = new_tail;
}
@@ -1125,10 +1106,7 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
maxres = sdp->sd_log_blks_reserved + tr->tr_reserved;
gfs2_assert_withdraw(sdp, maxres >= reserved);
unused = maxres - reserved;
- atomic_add(unused, &sdp->sd_log_blks_free);
- trace_gfs2_log_blocks(sdp, unused);
- gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
- sdp->sd_jdesc->jd_blocks);
+ gfs2_log_release(sdp, unused);
sdp->sd_log_blks_reserved = reserved;
gfs2_log_unlock(sdp);
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index f483e706db3a..db3bb3564ea0 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -65,7 +65,7 @@ extern void gfs2_ordered_del_inode(struct gfs2_inode *ip);
extern unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct);
extern void gfs2_remove_from_ail(struct gfs2_bufdata *bd);
extern void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
-extern int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
+extern void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
extern bool gfs2_log_is_empty(struct gfs2_sbd *sdp);
extern void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
u64 seq, u32 tail, u32 lblock, u32 flags,
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index e33cb8da056a..c9d8247ffa19 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -70,12 +70,22 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
INIT_LIST_HEAD(&tr->tr_ail1_list);
INIT_LIST_HEAD(&tr->tr_ail2_list);
+ if (gfs2_assert_warn(sdp, tr->tr_reserved <= sdp->sd_jdesc->jd_blocks)) {
+ error = -EINVAL;
+ goto fail;
+ }
+
sb_start_intwrite(sdp->sd_vfs);
- error = gfs2_log_reserve(sdp, tr->tr_reserved);
- if (error) {
- if (error == -EROFS)
- wake_up(&sdp->sd_log_waitq);
+ gfs2_log_reserve(sdp, tr->tr_reserved);
+
+ down_read(&sdp->sd_log_flush_lock);
+ if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
+ gfs2_log_release(sdp, tr->tr_reserved);
+ up_read(&sdp->sd_log_flush_lock);
+ sb_end_intwrite(sdp->sd_vfs);
+ wake_up(&sdp->sd_log_waitq);
+ error = -EROFS;
goto fail;
}
@@ -84,9 +94,7 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
return 0;
fail:
- sb_end_intwrite(sdp->sd_vfs);
kmem_cache_free(gfs2_trans_cachep, tr);
-
return error;
}
@@ -105,8 +113,9 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
gfs2_log_release(sdp, tr->tr_reserved);
- gfs2_trans_free(sdp, tr);
+ up_read(&sdp->sd_log_flush_lock);
sb_end_intwrite(sdp->sd_vfs);
+ gfs2_trans_free(sdp, tr);
return;
}
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (8 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Commit 588bff95c94e added gfs2_write_log_header() and started using it in
clean_journal(), with an additional call to log_flush_wait() at the end of
gfs2_write_log_header() which is unnecessary for clean_journal(). Move
that call out of gfs2_write_log_header() to restore the previous behavior.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/log.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index b0d4ce4992ed..a8cc71ba4852 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -756,7 +756,7 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
u64 dblock;
if (gfs2_withdrawn(sdp))
- goto out;
+ return;
page = mempool_alloc(gfs2_page_pool, GFP_NOIO);
lh = page_address(page);
@@ -811,8 +811,6 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
gfs2_log_write(sdp, page, sb->s_blocksize, 0, dblock);
gfs2_log_submit_bio(&sdp->sd_log_bio, REQ_OP_WRITE | op_flags);
-out:
- log_flush_wait(sdp);
}
/**
@@ -841,6 +839,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags)
gfs2_write_log_header(sdp, sdp->sd_jdesc, sdp->sd_log_sequence++, tail,
sdp->sd_log_flush_head, flags, op_flags);
gfs2_log_incr_head(sdp);
+ log_flush_wait(sdp);
if (sdp->sd_log_tail != tail)
log_pull_tail(sdp, tail);
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (9 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Wake up log waiters in gfs2_log_release when log space has actually become
available. This is a much better place for the wakeup than gfs2_logd.
Check if enough log space is immeditely available before anything else. If
there isn't, use io_wait_event to wait instead of open coding it.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/log.c | 58 ++++++++++++++++++++++-----------------------------
1 file changed, 25 insertions(+), 33 deletions(-)
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index a8cc71ba4852..36dbec4f9a54 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -411,6 +411,8 @@ void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
trace_gfs2_log_blocks(sdp, blks);
gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
sdp->sd_jdesc->jd_blocks);
+ if (atomic_read(&sdp->sd_log_blks_needed))
+ wake_up(&sdp->sd_log_waitq);
}
/**
@@ -444,37 +446,34 @@ void gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
{
unsigned reserved_blks = 7 * (4096 / sdp->sd_vfs->s_blocksize);
unsigned wanted = blks + reserved_blks;
- DEFINE_WAIT(wait);
- int did_wait = 0;
unsigned int free_blocks;
- atomic_add(blks, &sdp->sd_log_blks_needed);
-retry:
free_blocks = atomic_read(&sdp->sd_log_blks_free);
- if (unlikely(free_blocks <= wanted)) {
- do {
- prepare_to_wait_exclusive(&sdp->sd_log_waitq, &wait,
- TASK_UNINTERRUPTIBLE);
- wake_up(&sdp->sd_logd_waitq);
- did_wait = 1;
- if (atomic_read(&sdp->sd_log_blks_free) <= wanted)
- io_schedule();
- free_blocks = atomic_read(&sdp->sd_log_blks_free);
- } while(free_blocks <= wanted);
- finish_wait(&sdp->sd_log_waitq, &wait);
+ while (free_blocks >= wanted) {
+ if (atomic_try_cmpxchg(&sdp->sd_log_blks_free, &free_blocks,
+ free_blocks - blks))
+ return;
}
- if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
- free_blocks - blks) != free_blocks)
- goto retry;
- atomic_sub(blks, &sdp->sd_log_blks_needed);
- trace_gfs2_log_blocks(sdp, -blks);
- /*
- * If we waited, then so might others, wake them up _after_ we get
- * our share of the log.
- */
- if (unlikely(did_wait))
- wake_up(&sdp->sd_log_waitq);
+ atomic_add(blks, &sdp->sd_log_blks_needed);
+ for (;;) {
+ if (current != sdp->sd_logd_process)
+ wake_up(&sdp->sd_logd_waitq);
+ io_wait_event(sdp->sd_log_waitq,
+ (free_blocks = atomic_read(&sdp->sd_log_blks_free),
+ free_blocks >= wanted));
+ do {
+ if (atomic_try_cmpxchg(&sdp->sd_log_blks_free,
+ &free_blocks,
+ free_blocks - blks)) {
+ trace_gfs2_log_blocks(sdp, -blks);
+ if (atomic_sub_return(blks,
+ &sdp->sd_log_blks_needed))
+ wake_up(&sdp->sd_log_waitq);
+ return;
+ }
+ } while (free_blocks >= wanted);
+ }
}
/**
@@ -1190,7 +1189,6 @@ int gfs2_logd(void *data)
struct gfs2_sbd *sdp = data;
unsigned long t = 1;
DEFINE_WAIT(wait);
- bool did_flush;
while (!kthread_should_stop()) {
@@ -1209,12 +1207,10 @@ int gfs2_logd(void *data)
continue;
}
- did_flush = false;
if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
gfs2_ail1_empty(sdp, 0);
gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
GFS2_LFC_LOGD_JFLUSH_REQD);
- did_flush = true;
}
if (gfs2_ail_flush_reqd(sdp)) {
@@ -1223,12 +1219,8 @@ int gfs2_logd(void *data)
gfs2_ail1_empty(sdp, 0);
gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
GFS2_LFC_LOGD_AIL_FLUSH_REQD);
- did_flush = true;
}
- if (!gfs2_ail_flush_reqd(sdp) || did_flush)
- wake_up(&sdp->sd_log_waitq);
-
t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
try_to_freeze();
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
` (10 preceding siblings ...)
2020-12-14 8:54 ` [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
@ 2020-12-14 8:54 ` Andreas Gruenbacher
11 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 8:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
Use a tighter bound for the number of blocks required by transactions in
gfs2_trans_begin: in the worst case, we'll have mixed data and metadata,
so we'll need a log desciptor for each type.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/trans.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index c9d8247ffa19..6a6b07499d07 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -60,8 +60,14 @@ int __gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
tr->tr_blocks = blocks;
tr->tr_revokes = revokes;
tr->tr_reserved = 1;
- if (blocks)
- tr->tr_reserved += 6 + blocks;
+ if (blocks) {
+ /*
+ * The reserved blocks are either used for data or metadata.
+ * We can have mixed data and metadata, each with its own log
+ * descriptor block; see calc_reserved().
+ */
+ tr->tr_reserved += blocks + 1 + DIV_ROUND_UP(blocks - 1, databuf_limit(sdp));
+ }
if (revokes)
tr->tr_reserved += gfs2_struct2blk(sdp, revokes);
INIT_LIST_HEAD(&tr->tr_databuf);
--
2.26.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
2020-12-14 8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
@ 2020-12-14 14:02 ` Bob Peterson
2020-12-14 14:05 ` Steven Whitehouse
0 siblings, 1 reply; 16+ messages in thread
From: Bob Peterson @ 2020-12-14 14:02 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
----- Original Message -----
> + ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
The addition of __GFP_NOFAIL means that this operation can now block.
Looking at the code, I don't think it will be a problem because it can
already block in the log_flush operations that precede it, but it
makes me nervous. Obviously, we need to test this really well.
Bob
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
2020-12-14 14:02 ` Bob Peterson
@ 2020-12-14 14:05 ` Steven Whitehouse
2020-12-14 17:08 ` Andreas Gruenbacher
0 siblings, 1 reply; 16+ messages in thread
From: Steven Whitehouse @ 2020-12-14 14:05 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On 14/12/2020 14:02, Bob Peterson wrote:
> Hi,
>
> ----- Original Message -----
>> + ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
> The addition of __GFP_NOFAIL means that this operation can now block.
> Looking at the code, I don't think it will be a problem because it can
> already block in the log_flush operations that precede it, but it
> makes me nervous. Obviously, we need to test this really well.
>
> Bob
>
Not sure of the context here exactly, but why are we adding an instance
of __GFP_NOFAIL? There is already a return code there so that we can
fail in that case if required,
Steve.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions
2020-12-14 14:05 ` Steven Whitehouse
@ 2020-12-14 17:08 ` Andreas Gruenbacher
0 siblings, 0 replies; 16+ messages in thread
From: Andreas Gruenbacher @ 2020-12-14 17:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon, Dec 14, 2020 at 3:06 PM Steven Whitehouse <swhiteho@redhat.com> wrote:
> On 14/12/2020 14:02, Bob Peterson wrote:
> > Hi,
> >
> > ----- Original Message -----
> >> + ret = __gfs2_trans_begin(sdp, 0, revokes, GFP_NOFS | __GFP_NOFAIL);
> > The addition of __GFP_NOFAIL means that this operation can now block.
> > Looking at the code, I don't think it will be a problem because it can
> > already block in the log_flush operations that precede it, but it
> > makes me nervous. Obviously, we need to test this really well.
> >
> > Bob
> >
> Not sure of the context here exactly, but why are we adding an instance
> of __GFP_NOFAIL? There is already a return code there so that we can
> fail in that case if required,
__GFS_NOFAIL is used in {inode,rgrp}_go_sync -> gfs2_ail_empty_gl ->
__gfs2_trans_begin, and there isn't any reasonable way to react an
-ENOMEM error but to repeat the allocation.
Thanks,
Andreas
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2020-12-14 17:08 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14 8:54 [Cluster-devel] [RFC PATCH 00/12] Some log space management cleanups Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 01/12] gfs2: Deobfuscate function jdesc_find_i Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 02/12] gfs2: Simplify the buf_limit and databuf_limit definitions Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 03/12] gfs2: Minor gfs2_write_revokes cleanups Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 04/12] gfs2: Some documentation clarifications Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 05/12] gfs2: A minor debugging improvement Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up ail2_empty Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 07/12] gfs2: Get rid of on-stack transactions Andreas Gruenbacher
2020-12-14 14:02 ` Bob Peterson
2020-12-14 14:05 ` Steven Whitehouse
2020-12-14 17:08 ` Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 08/12] gfs2: Get rid of sd_reserving_log Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 09/12] gfs2: Move lock flush locking to gfs2_trans_{begin, end} Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 10/12] gfs2: Don't wait for journal flush in clean_journal Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 11/12] gfs2: Clean up gfs2_log_reserve Andreas Gruenbacher
2020-12-14 8:54 ` [Cluster-devel] [PATCH 12/12] gfs2: Use a tighter bound in gfs2_trans_begin Andreas Gruenbacher
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.