All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [RFC v2 PATCH 0/5] Speed up journal head lookup
@ 2018-08-13  4:48 Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 1/5] gfs2: allow map_journal_extents() to take a journal descriptor as argument Abhi Das
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Abhi Das @ 2018-08-13  4:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This is a revised version of the patch set I'd posted
earlier to speed up jhead lookup during recovery.

I've made some changes as per Steve's suggestions based on
the previous version:
https://www.redhat.com/archives/cluster-devel/2018-May/msg00088.html

As before, this patchset is based on the latest RHEL7 codebase as it is
easier for me to test. Upstream version shouldn't be very dissimilar and
I'll post the upstream port if it looks good.

I'll do a bit more testing and report some performance numbers shortly.

Cheers!
--Abhi

Abhi Das (5):
  gfs2: allow map_journal_extents() to take a journal descriptor as
    argument
  gfs2: add timing info for various stages of journal recovery
  gfs2: changes to gfs2_log_XXX_bio
  gfs2: read journal in large chunks to locate the head
  gfs2: add tracepoint debugging for gfs2_end_log_read

 fs/gfs2/incore.h     |   9 +++-
 fs/gfs2/log.c        |   4 +-
 fs/gfs2/log.h        |   1 +
 fs/gfs2/lops.c       | 142 +++++++++++++++++++++++++++++++++++++++++++--------
 fs/gfs2/lops.h       |  15 +++++-
 fs/gfs2/ops_fstype.c |  12 +++--
 fs/gfs2/recovery.c   | 138 ++++++++++---------------------------------------
 fs/gfs2/recovery.h   |   1 +
 fs/gfs2/trace_gfs2.h |  25 +++++++++
 9 files changed, 208 insertions(+), 139 deletions(-)

-- 
2.4.11



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Cluster-devel] [RFC v2 PATCH 1/5] gfs2: allow map_journal_extents() to take a journal descriptor as argument
  2018-08-13  4:48 [Cluster-devel] [RFC v2 PATCH 0/5] Speed up journal head lookup Abhi Das
@ 2018-08-13  4:48 ` Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 2/5] gfs2: add timing info for various stages of journal recovery Abhi Das
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Abhi Das @ 2018-08-13  4:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This function now maps the extents for the journal whose descriptor
is passed in as argument.

Signed-off-by: Abhi Das <adas@redhat.com>
---
 fs/gfs2/log.h        | 1 +
 fs/gfs2/ops_fstype.c | 5 ++---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 92dcbe7..19c93df 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -75,4 +75,5 @@ extern int gfs2_logd(void *data);
 extern void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd);
 extern void gfs2_write_revokes(struct gfs2_sbd *sdp);
 
+extern int map_journal_extents(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd);
 #endif /* __LOG_DOT_H__ */
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 228f38e..cf3e366 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -524,9 +524,8 @@ out:
  *       but since it's only done at mount time, I'm not worried about the
  *       time it takes.
  */
-static int map_journal_extents(struct gfs2_sbd *sdp)
+int map_journal_extents(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd)
 {
-	struct gfs2_jdesc *jd = sdp->sd_jdesc;
 	unsigned int lb;
 	u64 db, prev_db; /* logical block, disk block, prev disk block */
 	struct gfs2_inode *ip = GFS2_I(jd->jd_inode);
@@ -772,7 +771,7 @@ static int init_journal(struct gfs2_sbd *sdp, int undo)
 		atomic_set(&sdp->sd_log_thresh2, 4*sdp->sd_jdesc->jd_blocks/5);
 
 		/* Map the extents for this journal's blocks */
-		map_journal_extents(sdp);
+		map_journal_extents(sdp, sdp->sd_jdesc);
 	}
 	trace_gfs2_log_blocks(sdp, atomic_read(&sdp->sd_log_blks_free));
 
-- 
2.4.11



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Cluster-devel] [RFC v2 PATCH 2/5] gfs2: add timing info for various stages of journal recovery
  2018-08-13  4:48 [Cluster-devel] [RFC v2 PATCH 0/5] Speed up journal head lookup Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 1/5] gfs2: allow map_journal_extents() to take a journal descriptor as argument Abhi Das
@ 2018-08-13  4:48 ` Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 3/5] gfs2: changes to gfs2_log_XXX_bio Abhi Das
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Abhi Das @ 2018-08-13  4:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Tells you how many milliseconds each stage of journal recovery
takes.

Signed-off-by: Abhi Das <adas@redhat.com>
---
 fs/gfs2/ops_fstype.c |  5 +++++
 fs/gfs2/recovery.c   | 20 ++++++++++++++------
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index cf3e366..fd460c1 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -532,7 +532,9 @@ int map_journal_extents(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd)
 	struct gfs2_journal_extent *jext = NULL;
 	struct buffer_head bh;
 	int rc = 0;
+	ktime_t start, end;
 
+	start = ktime_get();
 	prev_db = 0;
 
 	for (lb = 0; lb < i_size_read(jd->jd_inode) >> sdp->sd_sb.sb_bsize_shift; lb++) {
@@ -564,6 +566,9 @@ int map_journal_extents(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd)
 		}
 		prev_db = db;
 	}
+	end = ktime_get();
+	fs_info(sdp, "jid=%u: Journal extent mapped in %lldms\n", jd->jd_jid,
+		ktime_ms_delta(end, start));
 	return rc;
 }
 
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 56dea44..4b042db 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -14,6 +14,7 @@
 #include <linux/buffer_head.h>
 #include <linux/gfs2_ondisk.h>
 #include <linux/crc32.h>
+#include <linux/ktime.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -455,12 +456,13 @@ void gfs2_recover_func(struct work_struct *work)
 	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
 	struct gfs2_log_header_host head;
 	struct gfs2_holder j_gh, ji_gh, t_gh;
-	unsigned long t;
+	ktime_t t_start, t_jlck, t_jhd, t_tlck, t_rep;
 	int ro = 0;
 	unsigned int pass;
 	int error;
 	int jlocked = 0;
 
+	t_start = ktime_get();
 	if (sdp->sd_args.ar_spectator ||
 	    (jd->jd_jid != sdp->sd_lockstruct.ls_jid)) {
 		fs_info(sdp, "jid=%u: Trying to acquire journal lock...\n",
@@ -492,6 +494,7 @@ void gfs2_recover_func(struct work_struct *work)
 		fs_info(sdp, "jid=%u, already locked for use\n", jd->jd_jid);
 	}
 
+	t_jlck = ktime_get();
 	fs_info(sdp, "jid=%u: Looking at journal...\n", jd->jd_jid);
 
 	error = gfs2_jdesc_check(jd);
@@ -501,13 +504,12 @@ void gfs2_recover_func(struct work_struct *work)
 	error = gfs2_find_jhead(jd, &head);
 	if (error)
 		goto fail_gunlock_ji;
+	t_jhd = ktime_get();
 
 	if (!(head.lh_flags & GFS2_LOG_HEAD_UNMOUNT)) {
 		fs_info(sdp, "jid=%u: Acquiring the transaction lock...\n",
 			jd->jd_jid);
 
-		t = jiffies;
-
 		/* Acquire a shared hold on the transaction lock */
 
 		error = gfs2_glock_nq_init(sdp->sd_trans_gl, LM_ST_SHARED,
@@ -541,6 +543,7 @@ void gfs2_recover_func(struct work_struct *work)
 			goto fail_gunlock_tr;
 		}
 
+		t_tlck = ktime_get();
 		fs_info(sdp, "jid=%u: Replaying journal...\n", jd->jd_jid);
 
 		for (pass = 0; pass < 2; pass++) {
@@ -557,9 +560,14 @@ void gfs2_recover_func(struct work_struct *work)
 			goto fail_gunlock_tr;
 
 		gfs2_glock_dq_uninit(&t_gh);
-		t = DIV_ROUND_UP(jiffies - t, HZ);
-		fs_info(sdp, "jid=%u: Journal replayed in %lus\n",
-			jd->jd_jid, t);
+		t_rep = ktime_get();
+		fs_info(sdp, "jid=%u: Journal replayed in %lldms [jlck:%lldms, "
+			"jhead:%lldms, tlck:%lldms, replay:%lldms]\n",
+			jd->jd_jid, ktime_ms_delta(t_rep, t_start),
+			ktime_ms_delta(t_jlck, t_start),
+			ktime_ms_delta(t_jhd, t_jlck),
+			ktime_ms_delta(t_tlck, t_jhd),
+			ktime_ms_delta(t_rep, t_tlck));
 	}
 
 	gfs2_recovery_done(sdp, jd->jd_jid, LM_RD_SUCCESS);
-- 
2.4.11



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Cluster-devel] [RFC v2 PATCH 3/5] gfs2: changes to gfs2_log_XXX_bio
  2018-08-13  4:48 [Cluster-devel] [RFC v2 PATCH 0/5] Speed up journal head lookup Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 1/5] gfs2: allow map_journal_extents() to take a journal descriptor as argument Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 2/5] gfs2: add timing info for various stages of journal recovery Abhi Das
@ 2018-08-13  4:48 ` Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 4/5] gfs2: read journal in large chunks to locate the head Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 5/5] gfs2: add tracepoint debugging for gfs2_end_log_read Abhi Das
  4 siblings, 0 replies; 6+ messages in thread
From: Abhi Das @ 2018-08-13  4:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Change gfs2_log_flush_bio to accept a pointer to the struct bio
to be flushed. Change gfs2_log_alloc_bio and gfs2_log_get_bio to
take a struct gfs2_jdesc instead of gfs2_sbd.

Signed-off-by: Abhi Das <adas@redhat.com>
---
 fs/gfs2/log.c  |  4 ++--
 fs/gfs2/lops.c | 32 ++++++++++++++++++--------------
 fs/gfs2/lops.h |  2 +-
 3 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 15a3a8c..87b7d87 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -655,7 +655,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags)
 
 	sdp->sd_log_idle = (tail == sdp->sd_log_flush_head);
 	gfs2_log_write_page(sdp, page);
-	gfs2_log_flush_bio(sdp, rw);
+	gfs2_log_flush_bio(&sdp->sd_log_bio, rw);
 	log_flush_wait(sdp);
 
 	if (sdp->sd_log_tail != tail)
@@ -699,7 +699,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl)
 
 	gfs2_ordered_write(sdp);
 	lops_before_commit(sdp, tr);
-	gfs2_log_flush_bio(sdp, WRITE);
+	gfs2_log_flush_bio(&sdp->sd_log_bio, WRITE);
 
 	if (sdp->sd_log_head != sdp->sd_log_flush_head) {
 		log_flush_wait(sdp);
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 4da6055..0284648 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -230,25 +230,27 @@ static void gfs2_end_log_write(struct bio *bio, int error)
 
 /**
  * gfs2_log_flush_bio - Submit any pending log bio
- * @sdp: The superblock
+ * @biop: Pointer to the bio we want to flush
  * @rw: The rw flags
  *
  * Submit any pending part-built or full bio to the block device. If
  * there is no pending bio, then this is a no-op.
  */
 
-void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw)
+void gfs2_log_flush_bio(struct bio **biop, int rw)
 {
-	if (sdp->sd_log_bio) {
+	struct bio *bio = *biop;
+	if (bio) {
+		struct gfs2_sbd *sdp = bio->bi_private;
 		atomic_inc(&sdp->sd_log_in_flight);
-		submit_bio(rw, sdp->sd_log_bio);
-		sdp->sd_log_bio = NULL;
+		submit_bio(rw, bio);
+		*biop = NULL;
 	}
 }
 
 /**
  * gfs2_log_alloc_bio - Allocate a new bio for log writing
- * @sdp: The superblock
+ * @jd: The journal descriptor
  * @blkno: The next device block number we want to write to
  *
  * This should never be called when there is a cached bio in the
@@ -259,8 +261,9 @@ void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw)
  * Returns: Newly allocated bio
  */
 
-static struct bio *gfs2_log_alloc_bio(struct gfs2_sbd *sdp, u64 blkno)
+static struct bio *gfs2_log_alloc_bio(struct gfs2_jdesc *jd, u64 blkno)
 {
+	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
 	struct super_block *sb = sdp->sd_vfs;
 	unsigned nrvecs = bio_get_nr_vecs(sb->s_bdev);
 	struct bio *bio;
@@ -286,7 +289,7 @@ static struct bio *gfs2_log_alloc_bio(struct gfs2_sbd *sdp, u64 blkno)
 
 /**
  * gfs2_log_get_bio - Get cached log bio, or allocate a new one
- * @sdp: The superblock
+ * @jd: The journal descriptor
  * @blkno: The device block number we want to write to
  *
  * If there is a cached bio, then if the next block number is sequential
@@ -297,8 +300,9 @@ static struct bio *gfs2_log_alloc_bio(struct gfs2_sbd *sdp, u64 blkno)
  * Returns: The bio to use for log writes
  */
 
-static struct bio *gfs2_log_get_bio(struct gfs2_sbd *sdp, u64 blkno)
+static struct bio *gfs2_log_get_bio(struct gfs2_jdesc *jd, u64 blkno)
 {
+	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
 	struct bio *bio = sdp->sd_log_bio;
 	u64 nblk;
 
@@ -307,10 +311,10 @@ static struct bio *gfs2_log_get_bio(struct gfs2_sbd *sdp, u64 blkno)
 		nblk >>= sdp->sd_fsb2bb_shift;
 		if (blkno == nblk)
 			return bio;
-		gfs2_log_flush_bio(sdp, WRITE);
+		gfs2_log_flush_bio(&sdp->sd_log_bio, WRITE);
 	}
 
-	return gfs2_log_alloc_bio(sdp, blkno);
+	return gfs2_log_alloc_bio(sdp->sd_jdesc, blkno);
 }
 
 
@@ -333,11 +337,11 @@ static void gfs2_log_write(struct gfs2_sbd *sdp, struct page *page,
 	struct bio *bio;
 	int ret;
 
-	bio = gfs2_log_get_bio(sdp, blkno);
+	bio = gfs2_log_get_bio(sdp->sd_jdesc, blkno);
 	ret = bio_add_page(bio, page, size, offset);
 	if (ret == 0) {
-		gfs2_log_flush_bio(sdp, WRITE);
-		bio = gfs2_log_alloc_bio(sdp, blkno);
+		gfs2_log_flush_bio(&sdp->sd_log_bio, WRITE);
+		bio = gfs2_log_alloc_bio(sdp->sd_jdesc, blkno);
 		ret = bio_add_page(bio, page, size, offset);
 		WARN_ON(ret == 0);
 	}
diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index 06793e3..3044347 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -28,7 +28,7 @@ extern const struct gfs2_log_operations gfs2_databuf_lops;
 
 extern const struct gfs2_log_operations *gfs2_log_ops[];
 extern void gfs2_log_write_page(struct gfs2_sbd *sdp, struct page *page);
-extern void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw);
+extern void gfs2_log_flush_bio(struct bio **biop, int rw);
 extern void gfs2_pin(struct gfs2_sbd *sdp, struct buffer_head *bh);
 
 static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
-- 
2.4.11



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Cluster-devel] [RFC v2 PATCH 4/5] gfs2: read journal in large chunks to locate the head
  2018-08-13  4:48 [Cluster-devel] [RFC v2 PATCH 0/5] Speed up journal head lookup Abhi Das
                   ` (2 preceding siblings ...)
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 3/5] gfs2: changes to gfs2_log_XXX_bio Abhi Das
@ 2018-08-13  4:48 ` Abhi Das
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 5/5] gfs2: add tracepoint debugging for gfs2_end_log_read Abhi Das
  4 siblings, 0 replies; 6+ messages in thread
From: Abhi Das @ 2018-08-13  4:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Use bio(s) to read in the journal sequentially in large chunks and
locate the head of the journal.
This is faster in most cases when compared to the existing bisect
method which operates one block at a time.

Signed-off-by: Abhi Das <adas@redhat.com>
---
 fs/gfs2/incore.h     |   8 +++-
 fs/gfs2/lops.c       | 121 +++++++++++++++++++++++++++++++++++++++++++++------
 fs/gfs2/lops.h       |  13 ++++++
 fs/gfs2/ops_fstype.c |   1 +
 fs/gfs2/recovery.c   | 118 +++++--------------------------------------------
 fs/gfs2/recovery.h   |   1 +
 6 files changed, 142 insertions(+), 120 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index f303616..31188c0 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -494,18 +494,24 @@ struct gfs2_journal_extent {
 	u64 blocks;
 };
 
+enum {
+	JDF_RECOVERY = 1,
+	JDF_JHEAD    = 2,
+};
+
 struct gfs2_jdesc {
 	struct list_head jd_list;
 	struct list_head extent_list;
 	struct work_struct jd_work;
 	struct inode *jd_inode;
 	unsigned long jd_flags;
-#define JDF_RECOVERY 1
 	unsigned int jd_jid;
 	unsigned int jd_blocks;
 	int jd_recover_error;
 	/* Replay stuff */
 
+	struct gfs2_log_header_host jd_jhead;
+	struct bio *jd_rd_bio; /* bio used for reading this journal */
 	unsigned int jd_found_blocks;
 	unsigned int jd_found_revokes;
 	unsigned int jd_replayed_blocks;
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 0284648..518b786 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -228,6 +228,53 @@ static void gfs2_end_log_write(struct bio *bio, int error)
 		wake_up(&sdp->sd_log_flush_wait);
 }
 
+static void gfs2_end_log_read(struct bio *bio, int error)
+{
+	struct gfs2_jdesc *jd = bio->bi_private;
+	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
+	struct page *page;
+	struct bio_vec *bvec;
+	int i, last;
+
+	if (error) {
+		sdp->sd_log_error = error;
+		fs_err(sdp, "Error %d reading from journal, jid=%u\n", error,
+		       jd->jd_jid);
+	}
+
+	bio_for_each_segment_all(bvec, bio, i) {
+		struct gfs2_log_header_host uninitialized_var(lh);
+		void *ptr;
+
+		page = bvec->bv_page;
+		ptr = page_address(page);
+		error = gfs2_log_header_in(&lh, ptr);
+		last = page_private(page);
+
+		if (!test_bit(JDF_JHEAD, &jd->jd_flags)) {
+			mempool_free(page, gfs2_page_pool);
+			continue;
+		}
+
+		if (!error && lh.lh_hash == compute_hash(ptr)) {
+			if (lh.lh_sequence > jd->jd_jhead.lh_sequence)
+				jd->jd_jhead = lh;
+			else
+				goto found;
+		}
+
+		if (last) {
+		found:
+			clear_bit(JDF_JHEAD, &jd->jd_flags);
+			smp_mb__after_clear_bit();
+			wake_up_bit(&jd->jd_flags, JDF_JHEAD);
+		}
+		mempool_free(page, gfs2_page_pool);
+	}
+
+	bio_put(bio);
+}
+
 /**
  * gfs2_log_flush_bio - Submit any pending log bio
  * @biop: Pointer to the bio we want to flush
@@ -241,8 +288,10 @@ void gfs2_log_flush_bio(struct bio **biop, int rw)
 {
 	struct bio *bio = *biop;
 	if (bio) {
-		struct gfs2_sbd *sdp = bio->bi_private;
-		atomic_inc(&sdp->sd_log_in_flight);
+		if (rw != READ) {
+			struct gfs2_sbd *sdp = bio->bi_private;
+			atomic_inc(&sdp->sd_log_in_flight);
+		}
 		submit_bio(rw, bio);
 		*biop = NULL;
 	}
@@ -261,14 +310,14 @@ void gfs2_log_flush_bio(struct bio **biop, int rw)
  * Returns: Newly allocated bio
  */
 
-static struct bio *gfs2_log_alloc_bio(struct gfs2_jdesc *jd, u64 blkno)
+static struct bio *gfs2_log_alloc_bio(struct gfs2_jdesc *jd, u64 blkno, int rw)
 {
 	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
 	struct super_block *sb = sdp->sd_vfs;
 	unsigned nrvecs = bio_get_nr_vecs(sb->s_bdev);
 	struct bio *bio;
 
-	BUG_ON(sdp->sd_log_bio);
+	BUG_ON((rw == READ ? jd->jd_rd_bio : sdp->sd_log_bio));
 
 	while (1) {
 		bio = bio_alloc(GFP_NOIO, nrvecs);
@@ -279,10 +328,13 @@ static struct bio *gfs2_log_alloc_bio(struct gfs2_jdesc *jd, u64 blkno)
 
 	bio->bi_sector = blkno * (sb->s_blocksize >> 9);
 	bio->bi_bdev = sb->s_bdev;
-	bio->bi_end_io = gfs2_end_log_write;
-	bio->bi_private = sdp;
+	bio->bi_end_io = rw == READ ? gfs2_end_log_read : gfs2_end_log_write;
+	bio->bi_private = rw == READ ? (void*)jd : (void*)sdp;
 
-	sdp->sd_log_bio = bio;
+	if (rw == READ)
+		jd->jd_rd_bio = bio;
+	else
+		sdp->sd_log_bio = bio;
 
 	return bio;
 }
@@ -300,10 +352,10 @@ static struct bio *gfs2_log_alloc_bio(struct gfs2_jdesc *jd, u64 blkno)
  * Returns: The bio to use for log writes
  */
 
-static struct bio *gfs2_log_get_bio(struct gfs2_jdesc *jd, u64 blkno)
+static struct bio *gfs2_log_get_bio(struct gfs2_jdesc *jd, u64 blkno, int rw)
 {
 	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
-	struct bio *bio = sdp->sd_log_bio;
+	struct bio *bio = rw == READ ? jd->jd_rd_bio : sdp->sd_log_bio;
 	u64 nblk;
 
 	if (bio) {
@@ -311,10 +363,11 @@ static struct bio *gfs2_log_get_bio(struct gfs2_jdesc *jd, u64 blkno)
 		nblk >>= sdp->sd_fsb2bb_shift;
 		if (blkno == nblk)
 			return bio;
-		gfs2_log_flush_bio(&sdp->sd_log_bio, WRITE);
+		gfs2_log_flush_bio(rw == READ ? &jd->jd_rd_bio
+				   : &sdp->sd_log_bio, rw);
 	}
 
-	return gfs2_log_alloc_bio(sdp->sd_jdesc, blkno);
+	return gfs2_log_alloc_bio(rw == READ ? jd : sdp->sd_jdesc, blkno, rw);
 }
 
 
@@ -337,11 +390,11 @@ static void gfs2_log_write(struct gfs2_sbd *sdp, struct page *page,
 	struct bio *bio;
 	int ret;
 
-	bio = gfs2_log_get_bio(sdp->sd_jdesc, blkno);
+	bio = gfs2_log_get_bio(sdp->sd_jdesc, blkno, WRITE);
 	ret = bio_add_page(bio, page, size, offset);
 	if (ret == 0) {
 		gfs2_log_flush_bio(&sdp->sd_log_bio, WRITE);
-		bio = gfs2_log_alloc_bio(sdp->sd_jdesc, blkno);
+		bio = gfs2_log_alloc_bio(sdp->sd_jdesc, blkno, WRITE);
 		ret = bio_add_page(bio, page, size, offset);
 		WARN_ON(ret == 0);
 	}
@@ -379,6 +432,48 @@ void gfs2_log_write_page(struct gfs2_sbd *sdp, struct page *page)
 	gfs2_log_write(sdp, page, sb->s_blocksize, 0);
 }
 
+void gfs2_log_read_extent(struct gfs2_jdesc *jd, u64 dblock,
+			  unsigned int blocks, int last)
+{
+	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
+	struct super_block *sb = sdp->sd_vfs;
+	struct page *page;
+	int i, ret;
+	struct bio *bio;
+
+	for (i=0; i<blocks; i++) {
+		page = mempool_alloc(gfs2_page_pool, GFP_NOIO);
+		/* flag the last page of the journal we plan to read in */
+		page_private(page) = (last && i == (blocks - 1));
+
+		bio = gfs2_log_get_bio(jd, dblock + i, READ);
+		ret = bio_add_page(bio, page, sb->s_blocksize, 0);
+		if (ret == 0) {
+			gfs2_log_flush_bio(&jd->jd_rd_bio, READ);
+			bio = gfs2_log_alloc_bio(jd, dblock + i, READ);
+			ret = bio_add_page(bio, page, sb->s_blocksize, 0);
+			WARN_ON(ret == 0);
+		}
+		bio->bi_private = jd;
+	}
+}
+
+void gfs2_log_read(struct gfs2_jdesc *jd)
+{
+	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
+	int last = 0;
+	struct gfs2_journal_extent *je;
+
+	if (list_empty(&jd->extent_list))
+		map_journal_extents(sdp, jd);
+
+	list_for_each_entry(je, &jd->extent_list, extent_list) {
+		last = list_is_last(&je->extent_list, &jd->extent_list);
+		gfs2_log_read_extent(jd, je->dblock, je->blocks, last);
+		gfs2_log_flush_bio(&jd->jd_rd_bio, READ);
+	}
+}
+
 static struct page *gfs2_get_log_desc(struct gfs2_sbd *sdp, u32 ld_type,
 				      u32 ld_length, u32 ld_data1)
 {
diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index 3044347..4d7841f 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -11,6 +11,7 @@
 #define __LOPS_DOT_H__
 
 #include <linux/list.h>
+#include <linux/crc32.h>
 #include "incore.h"
 
 #define BUF_OFFSET \
@@ -30,6 +31,7 @@ extern const struct gfs2_log_operations *gfs2_log_ops[];
 extern void gfs2_log_write_page(struct gfs2_sbd *sdp, struct page *page);
 extern void gfs2_log_flush_bio(struct bio **biop, int rw);
 extern void gfs2_pin(struct gfs2_sbd *sdp, struct buffer_head *bh);
+extern void gfs2_log_read(struct gfs2_jdesc *jd);
 
 static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
 {
@@ -101,5 +103,16 @@ static inline void lops_after_scan(struct gfs2_jdesc *jd, int error,
 			gfs2_log_ops[x]->lo_after_scan(jd, error, pass);
 }
 
+static inline u32 compute_hash(const void *ptr)
+{
+	const u32 nothing = 0;
+	u32 hash;
+
+	hash = crc32_le((u32)~0, ptr, sizeof(struct gfs2_log_header) - sizeof(u32));
+	hash = crc32_le(hash, (unsigned char const *)&nothing, sizeof(nothing));
+	hash ^= (u32)~0;
+
+	return hash;
+}
 #endif /* __LOPS_DOT_H__ */
 
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index fd460c1..4a17eaf 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -642,6 +642,7 @@ static int gfs2_jindex_hold(struct gfs2_sbd *sdp, struct gfs2_holder *ji_gh)
 			kfree(jd);
 			break;
 		}
+		jd->jd_rd_bio = NULL;
 
 		spin_lock(&sdp->sd_jindex_spin);
 		jd->jd_jid = sdp->sd_journals++;
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 4b042db..7a844c4 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -118,7 +118,7 @@ void gfs2_revoke_clean(struct gfs2_jdesc *jd)
 	}
 }
 
-static int gfs2_log_header_in(struct gfs2_log_header_host *lh, const void *buf)
+int gfs2_log_header_in(struct gfs2_log_header_host *lh, const void *buf)
 {
 	const struct gfs2_log_header *str = buf;
 
@@ -177,85 +177,11 @@ static int get_log_header(struct gfs2_jdesc *jd, unsigned int blk,
 }
 
 /**
- * find_good_lh - find a good log header
- * @jd: the journal
- * @blk: the segment to start searching from
- * @lh: the log header to fill in
- * @forward: if true search forward in the log, else search backward
- *
- * Call get_log_header() to get a log header for a segment, but if the
- * segment is bad, either scan forward or backward until we find a good one.
- *
- * Returns: errno
- */
-
-static int find_good_lh(struct gfs2_jdesc *jd, unsigned int *blk,
-			struct gfs2_log_header_host *head)
-{
-	unsigned int orig_blk = *blk;
-	int error;
-
-	for (;;) {
-		error = get_log_header(jd, *blk, head);
-		if (error <= 0)
-			return error;
-
-		if (++*blk == jd->jd_blocks)
-			*blk = 0;
-
-		if (*blk == orig_blk) {
-			gfs2_consist_inode(GFS2_I(jd->jd_inode));
-			return -EIO;
-		}
-	}
-}
-
-/**
- * jhead_scan - make sure we've found the head of the log
- * @jd: the journal
- * @head: this is filled in with the log descriptor of the head
- *
- * At this point, seg and lh should be either the head of the log or just
- * before.  Scan forward until we find the head.
- *
- * Returns: errno
- */
-
-static int jhead_scan(struct gfs2_jdesc *jd, struct gfs2_log_header_host *head)
-{
-	unsigned int blk = head->lh_blkno;
-	struct gfs2_log_header_host lh;
-	int error;
-
-	for (;;) {
-		if (++blk == jd->jd_blocks)
-			blk = 0;
-
-		error = get_log_header(jd, blk, &lh);
-		if (error < 0)
-			return error;
-		if (error == 1)
-			continue;
-
-		if (lh.lh_sequence == head->lh_sequence) {
-			gfs2_consist_inode(GFS2_I(jd->jd_inode));
-			return -EIO;
-		}
-		if (lh.lh_sequence < head->lh_sequence)
-			break;
-
-		*head = lh;
-	}
-
-	return 0;
-}
-
-/**
  * gfs2_find_jhead - find the head of a log
  * @jd: the journal
  * @head: the log descriptor for the head of the log is returned here
  *
- * Do a binary search of a journal and find the valid log entry with the
+ * Do a search of a journal and find the valid log entry with the
  * highest sequence number.  (i.e. the log head)
  *
  * Returns: errno
@@ -263,39 +189,19 @@ static int jhead_scan(struct gfs2_jdesc *jd, struct gfs2_log_header_host *head)
 
 int gfs2_find_jhead(struct gfs2_jdesc *jd, struct gfs2_log_header_host *head)
 {
-	struct gfs2_log_header_host lh_1, lh_m;
-	u32 blk_1, blk_2, blk_m;
-	int error;
-
-	blk_1 = 0;
-	blk_2 = jd->jd_blocks - 1;
-
-	for (;;) {
-		blk_m = (blk_1 + blk_2) / 2;
-
-		error = find_good_lh(jd, &blk_1, &lh_1);
-		if (error)
-			return error;
-
-		error = find_good_lh(jd, &blk_m, &lh_m);
-		if (error)
-			return error;
-
-		if (blk_1 == blk_m || blk_m == blk_2)
-			break;
-
-		if (lh_1.lh_sequence <= lh_m.lh_sequence)
-			blk_1 = blk_m;
-		else
-			blk_2 = blk_m;
-	}
+	int error = 0;
 
-	error = jhead_scan(jd, &lh_1);
-	if (error)
-		return error;
+	memset(&jd->jd_jhead, 0, sizeof(struct gfs2_log_header_host));
+	set_bit(JDF_JHEAD, &jd->jd_flags);
+	gfs2_log_read(jd);
 
-	*head = lh_1;
+	if (test_bit(JDF_JHEAD, &jd->jd_flags))
+		wait_on_bit(&jd->jd_flags, JDF_JHEAD, TASK_INTERRUPTIBLE);
 
+	if (jd->jd_jhead.lh_sequence == 0)
+		error = 1;
+	else
+		*head = jd->jd_jhead;
 	return error;
 }
 
diff --git a/fs/gfs2/recovery.h b/fs/gfs2/recovery.h
index 11fdfab..cd691ff 100644
--- a/fs/gfs2/recovery.h
+++ b/fs/gfs2/recovery.h
@@ -29,6 +29,7 @@ extern void gfs2_revoke_clean(struct gfs2_jdesc *jd);
 
 extern int gfs2_find_jhead(struct gfs2_jdesc *jd,
 		    struct gfs2_log_header_host *head);
+extern int gfs2_log_header_in(struct gfs2_log_header_host *lh, const void *buf);
 extern int gfs2_recover_journal(struct gfs2_jdesc *gfs2_jd, bool wait);
 extern void gfs2_recover_func(struct work_struct *work);
 
-- 
2.4.11



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Cluster-devel] [RFC v2 PATCH 5/5] gfs2: add tracepoint debugging for gfs2_end_log_read
  2018-08-13  4:48 [Cluster-devel] [RFC v2 PATCH 0/5] Speed up journal head lookup Abhi Das
                   ` (3 preceding siblings ...)
  2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 4/5] gfs2: read journal in large chunks to locate the head Abhi Das
@ 2018-08-13  4:48 ` Abhi Das
  4 siblings, 0 replies; 6+ messages in thread
From: Abhi Das @ 2018-08-13  4:48 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Use a tracepoint and a counter in gfs2_jdesc to count the number of
outstanding reads (in pages) as we read through a journal to aid
debugging.

Signed-off-by: Abhi Das <adas@redhat.com>
---
 fs/gfs2/incore.h     |  1 +
 fs/gfs2/lops.c       |  3 +++
 fs/gfs2/ops_fstype.c |  1 +
 fs/gfs2/trace_gfs2.h | 25 +++++++++++++++++++++++++
 4 files changed, 30 insertions(+)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 31188c0..bb4446d 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -512,6 +512,7 @@ struct gfs2_jdesc {
 
 	struct gfs2_log_header_host jd_jhead;
 	struct bio *jd_rd_bio; /* bio used for reading this journal */
+	atomic_t jd_rd_pg_ct;
 	unsigned int jd_found_blocks;
 	unsigned int jd_found_revokes;
 	unsigned int jd_replayed_blocks;
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 518b786..a261398 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -250,6 +250,7 @@ static void gfs2_end_log_read(struct bio *bio, int error)
 		ptr = page_address(page);
 		error = gfs2_log_header_in(&lh, ptr);
 		last = page_private(page);
+		atomic_dec(&jd->jd_rd_pg_ct);
 
 		if (!test_bit(JDF_JHEAD, &jd->jd_flags)) {
 			mempool_free(page, gfs2_page_pool);
@@ -273,6 +274,7 @@ static void gfs2_end_log_read(struct bio *bio, int error)
 	}
 
 	bio_put(bio);
+	trace_gfs2_end_log_read(jd);
 }
 
 /**
@@ -454,6 +456,7 @@ void gfs2_log_read_extent(struct gfs2_jdesc *jd, u64 dblock,
 			ret = bio_add_page(bio, page, sb->s_blocksize, 0);
 			WARN_ON(ret == 0);
 		}
+		atomic_inc(&jd->jd_rd_pg_ct);
 		bio->bi_private = jd;
 	}
 }
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 4a17eaf..ac9855a 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -643,6 +643,7 @@ static int gfs2_jindex_hold(struct gfs2_sbd *sdp, struct gfs2_holder *ji_gh)
 			break;
 		}
 		jd->jd_rd_bio = NULL;
+		atomic_set(&jd->jd_rd_pg_ct, 0);
 
 		spin_lock(&sdp->sd_jindex_spin);
 		jd->jd_jid = sdp->sd_journals++;
diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h
index d1de2ed..9f0cc8d 100644
--- a/fs/gfs2/trace_gfs2.h
+++ b/fs/gfs2/trace_gfs2.h
@@ -613,6 +613,31 @@ TRACE_EVENT(gfs2_rs,
 		  rs_func_name(__entry->func), (unsigned long)__entry->free)
 );
 
+TRACE_EVENT(gfs2_end_log_read,
+
+	TP_PROTO(const struct gfs2_jdesc *jd),
+
+
+	TP_ARGS(jd),
+
+	TP_STRUCT__entry(
+		__field(        dev_t,        dev               )
+		__field(	unsigned int, jid		)
+		__field(	unsigned int, pages 	        )
+	),
+
+	TP_fast_assign(
+		__entry->dev            = jd->jd_inode->i_sb->s_dev;
+		__entry->jid		= jd->jd_jid;
+		__entry->pages		= atomic_read(&jd->jd_rd_pg_ct);
+	),
+
+	TP_printk("%u,%u end_log_read jid:%u outstanding pages:%u",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  (unsigned int)__entry->jid,
+		  (unsigned int)__entry->pages)
+);
+
 #endif /* _TRACE_GFS2_H */
 
 /* This part must be outside protection */
-- 
2.4.11



^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-13  4:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-13  4:48 [Cluster-devel] [RFC v2 PATCH 0/5] Speed up journal head lookup Abhi Das
2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 1/5] gfs2: allow map_journal_extents() to take a journal descriptor as argument Abhi Das
2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 2/5] gfs2: add timing info for various stages of journal recovery Abhi Das
2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 3/5] gfs2: changes to gfs2_log_XXX_bio Abhi Das
2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 4/5] gfs2: read journal in large chunks to locate the head Abhi Das
2018-08-13  4:48 ` [Cluster-devel] [RFC v2 PATCH 5/5] gfs2: add tracepoint debugging for gfs2_end_log_read Abhi Das

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.