All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation
@ 2014-09-02 12:07 Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 01/19] libgfs2: Keep a pointer to the sbd in lgfs2_rgrps_t Andrew Price
                   ` (20 more replies)
  0 siblings, 21 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

This patch set introduces extent allocation to libgfs2 and adds functions which
decouple file creation, allocation and writing so that mkfs.gfs2 can be
re-worked to write journals and resource groups sequentially.

With these patches, mkfs.gfs2 typically takes around 20% of the time that it
did before in my tests.  The main speed-up has been from the journal data
allocation functions not having to re-read and write a resource group for each
block allocated as it did before (this was a performance regression introduced
by previous memory footprint improvement patches, hence the significant perf
improvement).  Journals now each occupy an extent spanning an entire resource
group specifically sized for the journal, and the resource group headers are
written only once, after the journal blocks have been allocated in the
in-memory bitmaps. Resource groups are still only kept in memory for as long as
they are needed so peak memory usage should be largely unchanged.

One thing to note is that, with these patches, the root and master inodes are
no longer the first objects in the first resource group. The master inode is
written in the first free block after the journals and then the other metafs
structures are placed. The root directory inode is then finally created. This
is not a format change but it may cause some confusion after years of expecting
the root and master inodes to be at certain addresses so I thought it worth
mentioning.

Coverity and valgrind are happy about these patches and I've encountered no
problems after various tests which mount the fs.

Cheers,
Andy

Andrew Price (19):
  libgfs2: Keep a pointer to the sbd in lgfs2_rgrps_t
  libgfs2: Move bitmap buffers inside struct gfs2_bitmap
  libgfs2: Fix an impossible loop condition in gfs2_rgrp_read
  libgfs2: Introduce struct lgfs2_rbm
  libgfs2: Move struct _lgfs2_rgrps into rgrp.h
  libgfs2: Add functions for finding free extents
  tests: Add unit tests for the new extent search functions
  libgfs2: Ignore an empty rgrp plan if a length is specified
  libgfs2: Add back-pointer to rgrps in lgfs2_rgrp_t
  libgfs2: Const-ify the parameters of print functions
  libgfs2: Allow init_dinode to accept a preallocated bh
  libgfs2: Add extent allocation functions
  libgfs2: Add support for allocating entire rgrp headers
  libgfs2: Write file metadata sequentially
  libgfs2: Fix alignment in lgfs2_rgsize_for_data
  libgfs2: Handle non-zero bitmaps in lgfs2_rgrp_write
  libgfs2: Add a speedier journal data block writing function
  libgfs2: Create jindex directory separately from journals
  mkfs.gfs2: Improve journal creation performance

 .gitignore                  |   3 +-
 gfs2/convert/gfs2_convert.c |  49 +++--
 gfs2/edit/journal.c         |   6 +-
 gfs2/fsck/fs_recovery.c     |   2 +-
 gfs2/fsck/initialize.c      |  27 +--
 gfs2/fsck/metawalk.c        |  10 +-
 gfs2/fsck/pass5.c           |   9 +-
 gfs2/fsck/rgrepair.c        |  14 +-
 gfs2/fsck/util.c            |   2 +-
 gfs2/libgfs2/Makefile.am    |   2 +-
 gfs2/libgfs2/fs_bits.c      |  10 +-
 gfs2/libgfs2/fs_geometry.c  |   6 +-
 gfs2/libgfs2/fs_ops.c       | 184 ++++++++++++++---
 gfs2/libgfs2/libgfs2.h      |  50 +++--
 gfs2/libgfs2/ondisk.c       |  26 +--
 gfs2/libgfs2/rgrp.c         | 491 ++++++++++++++++++++++++++++++++++++--------
 gfs2/libgfs2/rgrp.h         |  50 +++++
 gfs2/libgfs2/structures.c   | 103 +++++++++-
 gfs2/mkfs/main_grow.c       |   4 +-
 gfs2/mkfs/main_mkfs.c       | 155 ++++++++++----
 tests/Makefile.am           |  33 ++-
 tests/check_rgrp.c          | 143 +++++++++++++
 tests/libgfs2.at            |   8 +-
 23 files changed, 1113 insertions(+), 274 deletions(-)
 create mode 100644 gfs2/libgfs2/rgrp.h
 create mode 100644 tests/check_rgrp.c

-- 
1.9.3



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 01/19] libgfs2: Keep a pointer to the sbd in lgfs2_rgrps_t
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 02/19] libgfs2: Move bitmap buffers inside struct gfs2_bitmap Andrew Price
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

As a set of resource groups is tied to a particular file system (block
size, device length...) it makes sense to keep a reference to the file
system in the lgfs2_rgrps_t type. This allows us to avoid duplication of
the bsize and device length fields, reduces parameter counts and
provides a convenient way to access file system-specific values in
resource group-related functions.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/libgfs2.h |  5 +++--
 gfs2/libgfs2/rgrp.c    | 24 +++++++++++-------------
 gfs2/mkfs/main_grow.c  |  2 +-
 gfs2/mkfs/main_mkfs.c  |  2 +-
 4 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 041e5fd..2ba97d6 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -175,6 +175,8 @@ struct gfs2_bitmap
 	uint32_t   bi_len;     /* The number of bytes in this block */
 };
 
+struct gfs2_sbd;
+
 struct rgrp_tree {
 	struct osi_node node;
 	uint64_t start;	   /* The offset of the beginning of this resource group */
@@ -189,7 +191,7 @@ struct rgrp_tree {
 typedef struct rgrp_tree *lgfs2_rgrp_t;
 typedef struct _lgfs2_rgrps *lgfs2_rgrps_t;
 
-extern lgfs2_rgrps_t lgfs2_rgrps_init(unsigned bsize, uint64_t devlen, uint64_t align, uint64_t offset);
+extern lgfs2_rgrps_t lgfs2_rgrps_init(const struct gfs2_sbd *sdp, uint64_t align, uint64_t offset);
 extern void lgfs2_rgrps_free(lgfs2_rgrps_t *rgs);
 extern uint64_t lgfs2_rindex_entry_new(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry, uint64_t addr, uint32_t len);
 extern unsigned lgfs2_rindex_read_fd(int fd, lgfs2_rgrps_t rgs);
@@ -223,7 +225,6 @@ struct special_blocks {
 	uint64_t block;
 };
 
-struct gfs2_sbd;
 struct gfs2_inode {
 	int bh_owned; /* Is this bh owned, iow, should we release it later? */
 	struct gfs2_dinode i_di;
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index 901a7bf..c529594 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -227,10 +227,9 @@ struct rgplan {
 struct _lgfs2_rgrps {
 	struct osi_root root;
 	struct rgplan plan[2];
-	unsigned bsize;
+	const struct gfs2_sbd *sdp;
 	unsigned long align;
 	unsigned long align_off;
-	uint64_t devlen;
 };
 
 static uint64_t align_block(const uint64_t base, const uint64_t align)
@@ -280,8 +279,8 @@ uint32_t lgfs2_rgrp_align_len(const lgfs2_rgrps_t rgs, uint32_t len)
  */
 uint32_t lgfs2_rgrps_plan(const lgfs2_rgrps_t rgs, uint64_t space, uint32_t tgtsize)
 {
-	uint32_t maxlen = (GFS2_MAX_RGSIZE << 20) / rgs->bsize;
-	uint32_t minlen = (GFS2_MIN_RGSIZE << 20) / rgs->bsize;
+	uint32_t maxlen = (GFS2_MAX_RGSIZE << 20) / rgs->sdp->bsize;
+	uint32_t minlen = (GFS2_MIN_RGSIZE << 20) / rgs->sdp->bsize;
 
 	/* Apps should already have checked that the rg size is <=
 	   GFS2_MAX_RGSIZE but just in case alignment pushes it over we clamp
@@ -340,7 +339,7 @@ uint32_t lgfs2_rgrps_plan(const lgfs2_rgrps_t rgs, uint64_t space, uint32_t tgts
  * offset: The required stripe offset of the resource groups
  * Returns an initialised lgfs2_rgrps_t or NULL if unsuccessful with errno set
  */
-lgfs2_rgrps_t lgfs2_rgrps_init(unsigned bsize, uint64_t devlen, uint64_t align, uint64_t offset)
+lgfs2_rgrps_t lgfs2_rgrps_init(const struct gfs2_sbd *sdp, uint64_t align, uint64_t offset)
 {
 	lgfs2_rgrps_t rgs;
 
@@ -352,8 +351,7 @@ lgfs2_rgrps_t lgfs2_rgrps_init(unsigned bsize, uint64_t devlen, uint64_t align,
 	if (rgs == NULL)
 		return NULL;
 
-	rgs->bsize = bsize;
-	rgs->devlen = devlen;
+	rgs->sdp = sdp;
 	rgs->align = align;
 	rgs->align_off = offset;
 	memset(&rgs->root, 0, sizeof(rgs->root));
@@ -451,11 +449,11 @@ uint64_t lgfs2_rindex_entry_new(lgfs2_rgrps_t rgs, struct gfs2_rindex *ri, uint6
 		rgs->plan[plan].num--;
 	}
 
-	if (addr + len > rgs->devlen)
+	if (addr + len > rgs->sdp->device.length)
 		return 0;
 
 	ri->ri_addr = addr;
-	ri->ri_length = rgblocks2bitblocks(rgs->bsize, len, &ri->ri_data);
+	ri->ri_length = rgblocks2bitblocks(rgs->sdp->bsize, len, &ri->ri_data);
 	ri->__pad = 0;
 	ri->ri_data0 = ri->ri_addr + ri->ri_length;
 	ri->ri_bitbytes = ri->ri_data / GFS2_NBBY;
@@ -541,7 +539,7 @@ lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry)
 	rg->rg.rg_header.mh_format = GFS2_FORMAT_RG;
 	rg->rg.rg_free = rg->ri.ri_data;
 
-	compute_bitmaps(rg, rgs->bsize);
+	compute_bitmaps(rg, rgs->sdp->bsize);
 	return rg;
 }
 
@@ -552,7 +550,7 @@ lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry)
 int lgfs2_rgrp_write(const lgfs2_rgrps_t rgs, int fd, const lgfs2_rgrp_t rg)
 {
 	ssize_t ret = 0;
-	size_t len = rg->ri.ri_length * rgs->bsize;
+	size_t len = rg->ri.ri_length * rgs->sdp->bsize;
 	unsigned int i;
 	const struct gfs2_meta_header bmh = {
 		.mh_magic = GFS2_MAGIC,
@@ -565,9 +563,9 @@ int lgfs2_rgrp_write(const lgfs2_rgrps_t rgs, int fd, const lgfs2_rgrp_t rg)
 
 	gfs2_rgrp_out(&rg->rg, buff);
 	for (i = 1; i < rg->ri.ri_length; i++)
-		gfs2_meta_header_out(&bmh, buff + (i * rgs->bsize));
+		gfs2_meta_header_out(&bmh, buff + (i * rgs->sdp->bsize));
 
-	ret = pwrite(fd, buff, len, rg->ri.ri_addr * rgs->bsize);
+	ret = pwrite(fd, buff, len, rg->ri.ri_addr * rgs->sdp->bsize);
 	if (ret != len) {
 		free(buff);
 		return -1;
diff --git a/gfs2/mkfs/main_grow.c b/gfs2/mkfs/main_grow.c
index 5da809a..95fbd1d 100644
--- a/gfs2/mkfs/main_grow.c
+++ b/gfs2/mkfs/main_grow.c
@@ -178,7 +178,7 @@ static lgfs2_rgrps_t rgrps_init(struct gfs2_sbd *sdp)
 	}
 
 	blkid_free_probe(pr);
-	return lgfs2_rgrps_init(sdp->bsize, sdp->device.length, al_base, al_off);
+	return lgfs2_rgrps_init(sdp, al_base, al_off);
 }
 
 /**
diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index f778e8d..bf7c9cd 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -600,7 +600,7 @@ static lgfs2_rgrps_t rgs_init(struct mkfs_opts *opts, struct gfs2_sbd *sdp)
 		}
 	}
 
-	rgs = lgfs2_rgrps_init(sdp->bsize, sdp->device.length, al_base, al_off);
+	rgs = lgfs2_rgrps_init(sdp, al_base, al_off);
 	if (rgs == NULL) {
 		perror(_("Could not initialise resource groups"));
 		exit(-1);
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 02/19] libgfs2: Move bitmap buffers inside struct gfs2_bitmap
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 01/19] libgfs2: Keep a pointer to the sbd in lgfs2_rgrps_t Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 03/19] libgfs2: Fix an impossible loop condition in gfs2_rgrp_read Andrew Price
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Keeping an array of buffers alongside the array of bitmaps meant some
extra management and bookkeeping for arrays of buffer pointers. Move the
buffer pointers into the bitmap structures.

The only downside to this is that reading resource groups gets a little
more complicated and an intermediate array of buffer pointers is created
in gfs2_rgrp_read to make sure we can still use vector i/o. Fortunately
this doesn't cause any noticeable slowdown.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/convert/gfs2_convert.c | 42 ++++++++++++---------------
 gfs2/edit/journal.c         |  6 ++--
 gfs2/fsck/fs_recovery.c     |  2 +-
 gfs2/fsck/initialize.c      | 15 +++++-----
 gfs2/fsck/metawalk.c        | 10 +++----
 gfs2/fsck/pass5.c           |  9 +++---
 gfs2/fsck/rgrepair.c        | 14 ++++-----
 gfs2/fsck/util.c            |  2 +-
 gfs2/libgfs2/fs_bits.c      | 10 ++++---
 gfs2/libgfs2/fs_geometry.c  |  6 ++--
 gfs2/libgfs2/fs_ops.c       | 16 +++++------
 gfs2/libgfs2/libgfs2.h      |  2 +-
 gfs2/libgfs2/rgrp.c         | 70 ++++++++++++++++++++++-----------------------
 gfs2/libgfs2/structures.c   |  2 +-
 14 files changed, 98 insertions(+), 108 deletions(-)

diff --git a/gfs2/convert/gfs2_convert.c b/gfs2/convert/gfs2_convert.c
index 87bec8c..61ed320 100644
--- a/gfs2/convert/gfs2_convert.c
+++ b/gfs2/convert/gfs2_convert.c
@@ -142,16 +142,18 @@ static void convert_bitmaps(struct gfs2_sbd *sdp, struct rgrp_tree *rg)
 
 	ri = &rg->ri;
 	for (blk = 0; blk < ri->ri_length; blk++) {
+		struct gfs2_bitmap *bi;
 		x = (blk) ? sizeof(struct gfs2_meta_header) :
 			sizeof(struct gfs2_rgrp);
 
+		bi = &rg->bits[blk];
 		for (; x < sdp->bsize; x++)
 			for (y = 0; y < GFS2_NBBY; y++) {
-				state = (rg->bh[blk]->b_data[x] >>
+				state = (bi->bi_bh->b_data[x] >>
 					 (GFS2_BIT_SIZE * y)) & 0x03;
 				if (state == 0x02) {/* unallocated metadata state invalid */
-					rg->bh[blk]->b_data[x] &= ~(0x02 << (GFS2_BIT_SIZE * y));
-					bmodified(rg->bh[blk]);
+					bi->bi_bh->b_data[x] &= ~(0x02 << (GFS2_BIT_SIZE * y));
+					bmodified(bi->bi_bh);
 				}
 			}
 	}
@@ -188,7 +190,7 @@ static int convert_rgs(struct gfs2_sbd *sbp)
 		sbp->dinodes_alloced += rgd1->rg_useddi;
 		convert_bitmaps(sbp, rgd);
 		/* Write the updated rgrp to the gfs2 buffer */
-		gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+		gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 		rgs++;
 		if (rgs % 100 == 0) {
 			printf(".");
@@ -969,8 +971,8 @@ static int next_rg_meta(struct rgrp_tree *rgd, uint64_t *block, int first)
 	}
 	for (; i < length; i++){
 		bits = &rgd->bits[i];
-		blk = gfs2_bitfit((unsigned char *)rgd->bh[i]->b_data +
-				  bits->bi_offset, bits->bi_len, blk, GFS2_BLKST_DINODE);
+		blk = gfs2_bitfit((uint8_t *)bits->bi_bh->b_data + bits->bi_offset,
+		                   bits->bi_len, blk, GFS2_BLKST_DINODE);
 		if(blk != BFITNOENT){
 			*block = blk + (bits->bi_start * GFS2_NBBY) +
 				rgd->ri.ri_data0;
@@ -1073,16 +1075,17 @@ static int inode_renumber(struct gfs2_sbd *sbp, uint64_t root_inode_addr, osi_li
 				byte_bit = (block - rgd->ri.ri_data0) % GFS2_NBBY;
 				/* Now figure out which bitmap block the byte is on */
 				for (blk = 0; blk < rgd->ri.ri_length; blk++) {
+					struct gfs2_bitmap *bi = &rgd->bits[blk];
 					/* figure out offset of first bitmap byte for this map: */
 					buf_offset = (blk) ? sizeof(struct gfs2_meta_header) :
 						sizeof(struct gfs2_rgrp);
 					/* if it's on this page */
 					if (buf_offset + bitmap_byte < sbp->bsize) {
-						rgd->bh[blk]->b_data[buf_offset + bitmap_byte] &=
+						bi->bi_bh->b_data[buf_offset + bitmap_byte] &=
 							~(0x03 << (GFS2_BIT_SIZE * byte_bit));
-						rgd->bh[blk]->b_data[buf_offset + bitmap_byte] |=
+						bi->bi_bh->b_data[buf_offset + bitmap_byte] |=
 							(0x01 << (GFS2_BIT_SIZE * byte_bit));
-						bmodified(rgd->bh[blk]);
+						bmodified(bi->bi_bh);
 						break;
 					}
 					bitmap_byte -= (sbp->bsize - buf_offset);
@@ -1854,29 +1857,20 @@ static int journ_space_to_rg(struct gfs2_sbd *sdp)
 		rgd->rg.rg_free = rgd->ri.ri_data;
 		rgd->ri.ri_bitbytes = rgd->ri.ri_data / GFS2_NBBY;
 
-		if(!(rgd->bh = (struct gfs2_buffer_head **)
-		     malloc(rgd->ri.ri_length *
-			    sizeof(struct gfs2_buffer_head *))))
-			return -1;
-		if(!memset(rgd->bh, 0, rgd->ri.ri_length *
-			   sizeof(struct gfs2_buffer_head *))) {
-			free(rgd->bh);
-			return -1;
-		}
-		for (x = 0; x < rgd->ri.ri_length; x++) {
-			rgd->bh[x] = bget(sdp, rgd->ri.ri_addr + x);
-			memset(rgd->bh[x]->b_data, 0, sdp->bsize);
-		}
 		if (gfs2_compute_bitstructs(sdp->sd_sb.sb_bsize, rgd)) {
 			log_crit(_("gfs2_convert: Error converting bitmaps.\n"));
 			exit(-1);
 		}
+
+		for (x = 0; x < rgd->ri.ri_length; x++)
+			rgd->bits[x].bi_bh = bget(sdp, rgd->ri.ri_addr + x);
+
 		convert_bitmaps(sdp, rgd);
 		for (x = 0; x < rgd->ri.ri_length; x++) {
 			if (x)
-				gfs2_meta_header_out_bh(&mh, rgd->bh[x]);
+				gfs2_meta_header_out_bh(&mh, rgd->bits[x].bi_bh);
 			else
-				gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[x]);
+				gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[x].bi_bh);
 		}
 	} /* for each journal */
 	return error;
diff --git a/gfs2/edit/journal.c b/gfs2/edit/journal.c
index a72a044..5b824a4 100644
--- a/gfs2/edit/journal.c
+++ b/gfs2/edit/journal.c
@@ -259,14 +259,14 @@ static int print_ld_blks(const uint64_t *b, const char *end, int start_line,
 						      sizeof(struct gfs2_meta_header))
 							* GFS2_NBBY;
 					bmap = o / sbd.sd_blocks_per_bitmap;
-					save_bh = rgd->bh[bmap];
+					save_bh = rgd->bits[bmap].bi_bh;
 					j_bmap_bh = bread(&sbd, abs_block +
 							  bcount);
-					rgd->bh[bmap] = j_bmap_bh;
+					rgd->bits[bmap].bi_bh = j_bmap_bh;
 					type = lgfs2_get_bitmap(&sbd, tblk,
 								rgd);
 					brelse(j_bmap_bh);
-					rgd->bh[bmap] = save_bh;
+					rgd->bits[bmap].bi_bh = save_bh;
 					print_gfs2("bit for blk 0x%llx is %d "
 						   "(%s)",
 						   (unsigned long long)tblk,
diff --git a/gfs2/fsck/fs_recovery.c b/gfs2/fsck/fs_recovery.c
index a052487..bc4210d 100644
--- a/gfs2/fsck/fs_recovery.c
+++ b/gfs2/fsck/fs_recovery.c
@@ -652,7 +652,7 @@ int replay_journals(struct gfs2_sbd *sdp, int preen, int force_check,
 			 * don't segfault. */
 			rgd.start = sdp->sb_addr + 1;
 			rgd.length = 1;
-			rgd.bh = NULL;
+			bits.bi_bh = NULL;
 			bits.bi_start = 0;
 			bits.bi_len = sdp->fssize / GFS2_NBBY;
 			rgd.bits = &bits;
diff --git a/gfs2/fsck/initialize.c b/gfs2/fsck/initialize.c
index f7ea45f..02ecd3f 100644
--- a/gfs2/fsck/initialize.c
+++ b/gfs2/fsck/initialize.c
@@ -222,7 +222,7 @@ static void check_rgrp_integrity(struct gfs2_sbd *sdp, struct rgrp_tree *rgd,
 		for (x = 0; x < bytes_to_check; x++) {
 			unsigned char *byte;
 
-			byte = (unsigned char *)&rgd->bh[rgb]->b_data[off + x];
+			byte = (unsigned char *)&rgd->bits[rgb].bi_bh->b_data[off + x];
 			if (*byte == 0x55) {
 				diblock += GFS2_NBBY;
 				continue;
@@ -277,7 +277,7 @@ static void check_rgrp_integrity(struct gfs2_sbd *sdp, struct rgrp_tree *rgd,
 				}
 				*byte &= ~(GFS2_BIT_MASK <<
 					   (GFS2_BIT_SIZE * y));
-				bmodified(rgd->bh[rgb]);
+				bmodified(rgd->bits[rgb].bi_bh);
 				rg_reclaimed++;
 				rg_free++;
 				rgd->rg.rg_free++;
@@ -314,9 +314,9 @@ static void check_rgrp_integrity(struct gfs2_sbd *sdp, struct rgrp_tree *rgd,
 	   will be reported. */
 	if (rg_reclaimed && *fixit) {
 		if (sdp->gfs1)
-			gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bh[0]);
+			gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 		else
-			gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+			gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 		*this_rg_cleaned = 1;
 		log_info( _("The rgrp at %lld (0x%llx) was cleaned of %d "
 			    "free metadata blocks.\n"),
@@ -335,10 +335,9 @@ static void check_rgrp_integrity(struct gfs2_sbd *sdp, struct rgrp_tree *rgd,
 		if (query( _("Fix the rgrp free blocks count? (y/n)"))) {
 			rgd->rg.rg_free = rg_free;
 			if (sdp->gfs1)
-				gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg,
-					     rgd->bh[0]);
+				gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 			else
-				gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+				gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 			*this_rg_fixed = 1;
 			log_err( _("The rgrp was fixed.\n"));
 		} else
@@ -354,7 +353,7 @@ static void check_rgrp_integrity(struct gfs2_sbd *sdp, struct rgrp_tree *rgd,
 			 gfs1rg->rg_freemeta, rg_unlinked);
 		if (query( _("Fix the rgrp free meta blocks count? (y/n)"))) {
 			gfs1rg->rg_freemeta = rg_unlinked;
-			gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bh[0]);
+			gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 			*this_rg_fixed = 1;
 			log_err( _("The rgrp was fixed.\n"));
 		} else
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 8da17c6..329fc3b 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -91,17 +91,15 @@ int check_n_fix_bitmap(struct gfs2_sbd *sdp, uint64_t blk, int error_on_dinode,
 				}
 				rgd->rg.rg_free++;
 				if (sdp->gfs1)
-					gfs_rgrp_out((struct gfs_rgrp *)
-						     &rgd->rg, rgd->bh[0]);
+					gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 				else
-					gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+					gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 			} else if (old_bitmap_state == GFS2_BLKST_FREE) {
 				rgd->rg.rg_free--;
 				if (sdp->gfs1)
-					gfs_rgrp_out((struct gfs_rgrp *)
-						     &rgd->rg, rgd->bh[0]);
+					gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 				else
-					gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+					gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 			}
 			log_err( _("The bitmap was fixed.\n"));
 		} else {
diff --git a/gfs2/fsck/pass5.c b/gfs2/fsck/pass5.c
index 68b1373..b2e8adf 100644
--- a/gfs2/fsck/pass5.c
+++ b/gfs2/fsck/pass5.c
@@ -218,9 +218,8 @@ static void update_rgrp(struct gfs2_sbd *sdp, struct rgrp_tree *rgp,
 		bits = &rgp->bits[i];
 
 		/* update the bitmaps */
-		if (check_block_status(sdp, rgp->bh[i]->b_data +
-				       bits->bi_offset, bits->bi_len,
-				       &rg_block, rgp->ri.ri_data0, count))
+		if (check_block_status(sdp, bits->bi_bh->b_data + bits->bi_offset,
+		                       bits->bi_len, &rg_block, rgp->ri.ri_data0, count))
 			return;
 		if (skip_this_pass || fsck_abort) /* if asked to skip the rest */
 			return;
@@ -275,9 +274,9 @@ static void update_rgrp(struct gfs2_sbd *sdp, struct rgrp_tree *rgp,
 			log_warn( _("Resource group counts updated\n"));
 			/* write out the rgrp */
 			if (sdp->gfs1)
-				gfs_rgrp_out(gfs1rg, rgp->bh[0]);
+				gfs_rgrp_out(gfs1rg, rgp->bits[0].bi_bh);
 			else
-				gfs2_rgrp_out_bh(&rgp->rg, rgp->bh[0]);
+				gfs2_rgrp_out_bh(&rgp->rg, rgp->bits[0].bi_bh);
 		} else
 			log_err( _("Resource group counts left inconsistent\n"));
 	}
diff --git a/gfs2/fsck/rgrepair.c b/gfs2/fsck/rgrepair.c
index 0466dd8..dd197d6 100644
--- a/gfs2/fsck/rgrepair.c
+++ b/gfs2/fsck/rgrepair.c
@@ -658,14 +658,14 @@ static int rewrite_rg_block(struct gfs2_sbd *sdp, struct rgrp_tree *rg,
 		 (int)x+1, (int)rg->ri.ri_length, typedesc);
 	if (query( _("Fix the Resource Group? (y/n)"))) {
 		log_err( _("Attempting to repair the rgrp.\n"));
-		rg->bh[x] = bread(sdp, rg->ri.ri_addr + x);
+		rg->bits[x].bi_bh = bread(sdp, rg->ri.ri_addr + x);
 		if (x) {
 			struct gfs2_meta_header mh;
 
 			mh.mh_magic = GFS2_MAGIC;
 			mh.mh_type = GFS2_METATYPE_RB;
 			mh.mh_format = GFS2_FORMAT_RB;
-			gfs2_meta_header_out_bh(&mh, rg->bh[x]);
+			gfs2_meta_header_out_bh(&mh, rg->bits[x].bi_bh);
 		} else {
 			if (sdp->gfs1)
 				memset(&rg->rg, 0, sizeof(struct gfs_rgrp));
@@ -676,13 +676,12 @@ static int rewrite_rg_block(struct gfs2_sbd *sdp, struct rgrp_tree *rg,
 			rg->rg.rg_header.mh_format = GFS2_FORMAT_RG;
 			rg->rg.rg_free = rg->ri.ri_data;
 			if (sdp->gfs1)
-				gfs_rgrp_out((struct gfs_rgrp *)&rg->rg,
-					     rg->bh[x]);
+				gfs_rgrp_out((struct gfs_rgrp *)&rg->rg, rg->bits[x].bi_bh);
 			else
-				gfs2_rgrp_out_bh(&rg->rg, rg->bh[x]);
+				gfs2_rgrp_out_bh(&rg->rg, rg->bits[x].bi_bh);
 		}
-		brelse(rg->bh[x]);
-		rg->bh[x] = NULL;
+		brelse(rg->bits[x].bi_bh);
+		rg->bits[x].bi_bh = NULL;
 		return 0;
 	}
 	return 1;
@@ -712,7 +711,6 @@ static int expect_rindex_sanity(struct gfs2_sbd *sdp, int *num_rgs)
 		memcpy(&exp->ri, &rgd->ri, sizeof(exp->ri));
 		memcpy(&exp->rg, &rgd->rg, sizeof(exp->rg));
 		exp->bits = NULL;
-		exp->bh = NULL;
 		gfs2_compute_bitstructs(sdp->sd_sb.sb_bsize, exp);
 	}
 	sdp->rgrps = *num_rgs;
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index 8b439a9..a0f6009 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -680,7 +680,7 @@ uint64_t find_free_blk(struct gfs2_sbd *sdp)
 	rg = &rl->rg;
 
 	for (block = 0; block < ri->ri_length; block++) {
-		bh = rl->bh[block];
+		bh = rl->bits[block].bi_bh;
 		x = (block) ? sizeof(struct gfs2_meta_header) : sizeof(struct gfs2_rgrp);
 
 		for (; x < sdp->bsize; x++)
diff --git a/gfs2/libgfs2/fs_bits.c b/gfs2/libgfs2/fs_bits.c
index 7194949..93ddc13 100644
--- a/gfs2/libgfs2/fs_bits.c
+++ b/gfs2/libgfs2/fs_bits.c
@@ -148,7 +148,7 @@ int gfs2_set_bitmap(lgfs2_rgrp_t rgd, uint64_t blkno, int state)
 
 	if (bits == NULL)
 		return -1;
-	byte = (unsigned char *)(rgd->bh[buf]->b_data + bits->bi_offset) +
+	byte = (unsigned char *)(bits->bi_bh->b_data + bits->bi_offset) +
 		(rgrp_block/GFS2_NBBY - bits->bi_start);
 	bit = (rgrp_block % GFS2_NBBY) * GFS2_BIT_SIZE;
 
@@ -156,7 +156,7 @@ int gfs2_set_bitmap(lgfs2_rgrp_t rgd, uint64_t blkno, int state)
 	*byte ^= cur_state << bit;
 	*byte |= state << bit;
 
-	bmodified(rgd->bh[buf]);
+	bmodified(bits->bi_bh);
 	return 0;
 }
 
@@ -181,6 +181,7 @@ int lgfs2_get_bitmap(struct gfs2_sbd *sdp, uint64_t blkno, struct rgrp_tree *rgd
 	uint32_t i = 0;
 	char *byte;
 	unsigned int bit;
+	struct gfs2_bitmap *bi;
 
 	if (rgd == NULL) {
 		rgd = gfs2_blk2rgrpd(sdp, blkno);
@@ -205,10 +206,11 @@ int lgfs2_get_bitmap(struct gfs2_sbd *sdp, uint64_t blkno, struct rgrp_tree *rgd
 		offset -= i * sdp->sd_blocks_per_bitmap;
 	}
 
-	if (!rgd->bh || !rgd->bh[i])
+	bi = &rgd->bits[i];
+	if (!bi->bi_bh)
 		return GFS2_BLKST_FREE;
 
-	byte = (rgd->bh[i]->b_data + rgd->bits[i].bi_offset) + (offset/GFS2_NBBY);
+	byte = (bi->bi_bh->b_data + bi->bi_offset) + (offset/GFS2_NBBY);
 	bit = (offset % GFS2_NBBY) * GFS2_BIT_SIZE;
 
 	return (*byte >> bit) & GFS2_BIT_MASK;
diff --git a/gfs2/libgfs2/fs_geometry.c b/gfs2/libgfs2/fs_geometry.c
index 587ceb8..c378dba 100644
--- a/gfs2/libgfs2/fs_geometry.c
+++ b/gfs2/libgfs2/fs_geometry.c
@@ -213,11 +213,11 @@ int build_rgrps(struct gfs2_sbd *sdp, int do_write)
 
 		if (do_write) {
 			for (x = 0; x < bitblocks; x++) {
-				rl->bh[x] = bget(sdp, rl->start + x);
+				rl->bits[x].bi_bh = bget(sdp, rl->start + x);
 				if (x)
-					gfs2_meta_header_out_bh(&mh, rl->bh[x]);
+					gfs2_meta_header_out_bh(&mh, rl->bits[x].bi_bh);
 				else
-					gfs2_rgrp_out_bh(&rl->rg, rl->bh[x]);
+					gfs2_rgrp_out_bh(&rl->rg, rl->bits[x].bi_bh);
 			}
 		}
 
diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index 015b974..aa95aa8 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -126,7 +126,7 @@ static uint64_t find_free_block(struct rgrp_tree *rgd)
 		unsigned long blk = 0;
 		struct gfs2_bitmap *bits = &rgd->bits[bm];
 
-		blk = gfs2_bitfit((unsigned char *)rgd->bh[bm]->b_data + bits->bi_offset,
+		blk = gfs2_bitfit((uint8_t *)bits->bi_bh->b_data + bits->bi_offset,
 		                  bits->bi_len, blk, GFS2_BLKST_FREE);
 		if (blk != BFITNOENT) {
 			blkno = blk + (bits->bi_start * GFS2_NBBY) + rgd->ri.ri_data0;
@@ -149,9 +149,9 @@ static int blk_alloc_in_rg(struct gfs2_sbd *sdp, unsigned state, struct rgrp_tre
 
 	rgd->rg.rg_free--;
 	if (sdp->gfs1)
-		gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bh[0]);
+		gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 	else
-		gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+		gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 
 	sdp->blks_alloced++;
 	return 0;
@@ -178,7 +178,7 @@ static int block_alloc(struct gfs2_sbd *sdp, const uint64_t blksreq, int state,
 	if (rgt == NULL)
 		return -1;
 
-	if (rgt->bh[0] == NULL) {
+	if (rgt->bits[0].bi_bh == NULL) {
 		if (gfs2_rgrp_read(sdp, rgt))
 			return -1;
 		release = 1;
@@ -1767,9 +1767,9 @@ void gfs2_free_block(struct gfs2_sbd *sdp, uint64_t block)
 		gfs2_set_bitmap(rgd, block, GFS2_BLKST_FREE);
 		rgd->rg.rg_free++; /* adjust the free count */
 		if (sdp->gfs1)
-			gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bh[0]);
+			gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 		else
-			gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+			gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 		sdp->blks_alloced--;
 	}
 }
@@ -1836,9 +1836,9 @@ int gfs2_freedi(struct gfs2_sbd *sdp, uint64_t diblock)
 	rgd->rg.rg_free++;
 	rgd->rg.rg_dinodes--;
 	if (sdp->gfs1)
-		gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bh[0]);
+		gfs_rgrp_out((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 	else
-		gfs2_rgrp_out_bh(&rgd->rg, rgd->bh[0]);
+		gfs2_rgrp_out_bh(&rgd->rg, rgd->bits[0].bi_bh);
 	sdp->dinodes_alloced--;
 	return 0;
 }
diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 2ba97d6..71da81e 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -170,6 +170,7 @@ struct device {
 
 struct gfs2_bitmap
 {
+	struct gfs2_buffer_head *bi_bh;
 	uint32_t   bi_offset;  /* The offset in the buffer of the first byte */
 	uint32_t   bi_start;   /* The position of the first byte in this block */
 	uint32_t   bi_len;     /* The number of bytes in this block */
@@ -185,7 +186,6 @@ struct rgrp_tree {
 	struct gfs2_rindex ri;
 	struct gfs2_rgrp rg;
 	struct gfs2_bitmap *bits;
-	struct gfs2_buffer_head **bh;
 };
 
 typedef struct rgrp_tree *lgfs2_rgrp_t;
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index c529594..e929846 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -67,11 +67,6 @@ int gfs2_compute_bitstructs(const uint32_t bsize, struct rgrp_tree *rgd)
 	    rgd->bits[length - 1].bi_len) * GFS2_NBBY != rgd->ri.ri_data)
 		goto errbits;
 
-	if (rgd->bh == NULL) {
-		rgd->bh = calloc(length, sizeof(struct gfs2_buffer_head *));
-		if (rgd->bh == NULL)
-			goto errbits;
-	}
 	return 0;
 errbits:
 	if (ownbits)
@@ -108,8 +103,9 @@ struct rgrp_tree *gfs2_blk2rgrpd(struct gfs2_sbd *sdp, uint64_t blk)
  */
 uint64_t gfs2_rgrp_read(struct gfs2_sbd *sdp, struct rgrp_tree *rgd)
 {
-	int x, length = rgd->ri.ri_length;
+	unsigned x, length = rgd->ri.ri_length;
 	uint64_t max_rgrp_bitbytes, max_rgrp_len;
+	struct gfs2_buffer_head **bhs;
 
 	/* Max size of an rgrp is 2GB.  Figure out how many blocks that is: */
 	max_rgrp_bitbytes = ((2147483648 / sdp->bsize) / GFS2_NBBY);
@@ -118,27 +114,38 @@ uint64_t gfs2_rgrp_read(struct gfs2_sbd *sdp, struct rgrp_tree *rgd)
 		return -1;
 	if (gfs2_check_range(sdp, rgd->ri.ri_addr))
 		return -1;
-	if (breadm(sdp, rgd->bh, length, rgd->ri.ri_addr))
+
+	bhs = calloc(length, sizeof(struct gfs2_buffer_head *));
+	if (bhs == NULL)
+		return -1;
+
+	if (breadm(sdp, bhs, length, rgd->ri.ri_addr)) {
+		free(bhs);
 		return -1;
-	for (x = 0; x < length; x++){
-		if(gfs2_check_meta(rgd->bh[x], (x) ? GFS2_METATYPE_RB : GFS2_METATYPE_RG))
-		{
-			uint64_t error;
+	}
+
+	for (x = 0; x < length; x++) {
+		struct gfs2_bitmap *bi = &rgd->bits[x];
+		int mtype = (x ? GFS2_METATYPE_RB : GFS2_METATYPE_RG);
 
-			error = rgd->ri.ri_addr + x;
+		bi->bi_bh = bhs[x];
+		if (gfs2_check_meta(bi->bi_bh, mtype)) {
+			unsigned err = x;
 			for (; x >= 0; x--) {
-				brelse(rgd->bh[x]);
-				rgd->bh[x] = NULL;
+				brelse(rgd->bits[x].bi_bh);
+				rgd->bits[x].bi_bh = NULL;
 			}
-			return error;
+			free(bhs);
+			return rgd->ri.ri_addr + err;
 		}
 	}
-	if (rgd->bh && rgd->bh[0]) {
+	if (x > 0) {
 		if (sdp->gfs1)
-			gfs_rgrp_in((struct gfs_rgrp *)&rgd->rg, rgd->bh[0]);
+			gfs_rgrp_in((struct gfs_rgrp *)&rgd->rg, rgd->bits[0].bi_bh);
 		else
-			gfs2_rgrp_in(&rgd->rg, rgd->bh[0]);
+			gfs2_rgrp_in(&rgd->rg, rgd->bits[0].bi_bh);
 	}
+	free(bhs);
 	return 0;
 }
 
@@ -147,10 +154,9 @@ void gfs2_rgrp_relse(struct rgrp_tree *rgd)
 	int x, length = rgd->ri.ri_length;
 
 	for (x = 0; x < length; x++) {
-		if (rgd->bh) {
-			if (rgd->bh[x])
-				brelse(rgd->bh[x]);
-			rgd->bh[x] = NULL;
+		if (rgd->bits[x].bi_bh) {
+			brelse(rgd->bits[x].bi_bh);
+			rgd->bits[x].bi_bh = NULL;
 		}
 	}
 }
@@ -193,11 +199,12 @@ void gfs2_rgrp_free(struct osi_root *rgrp_tree)
 
 	while ((n = osi_first(rgrp_tree))) {
 		rgd = (struct rgrp_tree *)n;
-		if (rgd->bh && rgd->bh[0]) { /* if a buffer exists        */
+
+		if (rgd->bits[0].bi_bh) { /* if a buffer exists */
 			rgs_since_sync++;
 			if (rgs_since_sync >= RG_SYNC_TOLERANCE) {
 				if (!sdp)
-					sdp = rgd->bh[0]->sdp;
+					sdp = rgd->bits[0].bi_bh->sdp;
 				fsync(sdp->device_fd);
 				rgs_since_sync = 0;
 			}
@@ -205,10 +212,6 @@ void gfs2_rgrp_free(struct osi_root *rgrp_tree)
 		}
 		if(rgd->bits)
 			free(rgd->bits);
-		if(rgd->bh) {
-			free(rgd->bh);
-			rgd->bh = NULL;
-		}
 		osi_erase(&rgd->node, rgrp_tree);
 		free(rgd);
 	}
@@ -407,9 +410,9 @@ void lgfs2_rgrps_free(lgfs2_rgrps_t *rgs)
 	while ((rg = (struct rgrp_tree *)osi_first(tree))) {
 		int i;
 		for (i = 0; i < rg->ri.ri_length; i++) {
-			if (rg->bh[i] != NULL) {
-				free(rg->bh[i]);
-				rg->bh[i] = NULL;
+			if (rg->bits[i].bi_bh != NULL) {
+				free(rg->bits[i].bi_bh);
+				rg->bits[i].bi_bh = NULL;
 			}
 		}
 		osi_erase(&rg->node, tree);
@@ -521,14 +524,11 @@ lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry)
 		link = &lastrg->node.osi_right;
 	}
 
-	rg = calloc(1, sizeof(*rg) +
-	              (entry->ri_length * sizeof(struct gfs2_bitmap)) +
-	              (entry->ri_length * sizeof(struct gfs2_buffer_head *)));
+	rg = calloc(1, sizeof(*rg) + (entry->ri_length * sizeof(struct gfs2_bitmap)));
 	if (rg == NULL)
 		return NULL;
 
 	rg->bits = (struct gfs2_bitmap *)(rg + 1);
-	rg->bh = (struct gfs2_buffer_head **)(rg->bits + entry->ri_length);
 
 	osi_link_node(&rg->node, parent, link);
 	osi_insert_color(&rg->node, &rgs->root);
diff --git a/gfs2/libgfs2/structures.c b/gfs2/libgfs2/structures.c
index ee49dce..9d90657 100644
--- a/gfs2/libgfs2/structures.c
+++ b/gfs2/libgfs2/structures.c
@@ -560,7 +560,7 @@ unsigned lgfs2_bm_scan(struct rgrp_tree *rgd, unsigned idx, uint64_t *buf, uint8
 	uint32_t blk = 0;
 
 	while(blk < (bi->bi_len * GFS2_NBBY)) {
-		blk = gfs2_bitfit((const unsigned char *)rgd->bh[idx]->b_data + bi->bi_offset,
+		blk = gfs2_bitfit((uint8_t *)bi->bi_bh->b_data + bi->bi_offset,
 				  bi->bi_len, blk, state);
 		if (blk == BFITNOENT)
 			break;
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 03/19] libgfs2: Fix an impossible loop condition in gfs2_rgrp_read
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 01/19] libgfs2: Keep a pointer to the sbd in lgfs2_rgrps_t Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 02/19] libgfs2: Move bitmap buffers inside struct gfs2_bitmap Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 04/19] libgfs2: Introduce struct lgfs2_rbm Andrew Price
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Correct a loop which expects an unsigned int to become negative.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/rgrp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index e929846..56b73ae 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -131,10 +131,10 @@ uint64_t gfs2_rgrp_read(struct gfs2_sbd *sdp, struct rgrp_tree *rgd)
 		bi->bi_bh = bhs[x];
 		if (gfs2_check_meta(bi->bi_bh, mtype)) {
 			unsigned err = x;
-			for (; x >= 0; x--) {
+			do {
 				brelse(rgd->bits[x].bi_bh);
 				rgd->bits[x].bi_bh = NULL;
-			}
+			} while (x-- != 0);
 			free(bhs);
 			return rgd->ri.ri_addr + err;
 		}
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 04/19] libgfs2: Introduce struct lgfs2_rbm
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (2 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 03/19] libgfs2: Fix an impossible loop condition in gfs2_rgrp_read Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 05/19] libgfs2: Move struct _lgfs2_rgrps into rgrp.h Andrew Price
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Add struct lgfs2_rbm, which is similar to struct gfs2_rbm in the kernel,
in order to support the coming work on extent allocation in libgfs2.

This structure and its supporting functions are added to a new private
header file rather than libgfs2.h until we have reason to export it.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/Makefile.am |  2 +-
 gfs2/libgfs2/rgrp.c      | 67 ++++++++++++++++++++++++++++++++++++++++++++++++
 gfs2/libgfs2/rgrp.h      | 29 +++++++++++++++++++++
 3 files changed, 97 insertions(+), 1 deletion(-)
 create mode 100644 gfs2/libgfs2/rgrp.h

diff --git a/gfs2/libgfs2/Makefile.am b/gfs2/libgfs2/Makefile.am
index 4af27be..1ce8c13 100644
--- a/gfs2/libgfs2/Makefile.am
+++ b/gfs2/libgfs2/Makefile.am
@@ -5,7 +5,7 @@ BUILT_SOURCES		= parser.h lexer.h
 AM_LFLAGS		= --header-file=lexer.h
 AM_YFLAGS		= -d
 
-noinst_HEADERS		= libgfs2.h lang.h config.h
+noinst_HEADERS		= libgfs2.h lang.h config.h rgrp.h
 
 noinst_LTLIBRARIES	= libgfs2.la
 
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index 56b73ae..dd8811b 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -1,12 +1,14 @@
 #include "clusterautoconfig.h"
 
 #include <inttypes.h>
+#include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 
 #include "libgfs2.h"
+#include "rgrp.h"
 
 #define RG_SYNC_TOLERANCE 1000
 
@@ -589,3 +591,68 @@ lgfs2_rgrp_t lgfs2_rgrp_last(lgfs2_rgrps_t rgs)
 {
 	return (lgfs2_rgrp_t)osi_last(&rgs->root);
 }
+
+/**
+ * gfs2_rbm_from_block - Set the rbm based upon rgd and block number
+ * @rbm: The rbm with rgd already set correctly
+ * @block: The block number (filesystem relative)
+ *
+ * This sets the bi and offset members of an rbm based on a
+ * resource group and a filesystem relative block number. The
+ * resource group must be set in the rbm on entry, the bi and
+ * offset members will be set by this function.
+ *
+ * Returns: 0 on success, or non-zero with errno set
+ */
+static int lgfs2_rbm_from_block(struct lgfs2_rbm *rbm, uint64_t block)
+{
+	uint64_t rblock = block - rbm->rgd->ri.ri_data0;
+	struct gfs2_sbd *sdp = rbm_bi(rbm)->bi_bh->sdp;
+
+	if (rblock > UINT_MAX) {
+		errno = EINVAL;
+		return 1;
+	}
+	if (block >= rbm->rgd->ri.ri_data0 + rbm->rgd->ri.ri_data) {
+		errno = E2BIG;
+		return 1;
+	}
+
+	rbm->bii = 0;
+	rbm->offset = (uint32_t)(rblock);
+	/* Check if the block is within the first block */
+	if (rbm->offset < (rbm_bi(rbm)->bi_len * GFS2_NBBY))
+		return 0;
+
+	/* Adjust for the size diff between gfs2_meta_header and gfs2_rgrp */
+	rbm->offset += (sizeof(struct gfs2_rgrp) -
+			sizeof(struct gfs2_meta_header)) * GFS2_NBBY;
+	rbm->bii = rbm->offset / sdp->sd_blocks_per_bitmap;
+	rbm->offset -= rbm->bii * sdp->sd_blocks_per_bitmap;
+	return 0;
+}
+
+/**
+ * lgfs2_rbm_incr - increment an rbm structure
+ * @rbm: The rbm with rgd already set correctly
+ *
+ * This function takes an existing rbm structure and increments it to the next
+ * viable block offset.
+ *
+ * Returns: If incrementing the offset would cause the rbm to go past the
+ *          end of the rgrp, true is returned, otherwise false.
+ *
+ */
+static int lgfs2_rbm_incr(struct lgfs2_rbm *rbm)
+{
+	if (rbm->offset + 1 < (rbm_bi(rbm)->bi_len * GFS2_NBBY)) { /* in the same bitmap */
+		rbm->offset++;
+		return 0;
+	}
+	if (rbm->bii == rbm->rgd->ri.ri_length - 1) /* at the last bitmap */
+		return 1;
+
+	rbm->offset = 0;
+	rbm->bii++;
+	return 0;
+}
diff --git a/gfs2/libgfs2/rgrp.h b/gfs2/libgfs2/rgrp.h
new file mode 100644
index 0000000..99c52d3
--- /dev/null
+++ b/gfs2/libgfs2/rgrp.h
@@ -0,0 +1,29 @@
+#ifndef __RGRP_DOT_H__
+#define __RGRP_DOT_H__
+
+#include "libgfs2.h"
+
+struct lgfs2_rbm {
+	lgfs2_rgrp_t rgd;
+	uint32_t offset;    /* The offset is bitmap relative */
+	unsigned bii;       /* Bitmap index */
+};
+
+static inline struct gfs2_bitmap *rbm_bi(const struct lgfs2_rbm *rbm)
+{
+	return rbm->rgd->bits + rbm->bii;
+}
+
+static inline uint64_t lgfs2_rbm_to_block(const struct lgfs2_rbm *rbm)
+{
+	return rbm->rgd->ri.ri_data0 + (rbm_bi(rbm)->bi_start * GFS2_NBBY) +
+	        rbm->offset;
+}
+
+static inline int lgfs2_rbm_eq(const struct lgfs2_rbm *rbm1, const struct lgfs2_rbm *rbm2)
+{
+	return (rbm1->rgd == rbm2->rgd) && (rbm1->bii == rbm2->bii) &&
+	        (rbm1->offset == rbm2->offset);
+}
+
+#endif /* __RGRP_DOT_H__ */
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 05/19] libgfs2: Move struct _lgfs2_rgrps into rgrp.h
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (3 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 04/19] libgfs2: Introduce struct lgfs2_rbm Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents Andrew Price
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Other parts of libgfs2 will need to know the struct behind lgfs2_rgrps_t
in future patches so move it into rgrp.h. Also make its sdp field
non-const as the struct it points to will be modified.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/libgfs2.h |  2 +-
 gfs2/libgfs2/rgrp.c    | 20 +-------------------
 gfs2/libgfs2/rgrp.h    | 18 ++++++++++++++++++
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 71da81e..b996be9 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -191,7 +191,7 @@ struct rgrp_tree {
 typedef struct rgrp_tree *lgfs2_rgrp_t;
 typedef struct _lgfs2_rgrps *lgfs2_rgrps_t;
 
-extern lgfs2_rgrps_t lgfs2_rgrps_init(const struct gfs2_sbd *sdp, uint64_t align, uint64_t offset);
+extern lgfs2_rgrps_t lgfs2_rgrps_init(struct gfs2_sbd *sdp, uint64_t align, uint64_t offset);
 extern void lgfs2_rgrps_free(lgfs2_rgrps_t *rgs);
 extern uint64_t lgfs2_rindex_entry_new(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry, uint64_t addr, uint32_t len);
 extern unsigned lgfs2_rindex_read_fd(int fd, lgfs2_rgrps_t rgs);
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index dd8811b..0f36b86 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -219,24 +219,6 @@ void gfs2_rgrp_free(struct osi_root *rgrp_tree)
 	}
 }
 
-struct rgplan {
-	uint32_t num;
-	uint32_t len;
-};
-
-/**
- * This structure is defined in libgfs2.h as an opaque type. It stores the
- * constants and context required for creating resource groups from any point
- * in an application.
- */
-struct _lgfs2_rgrps {
-	struct osi_root root;
-	struct rgplan plan[2];
-	const struct gfs2_sbd *sdp;
-	unsigned long align;
-	unsigned long align_off;
-};
-
 static uint64_t align_block(const uint64_t base, const uint64_t align)
 {
 	if ((align > 0) && ((base % align) > 0))
@@ -344,7 +326,7 @@ uint32_t lgfs2_rgrps_plan(const lgfs2_rgrps_t rgs, uint64_t space, uint32_t tgts
  * offset: The required stripe offset of the resource groups
  * Returns an initialised lgfs2_rgrps_t or NULL if unsuccessful with errno set
  */
-lgfs2_rgrps_t lgfs2_rgrps_init(const struct gfs2_sbd *sdp, uint64_t align, uint64_t offset)
+lgfs2_rgrps_t lgfs2_rgrps_init(struct gfs2_sbd *sdp, uint64_t align, uint64_t offset)
 {
 	lgfs2_rgrps_t rgs;
 
diff --git a/gfs2/libgfs2/rgrp.h b/gfs2/libgfs2/rgrp.h
index 99c52d3..384231e 100644
--- a/gfs2/libgfs2/rgrp.h
+++ b/gfs2/libgfs2/rgrp.h
@@ -3,6 +3,24 @@
 
 #include "libgfs2.h"
 
+struct rgplan {
+	uint32_t num;
+	uint32_t len;
+};
+
+/**
+ * This structure is defined in libgfs2.h as an opaque type. It stores the
+ * constants and context required for creating resource groups from any point
+ * in an application.
+ */
+struct _lgfs2_rgrps {
+	struct osi_root root;
+	struct rgplan plan[2];
+	struct gfs2_sbd *sdp;
+	unsigned long align;
+	unsigned long align_off;
+};
+
 struct lgfs2_rbm {
 	lgfs2_rgrp_t rgd;
 	uint32_t offset;    /* The offset is bitmap relative */
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (4 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 05/19] libgfs2: Move struct _lgfs2_rgrps into rgrp.h Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-03 10:17   ` Steven Whitehouse
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 07/19] tests: Add unit tests for the new extent search functions Andrew Price
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Port gfs2_rbm_find and some functions which it depends on from the gfs2
kernel code. This will set the base for allocation of single-extent
files. The functions have been simplified where possible as libgfs2
doesn't have a concept of reservations for the time being.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/rgrp.c | 197 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 gfs2/libgfs2/rgrp.h |   2 +
 2 files changed, 199 insertions(+)

diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index 0f36b86..7063288 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -638,3 +638,200 @@ static int lgfs2_rbm_incr(struct lgfs2_rbm *rbm)
 	rbm->bii++;
 	return 0;
 }
+
+/**
+ * lgfs2_testbit - test a bit in the bitmaps
+ * @rbm: The bit to test
+ *
+ * Returns: The two bit block state of the requested bit
+ */
+static inline uint8_t lgfs2_testbit(const struct lgfs2_rbm *rbm)
+{
+	struct gfs2_bitmap *bi = rbm_bi(rbm);
+	const uint8_t *buffer = (uint8_t *)bi->bi_bh->b_data + bi->bi_offset;
+	const uint8_t *byte;
+	unsigned int bit;
+
+	byte = buffer + (rbm->offset / GFS2_NBBY);
+	bit = (rbm->offset % GFS2_NBBY) * GFS2_BIT_SIZE;
+
+	return (*byte >> bit) & GFS2_BIT_MASK;
+}
+
+/**
+ * lgfs2_unaligned_extlen - Look for free blocks which are not byte aligned
+ * @rbm: Position to search (value/result)
+ * @n_unaligned: Number of unaligned blocks to check
+ * @len: Decremented for each block found (terminate on zero)
+ *
+ * Returns: true if a non-free block is encountered
+ */
+static int lgfs2_unaligned_extlen(struct lgfs2_rbm *rbm, uint32_t n_unaligned, uint32_t *len)
+{
+	uint32_t n;
+	uint8_t res;
+
+	for (n = 0; n < n_unaligned; n++) {
+		res = lgfs2_testbit(rbm);
+		if (res != GFS2_BLKST_FREE)
+			return 1;
+		(*len)--;
+		if (*len == 0)
+			return 1;
+		if (lgfs2_rbm_incr(rbm))
+			return 1;
+	}
+
+	return 0;
+}
+
+static uint8_t *check_bytes8(const uint8_t *start, uint8_t value, unsigned bytes)
+{
+	while (bytes) {
+		if (*start != value)
+			return (void *)start;
+		start++;
+		bytes--;
+	}
+	return NULL;
+}
+
+/**
+ * lgfs2_free_extlen - Return extent length of free blocks
+ * @rbm: Starting position
+ * @len: Max length to check
+ *
+ * Starting@the block specified by the rbm, see how many free blocks
+ * there are, not reading more than len blocks ahead. This can be done
+ * using check_bytes8 when the blocks are byte aligned, but has to be done
+ * on a block by block basis in case of unaligned blocks. Also this
+ * function can cope with bitmap boundaries (although it must stop on
+ * a resource group boundary)
+ *
+ * Returns: Number of free blocks in the extent
+ */
+static uint32_t lgfs2_free_extlen(const struct lgfs2_rbm *rrbm, uint32_t len)
+{
+	struct lgfs2_rbm rbm = *rrbm;
+	uint32_t n_unaligned = rbm.offset & 3;
+	uint32_t size = len;
+	uint32_t bytes;
+	uint32_t chunk_size;
+	uint8_t *ptr, *start, *end;
+	uint64_t block;
+	struct gfs2_bitmap *bi;
+
+	if (n_unaligned &&
+	    lgfs2_unaligned_extlen(&rbm, 4 - n_unaligned, &len))
+		goto out;
+
+	n_unaligned = len & 3;
+	/* Start is now byte aligned */
+	while (len > 3) {
+		bi = rbm_bi(&rbm);
+		start = (uint8_t *)bi->bi_bh->b_data;
+		end = start + bi->bi_bh->sdp->bsize;
+		start += bi->bi_offset;
+		start += (rbm.offset / GFS2_NBBY);
+		bytes = (len / GFS2_NBBY) < (end - start) ? (len / GFS2_NBBY):(end - start);
+		ptr = check_bytes8(start, 0, bytes);
+		chunk_size = ((ptr == NULL) ? bytes : (ptr - start));
+		chunk_size *= GFS2_NBBY;
+		len -= chunk_size;
+		block = lgfs2_rbm_to_block(&rbm);
+		if (lgfs2_rbm_from_block(&rbm, block + chunk_size)) {
+			n_unaligned = 0;
+			break;
+		}
+		if (ptr) {
+			n_unaligned = 3;
+			break;
+		}
+		n_unaligned = len & 3;
+	}
+
+	/* Deal with any bits left over@the end */
+	if (n_unaligned)
+		lgfs2_unaligned_extlen(&rbm, n_unaligned, &len);
+out:
+	return size - len;
+}
+
+/**
+ * gfs2_rbm_find - Look for blocks of a particular state
+ * @rbm: Value/result starting position and final position
+ * @state: The state which we want to find
+ * @minext: Pointer to the requested extent length (NULL for a single block)
+ *          This is updated to be the actual reservation size.
+ *
+ * Returns: 0 on success, non-zero with errno == ENOSPC if there is no block of the requested state
+ */
+int lgfs2_rbm_find(struct lgfs2_rbm *rbm, uint8_t state, uint32_t *minext)
+{
+	int initial_bii;
+	uint32_t offset;
+	int n = 0;
+	int iters = rbm->rgd->ri.ri_length;
+	uint32_t extlen;
+
+	/* If we are not starting at the beginning of a bitmap, then we
+	 * need to add one to the bitmap count to ensure that we search
+	 * the starting bitmap twice.
+	 */
+	if (rbm->offset != 0)
+		iters++;
+
+	for (n = 0; n < iters; n++) {
+		struct gfs2_bitmap *bi = rbm_bi(rbm);
+		struct gfs2_buffer_head *bh = bi->bi_bh;
+		uint8_t *buf = (uint8_t *)bh->b_data + bi->bi_offset;
+		uint64_t block;
+		int ret;
+
+		if ((rbm->rgd->rg.rg_free < *minext) && (state == GFS2_BLKST_FREE))
+			goto next_bitmap;
+
+		offset = gfs2_bitfit(buf, bi->bi_len, rbm->offset, state);
+		if (offset == BFITNOENT)
+			goto next_bitmap;
+
+		rbm->offset = offset;
+		initial_bii = rbm->bii;
+		block = lgfs2_rbm_to_block(rbm);
+		extlen = 1;
+
+		if (*minext != 0)
+			extlen = lgfs2_free_extlen(rbm, *minext);
+
+		if (extlen >= *minext)
+			return 0;
+
+		ret = lgfs2_rbm_from_block(rbm, block + extlen);
+		if (ret == 0) {
+			n += (rbm->bii - initial_bii);
+			continue;
+		}
+
+		if (errno == E2BIG) {
+			rbm->bii = 0;
+			rbm->offset = 0;
+			n += (rbm->bii - initial_bii);
+			goto res_covered_end_of_rgrp;
+		}
+
+		return ret;
+
+next_bitmap:	/* Find next bitmap in the rgrp */
+		rbm->offset = 0;
+		rbm->bii++;
+		if (rbm->bii == rbm->rgd->ri.ri_length)
+			rbm->bii = 0;
+
+res_covered_end_of_rgrp:
+		if (rbm->bii == 0)
+			break;
+	}
+
+	errno = ENOSPC;
+	return 1;
+}
diff --git a/gfs2/libgfs2/rgrp.h b/gfs2/libgfs2/rgrp.h
index 384231e..1634fbc 100644
--- a/gfs2/libgfs2/rgrp.h
+++ b/gfs2/libgfs2/rgrp.h
@@ -44,4 +44,6 @@ static inline int lgfs2_rbm_eq(const struct lgfs2_rbm *rbm1, const struct lgfs2_
 	        (rbm1->offset == rbm2->offset);
 }
 
+extern int lgfs2_rbm_find(struct lgfs2_rbm *rbm, uint8_t state, uint32_t *minext);
+
 #endif /* __RGRP_DOT_H__ */
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 07/19] tests: Add unit tests for the new extent search functions
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (5 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 08/19] libgfs2: Ignore an empty rgrp plan if a length is specified Andrew Price
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The rbm search functions added by the previous commit added a certain
amount of complexity. These unit tests go a long way to making sure
they're doing what they're meant to do, and also demonstrate how the new
rgrp functions work without any i/o required.

The unit test sources have also been reorganised to match the names of
the libgfs2 source files they test.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 .gitignore         |   3 +-
 tests/Makefile.am  |  33 ++++++++-----
 tests/check_rgrp.c | 143 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/libgfs2.at   |   8 ++-
 4 files changed, 173 insertions(+), 14 deletions(-)
 create mode 100644 tests/check_rgrp.c

diff --git a/.gitignore b/.gitignore
index ae72f2b..4ffdb71 100644
--- a/.gitignore
+++ b/.gitignore
@@ -42,7 +42,8 @@ gfs2/fsck/fsck.gfs2
 gfs2/mkfs/mkfs.gfs2
 gfs2/tune/tunegfs2
 test-driver
-tests/check_libgfs2
+tests/check_meta
+tests/check_rgrp
 tests/testvol
 tests/testsuite.log
 tests/testsuite.dir
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 5f02d3a..70e77ef 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -3,20 +3,28 @@ DISTCLEANFILES = atlocal atconfig
 CLEANFILES = testvol
 
 if BUILD_TESTS
-check_PROGRAMS = check_libgfs2
-
-check_libgfs2_SOURCES = \
-	check_meta.c \
+UNIT_TESTS = \
+	check_meta \
+	check_rgrp
+UNIT_SOURCES = \
 	$(top_srcdir)/gfs2/libgfs2/libgfs2.h
-
-check_libgfs2_CFLAGS = \
+UNIT_CFLAGS = \
 	-I$(top_srcdir)/gfs2/libgfs2 \
         -I$(top_srcdir)/gfs2/include \
 	@check_CFLAGS@
-
-check_libgfs2_LDADD = \
+UNIT_LDADD = \
 	$(top_builddir)/gfs2/libgfs2/libgfs2.la \
 	@check_LIBS@
+
+check_PROGRAMS = $(UNIT_TESTS)
+
+check_meta_SOURCES = $(UNIT_SOURCES) check_meta.c
+check_meta_CFLAGS = $(UNIT_CFLAGS)
+check_meta_LDADD = $(UNIT_LDADD)
+
+check_rgrp_SOURCES = $(UNIT_SOURCES) check_rgrp.c
+check_rgrp_CFLAGS = $(UNIT_CFLAGS)
+check_rgrp_LDADD = $(UNIT_LDADD)
 endif
 
 # The `:;' works around a Bash 3.2 bug when the output is not writable.
@@ -41,8 +49,11 @@ TESTSUITE_AT = \
 	testsuite.at \
 	mkfs.at \
 	fsck.at \
-	edit.at \
-	libgfs2.at
+	edit.at
+
+if BUILD_TESTS
+TESTSUITE_AT += libgfs2.at
+endif
 
 TESTSUITE = $(srcdir)/testsuite
 
@@ -61,6 +72,6 @@ atconfig: $(top_builddir)/config.status
 AUTOM4TE = $(SHELL) $(top_srcdir)/missing --run autom4te
 AUTOTEST = $(AUTOM4TE) --language=autotest
 
-$(TESTSUITE): $(TESTSUITE_AT) package.m4
+$(TESTSUITE): $(TESTSUITE_AT) package.m4 $(UNIT_TESTS)
 	$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at
 	mv $@.tmp $@
diff --git a/tests/check_rgrp.c b/tests/check_rgrp.c
new file mode 100644
index 0000000..d113846
--- /dev/null
+++ b/tests/check_rgrp.c
@@ -0,0 +1,143 @@
+#include <check.h>
+#include <libgfs2.h>
+#include <rgrp.h> /* Private header libgfs2/rgrp.h for convenience */
+
+// TODO: Remove this when the extern is removed from libgfs2
+void print_it(const char *label, const char *fmt, const char *fmt2, ...) {}
+
+static lgfs2_rgrps_t mockup_rgrp(void)
+{
+	struct gfs2_sbd *sdp;
+	lgfs2_rgrps_t rgs;
+	unsigned i;
+	uint64_t addr;
+	struct gfs2_rindex ri = {0};
+	lgfs2_rgrp_t rg;
+	uint32_t rgsize = (1024 << 20) / 4096;
+
+	sdp = calloc(1, sizeof(*sdp));
+	ck_assert_ptr_ne(sdp, NULL);
+
+	sdp->device.length = rgsize + 20;
+	sdp->device_fd = -1;
+	sdp->bsize = sdp->sd_sb.sb_bsize = 4096;
+	compute_constants(sdp);
+
+	rgs = lgfs2_rgrps_init(sdp, 0, 0);
+	ck_assert_ptr_ne(rgs, NULL);
+
+	lgfs2_rgrps_plan(rgs, sdp->device.length - 16, rgsize);
+
+	addr = lgfs2_rindex_entry_new(rgs, &ri, 16, rgsize);
+	ck_assert(addr != 0);
+
+	rg = lgfs2_rgrps_append(rgs, &ri);
+	ck_assert_ptr_ne(rg, NULL);
+
+	for (i = 0; i < rg->ri.ri_length; i++) {
+		rg->bits[i].bi_bh = bget(sdp, rg->ri.ri_addr + i);
+		ck_assert_ptr_ne(rg->bits[i].bi_bh, NULL);
+	}
+	return rgs;
+}
+
+START_TEST(test_mockup_rgrp)
+{
+	lgfs2_rgrps_t rgs = mockup_rgrp();
+	ck_assert_ptr_ne(rgs, NULL);
+}
+END_TEST
+
+START_TEST(test_rbm_find_good)
+{
+	uint32_t minext;
+	struct lgfs2_rbm rbm = {0};
+	lgfs2_rgrps_t rgs = mockup_rgrp();
+	rbm.rgd = lgfs2_rgrp_first(rgs);
+
+	/* Check that extent sizes up to the whole rg can be found */
+	for (minext = 1; minext <= rbm.rgd->ri.ri_data; minext++) {
+		int err;
+		uint64_t addr;
+
+		rbm.offset = rbm.bii = 0;
+
+		err = lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &minext);
+		ck_assert_int_eq(err, 0);
+
+		addr = lgfs2_rbm_to_block(&rbm);
+		ck_assert_uint_eq(addr, rbm.rgd->ri.ri_data0);
+	}
+}
+END_TEST
+
+START_TEST(test_rbm_find_bad)
+{
+	int err;
+	uint32_t minext;
+	struct lgfs2_rbm rbm = {0};
+	lgfs2_rgrps_t rgs = mockup_rgrp();
+
+	rbm.rgd = lgfs2_rgrp_first(rgs);
+	minext = rbm.rgd->ri.ri_data + 1;
+
+	err = lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &minext);
+	ck_assert_int_eq(err, 1);
+}
+END_TEST
+
+START_TEST(test_rbm_find_lastblock)
+{
+	int err;
+	unsigned i;
+	uint64_t addr;
+	uint32_t minext = 1; /* Only looking for one block */
+	struct lgfs2_rbm rbm = {0};
+	lgfs2_rgrp_t rg;
+	lgfs2_rgrps_t rgs = mockup_rgrp();
+
+	rbm.rgd = rg = lgfs2_rgrp_first(rgs);
+
+	/* Flag all blocks as allocated... */
+	for (i = 0; i < rg->ri.ri_length; i++)
+		memset(rg->bits[i].bi_bh->b_data, 0xff, rgs->sdp->bsize);
+
+	/* ...except the final one */
+	err = gfs2_set_bitmap(rg, rg->ri.ri_data0 + rg->ri.ri_data - 1, GFS2_BLKST_FREE);
+	ck_assert_int_eq(err, 0);
+
+	err = lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &minext);
+	ck_assert_int_eq(err, 0);
+
+	addr = lgfs2_rbm_to_block(&rbm);
+	ck_assert_uint_eq(addr, rg->ri.ri_data0 + rg->ri.ri_data - 1);
+}
+END_TEST
+
+static Suite * libgfs2_suite(void)
+{
+
+	Suite *s = suite_create("libgfs2");
+
+	TCase *tc_rgrp = tcase_create("rgrp");
+
+	tcase_add_test(tc_rgrp, test_mockup_rgrp);
+	tcase_add_test(tc_rgrp, test_rbm_find_good);
+	tcase_add_test(tc_rgrp, test_rbm_find_bad);
+	tcase_add_test(tc_rgrp, test_rbm_find_lastblock);
+	tcase_set_timeout(tc_rgrp, 60);
+	suite_add_tcase(s, tc_rgrp);
+
+	return s;
+}
+
+int main(void)
+{
+	int failures;
+	Suite *s = libgfs2_suite();
+	SRunner *sr = srunner_create(s);
+	srunner_run_all(sr, CK_NORMAL);
+	failures = srunner_ntests_failed(sr);
+	srunner_free(sr);
+	return failures ? EXIT_FAILURE : EXIT_SUCCESS;
+}
diff --git a/tests/libgfs2.at b/tests/libgfs2.at
index a6b478a..8c44cdd 100644
--- a/tests/libgfs2.at
+++ b/tests/libgfs2.at
@@ -1,5 +1,9 @@
 AT_BANNER([libgfs2 unit tests])
 
-AT_SETUP([Metadata check])
-AT_CHECK([check_libgfs2], 0, [ignore], [ignore])
+AT_SETUP([meta.c])
+AT_CHECK([check_meta], 0, [ignore], [ignore])
+AT_CLEANUP
+
+AT_SETUP([rgrp.c])
+AT_CHECK([check_rgrp], 0, [ignore], [ignore])
 AT_CLEANUP
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 08/19] libgfs2: Ignore an empty rgrp plan if a length is specified
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (6 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 07/19] tests: Add unit tests for the new extent search functions Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 09/19] libgfs2: Add back-pointer to rgrps in lgfs2_rgrp_t Andrew Price
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

lgfs2_rindex_entry_new previously failed if the rgrp layout plan was
empty even if the caller specified an rgrp length.

Also tidy up a couple of function comments.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/rgrp.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index 7063288..f780e00 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -428,7 +428,7 @@ uint64_t lgfs2_rindex_entry_new(lgfs2_rgrps_t rgs, struct gfs2_rindex *ri, uint6
 		plan = 0;
 	else if (rgs->plan[1].num > 0)
 		plan = 1;
-	else
+	else if (len == 0)
 		return 0;
 
 	if (plan >= 0 && (len == 0 || len == rgs->plan[plan].len)) {
@@ -450,7 +450,9 @@ uint64_t lgfs2_rindex_entry_new(lgfs2_rgrps_t rgs, struct gfs2_rindex *ri, uint6
 }
 
 /**
- * Return the rindex structure relating to a a resource group.
+ * Return the rindex structure relating to a resource group.
+ * The return type is const to advise callers that making changes to this
+ * structure directly isn't wise. libgfs2 functions should be used instead.
  */
 const struct gfs2_rindex *lgfs2_rgrp_index(lgfs2_rgrp_t rg)
 {
@@ -458,7 +460,9 @@ const struct gfs2_rindex *lgfs2_rgrp_index(lgfs2_rgrp_t rg)
 }
 
 /**
- * Return the rgrp structure relating to a a resource group.
+ * Return the rgrp structure relating to a resource group.
+ * The return type is const to advise callers that making changes to this
+ * structure directly isn't wise. libgfs2 functions should be used instead.
  */
 const struct gfs2_rgrp *lgfs2_rgrp_rgrp(lgfs2_rgrp_t rg)
 {
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 09/19] libgfs2: Add back-pointer to rgrps in lgfs2_rgrp_t
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (7 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 08/19] libgfs2: Ignore an empty rgrp plan if a length is specified Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 10/19] libgfs2: Const-ify the parameters of print functions Andrew Price
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Keep a pointer from a resource group to the resource group set it
belongs to for convenience and also to make lgfs2_rgrp_write a more
sensible interface.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/libgfs2.h | 5 +++--
 gfs2/libgfs2/rgrp.c    | 4 +++-
 gfs2/mkfs/main_grow.c  | 2 +-
 gfs2/mkfs/main_mkfs.c  | 2 +-
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index b996be9..fca90b0 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -177,6 +177,7 @@ struct gfs2_bitmap
 };
 
 struct gfs2_sbd;
+typedef struct _lgfs2_rgrps *lgfs2_rgrps_t;
 
 struct rgrp_tree {
 	struct osi_node node;
@@ -186,10 +187,10 @@ struct rgrp_tree {
 	struct gfs2_rindex ri;
 	struct gfs2_rgrp rg;
 	struct gfs2_bitmap *bits;
+	lgfs2_rgrps_t rgrps;
 };
 
 typedef struct rgrp_tree *lgfs2_rgrp_t;
-typedef struct _lgfs2_rgrps *lgfs2_rgrps_t;
 
 extern lgfs2_rgrps_t lgfs2_rgrps_init(struct gfs2_sbd *sdp, uint64_t align, uint64_t offset);
 extern void lgfs2_rgrps_free(lgfs2_rgrps_t *rgs);
@@ -200,7 +201,7 @@ extern uint32_t lgfs2_rgrp_align_len(const lgfs2_rgrps_t rgs, uint32_t len);
 extern unsigned lgfs2_rgsize_for_data(uint64_t blksreq, unsigned bsize);
 extern uint32_t lgfs2_rgrps_plan(const lgfs2_rgrps_t rgs, uint64_t space, uint32_t tgtsize);
 extern lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry);
-extern int lgfs2_rgrp_write(lgfs2_rgrps_t rgs, int fd, lgfs2_rgrp_t rg);
+extern int lgfs2_rgrp_write(int fd, lgfs2_rgrp_t rg);
 extern const struct gfs2_rindex *lgfs2_rgrp_index(lgfs2_rgrp_t rg);
 extern const struct gfs2_rgrp *lgfs2_rgrp_rgrp(lgfs2_rgrp_t rg);
 extern lgfs2_rgrp_t lgfs2_rgrp_first(lgfs2_rgrps_t rgs);
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index f780e00..a3fa1a4 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -528,6 +528,7 @@ lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry)
 	rg->rg.rg_free = rg->ri.ri_data;
 
 	compute_bitmaps(rg, rgs->sdp->bsize);
+	rg->rgrps = rgs;
 	return rg;
 }
 
@@ -535,9 +536,10 @@ lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry)
  * Write a resource group to a file descriptor.
  * Returns 0 on success or non-zero on failure with errno set
  */
-int lgfs2_rgrp_write(const lgfs2_rgrps_t rgs, int fd, const lgfs2_rgrp_t rg)
+int lgfs2_rgrp_write(int fd, const lgfs2_rgrp_t rg)
 {
 	ssize_t ret = 0;
+	lgfs2_rgrps_t rgs = rg->rgrps;
 	size_t len = rg->ri.ri_length * rgs->sdp->bsize;
 	unsigned int i;
 	const struct gfs2_meta_header bmh = {
diff --git a/gfs2/mkfs/main_grow.c b/gfs2/mkfs/main_grow.c
index 95fbd1d..c4f5055 100644
--- a/gfs2/mkfs/main_grow.c
+++ b/gfs2/mkfs/main_grow.c
@@ -219,7 +219,7 @@ static unsigned initialize_new_portion(struct gfs2_sbd *sdp, lgfs2_rgrps_t rgs)
 		if (metafs_interrupted)
 			return 0;
 		if (!test)
-			err = lgfs2_rgrp_write(rgs, sdp->device_fd, rg);
+			err = lgfs2_rgrp_write(sdp->device_fd, rg);
 		if (err != 0) {
 			perror(_("Failed to write resource group"));
 			return 0;
diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index bf7c9cd..39b9609 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -627,7 +627,7 @@ static int place_rgrp(struct gfs2_sbd *sdp, lgfs2_rgrps_t rgs, struct gfs2_rinde
 		perror(_("Failed to create resource group"));
 		return -1;
 	}
-	err = lgfs2_rgrp_write(rgs, sdp->device_fd, rg);
+	err = lgfs2_rgrp_write(sdp->device_fd, rg);
 	if (err != 0) {
 		perror(_("Failed to write resource group"));
 		return -1;
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 10/19] libgfs2: Const-ify the parameters of print functions
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (8 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 09/19] libgfs2: Add back-pointer to rgrps in lgfs2_rgrp_t Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 11/19] libgfs2: Allow init_dinode to accept a preallocated bh Andrew Price
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This allows us to use the *_print functions to print const types, which
works better with values returned from functions such as
lgfs2_rgrp_index()

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/libgfs2.h | 26 +++++++++++++-------------
 gfs2/libgfs2/ondisk.c  | 26 +++++++++++++-------------
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index fca90b0..ed1030c 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -799,19 +799,19 @@ extern void gfs2_quota_change_out(struct gfs2_quota_change *qc,
 
 /* Printing functions */
 
-extern void gfs2_inum_print(struct gfs2_inum *no);
-extern void gfs2_meta_header_print(struct gfs2_meta_header *mh);
-extern void gfs2_sb_print(struct gfs2_sb *sb);
-extern void gfs2_rindex_print(struct gfs2_rindex *ri);
-extern void gfs2_rgrp_print(struct gfs2_rgrp *rg);
-extern void gfs2_quota_print(struct gfs2_quota *qu);
-extern void gfs2_dinode_print(struct gfs2_dinode *di);
-extern void gfs2_leaf_print(struct gfs2_leaf *lf);
-extern void gfs2_ea_header_print(struct gfs2_ea_header *ea, char *name);
-extern void gfs2_log_header_print(struct gfs2_log_header *lh);
-extern void gfs2_log_descriptor_print(struct gfs2_log_descriptor *ld);
-extern void gfs2_statfs_change_print(struct gfs2_statfs_change *sc);
-extern void gfs2_quota_change_print(struct gfs2_quota_change *qc);
+extern void gfs2_inum_print(const struct gfs2_inum *no);
+extern void gfs2_meta_header_print(const struct gfs2_meta_header *mh);
+extern void gfs2_sb_print(const struct gfs2_sb *sb);
+extern void gfs2_rindex_print(const struct gfs2_rindex *ri);
+extern void gfs2_rgrp_print(const struct gfs2_rgrp *rg);
+extern void gfs2_quota_print(const struct gfs2_quota *qu);
+extern void gfs2_dinode_print(const struct gfs2_dinode *di);
+extern void gfs2_leaf_print(const struct gfs2_leaf *lf);
+extern void gfs2_ea_header_print(const struct gfs2_ea_header *ea, char *name);
+extern void gfs2_log_header_print(const struct gfs2_log_header *lh);
+extern void gfs2_log_descriptor_print(const struct gfs2_log_descriptor *ld);
+extern void gfs2_statfs_change_print(const struct gfs2_statfs_change *sc);
+extern void gfs2_quota_change_print(const struct gfs2_quota_change *qc);
 
 /* Language functions */
 
diff --git a/gfs2/libgfs2/ondisk.c b/gfs2/libgfs2/ondisk.c
index 1f81b5f..4744337 100644
--- a/gfs2/libgfs2/ondisk.c
+++ b/gfs2/libgfs2/ondisk.c
@@ -56,7 +56,7 @@ void gfs2_inum_out(const struct gfs2_inum *no, char *buf)
 	CPOUT_64(no, str, no_addr);
 }
 
-void gfs2_inum_print(struct gfs2_inum *no)
+void gfs2_inum_print(const struct gfs2_inum *no)
 {
 	pv(no, no_formal_ino, "%llu", "0x%llx");
 	pv(no, no_addr, "%llu", "0x%llx");
@@ -90,7 +90,7 @@ void gfs2_meta_header_out_bh(const struct gfs2_meta_header *mh,
 	bmodified(bh);
 }
 
-void gfs2_meta_header_print(struct gfs2_meta_header *mh)
+void gfs2_meta_header_print(const struct gfs2_meta_header *mh)
 {
 	pv(mh, mh_magic, "0x%08X", NULL);
 	pv(mh, mh_type, "%u", "0x%x");
@@ -177,7 +177,7 @@ void gfs2_print_uuid(const unsigned char *uuid)
 }
 #endif
 
-void gfs2_sb_print(struct gfs2_sb *sb)
+void gfs2_sb_print(const struct gfs2_sb *sb)
 {
 	gfs2_meta_header_print(&sb->sb_header);
 
@@ -229,7 +229,7 @@ void gfs2_rindex_out(const struct gfs2_rindex *ri, char *buf)
 	CPOUT_08(ri, str, ri_reserved, 64);
 }
 
-void gfs2_rindex_print(struct gfs2_rindex *ri)
+void gfs2_rindex_print(const struct gfs2_rindex *ri)
 {
 	pv(ri, ri_addr, "%llu", "0x%llx");
 	pv(ri, ri_length, "%u", "0x%x");
@@ -270,7 +270,7 @@ void gfs2_rgrp_out_bh(const struct gfs2_rgrp *rg, struct gfs2_buffer_head *bh)
 	bmodified(bh);
 }
 
-void gfs2_rgrp_print(struct gfs2_rgrp *rg)
+void gfs2_rgrp_print(const struct gfs2_rgrp *rg)
 {
 	gfs2_meta_header_print(&rg->rg_header);
 	pv(rg, rg_flags, "%u", "0x%x");
@@ -298,7 +298,7 @@ void gfs2_quota_out(struct gfs2_quota *qu, char *buf)
 	memset(qu->qu_reserved, 0, sizeof(qu->qu_reserved));
 }
 
-void gfs2_quota_print(struct gfs2_quota *qu)
+void gfs2_quota_print(const struct gfs2_quota *qu)
 {
 	pv(qu, qu_limit, "%llu", "0x%llx");
 	pv(qu, qu_warn, "%llu", "0x%llx");
@@ -376,7 +376,7 @@ void gfs2_dinode_out(struct gfs2_dinode *di, struct gfs2_buffer_head *bh)
 	bmodified(bh);
 }
 
-void gfs2_dinode_print(struct gfs2_dinode *di)
+void gfs2_dinode_print(const struct gfs2_dinode *di)
 {
 	gfs2_meta_header_print(&di->di_header);
 	gfs2_inum_print(&di->di_num);
@@ -470,7 +470,7 @@ void gfs2_leaf_out(struct gfs2_leaf *lf, struct gfs2_buffer_head *bh)
 	bmodified(bh);
 }
 
-void gfs2_leaf_print(struct gfs2_leaf *lf)
+void gfs2_leaf_print(const struct gfs2_leaf *lf)
 {
 	gfs2_meta_header_print(&lf->lf_header);
 	pv(lf, lf_depth, "%u", "0x%x");
@@ -497,7 +497,7 @@ void gfs2_ea_header_in(struct gfs2_ea_header *ea, char *buf)
 	ea->ea_num_ptrs = str->ea_num_ptrs;
 }
 
-void gfs2_ea_header_print(struct gfs2_ea_header *ea, char *name)
+void gfs2_ea_header_print(const struct gfs2_ea_header *ea, char *name)
 {
 	char buf[GFS2_EA_MAX_NAME_LEN + 1];
 
@@ -540,7 +540,7 @@ void gfs2_log_header_out(struct gfs2_log_header *lh,
 	bmodified(bh);
 }
 
-void gfs2_log_header_print(struct gfs2_log_header *lh)
+void gfs2_log_header_print(const struct gfs2_log_header *lh)
 {
 	gfs2_meta_header_print(&lh->lh_header);
 	pv(lh, lh_sequence, "%llu", "0x%llx");
@@ -579,7 +579,7 @@ void gfs2_log_descriptor_out(struct gfs2_log_descriptor *ld,
 	bmodified(bh);
 }
 
-void gfs2_log_descriptor_print(struct gfs2_log_descriptor *ld)
+void gfs2_log_descriptor_print(const struct gfs2_log_descriptor *ld)
 {
 	gfs2_meta_header_print(&ld->ld_header);
 	pv(ld, ld_type, "%u", "0x%x");
@@ -606,7 +606,7 @@ void gfs2_statfs_change_out(struct gfs2_statfs_change *sc, char *buf)
 	CPOUT_64(sc, str, sc_dinodes);
 }
 
-void gfs2_statfs_change_print(struct gfs2_statfs_change *sc)
+void gfs2_statfs_change_print(const struct gfs2_statfs_change *sc)
 {
 	pv(sc, sc_total, "%lld", "0x%llx");
 	pv(sc, sc_free, "%lld", "0x%llx");
@@ -636,7 +636,7 @@ void gfs2_quota_change_out(struct gfs2_quota_change *qc,
 	bmodified(bh);
 }
 
-void gfs2_quota_change_print(struct gfs2_quota_change *qc)
+void gfs2_quota_change_print(const struct gfs2_quota_change *qc)
 {
 	pv(qc, qc_change, "%lld", "0x%llx");
 	pv(qc, qc_flags, "0x%.8X", NULL);
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 11/19] libgfs2: Allow init_dinode to accept a preallocated bh
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (9 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 10/19] libgfs2: Const-ify the parameters of print functions Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 12/19] libgfs2: Add extent allocation functions Andrew Price
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Previously init_dinode() always allocated the bh with bget() which meant
less control over memory allocation higher up. This patch changes the
signature of init_dinode to accept a bh pointer and only allocates a new
bh if it is NULL. Also adds error checking to init_dinode()'s callers.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/convert/gfs2_convert.c |  7 +++++--
 gfs2/fsck/initialize.c      | 12 ++++++++----
 gfs2/libgfs2/fs_ops.c       | 42 ++++++++++++++++++++++++------------------
 gfs2/libgfs2/libgfs2.h      |  6 ++----
 gfs2/libgfs2/structures.c   | 13 +++++++++----
 5 files changed, 48 insertions(+), 32 deletions(-)

diff --git a/gfs2/convert/gfs2_convert.c b/gfs2/convert/gfs2_convert.c
index 61ed320..19a9839 100644
--- a/gfs2/convert/gfs2_convert.c
+++ b/gfs2/convert/gfs2_convert.c
@@ -1389,7 +1389,7 @@ static int fix_cdpn_symlinks(struct gfs2_sbd *sbp, osi_list_t *cdpn_to_fix)
 	osi_list_foreach_safe(tmp, cdpn_to_fix, x) {
 		struct gfs2_inum fix, dir;
 		struct inode_dir_block *l_fix;
-		struct gfs2_buffer_head *bh;
+		struct gfs2_buffer_head *bh = NULL;
 		struct gfs2_inode *fix_inode;
 		uint64_t eablk;
 
@@ -1411,7 +1411,10 @@ static int fix_cdpn_symlinks(struct gfs2_sbd *sbp, osi_list_t *cdpn_to_fix)
 		}
 
 		/* initialize the symlink inode to be a directory */
-		bh = init_dinode(sbp, &fix, S_IFDIR | 0755, 0, &dir);
+		error = init_dinode(sbp, &bh, &fix, S_IFDIR | 0755, 0, &dir);
+		if (error != 0)
+			return -1;
+
 		fix_inode = lgfs2_inode_get(sbp, bh);
 		if (fix_inode == NULL)
 			return -1;
diff --git a/gfs2/fsck/initialize.c b/gfs2/fsck/initialize.c
index 02ecd3f..4dedec2 100644
--- a/gfs2/fsck/initialize.c
+++ b/gfs2/fsck/initialize.c
@@ -419,7 +419,7 @@ static void check_rgrps_integrity(struct gfs2_sbd *sdp)
 static int rebuild_master(struct gfs2_sbd *sdp)
 {
 	struct gfs2_inum inum;
-	struct gfs2_buffer_head *bh;
+	struct gfs2_buffer_head *bh = NULL;
 	int err = 0;
 
 	log_err(_("The system master directory seems to be destroyed.\n"));
@@ -430,7 +430,9 @@ static int rebuild_master(struct gfs2_sbd *sdp)
 	log_err(_("Trying to rebuild the master directory.\n"));
 	inum.no_formal_ino = sdp->md.next_inum++;
 	inum.no_addr = sdp->sd_sb.sb_master_dir.no_addr;
-	bh = init_dinode(sdp, &inum, S_IFDIR | 0755, GFS2_DIF_SYSTEM, &inum);
+	err = init_dinode(sdp, &bh, &inum, S_IFDIR | 0755, GFS2_DIF_SYSTEM, &inum);
+	if (err != 0)
+		return -1;
 	sdp->master_dir = lgfs2_inode_get(sdp, bh);
 	if (sdp->master_dir == NULL) {
 		log_crit(_("Error reading master: %s\n"), strerror(errno));
@@ -1210,7 +1212,7 @@ static int sb_repair(struct gfs2_sbd *sdp)
 		sdp->md.rooti = lgfs2_inode_read(sdp, possible_root);
 		if (!sdp->md.rooti ||
 		    sdp->md.rooti->i_di.di_header.mh_magic != GFS2_MAGIC) {
-			struct gfs2_buffer_head *bh;
+			struct gfs2_buffer_head *bh = NULL;
 
 			log_err(_("The root dinode block is destroyed.\n"));
 			log_err(_("At this point I recommend "
@@ -1225,7 +1227,9 @@ static int sb_repair(struct gfs2_sbd *sdp)
 			}
 			inum.no_formal_ino = 1;
 			inum.no_addr = possible_root;
-			bh = init_dinode(sdp, &inum, S_IFDIR | 0755, 0, &inum);
+			error = init_dinode(sdp, &bh, &inum, S_IFDIR | 0755, 0, &inum);
+			if (error != 0)
+				return -1;
 			brelse(bh);
 		}
 	}
diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index aa95aa8..c8b90ad 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -1273,24 +1273,30 @@ int dir_add(struct gfs2_inode *dip, const char *filename, int len,
 	return err;
 }
 
-static struct gfs2_buffer_head *__init_dinode(struct gfs2_sbd *sdp,
-					      struct gfs2_inum *inum,
-					      unsigned int mode,
-					      uint32_t flags,
-					      struct gfs2_inum *parent,
-					      int gfs1)
+static int __init_dinode(struct gfs2_sbd *sdp, struct gfs2_buffer_head **bhp, struct gfs2_inum *inum,
+                         unsigned int mode, uint32_t flags, struct gfs2_inum *parent, int gfs1)
 {
 	struct gfs2_buffer_head *bh;
-	struct gfs2_dinode di;
+	struct gfs2_dinode di = {{0}};
 	int is_dir;
 
 	if (gfs1)
 		is_dir = (IF2DT(mode) == GFS_FILE_DIR);
 	else
 		is_dir = S_ISDIR(mode);
-	bh = bget(sdp, inum->no_addr);
 
-	memset(&di, 0, sizeof(struct gfs2_dinode));
+	errno = EINVAL;
+	if (bhp == NULL)
+		return 1;
+
+	if (*bhp == NULL) {
+		*bhp = bget(sdp, inum->no_addr);
+		if (*bhp == NULL)
+			return 1;
+	}
+
+	bh = *bhp;
+
 	di.di_header.mh_magic = GFS2_MAGIC;
 	di.di_header.mh_type = GFS2_METATYPE_DI;
 	di.di_header.mh_format = GFS2_FORMAT_DI;
@@ -1340,15 +1346,13 @@ static struct gfs2_buffer_head *__init_dinode(struct gfs2_sbd *sdp,
 
 	gfs2_dinode_out(&di, bh);
 
-	return bh;
+	return 0;
 }
 
-struct gfs2_buffer_head *init_dinode(struct gfs2_sbd *sdp,
-				     struct gfs2_inum *inum,
-				     unsigned int mode, uint32_t flags,
-				     struct gfs2_inum *parent)
+int init_dinode(struct gfs2_sbd *sdp, struct gfs2_buffer_head **bhp, struct gfs2_inum *inum,
+                unsigned int mode, uint32_t flags, struct gfs2_inum *parent)
 {
-	return __init_dinode(sdp, inum, mode, flags, parent, 0);
+	return __init_dinode(sdp, bhp, inum, mode, flags, parent, 0);
 }
 
 static struct gfs2_inode *__createi(struct gfs2_inode *dip,
@@ -1358,7 +1362,7 @@ static struct gfs2_inode *__createi(struct gfs2_inode *dip,
 	struct gfs2_sbd *sdp = dip->i_sbd;
 	uint64_t bn;
 	struct gfs2_inum inum;
-	struct gfs2_buffer_head *bh;
+	struct gfs2_buffer_head *bh = NULL;
 	struct gfs2_inode *ip;
 	int err = 0;
 	int is_dir;
@@ -1388,8 +1392,10 @@ static struct gfs2_inode *__createi(struct gfs2_inode *dip,
 			dip->i_di.di_nlink++;
 		}
 
-		bh = __init_dinode(sdp, &inum, mode, flags, &dip->i_di.di_num,
-				   if_gfs1);
+		err = __init_dinode(sdp, &bh, &inum, mode, flags, &dip->i_di.di_num, if_gfs1);
+		if (err != 0)
+			return NULL;
+
 		ip = lgfs2_inode_get(sdp, bh);
 		if (ip == NULL)
 			return NULL;
diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index ed1030c..9b1bdc2 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -462,10 +462,8 @@ extern int __gfs2_writei(struct gfs2_inode *ip, void *buf, uint64_t offset,
 			 unsigned int size, int resize);
 extern struct gfs2_buffer_head *get_file_buf(struct gfs2_inode *ip,
 					     uint64_t lbn, int prealloc);
-extern struct gfs2_buffer_head *init_dinode(struct gfs2_sbd *sdp,
-					    struct gfs2_inum *inum,
-					    unsigned int mode, uint32_t flags,
-					    struct gfs2_inum *parent);
+extern int init_dinode(struct gfs2_sbd *sdp, struct gfs2_buffer_head **bhp, struct gfs2_inum *inum,
+                       unsigned int mode, uint32_t flags, struct gfs2_inum *parent);
 extern struct gfs2_inode *createi(struct gfs2_inode *dip, const char *filename,
 				  unsigned int mode, uint32_t flags);
 extern struct gfs2_inode *gfs_createi(struct gfs2_inode *dip,
diff --git a/gfs2/libgfs2/structures.c b/gfs2/libgfs2/structures.c
index 9d90657..1836255 100644
--- a/gfs2/libgfs2/structures.c
+++ b/gfs2/libgfs2/structures.c
@@ -20,7 +20,7 @@ int build_master(struct gfs2_sbd *sdp)
 {
 	struct gfs2_inum inum;
 	uint64_t bn;
-	struct gfs2_buffer_head *bh;
+	struct gfs2_buffer_head *bh = NULL;
 	int err = lgfs2_dinode_alloc(sdp, 1, &bn);
 
 	if (err != 0)
@@ -29,7 +29,9 @@ int build_master(struct gfs2_sbd *sdp)
 	inum.no_formal_ino = sdp->md.next_inum++;
 	inum.no_addr = bn;
 
-	bh = init_dinode(sdp, &inum, S_IFDIR | 0755, GFS2_DIF_SYSTEM, &inum);
+	err = init_dinode(sdp, &bh, &inum, S_IFDIR | 0755, GFS2_DIF_SYSTEM, &inum);
+	if (err != 0)
+		return -1;
 
 	sdp->master_dir = lgfs2_inode_get(sdp, bh);
 	if (sdp->master_dir == NULL)
@@ -479,7 +481,7 @@ int build_root(struct gfs2_sbd *sdp)
 {
 	struct gfs2_inum inum;
 	uint64_t bn;
-	struct gfs2_buffer_head *bh;
+	struct gfs2_buffer_head *bh = NULL;
 	int err = lgfs2_dinode_alloc(sdp, 1, &bn);
 
 	if (err != 0)
@@ -488,7 +490,10 @@ int build_root(struct gfs2_sbd *sdp)
 	inum.no_formal_ino = sdp->md.next_inum++;
 	inum.no_addr = bn;
 
-	bh = init_dinode(sdp, &inum, S_IFDIR | 0755, 0, &inum);
+	err = init_dinode(sdp, &bh, &inum, S_IFDIR | 0755, 0, &inum);
+	if (err != 0)
+		return -1;
+
 	sdp->md.rooti = lgfs2_inode_get(sdp, bh);
 	if (sdp->md.rooti == NULL)
 		return -1;
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 12/19] libgfs2: Add extent allocation functions
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (10 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 11/19] libgfs2: Allow init_dinode to accept a preallocated bh Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 13/19] libgfs2: Add support for allocating entire rgrp headers Andrew Price
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In order to preallocate journals as single extents we need functions
which allow bitmap allocation separate from buffer allocation and
writing. This adds two functions, lgfs2_file_alloc and
lgfs2_alloc_extent, which solve this problem, making use of the new
lgfs2_rbm functions.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/fs_ops.c  | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++
 gfs2/libgfs2/libgfs2.h |  1 +
 gfs2/libgfs2/rgrp.c    | 24 +++++++++++++++++++++
 gfs2/libgfs2/rgrp.h    |  1 +
 4 files changed, 83 insertions(+)

diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index c8b90ad..98db34d 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -13,6 +13,7 @@
 
 #include <linux/types.h>
 #include "libgfs2.h"
+#include "rgrp.h"
 
 static __inline__ uint64_t *metapointer(struct gfs2_buffer_head *bh,
 					unsigned int height,
@@ -291,6 +292,62 @@ uint64_t lgfs2_space_for_data(const struct gfs2_sbd *sdp, const unsigned bsize,
 	return blks + 1;
 }
 
+/**
+ * Allocate an extent for a file in a resource group's bitmaps.
+ * rg: The resource group in which to allocate the extent
+ * di_size: The size of the file in bytes
+ * ip: A pointer to the inode structure, whose fields will be set appropriately
+ * flags: GFS2_DIF_* flags
+ * mode: File mode flags, see creat(2)
+ * Returns 0 on success with the contents of ip set accordingly, or non-zero
+ * with errno set on error. If errno is ENOSPC then rg does not contain a
+ * large enough free extent for the given di_size.
+ */
+int lgfs2_file_alloc(lgfs2_rgrp_t rg, uint64_t di_size, struct gfs2_inode *ip, uint32_t flags, unsigned mode)
+{
+	unsigned extlen;
+	struct gfs2_dinode *di = &ip->i_di;
+	struct gfs2_sbd *sdp = rg->rgrps->sdp;
+	struct lgfs2_rbm rbm = { .rgd = rg, .offset = 0, .bii = 0 };
+	uint32_t blocks = lgfs2_space_for_data(sdp, sdp->bsize, di_size);
+	int err;
+
+	err = lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &blocks);
+	if (err != 0)
+		return err;
+
+	extlen = lgfs2_alloc_extent(&rbm, GFS2_BLKST_DINODE, blocks);
+	if (extlen < blocks) {
+		errno = EINVAL;
+		return 1;
+	}
+
+	ip->i_sbd = sdp;
+
+	di->di_header.mh_magic = GFS2_MAGIC;
+	di->di_header.mh_type = GFS2_METATYPE_DI;
+	di->di_header.mh_format = GFS2_FORMAT_DI;
+	di->di_size = di_size;
+	di->di_num.no_addr = lgfs2_rbm_to_block(&rbm);
+	di->di_num.no_formal_ino = sdp->md.next_inum++;
+	di->di_mode = mode;
+	di->di_nlink = 1;
+	di->di_blocks = blocks;
+	di->di_atime = di->di_mtime = di->di_ctime = sdp->time;
+	di->di_goal_data = di->di_num.no_addr + di->di_blocks - 1;
+	di->di_goal_meta = di->di_goal_data - ((di_size + sdp->bsize - 1) / sdp->bsize);
+	di->di_height = calc_tree_height(ip, di_size);
+	di->di_flags = flags;
+
+	rg->rg.rg_free -= blocks;
+	rg->rg.rg_dinodes += 1;
+
+	sdp->dinodes_alloced++;
+	sdp->blks_alloced += blocks;
+
+	return 0;
+}
+
 unsigned int calc_tree_height(struct gfs2_inode *ip, uint64_t size)
 {
 	struct gfs2_sbd *sdp = ip->i_sbd;
diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 9b1bdc2..43529a0 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -453,6 +453,7 @@ extern uint64_t data_alloc(struct gfs2_inode *ip);
 extern int lgfs2_meta_alloc(struct gfs2_inode *ip, uint64_t *blkno);
 extern int lgfs2_dinode_alloc(struct gfs2_sbd *sdp, const uint64_t blksreq, uint64_t *blkno);
 extern uint64_t lgfs2_space_for_data(const struct gfs2_sbd *sdp, unsigned bsize, uint64_t bytes);
+extern int lgfs2_file_alloc(lgfs2_rgrp_t rg, uint64_t di_size, struct gfs2_inode *ip, uint32_t flags, unsigned mode);
 
 extern int gfs2_readi(struct gfs2_inode *ip, void *buf, uint64_t offset,
 		      unsigned int size);
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index a3fa1a4..772e6d0 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -841,3 +841,27 @@ res_covered_end_of_rgrp:
 	errno = ENOSPC;
 	return 1;
 }
+
+/**
+ * lgfs2_alloc_extent - allocate an extent from a given bitmap
+ * @rbm: the resource group information
+ * @state: The state of the first block, GFS2_BLKST_DINODE or GFS2_BLKST_USED
+ * @elen: The requested extent length
+ * Returns the length of the extent allocated.
+ */
+unsigned lgfs2_alloc_extent(const struct lgfs2_rbm *rbm, int state, const unsigned elen)
+{
+	struct lgfs2_rbm pos = { .rgd = rbm->rgd, };
+	const uint64_t block = lgfs2_rbm_to_block(rbm);
+	unsigned len;
+
+	gfs2_set_bitmap(rbm->rgd, block, state);
+
+	for (len = 1; len < elen; len++) {
+		int ret = lgfs2_rbm_from_block(&pos, block + len);
+		if (ret || lgfs2_testbit(&pos) != GFS2_BLKST_FREE)
+			break;
+		gfs2_set_bitmap(pos.rgd, block + len, GFS2_BLKST_USED);
+	}
+	return len;
+}
diff --git a/gfs2/libgfs2/rgrp.h b/gfs2/libgfs2/rgrp.h
index 1634fbc..bd89289 100644
--- a/gfs2/libgfs2/rgrp.h
+++ b/gfs2/libgfs2/rgrp.h
@@ -45,5 +45,6 @@ static inline int lgfs2_rbm_eq(const struct lgfs2_rbm *rbm1, const struct lgfs2_
 }
 
 extern int lgfs2_rbm_find(struct lgfs2_rbm *rbm, uint8_t state, uint32_t *minext);
+extern unsigned lgfs2_alloc_extent(const struct lgfs2_rbm *rbm, int state, const unsigned elen);
 
 #endif /* __RGRP_DOT_H__ */
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 13/19] libgfs2: Add support for allocating entire rgrp headers
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (11 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 12/19] libgfs2: Add extent allocation functions Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 14/19] libgfs2: Write file metadata sequentially Andrew Price
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Before we can allocate files without writing buffers back to disk
immediately we need to hold the bitmap blocks for one rgrp in memory.
Add a lgfs2_rgrp_bitbuf_alloc() function which allocates the memory for
a resource group's bitmap blocks in one chunk, and also a matching _free
function.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/libgfs2.h |  2 ++
 gfs2/libgfs2/rgrp.c    | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 43529a0..d683dc7 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -201,6 +201,8 @@ extern uint32_t lgfs2_rgrp_align_len(const lgfs2_rgrps_t rgs, uint32_t len);
 extern unsigned lgfs2_rgsize_for_data(uint64_t blksreq, unsigned bsize);
 extern uint32_t lgfs2_rgrps_plan(const lgfs2_rgrps_t rgs, uint64_t space, uint32_t tgtsize);
 extern lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry);
+extern int lgfs2_rgrp_bitbuf_alloc(lgfs2_rgrp_t rg);
+extern void lgfs2_rgrp_bitbuf_free(lgfs2_rgrp_t rg);
 extern int lgfs2_rgrp_write(int fd, lgfs2_rgrp_t rg);
 extern const struct gfs2_rindex *lgfs2_rgrp_index(lgfs2_rgrp_t rg);
 extern const struct gfs2_rgrp *lgfs2_rgrp_rgrp(lgfs2_rgrp_t rg);
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index 772e6d0..57551c6 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -99,6 +99,54 @@ struct rgrp_tree *gfs2_blk2rgrpd(struct gfs2_sbd *sdp, uint64_t blk)
 }
 
 /**
+ * Allocate a multi-block buffer for a resource group's bitmaps. This is done
+ * as one chunk and should be freed using lgfs2_rgrp_bitbuf_free().
+ * Returns 0 on success with the bitmap buffer allocated in the resource group,
+ * or non-zero on failure with errno set.
+ */
+int lgfs2_rgrp_bitbuf_alloc(lgfs2_rgrp_t rg)
+{
+	struct gfs2_sbd *sdp = rg->rgrps->sdp;
+	unsigned i;
+	char *bufs;
+
+	bufs = calloc(rg->ri.ri_length, sizeof(struct gfs2_buffer_head) + sdp->bsize);
+	if (bufs == NULL)
+		return 1;
+
+	rg->bits[0].bi_bh = (struct gfs2_buffer_head *)bufs;
+	rg->bits[0].bi_bh->iov.iov_base = (char *)(rg->bits[0].bi_bh + 1);
+	rg->bits[0].bi_bh->iov.iov_len = sdp->bsize;
+	rg->bits[0].bi_bh->b_blocknr = rg->ri.ri_addr;
+	rg->bits[0].bi_bh->sdp = sdp;
+
+	for (i = 1; i < rg->ri.ri_length; i++) {
+		char *nextbuf = rg->bits[i - 1].bi_bh->b_data + sdp->bsize;
+		rg->bits[i].bi_bh = (struct gfs2_buffer_head *)(nextbuf);
+		rg->bits[i].bi_bh->iov.iov_base = (char *)(rg->bits[i].bi_bh + 1);
+		rg->bits[i].bi_bh->iov.iov_len = sdp->bsize;
+		rg->bits[i].bi_bh->b_blocknr = rg->ri.ri_addr + i;
+		rg->bits[i].bi_bh->sdp = sdp;
+	}
+	return 0;
+}
+
+/**
+ * Free the multi-block bitmap buffer from a resource group. The buffer should
+ * have been allocated as a single chunk as in lgfs2_rgrp_bitbuf_alloc().
+ * This does not implicitly write the bitmaps to disk. Use lgfs2_rgrp_write()
+ * for that.
+ * rg: The resource groups whose bitmap buffer should be freed.
+ */
+void lgfs2_rgrp_bitbuf_free(lgfs2_rgrp_t rg)
+{
+	unsigned i;
+	free(rg->bits[0].bi_bh);
+	for (i = 0; i < rg->ri.ri_length; i++)
+		rg->bits[i].bi_bh = NULL;
+}
+
+/**
  * gfs2_rgrp_read - read in the resource group information from disk.
  * @rgd - resource group structure
  * returns: 0 if no error, otherwise the block number that failed
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 14/19] libgfs2: Write file metadata sequentially
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (12 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 13/19] libgfs2: Add support for allocating entire rgrp headers Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 15/19] libgfs2: Fix alignment in lgfs2_rgsize_for_data Andrew Price
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Until now the journal creation functions built up the height of the
journal metadata using incremental allocation and the build_height
function which grows a file naively. Two things are about to change:

1. We will guarantee that a journal will occupy a single extent.
2. Journals will be written out sequentially and will only required one
   write of the resource group header, before the journal is written.

Since we know the size of the extent and can predict the layout of the
journal's metadata tree, it is possible to generate and write the
metadata blocks in sequence. This patch adds an lgfs2_write_filemeta()
function which does just that.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/fs_ops.c  | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++
 gfs2/libgfs2/libgfs2.h |  1 +
 2 files changed, 70 insertions(+)

diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index 98db34d..9c9cc82 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -1412,6 +1412,75 @@ int init_dinode(struct gfs2_sbd *sdp, struct gfs2_buffer_head **bhp, struct gfs2
 	return __init_dinode(sdp, bhp, inum, mode, flags, parent, 0);
 }
 
+static void lgfs2_fill_indir(char *start, char *end, uint64_t ptr0, unsigned n, unsigned *p)
+{
+	char *bp;
+	memset(start, 0, end - start);
+	for (bp = start; bp < end && *p < n; bp += sizeof(uint64_t)) {
+		uint64_t pn = ptr0 + *p;
+		*(uint64_t *)bp = cpu_to_be64(pn);
+		(*p)++;
+	}
+}
+
+/**
+ * Calculate and write the indirect blocks for a single-extent file of a given
+ * size.
+ * ip: The inode for which to write indirect blocks, with fields already set
+ *     appropriately (see lgfs2_file_alloc).
+ * Returns 0 on success or non-zero with errno set on failure.
+ */
+int lgfs2_write_filemeta(struct gfs2_inode *ip)
+{
+	unsigned height = 0;
+	struct metapath mp;
+	struct gfs2_sbd *sdp = ip->i_sbd;
+	uint64_t dblocks = (ip->i_di.di_size + sdp->bsize - 1) / sdp->bsize;
+	uint64_t ptr0 = ip->i_di.di_num.no_addr + 1;
+	unsigned ptrs = 1;
+	struct gfs2_meta_header mh = {
+		.mh_magic = GFS2_MAGIC,
+		.mh_type = GFS2_METATYPE_IN,
+		.mh_format = GFS2_FORMAT_IN,
+	};
+	struct gfs2_buffer_head *bh = bget(sdp, ip->i_di.di_num.no_addr);
+	if (bh == NULL)
+		return 1;
+
+	/* Using find_metapath() to find the last data block in the file will
+	   effectively give a remainder for the number of pointers at each
+	   height. Just need to add 1 to convert ptr index to quantity later. */
+	find_metapath(ip, dblocks - 1, &mp);
+
+	for (height = 0; height < ip->i_di.di_height; height++) {
+		unsigned p;
+		/* The number of pointers in this height will be the number of
+		   full indirect blocks pointed to by the previous height
+		   multiplied by the pointer capacity of an indirect block,
+		   plus the remainder which find_metapath() gave us. */
+		ptrs = ((ptrs - 1) * sdp->sd_inptrs) + mp.mp_list[height] + 1;
+
+		for (p = 0; p < ptrs; bh->b_blocknr++) {
+			char *start = bh->b_data;
+			if (height == 0) {
+				start += sizeof(struct gfs2_dinode);
+				gfs2_dinode_out(&ip->i_di, bh);
+			} else {
+				start += sizeof(struct gfs2_meta_header);
+				gfs2_meta_header_out(&mh, bh->b_data);
+			}
+			lgfs2_fill_indir(start, bh->b_data + sdp->bsize, ptr0, ptrs, &p);
+			if(bwrite(bh)) {
+				free(bh);
+				return 1;
+			}
+		}
+		ptr0 += ptrs;
+	}
+	free(bh);
+	return 0;
+}
+
 static struct gfs2_inode *__createi(struct gfs2_inode *dip,
 				    const char *filename, unsigned int mode,
 				    uint32_t flags, int if_gfs1)
diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index d683dc7..c7adbc1 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -500,6 +500,7 @@ extern void build_height(struct gfs2_inode *ip, int height);
 extern void unstuff_dinode(struct gfs2_inode *ip);
 extern unsigned int calc_tree_height(struct gfs2_inode *ip, uint64_t size);
 extern int write_journal(struct gfs2_inode *jnl, unsigned bsize, unsigned blocks);
+extern int lgfs2_write_filemeta(struct gfs2_inode *ip);
 
 /* gfs1.c - GFS1 backward compatibility structures and functions */
 
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 15/19] libgfs2: Fix alignment in lgfs2_rgsize_for_data
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (13 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 14/19] libgfs2: Write file metadata sequentially Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 16/19] libgfs2: Handle non-zero bitmaps in lgfs2_rgrp_write Andrew Price
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Align the result of this function such that it matches the alignment
done by other functions which calculate a resource group size. This
avoids a situation where the resource group size is smaller than the
contents meant to fill it (e.g. journal file extents).

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/rgrp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index 57551c6..8f96c9f 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -525,6 +525,7 @@ unsigned lgfs2_rgsize_for_data(uint64_t blksreq, unsigned bsize)
 	const uint32_t blks_rgrp = GFS2_NBBY * (bsize - sizeof(struct gfs2_rgrp));
 	const uint32_t blks_meta = GFS2_NBBY * (bsize - sizeof(struct gfs2_meta_header));
 	unsigned bitblocks = 1;
+	blksreq = (blksreq + 3) & ~3;
 	if (blksreq > blks_rgrp)
 		bitblocks += ((blksreq - blks_rgrp) + blks_meta - 1) / blks_meta;
 	return bitblocks + blksreq;
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 16/19] libgfs2: Handle non-zero bitmaps in lgfs2_rgrp_write
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (14 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 15/19] libgfs2: Fix alignment in lgfs2_rgsize_for_data Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 17/19] libgfs2: Add a speedier journal data block writing function Andrew Price
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Previously, lgfs2_rgrp_write had been used only for writing new resource
groups where the bitmaps were all zero. Fix it to write the bitmap
buffers already present in the resource group structure directly.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/rgrp.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index 8f96c9f..f57ae3a 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -587,31 +587,33 @@ lgfs2_rgrp_t lgfs2_rgrps_append(lgfs2_rgrps_t rgs, struct gfs2_rindex *entry)
  */
 int lgfs2_rgrp_write(int fd, const lgfs2_rgrp_t rg)
 {
-	ssize_t ret = 0;
-	lgfs2_rgrps_t rgs = rg->rgrps;
-	size_t len = rg->ri.ri_length * rgs->sdp->bsize;
+	int ret = 0;
 	unsigned int i;
 	const struct gfs2_meta_header bmh = {
 		.mh_magic = GFS2_MAGIC,
 		.mh_type = GFS2_METATYPE_RB,
 		.mh_format = GFS2_FORMAT_RB,
 	};
-	char *buff = calloc(len, 1);
-	if (buff == NULL)
-		return -1;
+	int freebufs = 0;
 
-	gfs2_rgrp_out(&rg->rg, buff);
-	for (i = 1; i < rg->ri.ri_length; i++)
-		gfs2_meta_header_out(&bmh, buff + (i * rgs->sdp->bsize));
+	if (rg->bits[0].bi_bh == NULL) {
+		freebufs = 1;
+		if (lgfs2_rgrp_bitbuf_alloc(rg) != 0)
+			return -1;
+	}
 
-	ret = pwrite(fd, buff, len, rg->ri.ri_addr * rgs->sdp->bsize);
-	if (ret != len) {
-		free(buff);
-		return -1;
+	gfs2_rgrp_out(&rg->rg, rg->bits[0].bi_bh->b_data);
+	ret = bwrite(rg->bits[0].bi_bh);
+
+	for (i = 1; ret == 0 && i < rg->ri.ri_length; i++) {
+		gfs2_meta_header_out(&bmh, rg->bits[i].bi_bh->b_data);
+		ret = bwrite(rg->bits[i].bi_bh);
 	}
 
-	free(buff);
-	return 0;
+	if (freebufs)
+		lgfs2_rgrp_bitbuf_free(rg);
+
+	return ret;
 }
 
 lgfs2_rgrp_t lgfs2_rgrp_first(lgfs2_rgrps_t rgs)
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 17/19] libgfs2: Add a speedier journal data block writing function
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (15 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 16/19] libgfs2: Handle non-zero bitmaps in lgfs2_rgrp_write Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 18/19] libgfs2: Create jindex directory separately from journals Andrew Price
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Now that we guarantee journals to be allocated contiguously, and the
block allocation has been separated out, we can speed up the journal
data writing process by simply generating the blocks and writing them
sequentially without consulting the resource group bitmaps each time a
new block is required.

lgfs2_write_journal_data() is added and the old write_journal() function
is left in place for now, until other journal creation scenarios (e.g.
potentially fragmented journals in gfs2_jadd and fsck.gfs2) have been
catered for, and those tools migrated to the new functions.

A further speed-up may be possible by using async i/o but we already
have a significant performance improvement with this strategy alone so
the added complexity of async i/o may not be worth it.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/libgfs2.h    |  1 +
 gfs2/libgfs2/structures.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index c7adbc1..406fbbe 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -500,6 +500,7 @@ extern void build_height(struct gfs2_inode *ip, int height);
 extern void unstuff_dinode(struct gfs2_inode *ip);
 extern unsigned int calc_tree_height(struct gfs2_inode *ip, uint64_t size);
 extern int write_journal(struct gfs2_inode *jnl, unsigned bsize, unsigned blocks);
+extern int lgfs2_write_journal_data(struct gfs2_inode *ip);
 extern int lgfs2_write_filemeta(struct gfs2_inode *ip);
 
 /* gfs1.c - GFS1 backward compatibility structures and functions */
diff --git a/gfs2/libgfs2/structures.c b/gfs2/libgfs2/structures.c
index 1836255..c4b9ebc 100644
--- a/gfs2/libgfs2/structures.c
+++ b/gfs2/libgfs2/structures.c
@@ -143,6 +143,54 @@ out_buf:
 	return err;
 }
 
+/**
+ * Intialise and write the data blocks for a new journal as a contiguous
+ * extent. The indirect blocks pointing to these data blocks should have been
+ * written separately using lgfs2_write_filemeta() and the extent should have
+ * been allocated using lgfs2_file_alloc().
+ * ip: The journal's inode
+ * Returns 0 on success or -1 with errno set on error.
+ */
+int lgfs2_write_journal_data(struct gfs2_inode *ip)
+{
+	uint32_t hash;
+	struct gfs2_log_header lh = {
+		.lh_header.mh_magic = GFS2_MAGIC,
+		.lh_header.mh_type = GFS2_METATYPE_LH,
+		.lh_header.mh_format = GFS2_FORMAT_LH,
+		.lh_flags = GFS2_LOG_HEAD_UNMOUNT,
+	};
+	struct gfs2_buffer_head *bh;
+	struct gfs2_sbd *sdp = ip->i_sbd;
+	unsigned blocks = (ip->i_di.di_size + sdp->bsize - 1) / sdp->bsize;
+	uint64_t jext0 = ip->i_di.di_num.no_addr + ip->i_di.di_blocks - blocks;
+	uint64_t seq = ((blocks) * (random() / (RAND_MAX + 1.0)));
+
+	bh = bget(sdp, jext0);
+	if (bh == NULL)
+		return -1;
+
+	do {
+		lh.lh_sequence = seq;
+		lh.lh_blkno = bh->b_blocknr - jext0;
+		gfs2_log_header_out(&lh, bh);
+		hash = gfs2_disk_hash(bh->b_data, sizeof(struct gfs2_log_header));
+		((struct gfs2_log_header *)bh->b_data)->lh_hash = cpu_to_be32(hash);
+
+		if (bwrite(bh)) {
+			free(bh);
+			return -1;
+		}
+
+		if (++seq == blocks)
+			seq = 0;
+
+	} while (++bh->b_blocknr < jext0 + blocks);
+
+	free(bh);
+	return 0;
+}
+
 int write_journal(struct gfs2_inode *jnl, unsigned bsize, unsigned int blocks)
 {
 	struct gfs2_log_header lh;
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 18/19] libgfs2: Create jindex directory separately from journals
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (16 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 17/19] libgfs2: Add a speedier journal data block writing function Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 19/19] mkfs.gfs2: Improve journal creation performance Andrew Price
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Previously journals were created indirectly via build_jindex and the
jindex inode (and therefore the master inode) was created before the
journals. Now that we're allocating the journals in whole resource
groups at the start of the mkfs process we need a way to create the
jindex after the journals are created. This adds lgfs2_build_jindex
which takes a list of inums relating to journals and builds the jindex
from them.

The old build_jindex function is left in place until the other tools
have been switched to use the new function.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/libgfs2/libgfs2.h    |  1 +
 gfs2/libgfs2/structures.c | 40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 406fbbe..831d45b 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -729,6 +729,7 @@ extern int lgfs2_sb_write(const struct gfs2_sb *sb, int fd, const unsigned bsize
 extern int build_journal(struct gfs2_sbd *sdp, int j,
 			 struct gfs2_inode *jindex);
 extern int build_jindex(struct gfs2_sbd *sdp);
+extern int lgfs2_build_jindex(struct gfs2_inode *master, struct gfs2_inum *jnls, size_t nmemb);
 extern int build_per_node(struct gfs2_sbd *sdp);
 extern int build_inum(struct gfs2_sbd *sdp);
 extern int build_statfs(struct gfs2_sbd *sdp);
diff --git a/gfs2/libgfs2/structures.c b/gfs2/libgfs2/structures.c
index c4b9ebc..87ffde7 100644
--- a/gfs2/libgfs2/structures.c
+++ b/gfs2/libgfs2/structures.c
@@ -255,6 +255,46 @@ int build_journal(struct gfs2_sbd *sdp, int j, struct gfs2_inode *jindex)
 	return ret;
 }
 
+/**
+ * Write a jindex file given a list of journal inums.
+ * master: Inode of the master directory
+ * jnls: List of inum structures relating to previously created journals.
+ * nmemb: The number of entries in the list (number of journals).
+ * Returns 0 on success or non-zero on error with errno set.
+ */
+int lgfs2_build_jindex(struct gfs2_inode *master, struct gfs2_inum *jnls, size_t nmemb)
+{
+	char fname[GFS2_FNAMESIZE + 1];
+	struct gfs2_inode *jindex;
+	unsigned j;
+	int ret;
+
+	if (nmemb == 0 || jnls == NULL) {
+		errno = EINVAL;
+		return 1;
+	}
+	jindex = createi(master, "jindex", S_IFDIR | 0700, GFS2_DIF_SYSTEM);
+	if (jindex == NULL)
+		return 1;
+
+	fname[GFS2_FNAMESIZE] = '\0';
+
+	for (j = 0; j < nmemb; j++) {
+		snprintf(fname, GFS2_FNAMESIZE, "journal%u", j);
+		ret = dir_add(jindex, fname, strlen(fname), &jnls[j], IF2DT(S_IFREG | 0600));
+		if (ret)
+			return 1;
+	}
+
+	if (cfg_debug) {
+		printf("\nJindex:\n");
+		gfs2_dinode_print(&jindex->i_di);
+	}
+
+	inode_put(&jindex);
+	return 0;
+}
+
 int build_jindex(struct gfs2_sbd *sdp)
 {
 	struct gfs2_inode *jindex;
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 19/19] mkfs.gfs2: Improve journal creation performance
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (17 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 18/19] libgfs2: Create jindex directory separately from journals Andrew Price
@ 2014-09-02 12:07 ` Andrew Price
  2014-09-02 14:06 ` [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Bob Peterson
  2014-09-03 10:20 ` Steven Whitehouse
  20 siblings, 0 replies; 25+ messages in thread
From: Andrew Price @ 2014-09-02 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Now that all of the journal extent allocation and writing building
blocks are in place in libgfs2, we can make use of them in mkfs.gfs2 to
write the journals sequentially and in-order with the resource group
blocks. This patch is a little messy because the changes had to be
introduced at the same time to avoid mismatching old and new behaviour,
but the end result should be fairly easy to follow.

Instead of writing all the resource group headers and then building the
journals afterwards, requiring the resource group headers to be read
back in, we now create a resource group header in memory, allocate the
blocks that we'll use for a journal in its bitmaps, write the resource
group header out, and then write the journal inode, indirect blocks and
data blocks in that order. This gives a substantial speed-up. For
example, running the test suite on my machine takes around 12 minutes
before and just over 2 minutes after this patch set.

Signed-off-by: Andrew Price <anprice@redhat.com>
---
 gfs2/mkfs/main_mkfs.c | 151 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 114 insertions(+), 37 deletions(-)

diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index 39b9609..530383d 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -166,6 +166,8 @@ static void opts_init(struct mkfs_opts *opts)
 	opts->align = 1;
 }
 
+struct gfs2_inum *mkfs_journals = NULL;
+
 #ifndef BLKDISCARD
 #define BLKDISCARD      _IO(0x12,119)
 #endif
@@ -617,16 +619,11 @@ static lgfs2_rgrps_t rgs_init(struct mkfs_opts *opts, struct gfs2_sbd *sdp)
 	return rgs;
 }
 
-static int place_rgrp(struct gfs2_sbd *sdp, lgfs2_rgrps_t rgs, struct gfs2_rindex *ri, int debug)
+static int place_rgrp(struct gfs2_sbd *sdp, lgfs2_rgrp_t rg, int debug)
 {
 	int err = 0;
-	lgfs2_rgrp_t rg = NULL;
+	const struct gfs2_rindex *ri = lgfs2_rgrp_index(rg);
 
-	rg = lgfs2_rgrps_append(rgs, ri);
-	if (rg == NULL) {
-		perror(_("Failed to create resource group"));
-		return -1;
-	}
 	err = lgfs2_rgrp_write(sdp->device_fd, rg);
 	if (err != 0) {
 		perror(_("Failed to write resource group"));
@@ -642,46 +639,121 @@ static int place_rgrp(struct gfs2_sbd *sdp, lgfs2_rgrps_t rgs, struct gfs2_rinde
 	return 0;
 }
 
-static int place_rgrps(struct gfs2_sbd *sdp, lgfs2_rgrps_t rgs, struct mkfs_opts *opts)
+static int add_rgrp(lgfs2_rgrps_t rgs, uint64_t *addr, uint32_t len, lgfs2_rgrp_t *rg)
 {
 	struct gfs2_rindex ri;
-	uint64_t jfsize = lgfs2_space_for_data(sdp, sdp->bsize, opts->jsize << 20);
-	uint32_t jrgsize = lgfs2_rgsize_for_data(jfsize, sdp->bsize);
-	uint64_t rgaddr = lgfs2_rgrp_align_addr(rgs, sdp->sb_addr + 1);
-	uint32_t rgsize = lgfs2_rgrps_plan(rgs, sdp->device.length - rgaddr, ((opts->rgsize << 20) / sdp->bsize));
-	unsigned j;
 
-	if (rgsize >= jrgsize)
-		jrgsize = rgsize;
+	/* When we get to the end of the device, it's only an error if we have
+	   more structures left to write, i.e. when len is != 0. */
+	*addr = lgfs2_rindex_entry_new(rgs, &ri, *addr, len);
+	if (*addr == 0) {
+		if (len != 0) {
+			perror(_("Failed to create resource group index entry"));
+			return -1;
+		} else {
+			return 1;
+		}
+	}
 
-	if (rgsize < ((GFS2_MIN_RGSIZE << 20) / sdp->bsize)) {
-		fprintf(stderr, _("Resource group size is too small\n"));
+	*rg = lgfs2_rgrps_append(rgs, &ri);
+	if (*rg == NULL) {
+		perror(_("Failed to create resource group"));
 		return -1;
-	} else if (rgsize < ((GFS2_DEFAULT_RGSIZE << 20) / sdp->bsize)) {
-		fprintf(stderr, _("Warning: small resource group size could impact performance\n"));
 	}
+	return 0;
+}
+
+static int place_journals(struct gfs2_sbd *sdp, lgfs2_rgrps_t rgs, struct mkfs_opts *opts, uint64_t *rgaddr)
+{
+	uint64_t jfsize = lgfs2_space_for_data(sdp, sdp->bsize, opts->jsize << 20);
+	uint32_t rgsize = lgfs2_rgsize_for_data(jfsize, sdp->bsize);
+	unsigned j;
+
+	/* We'll build the jindex later so remember where we put the journals */
+	mkfs_journals = calloc(opts->journals, sizeof(*mkfs_journals));
+	if (mkfs_journals == NULL)
+		return 1;
+	*rgaddr = lgfs2_rgrp_align_addr(rgs, sdp->sb_addr + 1);
 
 	for (j = 0; j < opts->journals; j++) {
 		int result;
-		rgaddr = lgfs2_rindex_entry_new(rgs, &ri, rgaddr, jrgsize);
-		if (rgaddr == 0) /* Reached the end when we still have journals to write */
-			return 1;
-		result = place_rgrp(sdp, rgs, &ri, opts->debug);
+		lgfs2_rgrp_t rg;
+		struct gfs2_inode in = {0};
+
+		if (opts->debug)
+			printf(_("Placing resource group for journal%u\n"), j);
+
+		result = add_rgrp(rgs, rgaddr, rgsize, &rg);
+		if (result > 0)
+			break;
+		else if (result < 0)
+			return result;
+
+		result = lgfs2_rgrp_bitbuf_alloc(rg);
+		if (result != 0) {
+			perror(_("Failed to allocate space for bitmap buffer"));
+			return result;
+		}
+		/* In order to keep writes sequential here, we have to allocate
+		   the journal, then write the rgrp header (which is now in its
+		   final form) and then write the journal out */
+		result = lgfs2_file_alloc(rg, opts->jsize << 20, &in, GFS2_DIF_SYSTEM, S_IFREG | 0600);
+		if (result != 0) {
+			fprintf(stderr, _("Failed to allocate space for journal %u\n"), j);
+			return result;
+		}
+
+		if (opts->debug)
+			gfs2_dinode_print(&in.i_di);
+
+		result = place_rgrp(sdp, rg, opts->debug);
 		if (result != 0)
 			return result;
+
+		lgfs2_rgrp_bitbuf_free(rg);
+
+		result = lgfs2_write_filemeta(&in);
+		if (result != 0) {
+			fprintf(stderr, _("Failed to write journal %u\n"), j);
+			return result;
+		}
+
+		result = lgfs2_write_journal_data(&in);
+		if (result != 0) {
+			fprintf(stderr, _("Failed to write data blocks for journal %u\n"), j);
+			return result;
+		}
+		mkfs_journals[j] = in.i_di.di_num;
 	}
 
-	if (rgsize != jrgsize)
-		lgfs2_rgrps_plan(rgs, sdp->device.length - rgaddr, ((opts->rgsize << 20) / sdp->bsize));
+	return 0;
+}
+
+static int place_rgrps(struct gfs2_sbd *sdp, lgfs2_rgrps_t rgs, struct mkfs_opts *opts)
+{
+	uint64_t rgaddr = lgfs2_rgrp_align_addr(rgs, sdp->sb_addr + 1);
+	uint32_t rgblks = ((opts->rgsize << 20) / sdp->bsize);
+	int result;
+
+	result = place_journals(sdp, rgs, opts, &rgaddr);
+	if (result != 0)
+		return result;
+
+	lgfs2_rgrps_plan(rgs, sdp->device.length - rgaddr, rgblks);
 
 	while (1) {
-		int result;
-		rgaddr = lgfs2_rindex_entry_new(rgs, &ri, rgaddr, 0);
-		if (rgaddr == 0)
-			break; /* Done */
-		result = place_rgrp(sdp, rgs, &ri, opts->debug);
-		if (result)
+		lgfs2_rgrp_t rg;
+		result = add_rgrp(rgs, &rgaddr, 0, &rg);
+		if (result > 0)
+			break;
+		else if (result < 0)
+			return result;
+
+		result = place_rgrp(sdp, rg, opts->debug);
+		if (result != 0) {
+			fprintf(stderr, _("Failed to build resource groups\n"));
 			return result;
+		}
 	}
 	return 0;
 }
@@ -696,7 +768,7 @@ static void sbd_init(struct gfs2_sbd *sdp, struct mkfs_opts *opts, unsigned bsiz
 	sdp->jsize = opts->jsize;
 	sdp->md.journals = opts->journals;
 	sdp->device_fd = opts->dev.fd;
-	sdp->bsize = bsize;
+	sdp->bsize = sdp->sd_sb.sb_bsize = bsize;
 
 	if (compute_constants(sdp)) {
 		perror(_("Failed to compute file system constants"));
@@ -848,17 +920,19 @@ void main_mkfs(int argc, char *argv[])
 	}
 	sbd.rgtree.osi_node = lgfs2_rgrps_root(rgs); // Temporary
 
-	build_root(&sbd);
-	sb.sb_root_dir = sbd.md.rooti->i_di.di_num;
-
-	build_master(&sbd);
+	error = build_master(&sbd);
+	if (error) {
+		fprintf(stderr, _("Error building '%s': %s\n"), "master", strerror(errno));
+		exit(EXIT_FAILURE);
+	}
 	sb.sb_master_dir = sbd.master_dir->i_di.di_num;
 
-	error = build_jindex(&sbd);
+	error = lgfs2_build_jindex(sbd.master_dir, mkfs_journals, opts.journals);
 	if (error) {
 		fprintf(stderr, _("Error building '%s': %s\n"), "jindex", strerror(errno));
 		exit(EXIT_FAILURE);
 	}
+	free(mkfs_journals);
 	error = build_per_node(&sbd);
 	if (error) {
 		fprintf(stderr, _("Error building '%s': %s\n"), "per_node", strerror(errno));
@@ -887,6 +961,9 @@ void main_mkfs(int argc, char *argv[])
 		exit(EXIT_FAILURE);
 	}
 
+	build_root(&sbd);
+	sb.sb_root_dir = sbd.md.rooti->i_di.di_num;
+
 	strcpy(sb.sb_lockproto, opts.lockproto);
 	strcpy(sb.sb_locktable, opts.locktable);
 
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (18 preceding siblings ...)
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 19/19] mkfs.gfs2: Improve journal creation performance Andrew Price
@ 2014-09-02 14:06 ` Bob Peterson
  2014-09-03 10:20 ` Steven Whitehouse
  20 siblings, 0 replies; 25+ messages in thread
From: Bob Peterson @ 2014-09-02 14:06 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> One thing to note is that, with these patches, the root and master inodes are
> no longer the first objects in the first resource group. The master inode is
> written in the first free block after the journals and then the other metafs
> structures are placed. The root directory inode is then finally created. This
> is not a format change but it may cause some confusion after years of
> expecting
> the root and master inodes to be at certain addresses so I thought it worth
> mentioning.

Hi,

I know that in fsck.gfs2, in initialize.c, it plays some games trying to find
and repair damaged system dinodes. For example, it looks for a missing master
directory by looking for "no_formal_ino==2" for example. So I'd be very cautious
and check to make sure these repairs still work properly. In the past, I've
done a for loop, wiping out the first X blocks of the file system, running
fsck.gfs2, and seeing if it can properly repair it.

Another concern is gfs2_convert. I don't know if it makes any assumptions
about the master directory, but it's much less likely. I think it just
assumes the file system is healthy. But fsck.gfs2 is a concern.

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents
  2014-09-02 12:07 ` [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents Andrew Price
@ 2014-09-03 10:17   ` Steven Whitehouse
  2014-09-03 12:13     ` Andrew Price
  0 siblings, 1 reply; 25+ messages in thread
From: Steven Whitehouse @ 2014-09-03 10:17 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

You shouldn't need this for allocation in mkfs since you already know 
where the free extents are, so no need to be reading the bitmaps to find 
that out,

Steve.

On 02/09/14 13:07, Andrew Price wrote:
> Port gfs2_rbm_find and some functions which it depends on from the gfs2
> kernel code. This will set the base for allocation of single-extent
> files. The functions have been simplified where possible as libgfs2
> doesn't have a concept of reservations for the time being.
>
> Signed-off-by: Andrew Price <anprice@redhat.com>
> ---
>   gfs2/libgfs2/rgrp.c | 197 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>   gfs2/libgfs2/rgrp.h |   2 +
>   2 files changed, 199 insertions(+)
>
> diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
> index 0f36b86..7063288 100644
> --- a/gfs2/libgfs2/rgrp.c
> +++ b/gfs2/libgfs2/rgrp.c
> @@ -638,3 +638,200 @@ static int lgfs2_rbm_incr(struct lgfs2_rbm *rbm)
>   	rbm->bii++;
>   	return 0;
>   }
> +
> +/**
> + * lgfs2_testbit - test a bit in the bitmaps
> + * @rbm: The bit to test
> + *
> + * Returns: The two bit block state of the requested bit
> + */
> +static inline uint8_t lgfs2_testbit(const struct lgfs2_rbm *rbm)
> +{
> +	struct gfs2_bitmap *bi = rbm_bi(rbm);
> +	const uint8_t *buffer = (uint8_t *)bi->bi_bh->b_data + bi->bi_offset;
> +	const uint8_t *byte;
> +	unsigned int bit;
> +
> +	byte = buffer + (rbm->offset / GFS2_NBBY);
> +	bit = (rbm->offset % GFS2_NBBY) * GFS2_BIT_SIZE;
> +
> +	return (*byte >> bit) & GFS2_BIT_MASK;
> +}
> +
> +/**
> + * lgfs2_unaligned_extlen - Look for free blocks which are not byte aligned
> + * @rbm: Position to search (value/result)
> + * @n_unaligned: Number of unaligned blocks to check
> + * @len: Decremented for each block found (terminate on zero)
> + *
> + * Returns: true if a non-free block is encountered
> + */
> +static int lgfs2_unaligned_extlen(struct lgfs2_rbm *rbm, uint32_t n_unaligned, uint32_t *len)
> +{
> +	uint32_t n;
> +	uint8_t res;
> +
> +	for (n = 0; n < n_unaligned; n++) {
> +		res = lgfs2_testbit(rbm);
> +		if (res != GFS2_BLKST_FREE)
> +			return 1;
> +		(*len)--;
> +		if (*len == 0)
> +			return 1;
> +		if (lgfs2_rbm_incr(rbm))
> +			return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +static uint8_t *check_bytes8(const uint8_t *start, uint8_t value, unsigned bytes)
> +{
> +	while (bytes) {
> +		if (*start != value)
> +			return (void *)start;
> +		start++;
> +		bytes--;
> +	}
> +	return NULL;
> +}
> +
> +/**
> + * lgfs2_free_extlen - Return extent length of free blocks
> + * @rbm: Starting position
> + * @len: Max length to check
> + *
> + * Starting at the block specified by the rbm, see how many free blocks
> + * there are, not reading more than len blocks ahead. This can be done
> + * using check_bytes8 when the blocks are byte aligned, but has to be done
> + * on a block by block basis in case of unaligned blocks. Also this
> + * function can cope with bitmap boundaries (although it must stop on
> + * a resource group boundary)
> + *
> + * Returns: Number of free blocks in the extent
> + */
> +static uint32_t lgfs2_free_extlen(const struct lgfs2_rbm *rrbm, uint32_t len)
> +{
> +	struct lgfs2_rbm rbm = *rrbm;
> +	uint32_t n_unaligned = rbm.offset & 3;
> +	uint32_t size = len;
> +	uint32_t bytes;
> +	uint32_t chunk_size;
> +	uint8_t *ptr, *start, *end;
> +	uint64_t block;
> +	struct gfs2_bitmap *bi;
> +
> +	if (n_unaligned &&
> +	    lgfs2_unaligned_extlen(&rbm, 4 - n_unaligned, &len))
> +		goto out;
> +
> +	n_unaligned = len & 3;
> +	/* Start is now byte aligned */
> +	while (len > 3) {
> +		bi = rbm_bi(&rbm);
> +		start = (uint8_t *)bi->bi_bh->b_data;
> +		end = start + bi->bi_bh->sdp->bsize;
> +		start += bi->bi_offset;
> +		start += (rbm.offset / GFS2_NBBY);
> +		bytes = (len / GFS2_NBBY) < (end - start) ? (len / GFS2_NBBY):(end - start);
> +		ptr = check_bytes8(start, 0, bytes);
> +		chunk_size = ((ptr == NULL) ? bytes : (ptr - start));
> +		chunk_size *= GFS2_NBBY;
> +		len -= chunk_size;
> +		block = lgfs2_rbm_to_block(&rbm);
> +		if (lgfs2_rbm_from_block(&rbm, block + chunk_size)) {
> +			n_unaligned = 0;
> +			break;
> +		}
> +		if (ptr) {
> +			n_unaligned = 3;
> +			break;
> +		}
> +		n_unaligned = len & 3;
> +	}
> +
> +	/* Deal with any bits left over at the end */
> +	if (n_unaligned)
> +		lgfs2_unaligned_extlen(&rbm, n_unaligned, &len);
> +out:
> +	return size - len;
> +}
> +
> +/**
> + * gfs2_rbm_find - Look for blocks of a particular state
> + * @rbm: Value/result starting position and final position
> + * @state: The state which we want to find
> + * @minext: Pointer to the requested extent length (NULL for a single block)
> + *          This is updated to be the actual reservation size.
> + *
> + * Returns: 0 on success, non-zero with errno == ENOSPC if there is no block of the requested state
> + */
> +int lgfs2_rbm_find(struct lgfs2_rbm *rbm, uint8_t state, uint32_t *minext)
> +{
> +	int initial_bii;
> +	uint32_t offset;
> +	int n = 0;
> +	int iters = rbm->rgd->ri.ri_length;
> +	uint32_t extlen;
> +
> +	/* If we are not starting at the beginning of a bitmap, then we
> +	 * need to add one to the bitmap count to ensure that we search
> +	 * the starting bitmap twice.
> +	 */
> +	if (rbm->offset != 0)
> +		iters++;
> +
> +	for (n = 0; n < iters; n++) {
> +		struct gfs2_bitmap *bi = rbm_bi(rbm);
> +		struct gfs2_buffer_head *bh = bi->bi_bh;
> +		uint8_t *buf = (uint8_t *)bh->b_data + bi->bi_offset;
> +		uint64_t block;
> +		int ret;
> +
> +		if ((rbm->rgd->rg.rg_free < *minext) && (state == GFS2_BLKST_FREE))
> +			goto next_bitmap;
> +
> +		offset = gfs2_bitfit(buf, bi->bi_len, rbm->offset, state);
> +		if (offset == BFITNOENT)
> +			goto next_bitmap;
> +
> +		rbm->offset = offset;
> +		initial_bii = rbm->bii;
> +		block = lgfs2_rbm_to_block(rbm);
> +		extlen = 1;
> +
> +		if (*minext != 0)
> +			extlen = lgfs2_free_extlen(rbm, *minext);
> +
> +		if (extlen >= *minext)
> +			return 0;
> +
> +		ret = lgfs2_rbm_from_block(rbm, block + extlen);
> +		if (ret == 0) {
> +			n += (rbm->bii - initial_bii);
> +			continue;
> +		}
> +
> +		if (errno == E2BIG) {
> +			rbm->bii = 0;
> +			rbm->offset = 0;
> +			n += (rbm->bii - initial_bii);
> +			goto res_covered_end_of_rgrp;
> +		}
> +
> +		return ret;
> +
> +next_bitmap:	/* Find next bitmap in the rgrp */
> +		rbm->offset = 0;
> +		rbm->bii++;
> +		if (rbm->bii == rbm->rgd->ri.ri_length)
> +			rbm->bii = 0;
> +
> +res_covered_end_of_rgrp:
> +		if (rbm->bii == 0)
> +			break;
> +	}
> +
> +	errno = ENOSPC;
> +	return 1;
> +}
> diff --git a/gfs2/libgfs2/rgrp.h b/gfs2/libgfs2/rgrp.h
> index 384231e..1634fbc 100644
> --- a/gfs2/libgfs2/rgrp.h
> +++ b/gfs2/libgfs2/rgrp.h
> @@ -44,4 +44,6 @@ static inline int lgfs2_rbm_eq(const struct lgfs2_rbm *rbm1, const struct lgfs2_
>   	        (rbm1->offset == rbm2->offset);
>   }
>   
> +extern int lgfs2_rbm_find(struct lgfs2_rbm *rbm, uint8_t state, uint32_t *minext);
> +
>   #endif /* __RGRP_DOT_H__ */



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation
  2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
                   ` (19 preceding siblings ...)
  2014-09-02 14:06 ` [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Bob Peterson
@ 2014-09-03 10:20 ` Steven Whitehouse
  20 siblings, 0 replies; 25+ messages in thread
From: Steven Whitehouse @ 2014-09-03 10:20 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Aside from the question on patch 6, the other patches all look good,

Steve.

On 02/09/14 13:07, Andrew Price wrote:
> Hi,
>
> This patch set introduces extent allocation to libgfs2 and adds functions which
> decouple file creation, allocation and writing so that mkfs.gfs2 can be
> re-worked to write journals and resource groups sequentially.
>
> With these patches, mkfs.gfs2 typically takes around 20% of the time that it
> did before in my tests.  The main speed-up has been from the journal data
> allocation functions not having to re-read and write a resource group for each
> block allocated as it did before (this was a performance regression introduced
> by previous memory footprint improvement patches, hence the significant perf
> improvement).  Journals now each occupy an extent spanning an entire resource
> group specifically sized for the journal, and the resource group headers are
> written only once, after the journal blocks have been allocated in the
> in-memory bitmaps. Resource groups are still only kept in memory for as long as
> they are needed so peak memory usage should be largely unchanged.
>
> One thing to note is that, with these patches, the root and master inodes are
> no longer the first objects in the first resource group. The master inode is
> written in the first free block after the journals and then the other metafs
> structures are placed. The root directory inode is then finally created. This
> is not a format change but it may cause some confusion after years of expecting
> the root and master inodes to be at certain addresses so I thought it worth
> mentioning.
>
> Coverity and valgrind are happy about these patches and I've encountered no
> problems after various tests which mount the fs.
>
> Cheers,
> Andy
>
> Andrew Price (19):
>    libgfs2: Keep a pointer to the sbd in lgfs2_rgrps_t
>    libgfs2: Move bitmap buffers inside struct gfs2_bitmap
>    libgfs2: Fix an impossible loop condition in gfs2_rgrp_read
>    libgfs2: Introduce struct lgfs2_rbm
>    libgfs2: Move struct _lgfs2_rgrps into rgrp.h
>    libgfs2: Add functions for finding free extents
>    tests: Add unit tests for the new extent search functions
>    libgfs2: Ignore an empty rgrp plan if a length is specified
>    libgfs2: Add back-pointer to rgrps in lgfs2_rgrp_t
>    libgfs2: Const-ify the parameters of print functions
>    libgfs2: Allow init_dinode to accept a preallocated bh
>    libgfs2: Add extent allocation functions
>    libgfs2: Add support for allocating entire rgrp headers
>    libgfs2: Write file metadata sequentially
>    libgfs2: Fix alignment in lgfs2_rgsize_for_data
>    libgfs2: Handle non-zero bitmaps in lgfs2_rgrp_write
>    libgfs2: Add a speedier journal data block writing function
>    libgfs2: Create jindex directory separately from journals
>    mkfs.gfs2: Improve journal creation performance
>
>   .gitignore                  |   3 +-
>   gfs2/convert/gfs2_convert.c |  49 +++--
>   gfs2/edit/journal.c         |   6 +-
>   gfs2/fsck/fs_recovery.c     |   2 +-
>   gfs2/fsck/initialize.c      |  27 +--
>   gfs2/fsck/metawalk.c        |  10 +-
>   gfs2/fsck/pass5.c           |   9 +-
>   gfs2/fsck/rgrepair.c        |  14 +-
>   gfs2/fsck/util.c            |   2 +-
>   gfs2/libgfs2/Makefile.am    |   2 +-
>   gfs2/libgfs2/fs_bits.c      |  10 +-
>   gfs2/libgfs2/fs_geometry.c  |   6 +-
>   gfs2/libgfs2/fs_ops.c       | 184 ++++++++++++++---
>   gfs2/libgfs2/libgfs2.h      |  50 +++--
>   gfs2/libgfs2/ondisk.c       |  26 +--
>   gfs2/libgfs2/rgrp.c         | 491 ++++++++++++++++++++++++++++++++++++--------
>   gfs2/libgfs2/rgrp.h         |  50 +++++
>   gfs2/libgfs2/structures.c   | 103 +++++++++-
>   gfs2/mkfs/main_grow.c       |   4 +-
>   gfs2/mkfs/main_mkfs.c       | 155 ++++++++++----
>   tests/Makefile.am           |  33 ++-
>   tests/check_rgrp.c          | 143 +++++++++++++
>   tests/libgfs2.at            |   8 +-
>   23 files changed, 1113 insertions(+), 274 deletions(-)
>   create mode 100644 gfs2/libgfs2/rgrp.h
>   create mode 100644 tests/check_rgrp.c
>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents
  2014-09-03 10:17   ` Steven Whitehouse
@ 2014-09-03 12:13     ` Andrew Price
  2014-09-03 12:24       ` Steven Whitehouse
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Price @ 2014-09-03 12:13 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 03/09/14 11:17, Steven Whitehouse wrote:
> Hi,
>
> You shouldn't need this for allocation in mkfs since you already know
> where the free extents are, so no need to be reading the bitmaps to find
> that out,
>
> Steve.

Well, that's true, but I'd like to use generic file allocation functions 
where possible and I wanted to make sure the new functions were able to 
allocate extents. I'd like to keep these changes in libgfs2 anyway 
because they'll be useful for future work in other tools, so how about 
an additional patch like the one below?

Andy

diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index 9c9cc82..b84b7f4 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -296,7 +296,9 @@ uint64_t lgfs2_space_for_data(const struct gfs2_sbd 
*sdp, const unsigned bsize,
   * Allocate an extent for a file in a resource group's bitmaps.
   * rg: The resource group in which to allocate the extent
   * di_size: The size of the file in bytes
- * ip: A pointer to the inode structure, whose fields will be set 
appropriately
+ * ip: A pointer to the inode structure, whose fields will be set 
appropriately.
+ *     If ip->i_di.di_num.no_addr is not 0, the extent search will be 
skipped and
+ *     the file allocated from that address.
   * flags: GFS2_DIF_* flags
   * mode: File mode flags, see creat(2)
   * Returns 0 on success with the contents of ip set accordingly, or 
non-zero
@@ -310,11 +312,13 @@ int lgfs2_file_alloc(lgfs2_rgrp_t rg, uint64_t 
di_size, struct gfs2_inode *ip, u
  	struct gfs2_sbd *sdp = rg->rgrps->sdp;
  	struct lgfs2_rbm rbm = { .rgd = rg, .offset = 0, .bii = 0 };
  	uint32_t blocks = lgfs2_space_for_data(sdp, sdp->bsize, di_size);
-	int err;

-	err = lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &blocks);
-	if (err != 0)
-		return err;
+	if (ip->i_di.di_num.no_addr != 0) {
+		if (lgfs2_rbm_from_block(&rbm, ip->i_di.di_num.no_addr) != 0)
+			return 1;
+	} else if (lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &blocks) != 0) {
+		return 1;
+	}

  	extlen = lgfs2_alloc_extent(&rbm, GFS2_BLKST_DINODE, blocks);
  	if (extlen < blocks) {
diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
index f57ae3a..0d0f000 100644
--- a/gfs2/libgfs2/rgrp.c
+++ b/gfs2/libgfs2/rgrp.c
@@ -643,7 +643,7 @@ lgfs2_rgrp_t lgfs2_rgrp_last(lgfs2_rgrps_t rgs)
   *
   * Returns: 0 on success, or non-zero with errno set
   */
-static int lgfs2_rbm_from_block(struct lgfs2_rbm *rbm, uint64_t block)
+int lgfs2_rbm_from_block(struct lgfs2_rbm *rbm, uint64_t block)
  {
  	uint64_t rblock = block - rbm->rgd->ri.ri_data0;
  	struct gfs2_sbd *sdp = rbm_bi(rbm)->bi_bh->sdp;
diff --git a/gfs2/libgfs2/rgrp.h b/gfs2/libgfs2/rgrp.h
index bd89289..fd442b1 100644
--- a/gfs2/libgfs2/rgrp.h
+++ b/gfs2/libgfs2/rgrp.h
@@ -44,6 +44,7 @@ static inline int lgfs2_rbm_eq(const struct lgfs2_rbm 
*rbm1, const struct lgfs2_
  	        (rbm1->offset == rbm2->offset);
  }

+extern int lgfs2_rbm_from_block(struct lgfs2_rbm *rbm, uint64_t block);
  extern int lgfs2_rbm_find(struct lgfs2_rbm *rbm, uint8_t state, 
uint32_t *minext);
  extern unsigned lgfs2_alloc_extent(const struct lgfs2_rbm *rbm, int 
state, const unsigned elen);

diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index 530383d..e927d82 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -694,6 +694,8 @@ static int place_journals(struct gfs2_sbd *sdp, 
lgfs2_rgrps_t rgs, struct mkfs_o
  			perror(_("Failed to allocate space for bitmap buffer"));
  			return result;
  		}
+		/* Allocate at the beginning of the rgrp, bypassing extent search */
+		in.i_di.di_num.no_addr = lgfs2_rgrp_index(rg)->ri_data0;
  		/* In order to keep writes sequential here, we have to allocate
  		   the journal, then write the rgrp header (which is now in its
  		   final form) and then write the journal out */



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents
  2014-09-03 12:13     ` Andrew Price
@ 2014-09-03 12:24       ` Steven Whitehouse
  0 siblings, 0 replies; 25+ messages in thread
From: Steven Whitehouse @ 2014-09-03 12:24 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 03/09/14 13:13, Andrew Price wrote:
> On 03/09/14 11:17, Steven Whitehouse wrote:
>> Hi,
>>
>> You shouldn't need this for allocation in mkfs since you already know
>> where the free extents are, so no need to be reading the bitmaps to find
>> that out,
>>
>> Steve.
>
> Well, that's true, but I'd like to use generic file allocation 
> functions where possible and I wanted to make sure the new functions 
> were able to allocate extents. I'd like to keep these changes in 
> libgfs2 anyway because they'll be useful for future work in other 
> tools, so how about an additional patch like the one below?
>
> Andy
>
That may be ok, depending on the context. I was expecting to see a two 
stage process of calculating how many blocks are required, and then 
figuring out how to divide the blocks between the rgrps (i.e. fixing the 
location) as a second stage,

Steve.

> diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
> index 9c9cc82..b84b7f4 100644
> --- a/gfs2/libgfs2/fs_ops.c
> +++ b/gfs2/libgfs2/fs_ops.c
> @@ -296,7 +296,9 @@ uint64_t lgfs2_space_for_data(const struct 
> gfs2_sbd *sdp, const unsigned bsize,
>   * Allocate an extent for a file in a resource group's bitmaps.
>   * rg: The resource group in which to allocate the extent
>   * di_size: The size of the file in bytes
> - * ip: A pointer to the inode structure, whose fields will be set 
> appropriately
> + * ip: A pointer to the inode structure, whose fields will be set 
> appropriately.
> + *     If ip->i_di.di_num.no_addr is not 0, the extent search will be 
> skipped and
> + *     the file allocated from that address.
>   * flags: GFS2_DIF_* flags
>   * mode: File mode flags, see creat(2)
>   * Returns 0 on success with the contents of ip set accordingly, or 
> non-zero
> @@ -310,11 +312,13 @@ int lgfs2_file_alloc(lgfs2_rgrp_t rg, uint64_t 
> di_size, struct gfs2_inode *ip, u
>      struct gfs2_sbd *sdp = rg->rgrps->sdp;
>      struct lgfs2_rbm rbm = { .rgd = rg, .offset = 0, .bii = 0 };
>      uint32_t blocks = lgfs2_space_for_data(sdp, sdp->bsize, di_size);
> -    int err;
>
> -    err = lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &blocks);
> -    if (err != 0)
> -        return err;
> +    if (ip->i_di.di_num.no_addr != 0) {
> +        if (lgfs2_rbm_from_block(&rbm, ip->i_di.di_num.no_addr) != 0)
> +            return 1;
> +    } else if (lgfs2_rbm_find(&rbm, GFS2_BLKST_FREE, &blocks) != 0) {
> +        return 1;
> +    }
>
>      extlen = lgfs2_alloc_extent(&rbm, GFS2_BLKST_DINODE, blocks);
>      if (extlen < blocks) {
> diff --git a/gfs2/libgfs2/rgrp.c b/gfs2/libgfs2/rgrp.c
> index f57ae3a..0d0f000 100644
> --- a/gfs2/libgfs2/rgrp.c
> +++ b/gfs2/libgfs2/rgrp.c
> @@ -643,7 +643,7 @@ lgfs2_rgrp_t lgfs2_rgrp_last(lgfs2_rgrps_t rgs)
>   *
>   * Returns: 0 on success, or non-zero with errno set
>   */
> -static int lgfs2_rbm_from_block(struct lgfs2_rbm *rbm, uint64_t block)
> +int lgfs2_rbm_from_block(struct lgfs2_rbm *rbm, uint64_t block)
>  {
>      uint64_t rblock = block - rbm->rgd->ri.ri_data0;
>      struct gfs2_sbd *sdp = rbm_bi(rbm)->bi_bh->sdp;
> diff --git a/gfs2/libgfs2/rgrp.h b/gfs2/libgfs2/rgrp.h
> index bd89289..fd442b1 100644
> --- a/gfs2/libgfs2/rgrp.h
> +++ b/gfs2/libgfs2/rgrp.h
> @@ -44,6 +44,7 @@ static inline int lgfs2_rbm_eq(const struct 
> lgfs2_rbm *rbm1, const struct lgfs2_
>              (rbm1->offset == rbm2->offset);
>  }
>
> +extern int lgfs2_rbm_from_block(struct lgfs2_rbm *rbm, uint64_t block);
>  extern int lgfs2_rbm_find(struct lgfs2_rbm *rbm, uint8_t state, 
> uint32_t *minext);
>  extern unsigned lgfs2_alloc_extent(const struct lgfs2_rbm *rbm, int 
> state, const unsigned elen);
>
> diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
> index 530383d..e927d82 100644
> --- a/gfs2/mkfs/main_mkfs.c
> +++ b/gfs2/mkfs/main_mkfs.c
> @@ -694,6 +694,8 @@ static int place_journals(struct gfs2_sbd *sdp, 
> lgfs2_rgrps_t rgs, struct mkfs_o
>              perror(_("Failed to allocate space for bitmap buffer"));
>              return result;
>          }
> +        /* Allocate at the beginning of the rgrp, bypassing extent 
> search */
> +        in.i_di.di_num.no_addr = lgfs2_rgrp_index(rg)->ri_data0;
>          /* In order to keep writes sequential here, we have to allocate
>             the journal, then write the rgrp header (which is now in its
>             final form) and then write the journal out */
>



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-09-03 12:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-02 12:07 [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 01/19] libgfs2: Keep a pointer to the sbd in lgfs2_rgrps_t Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 02/19] libgfs2: Move bitmap buffers inside struct gfs2_bitmap Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 03/19] libgfs2: Fix an impossible loop condition in gfs2_rgrp_read Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 04/19] libgfs2: Introduce struct lgfs2_rbm Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 05/19] libgfs2: Move struct _lgfs2_rgrps into rgrp.h Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 06/19] libgfs2: Add functions for finding free extents Andrew Price
2014-09-03 10:17   ` Steven Whitehouse
2014-09-03 12:13     ` Andrew Price
2014-09-03 12:24       ` Steven Whitehouse
2014-09-02 12:07 ` [Cluster-devel] [PATCH 07/19] tests: Add unit tests for the new extent search functions Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 08/19] libgfs2: Ignore an empty rgrp plan if a length is specified Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 09/19] libgfs2: Add back-pointer to rgrps in lgfs2_rgrp_t Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 10/19] libgfs2: Const-ify the parameters of print functions Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 11/19] libgfs2: Allow init_dinode to accept a preallocated bh Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 12/19] libgfs2: Add extent allocation functions Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 13/19] libgfs2: Add support for allocating entire rgrp headers Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 14/19] libgfs2: Write file metadata sequentially Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 15/19] libgfs2: Fix alignment in lgfs2_rgsize_for_data Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 16/19] libgfs2: Handle non-zero bitmaps in lgfs2_rgrp_write Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 17/19] libgfs2: Add a speedier journal data block writing function Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 18/19] libgfs2: Create jindex directory separately from journals Andrew Price
2014-09-02 12:07 ` [Cluster-devel] [PATCH 19/19] mkfs.gfs2: Improve journal creation performance Andrew Price
2014-09-02 14:06 ` [Cluster-devel] [PATCH 00/19] gfs2-utils: Introduce extent allocation and speed up journal creation Bob Peterson
2014-09-03 10:20 ` Steven Whitehouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.