All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH v2 0/3] Implement iomap for fiemap in GFS2
@ 2016-10-28 19:29 Bob Peterson
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 1/3] GFS2: Make height info part of metapath Bob Peterson
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Bob Peterson @ 2016-10-28 19:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 20 October, I submitted a set of 4 patches that implement iomap and
fiemap-with-iomap for GFS2. (It was a rework of an earlier patch set from
12 August.) I got some good feedback. Steve Whitehouse suggested I roll
the GFS2 allocation path into it and Christoph suggested I remove a
couple of useless casts. There are only three patches because when I
followed Steve's suggestion to include the block allocator stuff, it
simplified the code and made one of the patches completely unnecessary.
I also found and fixed some bugs during testing. This is a revised
version. The three patches are:

1. GFS2: Make height info part of metapath
   This patch is another "decoupling" needed to make the GFS2 blockmap
   function work with iomap under the covers. In this case, it eliminates
   the height parameters from function gfs2_bmap_alloc in favor of
   simply fetching the values from the metapath.
2. GFS2: Implement iomap for block_map
   This patch now implements the new GFS2 get_iomap function, and makes
   gfs2_block_map use it under the covers (allocation now included).
3. GFS2: Switch fiemap implementation to use iomap
   This patch switches GFS2's fiemap implementation to use the new iomap
   interface. It also uses a "select" in Kconfig to pull it all in, as
   Christoph had suggested.

This has been tested with large holey files, up to 1EB, and I also tested
some boundary conditions as well, such as 1PB plus and minus 1 byte, and
1PB plus and minus 4K. It seems to have basic functionality and excellent
performance, but it probably needs more testing.

As before, feedback is encouraged.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
Bob Peterson (3):
  GFS2: Make height info part of metapath
  GFS2: Implement iomap for block_map
  GFS2: Switch fiemap implementation to use iomap

 fs/gfs2/Kconfig       |   1 +
 fs/gfs2/bmap.c        | 291 +++++++++++++++++++++++++++++++++++++-------------
 fs/gfs2/bmap.h        |   4 +
 fs/gfs2/inode.c       |  60 ++++++++---
 include/linux/iomap.h |   1 +
 5 files changed, 269 insertions(+), 88 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 1/3] GFS2: Make height info part of metapath
  2016-10-28 19:29 [Cluster-devel] [PATCH v2 0/3] Implement iomap for fiemap in GFS2 Bob Peterson
@ 2016-10-28 19:29 ` Bob Peterson
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map Bob Peterson
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 3/3] GFS2: Switch fiemap implementation to use iomap Bob Peterson
  2 siblings, 0 replies; 13+ messages in thread
From: Bob Peterson @ 2016-10-28 19:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch eliminates height parameters from function gfs2_bmap_alloc.
Function find_metapath determines the metapath's "find height", also
known as the desired height. Function lookup_metapath determines the
metapath's "actual height", previously known as starting height or
sheight. Function gfs2_bmap_alloc now gets both height values from
the metapath. This simplification was done as a step toward switching
the block_map functions to using iomap. The bh_map responsibilities
are also removed from function gfs2_bmap_alloc for the same reason.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/bmap.c | 115 +++++++++++++++++++++++++++++++++------------------------
 1 file changed, 66 insertions(+), 49 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index fc5da4c..59b45b8 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -36,6 +36,8 @@
 struct metapath {
 	struct buffer_head *mp_bh[GFS2_MAX_META_HEIGHT];
 	__u16 mp_list[GFS2_MAX_META_HEIGHT];
+	int mp_fheight; /* find_metapath height */
+	int mp_aheight; /* actual height (lookup height) */
 };
 
 struct strip_mine {
@@ -240,9 +242,9 @@ static void find_metapath(const struct gfs2_sbd *sdp, u64 block,
 {
 	unsigned int i;
 
+	mp->mp_fheight = height;
 	for (i = height; i--;)
 		mp->mp_list[i] = do_div(block, sdp->sd_inptrs);
-
 }
 
 static inline unsigned int metapath_branch_start(const struct metapath *mp)
@@ -323,15 +325,20 @@ static int lookup_metapath(struct gfs2_inode *ip, struct metapath *mp)
 	for (x = 0; x < end_of_metadata; x++) {
 		ptr = metapointer(x, mp);
 		dblock = be64_to_cpu(*ptr);
-		if (!dblock)
-			return x + 1;
+		if (!dblock) {
+			ret = x + 1;
+			goto out;
+		}
 
 		ret = gfs2_meta_indirect_buffer(ip, x+1, dblock, &mp->mp_bh[x+1]);
 		if (ret)
-			return ret;
+			goto out;
 	}
 
-	return ip->i_height;
+	ret = ip->i_height;
+out:
+	mp->mp_aheight = ret;
+	return ret;
 }
 
 static inline void release_metapath(struct metapath *mp)
@@ -427,10 +434,11 @@ enum alloc_state {
  * @inode: The GFS2 inode
  * @lblock: The logical starting block of the extent
  * @bh_map: This is used to return the mapping details
- * @mp: The metapath
- * @sheight: The starting height (i.e. whats already mapped)
- * @height: The height to build to
+ * @zero_new: True if newly allocated blocks should be zeroed
+ * @mp: The metapath, with proper height information calculated
  * @maxlen: The max number of data blocks to alloc
+ * @dblock: Pointer to return the resulting new block
+ * @dblks: Pointer to return the number of blocks allocated
  *
  * In this routine we may have to alloc:
  *   i) Indirect blocks to grow the metadata tree height
@@ -447,62 +455,63 @@ enum alloc_state {
  */
 
 static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
-			   struct buffer_head *bh_map, struct metapath *mp,
-			   const unsigned int sheight,
-			   const unsigned int height,
-			   const size_t maxlen)
+			   bool zero_new, struct metapath *mp,
+			   const size_t maxlen, sector_t *dblock,
+			   unsigned *dblks)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct super_block *sb = sdp->sd_vfs;
 	struct buffer_head *dibh = mp->mp_bh[0];
-	u64 bn, dblock = 0;
+	u64 bn;
 	unsigned n, i, blks, alloced = 0, iblks = 0, branch_start = 0;
-	unsigned dblks = 0;
 	unsigned ptrs_per_blk;
-	const unsigned end_of_metadata = height - 1;
+	const unsigned end_of_metadata = mp->mp_fheight - 1;
 	int ret;
 	int eob = 0;
 	enum alloc_state state;
 	__be64 *ptr;
 	__be64 zero_bn = 0;
 
-	BUG_ON(sheight < 1);
+	BUG_ON(mp->mp_aheight < 1);
 	BUG_ON(dibh == NULL);
 
+	*dblock = 0;
+	*dblks = 0;
 	gfs2_trans_add_meta(ip->i_gl, dibh);
 
-	if (height == sheight) {
+	if (mp->mp_fheight == mp->mp_aheight) {
 		struct buffer_head *bh;
 		/* Bottom indirect block exists, find unalloced extent size */
 		ptr = metapointer(end_of_metadata, mp);
 		bh = mp->mp_bh[end_of_metadata];
-		dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr, maxlen,
-					   &eob);
-		BUG_ON(dblks < 1);
+		*dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
+					    maxlen, &eob);
+		BUG_ON(*dblks < 1);
 		state = ALLOC_DATA;
 	} else {
 		/* Need to allocate indirect blocks */
-		ptrs_per_blk = height > 1 ? sdp->sd_inptrs : sdp->sd_diptrs;
-		dblks = min(maxlen, (size_t)(ptrs_per_blk -
-					     mp->mp_list[end_of_metadata]));
-		if (height == ip->i_height) {
+		ptrs_per_blk = mp->mp_fheight > 1 ? sdp->sd_inptrs :
+			sdp->sd_diptrs;
+		*dblks = min(maxlen, (size_t)(ptrs_per_blk -
+					      mp->mp_list[end_of_metadata]));
+		if (mp->mp_fheight == ip->i_height) {
 			/* Writing into existing tree, extend tree down */
-			iblks = height - sheight;
+			iblks = mp->mp_fheight - mp->mp_aheight;
 			state = ALLOC_GROW_DEPTH;
 		} else {
 			/* Building up tree height */
 			state = ALLOC_GROW_HEIGHT;
-			iblks = height - ip->i_height;
+			iblks = mp->mp_fheight - ip->i_height;
 			branch_start = metapath_branch_start(mp);
-			iblks += (height - branch_start);
+			iblks += (mp->mp_fheight - branch_start);
 		}
 	}
 
 	/* start of the second part of the function (state machine) */
 
-	blks = dblks + iblks;
-	i = sheight;
+	blks = *dblks + iblks;
+	i = mp->mp_aheight;
 	do {
 		int error;
 		n = blks - alloced;
@@ -520,9 +529,10 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 						 sizeof(struct gfs2_dinode));
 				zero_bn = *ptr;
 			}
-			for (; i - 1 < height - ip->i_height && n > 0; i++, n--)
+			for (; i - 1 < mp->mp_fheight - ip->i_height && n > 0;
+			     i++, n--)
 				gfs2_indirect_init(mp, ip->i_gl, i, 0, bn++);
-			if (i - 1 == height - ip->i_height) {
+			if (i - 1 == mp->mp_fheight - ip->i_height) {
 				i--;
 				gfs2_buffer_copy_tail(mp->mp_bh[i],
 						sizeof(struct gfs2_meta_header),
@@ -534,7 +544,7 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 					sizeof(struct gfs2_meta_header));
 				*ptr = zero_bn;
 				state = ALLOC_GROW_DEPTH;
-				for(i = branch_start; i < height; i++) {
+				for(i = branch_start; i < mp->mp_fheight; i++) {
 					if (mp->mp_bh[i] == NULL)
 						break;
 					brelse(mp->mp_bh[i]);
@@ -546,44 +556,40 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 				break;
 		/* Branching from existing tree */
 		case ALLOC_GROW_DEPTH:
-			if (i > 1 && i < height)
+			if (i > 1 && i < mp->mp_fheight)
 				gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[i-1]);
-			for (; i < height && n > 0; i++, n--)
+			for (; i < mp->mp_fheight && n > 0; i++, n--)
 				gfs2_indirect_init(mp, ip->i_gl, i,
 						   mp->mp_list[i-1], bn++);
-			if (i == height)
+			if (i == mp->mp_fheight)
 				state = ALLOC_DATA;
 			if (n == 0)
 				break;
 		/* Tree complete, adding data blocks */
 		case ALLOC_DATA:
-			BUG_ON(n > dblks);
+			BUG_ON(n > *dblks);
 			BUG_ON(mp->mp_bh[end_of_metadata] == NULL);
 			gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[end_of_metadata]);
-			dblks = n;
+			*dblks = n;
 			ptr = metapointer(end_of_metadata, mp);
-			dblock = bn;
+			*dblock = bn;
 			while (n-- > 0)
 				*ptr++ = cpu_to_be64(bn++);
-			if (buffer_zeronew(bh_map)) {
-				ret = sb_issue_zeroout(sb, dblock, dblks,
+			if (zero_new) {
+				ret = sb_issue_zeroout(sb, *dblock, *dblks,
 						       GFP_NOFS);
 				if (ret) {
 					fs_err(sdp,
 					       "Failed to zero data buffers\n");
-					clear_buffer_zeronew(bh_map);
 				}
 			}
 			break;
 		}
-	} while ((state != ALLOC_DATA) || !dblock);
+	} while ((state != ALLOC_DATA) || !(*dblock));
 
-	ip->i_height = height;
+	ip->i_height = mp->mp_fheight;
 	gfs2_add_inode_blocks(&ip->i_inode, alloced);
 	gfs2_dinode_out(ip, mp->mp_bh[0]->b_data);
-	map_bh(bh_map, inode->i_sb, dblock);
-	bh_map->b_size = dblks << inode->i_blkbits;
-	set_buffer_new(bh_map);
 	return 0;
 }
 
@@ -617,6 +623,9 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 	unsigned int len;
 	struct buffer_head *bh;
 	u8 height;
+	bool zero_new = false;
+	sector_t dblock = 0;
+	unsigned dblks;
 
 	BUG_ON(maxlen == 0);
 
@@ -640,13 +649,13 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 	while (size > arr[height])
 		height++;
 	find_metapath(sdp, lblock, &mp, height);
-	ret = 1;
+	mp.mp_aheight = 1;
 	if (height > ip->i_height || gfs2_is_stuffed(ip))
 		goto do_alloc;
 	ret = lookup_metapath(ip, &mp);
 	if (ret < 0)
 		goto out;
-	if (ret != ip->i_height)
+	if (mp.mp_aheight != ip->i_height)
 		goto do_alloc;
 	ptr = metapointer(ip->i_height - 1, &mp);
 	if (*ptr == 0)
@@ -673,7 +682,15 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 	}
 
 	/* At this point ret is the tree depth of already allocated blocks */
-	ret = gfs2_bmap_alloc(inode, lblock, bh_map, &mp, ret, height, maxlen);
+	if (buffer_zeronew(bh_map))
+		zero_new = true;
+	ret = gfs2_bmap_alloc(inode, lblock, zero_new, &mp, maxlen, &dblock,
+			      &dblks);
+	if (ret == 0) {
+		map_bh(bh_map, inode->i_sb, dblock);
+		bh_map->b_size = dblks << inode->i_blkbits;
+		set_buffer_new(bh_map);
+	}
 	goto out;
 }
 
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-10-28 19:29 [Cluster-devel] [PATCH v2 0/3] Implement iomap for fiemap in GFS2 Bob Peterson
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 1/3] GFS2: Make height info part of metapath Bob Peterson
@ 2016-10-28 19:29 ` Bob Peterson
  2016-10-29  9:24   ` Steven Whitehouse
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 3/3] GFS2: Switch fiemap implementation to use iomap Bob Peterson
  2 siblings, 1 reply; 13+ messages in thread
From: Bob Peterson @ 2016-10-28 19:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch implements iomap for block mapping, and switches the
block_map function to use it under the covers.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/bmap.c | 248 ++++++++++++++++++++++++++++++++++++++++++---------------
 fs/gfs2/bmap.h |   4 +
 2 files changed, 189 insertions(+), 63 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 59b45b8..a654f67 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -13,6 +13,7 @@
 #include <linux/blkdev.h>
 #include <linux/gfs2_ondisk.h>
 #include <linux/crc32.h>
+#include <linux/iomap.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -454,10 +455,8 @@ enum alloc_state {
  * Returns: errno on error
  */
 
-static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
-			   bool zero_new, struct metapath *mp,
-			   const size_t maxlen, sector_t *dblock,
-			   unsigned *dblks)
+static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap,
+			    unsigned flags, struct metapath *mp)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
@@ -465,6 +464,7 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 	struct buffer_head *dibh = mp->mp_bh[0];
 	u64 bn;
 	unsigned n, i, blks, alloced = 0, iblks = 0, branch_start = 0;
+	unsigned dblks = 0;
 	unsigned ptrs_per_blk;
 	const unsigned end_of_metadata = mp->mp_fheight - 1;
 	int ret;
@@ -472,12 +472,11 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 	enum alloc_state state;
 	__be64 *ptr;
 	__be64 zero_bn = 0;
+	size_t maxlen = iomap->length >> inode->i_blkbits;
 
 	BUG_ON(mp->mp_aheight < 1);
 	BUG_ON(dibh == NULL);
 
-	*dblock = 0;
-	*dblks = 0;
 	gfs2_trans_add_meta(ip->i_gl, dibh);
 
 	if (mp->mp_fheight == mp->mp_aheight) {
@@ -485,16 +484,16 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 		/* Bottom indirect block exists, find unalloced extent size */
 		ptr = metapointer(end_of_metadata, mp);
 		bh = mp->mp_bh[end_of_metadata];
-		*dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
-					    maxlen, &eob);
-		BUG_ON(*dblks < 1);
+		dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
+					   maxlen, &eob);
+		BUG_ON(dblks < 1);
 		state = ALLOC_DATA;
 	} else {
 		/* Need to allocate indirect blocks */
 		ptrs_per_blk = mp->mp_fheight > 1 ? sdp->sd_inptrs :
 			sdp->sd_diptrs;
-		*dblks = min(maxlen, (size_t)(ptrs_per_blk -
-					      mp->mp_list[end_of_metadata]));
+		dblks = min(maxlen, (size_t)(ptrs_per_blk -
+					     mp->mp_list[end_of_metadata]));
 		if (mp->mp_fheight == ip->i_height) {
 			/* Writing into existing tree, extend tree down */
 			iblks = mp->mp_fheight - mp->mp_aheight;
@@ -510,7 +509,7 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 
 	/* start of the second part of the function (state machine) */
 
-	blks = *dblks + iblks;
+	blks = dblks + iblks;
 	i = mp->mp_aheight;
 	do {
 		int error;
@@ -567,26 +566,28 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 				break;
 		/* Tree complete, adding data blocks */
 		case ALLOC_DATA:
-			BUG_ON(n > *dblks);
+			BUG_ON(n > dblks);
 			BUG_ON(mp->mp_bh[end_of_metadata] == NULL);
 			gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[end_of_metadata]);
-			*dblks = n;
+			dblks = n;
 			ptr = metapointer(end_of_metadata, mp);
-			*dblock = bn;
+			iomap->blkno = bn;
 			while (n-- > 0)
 				*ptr++ = cpu_to_be64(bn++);
-			if (zero_new) {
-				ret = sb_issue_zeroout(sb, *dblock, *dblks,
-						       GFP_NOFS);
+			if (flags & IOMAP_ZERO) {
+				ret = sb_issue_zeroout(sb, iomap->blkno,
+						       dblks, GFP_NOFS);
 				if (ret) {
 					fs_err(sdp,
 					       "Failed to zero data buffers\n");
+					flags &= ~IOMAP_ZERO;
 				}
 			}
 			break;
 		}
-	} while ((state != ALLOC_DATA) || !(*dblock));
+	} while (iomap->blkno == IOMAP_NULL_BLOCK);
 
+	iomap->length = dblks << inode->i_blkbits;
 	ip->i_height = mp->mp_fheight;
 	gfs2_add_inode_blocks(&ip->i_inode, alloced);
 	gfs2_dinode_out(ip, mp->mp_bh[0]->b_data);
@@ -594,47 +595,103 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 }
 
 /**
- * gfs2_block_map - Map a block from an inode to a disk block
- * @inode: The inode
- * @lblock: The logical block number
- * @bh_map: The bh to be mapped
- * @create: True if its ok to alloc blocks to satify the request
+ * hole_size - figure out the size of a hole
+ * @ip: The inode
+ * @lblock: The logical starting block number
+ * @mp: The metapath
  *
- * Sets buffer_mapped() if successful, sets buffer_boundary() if a
- * read of metadata will be required before the next block can be
- * mapped. Sets buffer_new() if new blocks were allocated.
+ * Returns: The hole size in bytes
  *
- * Returns: errno
  */
+static u64 hole_size(struct inode *inode, sector_t lblock, struct metapath *mp)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	struct metapath mp_eof;
+	unsigned int end_of_metadata = ip->i_height - 1;
+	u64 factor = 1;
+	int hgt = end_of_metadata;
+	u64 holesz = 0, holestep;
+	const __be64 *first, *end, *ptr;
+	const struct buffer_head *bh;
+	u64 isize_blks = (i_size_read(inode) - 1) >> inode->i_blkbits;
+	int zeroptrs;
+	bool done = false;
+
+	/* Get another metapath, to the very last byte */
+	find_metapath(sdp, isize_blks, &mp_eof, ip->i_height);
+	for (hgt = end_of_metadata; hgt >= 0 && !done; hgt--) {
+		bh = mp->mp_bh[hgt];
+		if (bh) {
+			zeroptrs = 0;
+			first = metapointer(hgt, mp);
+			end = (const __be64 *)(bh->b_data + bh->b_size);
+
+			for (ptr = first; ptr < end; ptr++) {
+				if (*ptr) {
+					done = true;
+					break;
+				} else {
+					zeroptrs++;
+				}
+			}
+		} else {
+			zeroptrs = sdp->sd_inptrs;
+		}
+		holestep = min(factor * zeroptrs,
+			       isize_blks - (lblock + (zeroptrs * holesz)));
+		holesz += holestep;
+		if (lblock + holesz >= isize_blks) {
+			holesz = isize_blks - lblock;
+			break;
+		}
 
-int gfs2_block_map(struct inode *inode, sector_t lblock,
-		   struct buffer_head *bh_map, int create)
+		factor *= sdp->sd_inptrs;
+		if (hgt && (mp->mp_list[hgt - 1] < mp_eof.mp_list[hgt - 1]))
+			(mp->mp_list[hgt - 1])++;
+	}
+	return holesz << inode->i_blkbits;
+}
+
+/**
+ * _gfs2_get_iomap - Map blocks from an inode to disk blocks
+ * @mapping: The address space
+ * @pos: Starting position in bytes
+ * @length: Length to map, in bytes
+ * @iomap: The iomap structure
+ * @mp: An uninitialized metapath structure
+ * @release_mpath: if true, we need to release the metapath when done
+ *
+ * Returns: errno
+ */
+static int _gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length,
+			   struct iomap *iomap, unsigned flags)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	struct metapath mp = { .mp_aheight = 1, };
 	unsigned int bsize = sdp->sd_sb.sb_bsize;
-	const size_t maxlen = bh_map->b_size >> inode->i_blkbits;
+	const size_t maxlen = length  >> inode->i_blkbits;
 	const u64 *arr = sdp->sd_heightsize;
 	__be64 *ptr;
-	u64 size;
-	struct metapath mp;
+	sector_t lblock = pos >> sdp->sd_sb.sb_bsize_shift;
 	int ret;
 	int eob;
 	unsigned int len;
 	struct buffer_head *bh;
 	u8 height;
-	bool zero_new = false;
-	sector_t dblock = 0;
-	unsigned dblks;
+	loff_t isize = i_size_read(inode);
 
-	BUG_ON(maxlen == 0);
+	if (maxlen == 0)
+		return -EINVAL;
+
+	iomap->offset = pos;
+	iomap->blkno = IOMAP_NULL_BLOCK;
+	iomap->type = IOMAP_HOLE;
+	iomap->length = length;
+	iomap->flags = 0;
+	bmap_lock(ip, 0);
 
-	memset(mp.mp_bh, 0, sizeof(mp.mp_bh));
-	bmap_lock(ip, create);
-	clear_buffer_mapped(bh_map);
-	clear_buffer_new(bh_map);
-	clear_buffer_boundary(bh_map);
-	trace_gfs2_bmap(ip, bh_map, lblock, create, 1);
 	if (gfs2_is_dir(ip)) {
 		bsize = sdp->sd_jbsize;
 		arr = sdp->sd_jheightsize;
@@ -645,53 +702,118 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 		goto out;
 
 	height = ip->i_height;
-	size = (lblock + 1) * bsize;
-	while (size > arr[height])
+	while ((lblock + 1) * bsize > arr[height])
 		height++;
 	find_metapath(sdp, lblock, &mp, height);
-	mp.mp_aheight = 1;
 	if (height > ip->i_height || gfs2_is_stuffed(ip))
 		goto do_alloc;
+
 	ret = lookup_metapath(ip, &mp);
 	if (ret < 0)
 		goto out;
+
 	if (mp.mp_aheight != ip->i_height)
 		goto do_alloc;
+
 	ptr = metapointer(ip->i_height - 1, &mp);
 	if (*ptr == 0)
 		goto do_alloc;
-	map_bh(bh_map, inode->i_sb, be64_to_cpu(*ptr));
+
+	iomap->type = IOMAP_MAPPED;
+	iomap->blkno = be64_to_cpu(*ptr);
+
 	bh = mp.mp_bh[ip->i_height - 1];
 	len = gfs2_extent_length(bh->b_data, bh->b_size, ptr, maxlen, &eob);
-	bh_map->b_size = (len << inode->i_blkbits);
-	if (eob)
-		set_buffer_boundary(bh_map);
+	iomap->length = (len << inode->i_blkbits);
+
+	if (iomap->offset + iomap->length >= isize)
+		iomap->length = isize - iomap->offset;
+
 	ret = 0;
+
 out:
 	release_metapath(&mp);
-	trace_gfs2_bmap(ip, bh_map, lblock, create, ret);
-	bmap_unlock(ip, create);
+	bmap_unlock(ip, 0);
 	return ret;
 
 do_alloc:
-	/* All allocations are done here, firstly check create flag */
-	if (!create) {
+	if (!(flags & IOMAP_WRITE)) {
+		if (pos >= isize) {
+			ret = -ENOENT;
+			goto out;
+		}
 		BUG_ON(gfs2_is_stuffed(ip));
 		ret = 0;
+		iomap->length = hole_size(inode, lblock, &mp);
 		goto out;
 	}
 
-	/* At this point ret is the tree depth of already allocated blocks */
+	ret = gfs2_iomap_alloc(inode, iomap, flags, &mp);
+	if (ret == 0)
+		iomap->flags = IOMAP_WRITE;
+	goto out;
+}
+
+int gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length,
+		   struct iomap *iomap)
+{
+	return _gfs2_get_iomap(inode, pos, length, iomap, 0);
+}
+
+/**
+ * gfs2_block_map - Map a block from an inode to a disk block
+ * @inode: The inode
+ * @lblock: The logical block number
+ * @bh_map: The bh to be mapped
+ * @create: True if its ok to alloc blocks to satify the request
+ *
+ * Sets buffer_mapped() if successful, sets buffer_boundary() if a
+ * read of metadata will be required before the next block can be
+ * mapped. Sets buffer_new() if new blocks were allocated.
+ *
+ * Returns: errno
+ */
+
+int gfs2_block_map(struct inode *inode, sector_t lblock,
+		   struct buffer_head *bh_map, int create)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	const size_t maxlen = bh_map->b_size >> inode->i_blkbits;
+	struct iomap iomap;
+	int ret, flags = 0;
+
+	BUG_ON(maxlen == 0);
+
+	clear_buffer_mapped(bh_map);
+	clear_buffer_new(bh_map);
+	clear_buffer_boundary(bh_map);
+	trace_gfs2_bmap(ip, bh_map, lblock, create, 1);
+
+	if (create)
+		flags |= IOMAP_WRITE;
 	if (buffer_zeronew(bh_map))
-		zero_new = true;
-	ret = gfs2_bmap_alloc(inode, lblock, zero_new, &mp, maxlen, &dblock,
-			      &dblks);
-	if (ret == 0) {
-		map_bh(bh_map, inode->i_sb, dblock);
-		bh_map->b_size = dblks << inode->i_blkbits;
+		flags |= IOMAP_ZERO;
+	ret = _gfs2_get_iomap(inode, lblock << sdp->sd_sb.sb_bsize_shift,
+			      bh_map->b_size, &iomap, flags);
+
+	iomap.length = round_up(iomap.length, sdp->sd_sb.sb_bsize);
+	bh_map->b_size = iomap.length;
+	if (maxlen >= iomap.length)
+		set_buffer_boundary(bh_map);
+
+	if (ret)
+		goto out;
+
+	if (iomap.blkno != IOMAP_NULL_BLOCK)
+		map_bh(bh_map, inode->i_sb, iomap.blkno);
+
+	bh_map->b_size = iomap.length;
+	if (iomap.flags & IOMAP_WRITE)
 		set_buffer_new(bh_map);
-	}
-	goto out;
+out:
+	trace_gfs2_bmap(ip, bh_map, lblock, create, ret);
+	return ret;
 }
 
 /*
diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
index 81ded5e..8da2429 100644
--- a/fs/gfs2/bmap.h
+++ b/fs/gfs2/bmap.h
@@ -10,6 +10,8 @@
 #ifndef __BMAP_DOT_H__
 #define __BMAP_DOT_H__
 
+#include <linux/iomap.h>
+
 #include "inode.h"
 
 struct inode;
@@ -47,6 +49,8 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip,
 extern int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page);
 extern int gfs2_block_map(struct inode *inode, sector_t lblock,
 			  struct buffer_head *bh, int create);
+extern int gfs2_get_iomap(struct inode *inode, loff_t pos,
+			  ssize_t length, struct iomap *iomap);
 extern int gfs2_extent_map(struct inode *inode, u64 lblock, int *new,
 			   u64 *dblock, unsigned *extlen);
 extern int gfs2_setattr_size(struct inode *inode, u64 size);
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 3/3] GFS2: Switch fiemap implementation to use iomap
  2016-10-28 19:29 [Cluster-devel] [PATCH v2 0/3] Implement iomap for fiemap in GFS2 Bob Peterson
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 1/3] GFS2: Make height info part of metapath Bob Peterson
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map Bob Peterson
@ 2016-10-28 19:29 ` Bob Peterson
  2016-10-29  9:33   ` Steven Whitehouse
  2 siblings, 1 reply; 13+ messages in thread
From: Bob Peterson @ 2016-10-28 19:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch switches GFS2's implementation of fiemap from the old
block_map code to the new iomap interface.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/Kconfig       |  1 +
 fs/gfs2/inode.c       | 60 ++++++++++++++++++++++++++++++++++++++++-----------
 include/linux/iomap.h |  1 +
 3 files changed, 50 insertions(+), 12 deletions(-)

diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
index 90c6a8f..43c827a 100644
--- a/fs/gfs2/Kconfig
+++ b/fs/gfs2/Kconfig
@@ -4,6 +4,7 @@ config GFS2_FS
 	select FS_POSIX_ACL
 	select CRC32
 	select QUOTACTL
+	select FS_IOMAP
 	help
 	  A cluster filesystem.
 
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index fe3f849..322d178 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -17,7 +17,7 @@
 #include <linux/posix_acl.h>
 #include <linux/gfs2_ondisk.h>
 #include <linux/crc32.h>
-#include <linux/fiemap.h>
+#include <linux/iomap.h>
 #include <linux/security.h>
 #include <asm/uaccess.h>
 
@@ -1994,28 +1994,64 @@ static int gfs2_getattr(struct vfsmount *mnt, struct dentry *dentry,
 	return 0;
 }
 
+static int gfs2_iomap_fiemap_begin(struct inode *inode, loff_t offset,
+				   loff_t length, unsigned flags,
+				   struct iomap *iomap)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_holder *gh;
+	int ret;
+
+	gh = kzalloc(sizeof(struct gfs2_holder), GFP_NOFS);
+	if (!gh)
+		return -ENOMEM;
+
+	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, gh);
+	if (ret) {
+		kfree(gh);
+		return ret;
+	}
+	iomap->private = gh;
+	ret = gfs2_get_iomap(inode, offset, length, iomap);
+	if (ret)
+		gfs2_glock_dq_uninit(gh);
+	return ret;
+}
+
+static int gfs2_iomap_fiemap_end(struct inode *inode, loff_t offset,
+				 loff_t length, ssize_t written,
+				 unsigned flags, struct iomap *iomap)
+{
+	struct gfs2_holder *gh = iomap->private;
+
+	BUG_ON(gh == NULL);
+	gfs2_glock_dq_uninit(gh);
+	return 0;
+}
+
 static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		       u64 start, u64 len)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_holder gh;
 	int ret;
 
-	ret = fiemap_check_flags(fieinfo, FIEMAP_FLAG_SYNC);
-	if (ret)
-		return ret;
+	struct iomap_ops gfs2_iomap_fiemap_ops = {
+		.iomap_begin = gfs2_iomap_fiemap_begin,
+		.iomap_end = gfs2_iomap_fiemap_end,
+	};
 
 	inode_lock(inode);
 
-	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
-	if (ret)
-		goto out;
-
 	if (gfs2_is_stuffed(ip)) {
+		struct gfs2_holder gh;
 		u64 phys = ip->i_no_addr << inode->i_blkbits;
 		u64 size = i_size_read(inode);
 		u32 flags = FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_NOT_ALIGNED|
 			    FIEMAP_EXTENT_DATA_INLINE;
+		ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
+		if (ret)
+			goto out;
+
 		phys += sizeof(struct gfs2_dinode);
 		phys += start;
 		if (start + len > size)
@@ -2025,12 +2061,12 @@ static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 						      len, flags);
 		if (ret == 1)
 			ret = 0;
+		gfs2_glock_dq_uninit(&gh);
 	} else {
-		ret = __generic_block_fiemap(inode, fieinfo, start, len,
-					     gfs2_block_map);
+		ret = iomap_fiemap(inode, fieinfo, start, len,
+				   &gfs2_iomap_fiemap_ops);
 	}
 
-	gfs2_glock_dq_uninit(&gh);
 out:
 	inode_unlock(inode);
 	return ret;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e63e288..0f177d3 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -37,6 +37,7 @@ struct iomap {
 	u16			type;	/* type of mapping */
 	u16			flags;	/* flags for mapping */
 	struct block_device	*bdev;	/* block device for I/O */
+	void			*private; /* private value for fs use */
 };
 
 /*
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map Bob Peterson
@ 2016-10-29  9:24   ` Steven Whitehouse
  2016-10-31 12:07     ` Bob Peterson
  2016-10-31 20:07     ` Dave Chinner
  0 siblings, 2 replies; 13+ messages in thread
From: Steven Whitehouse @ 2016-10-29  9:24 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Is there a missing patch here? I can see 0/3, 2/3 and 3/3, but not a 1/3

Overall this is looking a lot cleaner I think, but some comments below....

On 28/10/16 20:29, Bob Peterson wrote:
> This patch implements iomap for block mapping, and switches the
> block_map function to use it under the covers.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/bmap.c | 248 ++++++++++++++++++++++++++++++++++++++++++---------------
>   fs/gfs2/bmap.h |   4 +
>   2 files changed, 189 insertions(+), 63 deletions(-)
>
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index 59b45b8..a654f67 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -13,6 +13,7 @@
>   #include <linux/blkdev.h>
>   #include <linux/gfs2_ondisk.h>
>   #include <linux/crc32.h>
> +#include <linux/iomap.h>
>   
>   #include "gfs2.h"
>   #include "incore.h"
> @@ -454,10 +455,8 @@ enum alloc_state {
>    * Returns: errno on error
>    */
>   
> -static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
> -			   bool zero_new, struct metapath *mp,
> -			   const size_t maxlen, sector_t *dblock,
> -			   unsigned *dblks)
> +static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap,
> +			    unsigned flags, struct metapath *mp)
>   {
>   	struct gfs2_inode *ip = GFS2_I(inode);
>   	struct gfs2_sbd *sdp = GFS2_SB(inode);
> @@ -465,6 +464,7 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>   	struct buffer_head *dibh = mp->mp_bh[0];
>   	u64 bn;
>   	unsigned n, i, blks, alloced = 0, iblks = 0, branch_start = 0;
> +	unsigned dblks = 0;
>   	unsigned ptrs_per_blk;
>   	const unsigned end_of_metadata = mp->mp_fheight - 1;
>   	int ret;
> @@ -472,12 +472,11 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>   	enum alloc_state state;
>   	__be64 *ptr;
>   	__be64 zero_bn = 0;
> +	size_t maxlen = iomap->length >> inode->i_blkbits;
>   
>   	BUG_ON(mp->mp_aheight < 1);
>   	BUG_ON(dibh == NULL);
>   
> -	*dblock = 0;
> -	*dblks = 0;
>   	gfs2_trans_add_meta(ip->i_gl, dibh);
>   
>   	if (mp->mp_fheight == mp->mp_aheight) {
> @@ -485,16 +484,16 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>   		/* Bottom indirect block exists, find unalloced extent size */
>   		ptr = metapointer(end_of_metadata, mp);
>   		bh = mp->mp_bh[end_of_metadata];
> -		*dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
> -					    maxlen, &eob);
> -		BUG_ON(*dblks < 1);
> +		dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
> +					   maxlen, &eob);
> +		BUG_ON(dblks < 1);
>   		state = ALLOC_DATA;
>   	} else {
>   		/* Need to allocate indirect blocks */
>   		ptrs_per_blk = mp->mp_fheight > 1 ? sdp->sd_inptrs :
>   			sdp->sd_diptrs;
> -		*dblks = min(maxlen, (size_t)(ptrs_per_blk -
> -					      mp->mp_list[end_of_metadata]));
> +		dblks = min(maxlen, (size_t)(ptrs_per_blk -
> +					     mp->mp_list[end_of_metadata]));
>   		if (mp->mp_fheight == ip->i_height) {
>   			/* Writing into existing tree, extend tree down */
>   			iblks = mp->mp_fheight - mp->mp_aheight;
> @@ -510,7 +509,7 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>   
>   	/* start of the second part of the function (state machine) */
>   
> -	blks = *dblks + iblks;
> +	blks = dblks + iblks;
>   	i = mp->mp_aheight;
>   	do {
>   		int error;
> @@ -567,26 +566,28 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>   				break;
>   		/* Tree complete, adding data blocks */
>   		case ALLOC_DATA:
> -			BUG_ON(n > *dblks);
> +			BUG_ON(n > dblks);
>   			BUG_ON(mp->mp_bh[end_of_metadata] == NULL);
>   			gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[end_of_metadata]);
> -			*dblks = n;
> +			dblks = n;
>   			ptr = metapointer(end_of_metadata, mp);
> -			*dblock = bn;
> +			iomap->blkno = bn;
>   			while (n-- > 0)
>   				*ptr++ = cpu_to_be64(bn++);
> -			if (zero_new) {
> -				ret = sb_issue_zeroout(sb, *dblock, *dblks,
> -						       GFP_NOFS);
> +			if (flags & IOMAP_ZERO) {
> +				ret = sb_issue_zeroout(sb, iomap->blkno,
> +						       dblks, GFP_NOFS);
>   				if (ret) {
>   					fs_err(sdp,
>   					       "Failed to zero data buffers\n");
> +					flags &= ~IOMAP_ZERO;
>   				}
>   			}
>   			break;
>   		}
> -	} while ((state != ALLOC_DATA) || !(*dblock));
> +	} while (iomap->blkno == IOMAP_NULL_BLOCK);
>   
> +	iomap->length = dblks << inode->i_blkbits;
>   	ip->i_height = mp->mp_fheight;
>   	gfs2_add_inode_blocks(&ip->i_inode, alloced);
>   	gfs2_dinode_out(ip, mp->mp_bh[0]->b_data);
> @@ -594,47 +595,103 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>   }
>   
>   /**
> - * gfs2_block_map - Map a block from an inode to a disk block
> - * @inode: The inode
> - * @lblock: The logical block number
> - * @bh_map: The bh to be mapped
> - * @create: True if its ok to alloc blocks to satify the request
> + * hole_size - figure out the size of a hole
> + * @ip: The inode
> + * @lblock: The logical starting block number
> + * @mp: The metapath
>    *
> - * Sets buffer_mapped() if successful, sets buffer_boundary() if a
> - * read of metadata will be required before the next block can be
> - * mapped. Sets buffer_new() if new blocks were allocated.
> + * Returns: The hole size in bytes
>    *
> - * Returns: errno
>    */
> +static u64 hole_size(struct inode *inode, sector_t lblock, struct metapath *mp)
> +{
> +	struct gfs2_inode *ip = GFS2_I(inode);
> +	struct gfs2_sbd *sdp = GFS2_SB(inode);
> +	struct metapath mp_eof;
> +	unsigned int end_of_metadata = ip->i_height - 1;
> +	u64 factor = 1;
> +	int hgt = end_of_metadata;
> +	u64 holesz = 0, holestep;
> +	const __be64 *first, *end, *ptr;
> +	const struct buffer_head *bh;
> +	u64 isize_blks = (i_size_read(inode) - 1) >> inode->i_blkbits;
> +	int zeroptrs;
> +	bool done = false;
> +
> +	/* Get another metapath, to the very last byte */
> +	find_metapath(sdp, isize_blks, &mp_eof, ip->i_height);
> +	for (hgt = end_of_metadata; hgt >= 0 && !done; hgt--) {
> +		bh = mp->mp_bh[hgt];
> +		if (bh) {
> +			zeroptrs = 0;
> +			first = metapointer(hgt, mp);
> +			end = (const __be64 *)(bh->b_data + bh->b_size);
> +
> +			for (ptr = first; ptr < end; ptr++) {
> +				if (*ptr) {
> +					done = true;
> +					break;
> +				} else {
> +					zeroptrs++;
> +				}
> +			}
> +		} else {
> +			zeroptrs = sdp->sd_inptrs;
> +		}
> +		holestep = min(factor * zeroptrs,
> +			       isize_blks - (lblock + (zeroptrs * holesz)));
> +		holesz += holestep;
> +		if (lblock + holesz >= isize_blks) {
> +			holesz = isize_blks - lblock;
> +			break;
> +		}
>   
> -int gfs2_block_map(struct inode *inode, sector_t lblock,
> -		   struct buffer_head *bh_map, int create)
> +		factor *= sdp->sd_inptrs;
> +		if (hgt && (mp->mp_list[hgt - 1] < mp_eof.mp_list[hgt - 1]))
> +			(mp->mp_list[hgt - 1])++;
> +	}
> +	return holesz << inode->i_blkbits;
> +}
> +
> +/**
> + * _gfs2_get_iomap - Map blocks from an inode to disk blocks
> + * @mapping: The address space
> + * @pos: Starting position in bytes
> + * @length: Length to map, in bytes
> + * @iomap: The iomap structure
> + * @mp: An uninitialized metapath structure
> + * @release_mpath: if true, we need to release the metapath when done
> + *
> + * Returns: errno
> + */
> +static int _gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length,
> +			   struct iomap *iomap, unsigned flags)
>   {
>   	struct gfs2_inode *ip = GFS2_I(inode);
>   	struct gfs2_sbd *sdp = GFS2_SB(inode);
> +	struct metapath mp = { .mp_aheight = 1, };
>   	unsigned int bsize = sdp->sd_sb.sb_bsize;
> -	const size_t maxlen = bh_map->b_size >> inode->i_blkbits;
> +	const size_t maxlen = length  >> inode->i_blkbits;
>   	const u64 *arr = sdp->sd_heightsize;
>   	__be64 *ptr;
> -	u64 size;
> -	struct metapath mp;
> +	sector_t lblock = pos >> sdp->sd_sb.sb_bsize_shift;
>   	int ret;
>   	int eob;
>   	unsigned int len;
>   	struct buffer_head *bh;
>   	u8 height;
> -	bool zero_new = false;
> -	sector_t dblock = 0;
> -	unsigned dblks;
> +	loff_t isize = i_size_read(inode);
>   
> -	BUG_ON(maxlen == 0);
> +	if (maxlen == 0)
> +		return -EINVAL;
> +
> +	iomap->offset = pos;
> +	iomap->blkno = IOMAP_NULL_BLOCK;
> +	iomap->type = IOMAP_HOLE;
> +	iomap->length = length;
> +	iomap->flags = 0;
> +	bmap_lock(ip, 0);
>   
> -	memset(mp.mp_bh, 0, sizeof(mp.mp_bh));
> -	bmap_lock(ip, create);
> -	clear_buffer_mapped(bh_map);
> -	clear_buffer_new(bh_map);
> -	clear_buffer_boundary(bh_map);
> -	trace_gfs2_bmap(ip, bh_map, lblock, create, 1);
>   	if (gfs2_is_dir(ip)) {
>   		bsize = sdp->sd_jbsize;
>   		arr = sdp->sd_jheightsize;
> @@ -645,53 +702,118 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
>   		goto out;
>   
>   	height = ip->i_height;
> -	size = (lblock + 1) * bsize;
> -	while (size > arr[height])
> +	while ((lblock + 1) * bsize > arr[height])
>   		height++;
>   	find_metapath(sdp, lblock, &mp, height);
> -	mp.mp_aheight = 1;
>   	if (height > ip->i_height || gfs2_is_stuffed(ip))
>   		goto do_alloc;
> +
>   	ret = lookup_metapath(ip, &mp);
>   	if (ret < 0)
>   		goto out;
> +
>   	if (mp.mp_aheight != ip->i_height)
>   		goto do_alloc;
> +
>   	ptr = metapointer(ip->i_height - 1, &mp);
>   	if (*ptr == 0)
>   		goto do_alloc;
> -	map_bh(bh_map, inode->i_sb, be64_to_cpu(*ptr));
> +
> +	iomap->type = IOMAP_MAPPED;
> +	iomap->blkno = be64_to_cpu(*ptr);
> +
>   	bh = mp.mp_bh[ip->i_height - 1];
>   	len = gfs2_extent_length(bh->b_data, bh->b_size, ptr, maxlen, &eob);
> -	bh_map->b_size = (len << inode->i_blkbits);
> -	if (eob)
> -		set_buffer_boundary(bh_map);
> +	iomap->length = (len << inode->i_blkbits);
> +
> +	if (iomap->offset + iomap->length >= isize)
> +		iomap->length = isize - iomap->offset;
> +
>   	ret = 0;
> +
>   out:
>   	release_metapath(&mp);
> -	trace_gfs2_bmap(ip, bh_map, lblock, create, ret);
> -	bmap_unlock(ip, create);
> +	bmap_unlock(ip, 0);
>   	return ret;
>   
>   do_alloc:
> -	/* All allocations are done here, firstly check create flag */
> -	if (!create) {
> +	if (!(flags & IOMAP_WRITE)) {
> +		if (pos >= isize) {
> +			ret = -ENOENT;
> +			goto out;
> +		}
>   		BUG_ON(gfs2_is_stuffed(ip));
>   		ret = 0;
> +		iomap->length = hole_size(inode, lblock, &mp);
>   		goto out;
>   	}
>   
> -	/* At this point ret is the tree depth of already allocated blocks */
> +	ret = gfs2_iomap_alloc(inode, iomap, flags, &mp);
> +	if (ret == 0)
> +		iomap->flags = IOMAP_WRITE;
> +	goto out;
> +}
> +
> +int gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length,
> +		   struct iomap *iomap)
> +{
> +	return _gfs2_get_iomap(inode, pos, length, iomap, 0);
> +}
> +
> +/**
> + * gfs2_block_map - Map a block from an inode to a disk block
> + * @inode: The inode
> + * @lblock: The logical block number
> + * @bh_map: The bh to be mapped
> + * @create: True if its ok to alloc blocks to satify the request
> + *
> + * Sets buffer_mapped() if successful, sets buffer_boundary() if a
> + * read of metadata will be required before the next block can be
> + * mapped. Sets buffer_new() if new blocks were allocated.
> + *
> + * Returns: errno
> + */
> +
> +int gfs2_block_map(struct inode *inode, sector_t lblock,
> +		   struct buffer_head *bh_map, int create)
> +{
> +	struct gfs2_inode *ip = GFS2_I(inode);
> +	struct gfs2_sbd *sdp = GFS2_SB(inode);
> +	const size_t maxlen = bh_map->b_size >> inode->i_blkbits;
> +	struct iomap iomap;
> +	int ret, flags = 0;
> +
> +	BUG_ON(maxlen == 0);
> +
> +	clear_buffer_mapped(bh_map);
> +	clear_buffer_new(bh_map);
> +	clear_buffer_boundary(bh_map);
> +	trace_gfs2_bmap(ip, bh_map, lblock, create, 1);
> +
We will need to have the ability to trace the iomap operation, since 
that is now the important one. Since this tracepoint exposes the bh's 
internal flags, I don't think we can just move this one, so perhaps best 
to add  a new trace_gfs2_iomap() tracepoint in _gfs2_get_iomap() plus we 
need to let Paul know so he can update the PCP pmda too in due course.

> +	if (create)
> +		flags |= IOMAP_WRITE;
Hmm, I wonder why IOMAP_WRITE and IOMAP_ZERO are separate flags from the 
iomap.flags field... Christoph, was there a specific reason for that? If 
we could just use iomap.flags here it would save having the 
gfs2_get_iomap() wrapper function for _gfs2_get_iomap(), and in our case 
we need to update the zeroing flag based on the iomap result anyway. So 
we'd just have additional flags IOMAP_F_ZERO and IOMAP_F_CREATE (or 
IOMAP_F_WRITE) depending on which name is preferred.

>   	if (buffer_zeronew(bh_map))
> -		zero_new = true;
> -	ret = gfs2_bmap_alloc(inode, lblock, zero_new, &mp, maxlen, &dblock,
> -			      &dblks);
> -	if (ret == 0) {
> -		map_bh(bh_map, inode->i_sb, dblock);
> -		bh_map->b_size = dblks << inode->i_blkbits;
> +		flags |= IOMAP_ZERO;
> +	ret = _gfs2_get_iomap(inode, lblock << sdp->sd_sb.sb_bsize_shift,
> +			      bh_map->b_size, &iomap, flags);
> +
> +	iomap.length = round_up(iomap.length, sdp->sd_sb.sb_bsize);
> +	bh_map->b_size = iomap.length;
> +	if (maxlen >= iomap.length)
> +		set_buffer_boundary(bh_map);
That seems to imply setting the boundary on every single mapping, since 
the mapped length will always be equal to or less than the max length we 
requested. Either we need to add a new IOMAP_F_BOUNDARY flag from which 
to set the bh boundary flag, or if we are not getting any performance 
benefit from it, then we could just drop support for this feature. 
Either way we definitely don't want to set it for every mapping.

> +
> +	if (ret)
> +		goto out;
> +
> +	if (iomap.blkno != IOMAP_NULL_BLOCK)
> +		map_bh(bh_map, inode->i_sb, iomap.blkno);
> +
> +	bh_map->b_size = iomap.length;
> +	if (iomap.flags & IOMAP_WRITE)
>   		set_buffer_new(bh_map);
This should be:
                 if (iomap.flags & IOMAP_F_NEW)
                         set_buffer_new(bh_map);
and gfs2_get_iomap() should be setting that flag when the extent is 
newly allocated. We should not be setting buffer new for every write, 
but only when we've allocated new blocks. Also missing is something 
along the lines of:
                 clear_buffer_zeronew(bh_map);
                 if (iomap.flags & IOMAP_F_ZERO)
                         set_buffer_zeronew(bh_map);
The zeronew flag is used both to request zeroing of a new extent and 
also to report errors (i.e. did the zeroing request succeed) so please 
check that fallocate() is still working correctly with the new 
bmap/iomap code.

Steve.

> -	}
> -	goto out;
> +out:
> +	trace_gfs2_bmap(ip, bh_map, lblock, create, ret);
> +	return ret;
>   }
>   
>   /*
> diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
> index 81ded5e..8da2429 100644
> --- a/fs/gfs2/bmap.h
> +++ b/fs/gfs2/bmap.h
> @@ -10,6 +10,8 @@
>   #ifndef __BMAP_DOT_H__
>   #define __BMAP_DOT_H__
>   
> +#include <linux/iomap.h>
> +
>   #include "inode.h"
>   
>   struct inode;
> @@ -47,6 +49,8 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip,
>   extern int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page);
>   extern int gfs2_block_map(struct inode *inode, sector_t lblock,
>   			  struct buffer_head *bh, int create);
> +extern int gfs2_get_iomap(struct inode *inode, loff_t pos,
> +			  ssize_t length, struct iomap *iomap);
>   extern int gfs2_extent_map(struct inode *inode, u64 lblock, int *new,
>   			   u64 *dblock, unsigned *extlen);
>   extern int gfs2_setattr_size(struct inode *inode, u64 size);



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 3/3] GFS2: Switch fiemap implementation to use iomap
  2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 3/3] GFS2: Switch fiemap implementation to use iomap Bob Peterson
@ 2016-10-29  9:33   ` Steven Whitehouse
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Whitehouse @ 2016-10-29  9:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 28/10/16 20:29, Bob Peterson wrote:
> This patch switches GFS2's implementation of fiemap from the old
> block_map code to the new iomap interface.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/Kconfig       |  1 +
>   fs/gfs2/inode.c       | 60 ++++++++++++++++++++++++++++++++++++++++-----------
>   include/linux/iomap.h |  1 +
>   3 files changed, 50 insertions(+), 12 deletions(-)
>
> diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
> index 90c6a8f..43c827a 100644
> --- a/fs/gfs2/Kconfig
> +++ b/fs/gfs2/Kconfig
> @@ -4,6 +4,7 @@ config GFS2_FS
>   	select FS_POSIX_ACL
>   	select CRC32
>   	select QUOTACTL
> +	select FS_IOMAP
>   	help
>   	  A cluster filesystem.
>   
> diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
> index fe3f849..322d178 100644
> --- a/fs/gfs2/inode.c
> +++ b/fs/gfs2/inode.c
> @@ -17,7 +17,7 @@
>   #include <linux/posix_acl.h>
>   #include <linux/gfs2_ondisk.h>
>   #include <linux/crc32.h>
> -#include <linux/fiemap.h>
> +#include <linux/iomap.h>
>   #include <linux/security.h>
>   #include <asm/uaccess.h>
>   
> @@ -1994,28 +1994,64 @@ static int gfs2_getattr(struct vfsmount *mnt, struct dentry *dentry,
>   	return 0;
>   }
>   
> +static int gfs2_iomap_fiemap_begin(struct inode *inode, loff_t offset,
> +				   loff_t length, unsigned flags,
> +				   struct iomap *iomap)
> +{
> +	struct gfs2_inode *ip = GFS2_I(inode);
> +	struct gfs2_holder *gh;
> +	int ret;
> +
> +	gh = kzalloc(sizeof(struct gfs2_holder), GFP_NOFS);
> +	if (!gh)
> +		return -ENOMEM;
> +
> +	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, gh);
> +	if (ret) {
> +		kfree(gh);
> +		return ret;
> +	}
> +	iomap->private = gh;
> +	ret = gfs2_get_iomap(inode, offset, length, iomap);
> +	if (ret)
> +		gfs2_glock_dq_uninit(gh);
> +	return ret;
> +}
> +
> +static int gfs2_iomap_fiemap_end(struct inode *inode, loff_t offset,
> +				 loff_t length, ssize_t written,
> +				 unsigned flags, struct iomap *iomap)
> +{
> +	struct gfs2_holder *gh = iomap->private;
> +
> +	BUG_ON(gh == NULL);
> +	gfs2_glock_dq_uninit(gh);
> +	return 0;
> +}
> +
>   static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>   		       u64 start, u64 len)
>   {
>   	struct gfs2_inode *ip = GFS2_I(inode);
> -	struct gfs2_holder gh;
>   	int ret;
>   
> -	ret = fiemap_check_flags(fieinfo, FIEMAP_FLAG_SYNC);
> -	if (ret)
> -		return ret;
> +	struct iomap_ops gfs2_iomap_fiemap_ops = {
> +		.iomap_begin = gfs2_iomap_fiemap_begin,
> +		.iomap_end = gfs2_iomap_fiemap_end,
> +	};
>   
>   	inode_lock(inode);
>   
> -	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
> -	if (ret)
> -		goto out;
> -
>   	if (gfs2_is_stuffed(ip)) {
> +		struct gfs2_holder gh;
>   		u64 phys = ip->i_no_addr << inode->i_blkbits;
>   		u64 size = i_size_read(inode);
>   		u32 flags = FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_NOT_ALIGNED|
>   			    FIEMAP_EXTENT_DATA_INLINE;
> +		ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
> +		if (ret)
> +			goto out;
> +
>   		phys += sizeof(struct gfs2_dinode);
>   		phys += start;
>   		if (start + len > size)
> @@ -2025,12 +2061,12 @@ static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>   						      len, flags);
>   		if (ret == 1)
>   			ret = 0;
> +		gfs2_glock_dq_uninit(&gh);
>   	} else {
> -		ret = __generic_block_fiemap(inode, fieinfo, start, len,
> -					     gfs2_block_map);

> +		ret = iomap_fiemap(inode, fieinfo, start, len,
> +				   &gfs2_iomap_fiemap_ops);
Is there a reason that we don't simply wrap this function with the 
glock? That would simplify the gfs2_iomap_fiemap_ops and reduce locking 
for multi-extent files a lot,

Steve.

>   	}
>   
> -	gfs2_glock_dq_uninit(&gh);
>   out:
>   	inode_unlock(inode);
>   	return ret;
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index e63e288..0f177d3 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -37,6 +37,7 @@ struct iomap {
>   	u16			type;	/* type of mapping */
>   	u16			flags;	/* flags for mapping */
>   	struct block_device	*bdev;	/* block device for I/O */
> +	void			*private; /* private value for fs use */
>   };
>   
>   /*



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-10-29  9:24   ` Steven Whitehouse
@ 2016-10-31 12:07     ` Bob Peterson
  2016-10-31 20:07     ` Dave Chinner
  1 sibling, 0 replies; 13+ messages in thread
From: Bob Peterson @ 2016-10-31 12:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| Hi,
| 
| Is there a missing patch here? I can see 0/3, 2/3 and 3/3, but not a 1/3
| 
| Overall this is looking a lot cleaner I think, but some comments below....

Hm. I sent the whole patch set at the same time with git send-email, so
I'm not sure why this happened. Patch 1/3 showed up yesterday on my
email. Perhaps it was filtered or was slow to be released by the moderators?

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-10-29  9:24   ` Steven Whitehouse
  2016-10-31 12:07     ` Bob Peterson
@ 2016-10-31 20:07     ` Dave Chinner
  2016-11-02  9:37       ` Steven Whitehouse
  1 sibling, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2016-10-31 20:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Sat, Oct 29, 2016 at 10:24:45AM +0100, Steven Whitehouse wrote:
> On 28/10/16 20:29, Bob Peterson wrote:
> >+	if (create)
> >+		flags |= IOMAP_WRITE;
> Hmm, I wonder why IOMAP_WRITE and IOMAP_ZERO are separate flags from
> the iomap.flags field... Christoph, was there a specific reason for
> that?

They are different actions. IOMAP_WRITE requires allocation over
holes and conversion of unwritten extents to allow writing of
user data into the range. IOMAP_ZERO is for zeroing a range of a
file via iomap_zero_range() and it does not require allocation - it
skips holes and unwritten regions as they are already guaranteed to
contain zeros.

One *could* allocate blocks with IOMAP_ZERO if desired (i.e.
implement IOMAP_ZERO as though it implied IOMAP_WRITE) and
iomap_zero_range_actor() will zero the allocated regions
appropriately, but it's not necessary to do if it is already known
what ranges of the file contain zeros...

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-10-31 20:07     ` Dave Chinner
@ 2016-11-02  9:37       ` Steven Whitehouse
  2016-11-02 21:01         ` Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Whitehouse @ 2016-11-02  9:37 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 31/10/16 20:07, Dave Chinner wrote:
> On Sat, Oct 29, 2016 at 10:24:45AM +0100, Steven Whitehouse wrote:
>> On 28/10/16 20:29, Bob Peterson wrote:
>>> +	if (create)
>>> +		flags |= IOMAP_WRITE;
>> Hmm, I wonder why IOMAP_WRITE and IOMAP_ZERO are separate flags from
>> the iomap.flags field... Christoph, was there a specific reason for
>> that?
> They are different actions. IOMAP_WRITE requires allocation over
> holes and conversion of unwritten extents to allow writing of
> user data into the range. IOMAP_ZERO is for zeroing a range of a
> file via iomap_zero_range() and it does not require allocation - it
> skips holes and unwritten regions as they are already guaranteed to
> contain zeros.
>
> One *could* allocate blocks with IOMAP_ZERO if desired (i.e.
> implement IOMAP_ZERO as though it implied IOMAP_WRITE) and
> iomap_zero_range_actor() will zero the allocated regions
> appropriately, but it's not necessary to do if it is already known
> what ranges of the file contain zeros...
>
> Cheers,
>
> Dave.
That wasn't quite what I was getting at... we have two sets of flags. 
IOMAP_ZERO, IOMAP_WRITE and IOMAP_REPORT form one set that are passed as 
an argument to ->iomap_begin() and ->iomap_end() whereas we also have 
IOMAP_F_* which are set in the iomap.flags field. I guess perhaps the 
former is intended as the input flags to the functions, where as the 
latter are output flags from ->iomap_begin()? In that case I would 
expect that ->iomap_end() would view the iomap.flags as read only.

Just trying to get my head around the intention here and then we can 
figure out the right thing to do gfs2-wise based on that,

Steve.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-11-02  9:37       ` Steven Whitehouse
@ 2016-11-02 21:01         ` Dave Chinner
  2016-11-03  9:45           ` Steven Whitehouse
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2016-11-02 21:01 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Nov 02, 2016 at 09:37:00AM +0000, Steven Whitehouse wrote:
> Hi,
> 
> On 31/10/16 20:07, Dave Chinner wrote:
> >On Sat, Oct 29, 2016 at 10:24:45AM +0100, Steven Whitehouse wrote:
> >>On 28/10/16 20:29, Bob Peterson wrote:
> >>>+	if (create)
> >>>+		flags |= IOMAP_WRITE;
> >>Hmm, I wonder why IOMAP_WRITE and IOMAP_ZERO are separate flags from
> >>the iomap.flags field... Christoph, was there a specific reason for
> >>that?
> >They are different actions. IOMAP_WRITE requires allocation over
> >holes and conversion of unwritten extents to allow writing of
> >user data into the range. IOMAP_ZERO is for zeroing a range of a
> >file via iomap_zero_range() and it does not require allocation - it
> >skips holes and unwritten regions as they are already guaranteed to
> >contain zeros.
> >
> >One *could* allocate blocks with IOMAP_ZERO if desired (i.e.
> >implement IOMAP_ZERO as though it implied IOMAP_WRITE) and
> >iomap_zero_range_actor() will zero the allocated regions
> >appropriately, but it's not necessary to do if it is already known
> >what ranges of the file contain zeros...
> >
> >Cheers,
> >
> >Dave.
> That wasn't quite what I was getting at... we have two sets of
> flags. IOMAP_ZERO, IOMAP_WRITE and IOMAP_REPORT form one set that
> are passed as an argument to ->iomap_begin() and ->iomap_end()

Yes, those are mapping operation control flags that determine the
mapping operation to be done.

> whereas we also have IOMAP_F_* which are set in the iomap.flags

And those are per-map state flags that apply to the extent being
returned.

> field. I guess perhaps the former is intended as the input flags to
> the functions, where as the latter are output flags from
> ->iomap_begin()?

Yes, I thought that was obvious - it didn't occur to me that there'd
be any confusion there.  What can we add to the iomap.h definitions
to make this clearer?

> In that case I would expect that ->iomap_end()
> would view the iomap.flags as read only.

iomap.flags, like the rest of the struct iomap that is returned from
->iomap_begin(), is readonly for all users. Only the filesystem
itself can change extent mappings or state.....

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-11-02 21:01         ` Dave Chinner
@ 2016-11-03  9:45           ` Steven Whitehouse
  2016-11-04 13:44             ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Whitehouse @ 2016-11-03  9:45 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 02/11/16 21:01, Dave Chinner wrote:
> On Wed, Nov 02, 2016 at 09:37:00AM +0000, Steven Whitehouse wrote:
>> Hi,
>>
>> On 31/10/16 20:07, Dave Chinner wrote:
>>> On Sat, Oct 29, 2016 at 10:24:45AM +0100, Steven Whitehouse wrote:
>>>> On 28/10/16 20:29, Bob Peterson wrote:
>>>>> +	if (create)
>>>>> +		flags |= IOMAP_WRITE;
>>>> Hmm, I wonder why IOMAP_WRITE and IOMAP_ZERO are separate flags from
>>>> the iomap.flags field... Christoph, was there a specific reason for
>>>> that?
>>> They are different actions. IOMAP_WRITE requires allocation over
>>> holes and conversion of unwritten extents to allow writing of
>>> user data into the range. IOMAP_ZERO is for zeroing a range of a
>>> file via iomap_zero_range() and it does not require allocation - it
>>> skips holes and unwritten regions as they are already guaranteed to
>>> contain zeros.
>>>
>>> One *could* allocate blocks with IOMAP_ZERO if desired (i.e.
>>> implement IOMAP_ZERO as though it implied IOMAP_WRITE) and
>>> iomap_zero_range_actor() will zero the allocated regions
>>> appropriately, but it's not necessary to do if it is already known
>>> what ranges of the file contain zeros...
>>>
>>> Cheers,
>>>
>>> Dave.
>> That wasn't quite what I was getting at... we have two sets of
>> flags. IOMAP_ZERO, IOMAP_WRITE and IOMAP_REPORT form one set that
>> are passed as an argument to ->iomap_begin() and ->iomap_end()
> Yes, those are mapping operation control flags that determine the
> mapping operation to be done.
>
>> whereas we also have IOMAP_F_* which are set in the iomap.flags
> And those are per-map state flags that apply to the extent being
> returned.
>
>> field. I guess perhaps the former is intended as the input flags to
>> the functions, where as the latter are output flags from
>> ->iomap_begin()?
> Yes, I thought that was obvious - it didn't occur to me that there'd
> be any confusion there.  What can we add to the iomap.h definitions
> to make this clearer?
Yes, probably more to do with me not having as much time to look at the 
code as I used to :( We could maybe update the comment in iomap.h next 
to iomap.flags to say /* Flags set in ->iomap_begin() */ The other 
thought is to rename the flags argument to ->iomap_begin and ->iomap_end 
so that we don't land up with two things both called flags, but I'm not 
sure what a better name would be ... "inflags" or "action" or something 
like that?
Could the iomap that is passed to ->iomap_end() possibly become const? I 
guess that the only thing that might need to be updated by ->iomap_end() 
is the iomap.flags, but if it truely is readonly for ->iomap_end() then 
perhaps it could be const in that case.

There may well be better solutions here too, so very open to better 
suggestions.

For GFS2 then we want to use the IOMAP_* flags as they are at the 
moment, but for the function that maps back from iomap to 
gfs2_block_map() we'd need to add a IOMAP_F_ZEROED or something like 
that from which to set the BH_zeronew() flag, which is used as both a 
request and result flag for the zeroing operation. That flag would very 
likely go away in due course when the final iomap migration is done for 
GFS2, so would just be a temporary thing in the mean time.

>> In that case I would expect that ->iomap_end()
>> would view the iomap.flags as read only.
> iomap.flags, like the rest of the struct iomap that is returned from
> ->iomap_begin(), is readonly for all users. Only the filesystem
> itself can change extent mappings or state.....
>
> Cheers,
>
> Dave.

Yes - and in this initial GFS2 case things are a bit odd in that we 
don't have a full iomap_begin() / iomap_end() implementation yet as 
would be used for "normal buffered i/o" but only a subset of that 
functionality which is being used for FIEMAP, while at the same time 
retaining backwards compatibility with gfs2_block_map(). The intent is 
eventually to do the mapping with iomap of course, but that will be a 
later patch set.

I also wondered whether it would be possible to write a generic 
implementation of SEEK_HOLE/SEEK_END for iomap supporting filesystems at 
the VFS level. That would get GFS2 support for those two operations and 
be a nice clean up too. I've added an item on our todo list for that, 
but it may be a little while before we get around to it,

Steve



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-11-03  9:45           ` Steven Whitehouse
@ 2016-11-04 13:44             ` Christoph Hellwig
  2016-11-07 12:03               ` Steven Whitehouse
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-04 13:44 UTC (permalink / raw)
  To: cluster-devel.redhat.com

> > > That wasn't quite what I was getting at... we have two sets of
> > > flags. IOMAP_ZERO, IOMAP_WRITE and IOMAP_REPORT form one set that
> > > are passed as an argument to ->iomap_begin() and ->iomap_end()

They actually are the type of operation we perform, so maybe we should
rename them to "op" or similar.

> For GFS2 then we want to use the IOMAP_* flags as they are at the moment,
> but for the function that maps back from iomap to gfs2_block_map() we'd need
> to add a IOMAP_F_ZEROED or something like that from which to set the
> BH_zeronew() flag, which is used as both a request and result flag for the
> zeroing operation. That flag would very likely go away in due course when
> the final iomap migration is done for GFS2, so would just be a temporary
> thing in the mean time.

That would be the existing IOMAP_F_NEW I guess?

> I also wondered whether it would be possible to write a generic
> implementation of SEEK_HOLE/SEEK_END for iomap supporting filesystems at the
> VFS level. That would get GFS2 support for those two operations and be a
> nice clean up too. I've added an item on our todo list for that, but it may
> be a little while before we get around to it,

Yes, that should be doable.  Would be great if you could look at it,
otherwise I'll add it to my todo list.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map
  2016-11-04 13:44             ` Christoph Hellwig
@ 2016-11-07 12:03               ` Steven Whitehouse
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Whitehouse @ 2016-11-07 12:03 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 04/11/16 13:44, Christoph Hellwig wrote:
>>>> That wasn't quite what I was getting at... we have two sets of
>>>> flags. IOMAP_ZERO, IOMAP_WRITE and IOMAP_REPORT form one set that
>>>> are passed as an argument to ->iomap_begin() and ->iomap_end()
> They actually are the type of operation we perform, so maybe we should
> rename them to "op" or similar.
>
>> For GFS2 then we want to use the IOMAP_* flags as they are at the moment,
>> but for the function that maps back from iomap to gfs2_block_map() we'd need
>> to add a IOMAP_F_ZEROED or something like that from which to set the
>> BH_zeronew() flag, which is used as both a request and result flag for the
>> zeroing operation. That flag would very likely go away in due course when
>> the final iomap migration is done for GFS2, so would just be a temporary
>> thing in the mean time.
> That would be the existing IOMAP_F_NEW I guess?
Well maybe, maybe not... currently it differentiates between failure of 
allocation and failure of zeroing. Probably not really required except 
for lack of roll back on the allocation if the zeroing fails. Something 
to be fixed in due course I think.

>> I also wondered whether it would be possible to write a generic
>> implementation of SEEK_HOLE/SEEK_END for iomap supporting filesystems at the
>> VFS level. That would get GFS2 support for those two operations and be a
>> nice clean up too. I've added an item on our todo list for that, but it may
>> be a little while before we get around to it,
> Yes, that should be doable.  Would be great if you could look at it,
> otherwise I'll add it to my todo list.

Well, it is on our list, but we'll see who gets there first. We have 
quite a lot of other things that need doing too, so it is not top of the 
list right now,

Steve.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-11-07 12:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28 19:29 [Cluster-devel] [PATCH v2 0/3] Implement iomap for fiemap in GFS2 Bob Peterson
2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 1/3] GFS2: Make height info part of metapath Bob Peterson
2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 2/3] GFS2: Implement iomap for block_map Bob Peterson
2016-10-29  9:24   ` Steven Whitehouse
2016-10-31 12:07     ` Bob Peterson
2016-10-31 20:07     ` Dave Chinner
2016-11-02  9:37       ` Steven Whitehouse
2016-11-02 21:01         ` Dave Chinner
2016-11-03  9:45           ` Steven Whitehouse
2016-11-04 13:44             ` Christoph Hellwig
2016-11-07 12:03               ` Steven Whitehouse
2016-10-28 19:29 ` [Cluster-devel] [PATCH v2 3/3] GFS2: Switch fiemap implementation to use iomap Bob Peterson
2016-10-29  9:33   ` Steven Whitehouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.