All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH 00/12] gfs2: punch hole
@ 2017-12-22 14:34 Andreas Gruenbacher
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 01/12] gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use Andreas Gruenbacher
                   ` (12 more replies)
  0 siblings, 13 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

These patches fix a few minor issues in the non-recursive delete
implementation, add upper bound parameters so that these functions can
be used for punching holes in addition to truncating files at a given
position, and implement the fallocate hole punching operation base don
that.

These patches pass xfstests, but they haven't been tested very heavily
beyond that.

Andreas Gruenbacher (11):
  gfs2: Remove pointless BUG_ON
  gfs2: Clean up trunc_start error path
  gfs2: truncate: Remove unnecessary oldsize parameters
  gfs2: Remove minor gfs2_journaled_truncate inefficiencies
  gfs2: Clean up {lookup,fillup}_metapath
  gfs2: Fix metadata read-ahead during truncate
  gfs2: Improve non-recursive delete algorithm
  Turn gfs2_block_truncate_page into gfs2_block_zero_range
  gfs2: Generalize truncate code
  gfs2: Turn trunc_dealloc into punch_hole
  gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE)

Steven Whitehouse (1):
  gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use

 fs/gfs2/bmap.c  | 571 +++++++++++++++++++++++++++++++++++++++-----------------
 fs/gfs2/bmap.h  |   1 +
 fs/gfs2/file.c  |  19 +-
 fs/gfs2/rgrp.c  |   7 +
 fs/gfs2/trans.c |   1 -
 5 files changed, 418 insertions(+), 181 deletions(-)

-- 
2.14.3



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 01/12] gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
@ 2017-12-22 14:34 ` Andreas Gruenbacher
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 02/12] gfs2: Remove pointless BUG_ON Andreas Gruenbacher
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Steven Whitehouse <swhiteho@redhat.com>

Document when to use gfs2_blk2rgrpd for "inexact" resource group
matching.  Based on that, fix an incorrect use of gfs2_blk2rgrpd in
sweep_bh_for_rgrps.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 2 +-
 fs/gfs2/rgrp.c | 7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index d5f0d96169c5..8b993e4d80b2 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1133,7 +1133,7 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 			gfs2_assert_withdraw(sdp,
 				     gfs2_glock_is_locked_by_me(rd_gh->gh_gl));
 		} else {
-			rgd = gfs2_blk2rgrpd(sdp, bn, false);
+			rgd = gfs2_blk2rgrpd(sdp, bn, true);
 			ret = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
 						 0, rd_gh);
 			if (ret)
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 95b2a57ded33..211d7a5fa10f 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -489,6 +489,13 @@ void gfs2_rgrp_verify(struct gfs2_rgrpd *rgd)
  * @blk: The data block number
  * @exact: True if this needs to be an exact match
  *
+ * The @exact argument should be set to true by most callers. The exception
+ * is when we need to match blocks which are not represented by the rgrp
+ * bitmap, but which are part of the rgrp (i.e. padding blocks) which are
+ * there for alignment purposes. Another way of looking at it is that @exact
+ * matches only valid data/metadata blocks, but with @exact false, it will
+ * match any block within the extent of the rgrp.
+ *
  * Returns: The resource group, or NULL if not found
  */
 
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 02/12] gfs2: Remove pointless BUG_ON
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 01/12] gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use Andreas Gruenbacher
@ 2017-12-22 14:34 ` Andreas Gruenbacher
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 03/12] gfs2: Clean up trunc_start error path Andreas Gruenbacher
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The current transaction is being dereferenced before asserting that is
not NULL; that isn't going to help.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/trans.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index ca8b72d0a831..b95ebd166cac 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -92,7 +92,6 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
 	s64 nbuf;
 	int alloced = test_bit(TR_ALLOCED, &tr->tr_flags);
 
-	BUG_ON(!tr);
 	current->journal_info = NULL;
 
 	if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 03/12] gfs2: Clean up trunc_start error path
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 01/12] gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use Andreas Gruenbacher
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 02/12] gfs2: Remove pointless BUG_ON Andreas Gruenbacher
@ 2017-12-22 14:34 ` Andreas Gruenbacher
  2018-01-08 20:12   ` Bob Peterson
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 04/12] gfs2: truncate: Remove unnecessary oldsize parameters Andreas Gruenbacher
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 8b993e4d80b2..0ad6d812c78b 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1022,7 +1022,7 @@ static int trunc_start(struct inode *inode, u64 oldsize, u64 newsize)
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct address_space *mapping = inode->i_mapping;
-	struct buffer_head *dibh;
+	struct buffer_head *dibh = NULL;
 	int journaled = gfs2_is_jdata(ip);
 	int error;
 
@@ -1045,7 +1045,7 @@ static int trunc_start(struct inode *inode, u64 oldsize, u64 newsize)
 		if (newsize & (u64)(sdp->sd_sb.sb_bsize - 1)) {
 			error = gfs2_block_truncate_page(mapping, newsize);
 			if (error)
-				goto out_brelse;
+				goto out;
 		}
 		ip->i_diskflags |= GFS2_DIF_TRUNC_IN_PROG;
 	}
@@ -1059,15 +1059,10 @@ static int trunc_start(struct inode *inode, u64 oldsize, u64 newsize)
 	else
 		truncate_pagecache(inode, newsize);
 
-	if (error) {
-		brelse(dibh);
-		return error;
-	}
-
-out_brelse:
-	brelse(dibh);
 out:
-	gfs2_trans_end(sdp);
+	brelse(dibh);
+	if (current->journal_info)
+		gfs2_trans_end(sdp);
 	return error;
 }
 
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 04/12] gfs2: truncate: Remove unnecessary oldsize parameters
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (2 preceding siblings ...)
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 03/12] gfs2: Clean up trunc_start error path Andreas Gruenbacher
@ 2017-12-22 14:34 ` Andreas Gruenbacher
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 05/12] gfs2: Remove minor gfs2_journaled_truncate inefficiencies Andreas Gruenbacher
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 0ad6d812c78b..963117f704bf 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1017,13 +1017,14 @@ static int gfs2_journaled_truncate(struct inode *inode, u64 oldsize, u64 newsize
 	return 0;
 }
 
-static int trunc_start(struct inode *inode, u64 oldsize, u64 newsize)
+static int trunc_start(struct inode *inode, u64 newsize)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct address_space *mapping = inode->i_mapping;
 	struct buffer_head *dibh = NULL;
 	int journaled = gfs2_is_jdata(ip);
+	u64 oldsize = inode->i_size;
 	int error;
 
 	if (journaled)
@@ -1519,7 +1520,6 @@ static int trunc_end(struct gfs2_inode *ip)
 /**
  * do_shrink - make a file smaller
  * @inode: the inode
- * @oldsize: the current inode size
  * @newsize: the size to make the file
  *
  * Called with an exclusive lock on @inode. The @size must
@@ -1528,12 +1528,12 @@ static int trunc_end(struct gfs2_inode *ip)
  * Returns: errno
  */
 
-static int do_shrink(struct inode *inode, u64 oldsize, u64 newsize)
+static int do_shrink(struct inode *inode, u64 newsize)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	int error;
 
-	error = trunc_start(inode, oldsize, newsize);
+	error = trunc_start(inode, newsize);
 	if (error < 0)
 		return error;
 	if (gfs2_is_stuffed(ip))
@@ -1548,10 +1548,9 @@ static int do_shrink(struct inode *inode, u64 oldsize, u64 newsize)
 
 void gfs2_trim_blocks(struct inode *inode)
 {
-	u64 size = inode->i_size;
 	int ret;
 
-	ret = do_shrink(inode, size, size);
+	ret = do_shrink(inode, inode->i_size);
 	WARN_ON(ret != 0);
 }
 
@@ -1645,7 +1644,6 @@ int gfs2_setattr_size(struct inode *inode, u64 newsize)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	int ret;
-	u64 oldsize;
 
 	BUG_ON(!S_ISREG(inode->i_mode));
 
@@ -1659,13 +1657,12 @@ int gfs2_setattr_size(struct inode *inode, u64 newsize)
 	if (ret)
 		goto out;
 
-	oldsize = inode->i_size;
-	if (newsize >= oldsize) {
+	if (newsize >= inode->i_size) {
 		ret = do_grow(inode, newsize);
 		goto out;
 	}
 
-	ret = do_shrink(inode, oldsize, newsize);
+	ret = do_shrink(inode, newsize);
 out:
 	gfs2_rsqa_delete(ip, NULL);
 	return ret;
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 05/12] gfs2: Remove minor gfs2_journaled_truncate inefficiencies
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (3 preceding siblings ...)
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 04/12] gfs2: truncate: Remove unnecessary oldsize parameters Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2018-01-09 13:53   ` Bob Peterson
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up {lookup, fillup}_metapath Andreas Gruenbacher
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

First, this function truncates the file in chunks.  When the original
file size isn't block aligned, each chunk that is truncated will remain
be misaligned.  This is inefficient.

Second, this function doesn't recognize where holes are, so it loops
through them.  For each chunk of a hole, it creates a new transaction.
At least avoid creating another transactions whe the current one is
still empty.  (An better fix would be to skip large holes, of course.)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 963117f704bf..921bf33faa96 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1003,11 +1003,25 @@ static int gfs2_journaled_truncate(struct inode *inode, u64 oldsize, u64 newsize
 	int error;
 
 	while (oldsize != newsize) {
+		struct gfs2_trans *tr;
+		unsigned int offs;
+
 		chunk = oldsize - newsize;
 		if (chunk > max_chunk)
 			chunk = max_chunk;
+
+		offs = oldsize & ~PAGE_MASK;
+		if (offs && chunk > PAGE_SIZE)
+			chunk = offs + ((chunk - offs) & PAGE_MASK);
+
 		truncate_pagecache(inode, oldsize - chunk);
 		oldsize -= chunk;
+
+		/* XXX Is this check sufficient? */
+		tr = current->journal_info;
+		if (!tr->tr_num_revoke)
+			continue;
+
 		gfs2_trans_end(sdp);
 		error = gfs2_trans_begin(sdp, RES_DINODE, GFS2_JTRUNC_REVOKES);
 		if (error)
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 06/12] gfs2: Clean up {lookup, fillup}_metapath
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (4 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 05/12] gfs2: Remove minor gfs2_journaled_truncate inefficiencies Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 07/12] gfs2: Fix metadata read-ahead during truncate Andreas Gruenbacher
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Split out the entire lookup loop from lookup_metapath and
fillup_metapath.  Make both functions return the actual height in
mp->mp_aheight, and return 0 on success.  Handle lookup errors properly
in trunc_dealloc.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 74 ++++++++++++++++++++++++----------------------------------
 1 file changed, 30 insertions(+), 44 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 921bf33faa96..a756d3d097ca 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -305,21 +305,22 @@ static void gfs2_metapath_ra(struct gfs2_glock *gl,
 	}
 }
 
-/**
- * lookup_mp_height - helper function for lookup_metapath
- * @ip: the inode
- * @mp: the metapath
- * @h: the height which needs looking up
- */
-static int lookup_mp_height(struct gfs2_inode *ip, struct metapath *mp, int h)
+static int __fillup_metapath(struct gfs2_inode *ip, struct metapath *mp,
+			     unsigned int x, unsigned int h)
 {
-	__be64 *ptr = metapointer(h, mp);
-	u64 dblock = be64_to_cpu(*ptr);
+	for (; x < h; x++) {
+		__be64 *ptr = metapointer(x, mp);
+		u64 dblock = be64_to_cpu(*ptr);
+		int ret;
 
-	if (!dblock)
-		return h + 1;
-
-	return gfs2_meta_indirect_buffer(ip, h + 1, dblock, &mp->mp_bh[h + 1]);
+		if (!dblock)
+			break;
+		ret = gfs2_meta_indirect_buffer(ip, x + 1, dblock, &mp->mp_bh[x + 1]);
+		if (ret)
+			return ret;
+	}
+	mp->mp_aheight = x + 1;
+	return 0;
 }
 
 /**
@@ -336,25 +337,12 @@ static int lookup_mp_height(struct gfs2_inode *ip, struct metapath *mp, int h)
  * at which it found the unallocated block. Blocks which are found are
  * added to the mp->mp_bh[] list.
  *
- * Returns: error or height of metadata tree
+ * Returns: error
  */
 
 static int lookup_metapath(struct gfs2_inode *ip, struct metapath *mp)
 {
-	unsigned int end_of_metadata = ip->i_height - 1;
-	unsigned int x;
-	int ret;
-
-	for (x = 0; x < end_of_metadata; x++) {
-		ret = lookup_mp_height(ip, mp, x);
-		if (ret)
-			goto out;
-	}
-
-	ret = ip->i_height;
-out:
-	mp->mp_aheight = ret;
-	return ret;
+	return __fillup_metapath(ip, mp, 0, ip->i_height - 1);
 }
 
 /**
@@ -365,25 +353,21 @@ static int lookup_metapath(struct gfs2_inode *ip, struct metapath *mp)
  *
  * Similar to lookup_metapath, but does lookups for a range of heights
  *
- * Returns: error or height of metadata tree
+ * Returns: error
  */
 
 static int fillup_metapath(struct gfs2_inode *ip, struct metapath *mp, int h)
 {
-	unsigned int start_h = h - 1;
-	int ret;
+	unsigned int x = 0;
 
 	if (h) {
 		/* find the first buffer we need to look up. */
-		while (start_h > 0 && mp->mp_bh[start_h] == NULL)
-			start_h--;
-		for (; start_h < h; start_h++) {
-			ret = lookup_mp_height(ip, mp, start_h);
-			if (ret)
-				return ret;
+		for (x = h - 1; x > 0; x--) {
+			if (mp->mp_bh[x])
+				break;
 		}
 	}
-	return ip->i_height;
+	return __fillup_metapath(ip, mp, x, h);
 }
 
 static inline void release_metapath(struct metapath *mp)
@@ -788,7 +772,7 @@ int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length,
 		goto do_alloc;
 
 	ret = lookup_metapath(ip, &mp);
-	if (ret < 0)
+	if (ret)
 		goto out_release;
 
 	if (mp.mp_aheight != ip->i_height)
@@ -1346,7 +1330,9 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 
 	mp.mp_bh[0] = dibh;
 	ret = lookup_metapath(ip, &mp);
-	if (ret == ip->i_height)
+	if (ret)
+		goto out_metapath;
+	if (mp.mp_aheight == ip->i_height)
 		state = DEALLOC_MP_FULL; /* We have a complete metapath */
 	else
 		state = DEALLOC_FILL_MP; /* deal with partial metapath */
@@ -1442,16 +1428,16 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 		case DEALLOC_FILL_MP:
 			/* Fill the buffers out to the current height. */
 			ret = fillup_metapath(ip, &mp, mp_h);
-			if (ret < 0)
+			if (ret)
 				goto out;
 
 			/* If buffers found for the entire strip height */
-			if ((ret == ip->i_height) && (mp_h == strip_h)) {
+			if (mp.mp_aheight - 1 == strip_h) {
 				state = DEALLOC_MP_FULL;
 				break;
 			}
-			if (ret < ip->i_height) /* We have a partial height */
-				mp_h = ret - 1;
+			if (mp.mp_aheight < ip->i_height) /* We have a partial height */
+				mp_h = mp.mp_aheight - 1;
 
 			/* If we find a non-null block pointer, crawl a bit
 			   higher up in the metapath and try again, otherwise
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 07/12] gfs2: Fix metadata read-ahead during truncate
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (5 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up {lookup, fillup}_metapath Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 08/12] gfs2: Improve non-recursive delete algorithm Andreas Gruenbacher
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The metadata read-ahead algorithm broke when switching from recursive to
non-recursive delete: the current algorithm reads ahead blocks at height
N - 1 while deallocating the blocks at hight N.  However, deallocating
the blocks at height N requires a complete walk of the metadata tree,
not only down to height N - 1.  Consequently, all blocks below height
N - 1 will be accessed without read-ahead.

Fix this by issuing read-aheads as early as possible, after each
metapath lookup.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index a756d3d097ca..57e9e1e65272 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -279,14 +279,17 @@ static inline __be64 *metapointer(unsigned int height, const struct metapath *mp
 	return p + mp->mp_list[height];
 }
 
-static void gfs2_metapath_ra(struct gfs2_glock *gl,
-			     const struct buffer_head *bh, const __be64 *pos)
+static void gfs2_metapath_ra(struct gfs2_glock *gl, struct metapath *mp,
+			     unsigned int height)
 {
-	struct buffer_head *rabh;
+	struct buffer_head *bh = mp->mp_bh[height];
+	const __be64 *pos = metapointer(height, mp);
 	const __be64 *endp = (const __be64 *)(bh->b_data + bh->b_size);
 	const __be64 *t;
 
 	for (t = pos; t < endp; t++) {
+		struct buffer_head *rabh;
+
 		if (!*t)
 			continue;
 
@@ -353,12 +356,13 @@ static int lookup_metapath(struct gfs2_inode *ip, struct metapath *mp)
  *
  * Similar to lookup_metapath, but does lookups for a range of heights
  *
- * Returns: error
+ * Returns: error or the number of buffers filled
  */
 
 static int fillup_metapath(struct gfs2_inode *ip, struct metapath *mp, int h)
 {
 	unsigned int x = 0;
+	int ret;
 
 	if (h) {
 		/* find the first buffer we need to look up. */
@@ -367,7 +371,10 @@ static int fillup_metapath(struct gfs2_inode *ip, struct metapath *mp, int h)
 				break;
 		}
 	}
-	return __fillup_metapath(ip, mp, x, h);
+	ret = __fillup_metapath(ip, mp, x, h);
+	if (ret)
+		return ret;
+	return mp->mp_aheight - x - 1;
 }
 
 static inline void release_metapath(struct metapath *mp)
@@ -1310,7 +1317,6 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 	u32 btotal = 0;
 	int ret, state;
 	int mp_h; /* metapath buffers are read in to this height */
-	sector_t last_ra = 0;
 	u64 prev_bnr = 0;
 	bool preserve1; /* need to preserve the first meta pointer? */
 
@@ -1332,6 +1338,11 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 	ret = lookup_metapath(ip, &mp);
 	if (ret)
 		goto out_metapath;
+
+	/* issue read-ahead on metadata */
+	for (mp_h = 0; mp_h < mp.mp_aheight - 1; mp_h++)
+		gfs2_metapath_ra(ip->i_gl, &mp, mp_h);
+
 	if (mp.mp_aheight == ip->i_height)
 		state = DEALLOC_MP_FULL; /* We have a complete metapath */
 	else
@@ -1353,16 +1364,6 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 		/* Truncate a full metapath at the given strip height.
 		 * Note that strip_h == mp_h in order to be in this state. */
 		case DEALLOC_MP_FULL:
-			if (mp_h > 0) { /* issue read-ahead on metadata */
-				__be64 *top;
-
-				bh = mp.mp_bh[mp_h - 1];
-				if (bh->b_blocknr != last_ra) {
-					last_ra = bh->b_blocknr;
-					top = metaptr1(mp_h - 1, &mp);
-					gfs2_metapath_ra(ip->i_gl, bh, top);
-				}
-			}
 			/* If we're truncating to a non-zero size and the mp is
 			   at the beginning of file for the strip height, we
 			   need to preserve the first metadata pointer. */
@@ -1428,9 +1429,16 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 		case DEALLOC_FILL_MP:
 			/* Fill the buffers out to the current height. */
 			ret = fillup_metapath(ip, &mp, mp_h);
-			if (ret)
+			if (ret < 0)
 				goto out;
 
+			/* issue read-ahead on metadata */
+			if (mp.mp_aheight > 1) {
+				for (; ret > 1; ret--)
+					gfs2_metapath_ra(ip->i_gl, &mp,
+						mp.mp_aheight - ret);
+			}
+
 			/* If buffers found for the entire strip height */
 			if (mp.mp_aheight - 1 == strip_h) {
 				state = DEALLOC_MP_FULL;
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 08/12] gfs2: Improve non-recursive delete algorithm
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (6 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 07/12] gfs2: Fix metadata read-ahead during truncate Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 09/12] Turn gfs2_block_truncate_page into gfs2_block_zero_range Andreas Gruenbacher
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In rare cases, the current non-recursive delete algorithm doesn't
deallocate empty intermediary indirect blocks.  This should have very
little practical effect, but deallocating all blocks correctly should
still be preferable as it is cleaner and easier to validate.

The fix consists of using the first block to deallocate to compute the
start marker of the truncate point instead of the last block that needs
to be kept.  With that change, computing which indirect blocks are still
needed becomes relatively easy.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 51 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 57e9e1e65272..8a9079823fa3 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1079,7 +1079,7 @@ static int trunc_start(struct inode *inode, u64 newsize)
  * @mp: current metapath fully populated with buffers
  * @btotal: place to keep count of total blocks freed
  * @hgt: height we're processing
- * @first: true if this is the first call to this function for this height
+ * @keep_start: preserve the first meta pointer
  *
  * We sweep a metadata buffer (provided by the metapath) for blocks we need to
  * free, and free them all. However, we do it one rgrp at a time. If this
@@ -1095,7 +1095,7 @@ static int trunc_start(struct inode *inode, u64 newsize)
  */
 static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 			      const struct metapath *mp, u32 *btotal, int hgt,
-			      bool preserve1)
+			      bool keep_start)
 {
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
 	struct gfs2_rgrpd *rgd;
@@ -1120,7 +1120,7 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 	top = metapointer(hgt, mp); /* first ptr from metapath */
 	/* If we're keeping some data at the truncation point, we've got to
 	   preserve the metadata tree by adding 1 to the starting metapath. */
-	if (preserve1)
+	if (keep_start)
 		top++;
 
 	bottom = (__be64 *)(bh->b_data + bh->b_size);
@@ -1287,9 +1287,9 @@ enum dealloc_states {
 	DEALLOC_DONE = 3,       /* process complete */
 };
 
-static bool mp_eq_to_hgt(struct metapath *mp, __u16 *nbof, unsigned int h)
+static bool mp_eq_to_hgt(struct metapath *mp, __u16 *list, unsigned int h)
 {
-	if (memcmp(mp->mp_list, nbof, h * sizeof(mp->mp_list[0])))
+	if (memcmp(mp->mp_list, list, h * sizeof(mp->mp_list[0])))
 		return false;
 	return true;
 }
@@ -1311,24 +1311,35 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 	struct metapath mp;
 	struct buffer_head *dibh, *bh;
 	struct gfs2_holder rd_gh;
-	u64 lblock;
-	__u16 nbof[GFS2_MAX_META_HEIGHT]; /* new beginning of truncation */
+	unsigned int bsize_shift = sdp->sd_sb.sb_bsize_shift;
+	u64 lblock = (newsize + (1 << bsize_shift) - 1) >> bsize_shift;
+	__u16 start_list[GFS2_MAX_META_HEIGHT]; /* new beginning of truncation */
+	unsigned int start_aligned;
 	unsigned int strip_h = ip->i_height - 1;
 	u32 btotal = 0;
 	int ret, state;
 	int mp_h; /* metapath buffers are read in to this height */
 	u64 prev_bnr = 0;
-	bool preserve1; /* need to preserve the first meta pointer? */
-
-	if (!newsize)
-		lblock = 0;
-	else
-		lblock = (newsize - 1) >> sdp->sd_sb.sb_bsize_shift;
+	bool keep_start; /* need to preserve the first meta pointer? */
 
 	memset(&mp, 0, sizeof(mp));
 	find_metapath(sdp, lblock, &mp, ip->i_height);
 
-	memcpy(&nbof, &mp.mp_list, sizeof(nbof));
+	memcpy(start_list, mp.mp_list, sizeof(start_list));
+
+	/*
+	 * Set start_aligned to the metadata height up to which the truncate
+	 * point is aligned to the metadata tree (i.e., the truncate point is a
+	 * multiple of the granularity at the height above).  This determines
+	 * at which heights an additional meta pointer needs to be preserved:
+	 * an additional meta pointer is needed at a given height if
+	 * height < start_aligned.
+	 */
+	for (mp_h = ip->i_height - 1; mp_h > 0; mp_h--) {
+		if (start_list[mp_h])
+			break;
+	}
+	start_aligned = mp_h;
 
 	ret = gfs2_meta_inode_buffer(ip, &dibh);
 	if (ret)
@@ -1364,10 +1375,6 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 		/* Truncate a full metapath at the given strip height.
 		 * Note that strip_h == mp_h in order to be in this state. */
 		case DEALLOC_MP_FULL:
-			/* If we're truncating to a non-zero size and the mp is
-			   at the beginning of file for the strip height, we
-			   need to preserve the first metadata pointer. */
-			preserve1 = (newsize && mp_eq_to_hgt(&mp, nbof, mp_h));
 			bh = mp.mp_bh[mp_h];
 			gfs2_assert_withdraw(sdp, bh);
 			if (gfs2_assert_withdraw(sdp,
@@ -1379,8 +1386,12 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 				       prev_bnr, ip->i_height, strip_h, mp_h);
 			}
 			prev_bnr = bh->b_blocknr;
+
+			keep_start = hp_h < start_aligned &&
+				     mp_eq_to_hgt(&mp, start_list, mp_h);
+
 			ret = sweep_bh_for_rgrps(ip, &rd_gh, &mp, &btotal,
-						 mp_h, preserve1);
+						 mp_h, keep_start);
 			/* If we hit an error or just swept dinode buffer,
 			   just exit. */
 			if (ret || !mp_h) {
@@ -1404,7 +1415,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 			   stripping the previous level of metadata. */
 			if (mp_h == 0) {
 				strip_h--;
-				memcpy(&mp.mp_list, &nbof, sizeof(nbof));
+				memcpy(mp.mp_list, start_list, sizeof(start_list));
 				mp_h = strip_h;
 				state = DEALLOC_FILL_MP;
 				break;
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 09/12] Turn gfs2_block_truncate_page into gfs2_block_zero_range
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (7 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 08/12] gfs2: Improve non-recursive delete algorithm Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 10/12] gfs2: Generalize truncate code Andreas Gruenbacher
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Turn gfs2_block_truncate_page into a function that zeroes a range within
a block rather than only the end of a block.  This will be used for
cleaning the end of the first partial block and the start of the last
partial block when punching a hole in a file.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 8a9079823fa3..1e789bbd24f5 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -904,17 +904,18 @@ int gfs2_extent_map(struct inode *inode, u64 lblock, int *new, u64 *dblock, unsi
 }
 
 /**
- * gfs2_block_truncate_page - Deal with zeroing out data for truncate
+ * gfs2_block_zero_range - Deal with zeroing out data
  *
  * This is partly borrowed from ext3.
  */
-static int gfs2_block_truncate_page(struct address_space *mapping, loff_t from)
+static int gfs2_block_zero_range(struct inode *inode, loff_t from,
+				 unsigned int length)
 {
-	struct inode *inode = mapping->host;
+	struct address_space *mapping = inode->i_mapping;
 	struct gfs2_inode *ip = GFS2_I(inode);
 	unsigned long index = from >> PAGE_SHIFT;
 	unsigned offset = from & (PAGE_SIZE-1);
-	unsigned blocksize, iblock, length, pos;
+	unsigned blocksize, iblock, pos;
 	struct buffer_head *bh;
 	struct page *page;
 	int err;
@@ -924,7 +925,6 @@ static int gfs2_block_truncate_page(struct address_space *mapping, loff_t from)
 		return 0;
 
 	blocksize = inode->i_sb->s_blocksize;
-	length = blocksize - (offset & (blocksize - 1));
 	iblock = index << (PAGE_SHIFT - inode->i_sb->s_blocksize_bits);
 
 	if (!page_has_buffers(page))
@@ -1026,7 +1026,6 @@ static int trunc_start(struct inode *inode, u64 newsize)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
-	struct address_space *mapping = inode->i_mapping;
 	struct buffer_head *dibh = NULL;
 	int journaled = gfs2_is_jdata(ip);
 	u64 oldsize = inode->i_size;
@@ -1048,8 +1047,11 @@ static int trunc_start(struct inode *inode, u64 newsize)
 	if (gfs2_is_stuffed(ip)) {
 		gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode) + newsize);
 	} else {
-		if (newsize & (u64)(sdp->sd_sb.sb_bsize - 1)) {
-			error = gfs2_block_truncate_page(mapping, newsize);
+		unsigned int blocksize = i_blocksize(inode);
+		unsigned int offs = newsize & (blocksize - 1);
+		if (offs) {
+			error = gfs2_block_zero_range(inode, newsize,
+						      blocksize - offs);
 			if (error)
 				goto out;
 		}
@@ -1387,7 +1389,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 			}
 			prev_bnr = bh->b_blocknr;
 
-			keep_start = hp_h < start_aligned &&
+			keep_start = mp_h < start_aligned &&
 				     mp_eq_to_hgt(&mp, start_list, mp_h);
 
 			ret = sweep_bh_for_rgrps(ip, &rd_gh, &mp, &btotal,
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 10/12] gfs2: Generalize truncate code
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (8 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 09/12] Turn gfs2_block_truncate_page into gfs2_block_zero_range Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2018-01-09 15:45   ` Bob Peterson
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 11/12] gfs2: Turn trunc_dealloc into punch_hole Andreas Gruenbacher
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Pull the code for computing the range of metapointers to iterate out of
gfs2_metapath_ra (for readahead), sweep_bh_for_rgrps (for deallocating
metapointers within a block), and trunc_dealloc (for walking the
metadata tree).

In sweep_bh_for_rgrps, move the code for looking up the resource group
descriptor of the current resource group out of the inner loop.  The
metatype check moves to trunc_dealloc.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 118 ++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 71 insertions(+), 47 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 1e789bbd24f5..984e27715a2e 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -279,15 +279,11 @@ static inline __be64 *metapointer(unsigned int height, const struct metapath *mp
 	return p + mp->mp_list[height];
 }
 
-static void gfs2_metapath_ra(struct gfs2_glock *gl, struct metapath *mp,
-			     unsigned int height)
+static void gfs2_metapath_ra(struct gfs2_glock *gl, __be64 *start, __be64 *end)
 {
-	struct buffer_head *bh = mp->mp_bh[height];
-	const __be64 *pos = metapointer(height, mp);
-	const __be64 *endp = (const __be64 *)(bh->b_data + bh->b_size);
 	const __be64 *t;
 
-	for (t = pos; t < endp; t++) {
+	for (t = start; t < end; t++) {
 		struct buffer_head *rabh;
 
 		if (!*t)
@@ -1078,10 +1074,11 @@ static int trunc_start(struct inode *inode, u64 newsize)
  * sweep_bh_for_rgrps - find an rgrp in a meta buffer and free blocks therein
  * @ip: inode
  * @rg_gh: holder of resource group glock
- * @mp: current metapath fully populated with buffers
+ * @bh: buffer head to sweep
+ * @start: starting point in bh
+ * @end: end point in bh
+ * @meta: true if bh points to metadata (rather than data)
  * @btotal: place to keep count of total blocks freed
- * @hgt: height we're processing
- * @keep_start: preserve the first meta pointer
  *
  * We sweep a metadata buffer (provided by the metapath) for blocks we need to
  * free, and free them all. However, we do it one rgrp@a time. If this
@@ -1096,45 +1093,40 @@ static int trunc_start(struct inode *inode, u64 newsize)
  *          *btotal has the total number of blocks freed
  */
 static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
-			      const struct metapath *mp, u32 *btotal, int hgt,
-			      bool keep_start)
+			      struct buffer_head *bh, __be64 *start, __be64 *end,
+			      bool meta, u32 *btotal)
 {
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
 	struct gfs2_rgrpd *rgd;
 	struct gfs2_trans *tr;
-	struct buffer_head *bh = mp->mp_bh[hgt];
-	__be64 *top, *bottom, *p;
+	__be64 *p;
 	int blks_outside_rgrp;
 	u64 bn, bstart, isize_blks;
 	s64 blen; /* needs to be s64 or gfs2_add_inode_blocks breaks */
-	int meta = ((hgt != ip->i_height - 1) ? 1 : 0);
 	int ret = 0;
 	bool buf_in_tr = false; /* buffer was added to transaction */
 
-	if (gfs2_metatype_check(sdp, bh,
-				(hgt ? GFS2_METATYPE_IN : GFS2_METATYPE_DI)))
-		return -EIO;
-
 more_rgrps:
+	rgd = NULL;
+	if (gfs2_holder_initialized(rd_gh)) {
+		rgd = gfs2_glock2rgrp(rd_gh->gh_gl);
+		gfs2_assert_withdraw(sdp,
+			     gfs2_glock_is_locked_by_me(rd_gh->gh_gl));
+	}
 	blks_outside_rgrp = 0;
 	bstart = 0;
 	blen = 0;
-	top = metapointer(hgt, mp); /* first ptr from metapath */
-	/* If we're keeping some data at the truncation point, we've got to
-	   preserve the metadata tree by adding 1 to the starting metapath. */
-	if (keep_start)
-		top++;
-
-	bottom = (__be64 *)(bh->b_data + bh->b_size);
 
-	for (p = top; p < bottom; p++) {
+	for (p = start; p < end; p++) {
 		if (!*p)
 			continue;
 		bn = be64_to_cpu(*p);
-		if (gfs2_holder_initialized(rd_gh)) {
-			rgd = gfs2_glock2rgrp(rd_gh->gh_gl);
-			gfs2_assert_withdraw(sdp,
-				     gfs2_glock_is_locked_by_me(rd_gh->gh_gl));
+
+		if (rgd) {
+			if (!rgrp_contains_block(rgd, bn)) {
+				blks_outside_rgrp++;
+				continue;
+			}
 		} else {
 			rgd = gfs2_blk2rgrpd(sdp, bn, true);
 			ret = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE,
@@ -1148,11 +1140,6 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 				gfs2_rs_deltree(&ip->i_res);
 		}
 
-		if (!rgrp_contains_block(rgd, bn)) {
-			blks_outside_rgrp++;
-			continue;
-		}
-
 		/* The size of our transactions will be unknown until we
 		   actually process all the metadata blocks that relate to
 		   the rgrp. So we estimate. We know it can't be more than
@@ -1171,7 +1158,7 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 				jblocks_rqsted += isize_blks;
 			revokes = jblocks_rqsted;
 			if (meta)
-				revokes += hptrs(sdp, hgt);
+				revokes += end - start;
 			else if (ip->i_depth)
 				revokes += sdp->sd_inptrs;
 			ret = gfs2_trans_begin(sdp, jblocks_rqsted, revokes);
@@ -1229,7 +1216,11 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 					    outside the rgrp we just processed,
 					    do it all over again. */
 		if (current->journal_info) {
-			struct buffer_head *dibh = mp->mp_bh[0];
+			struct buffer_head *dibh;
+
+			ret = gfs2_meta_inode_buffer(ip, &dibh);
+			if (ret)
+				goto out;
 
 			/* Every transaction boundary, we rewrite the dinode
 			   to keep its di_blocks current in case of failure. */
@@ -1237,6 +1228,7 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 				current_time(&ip->i_inode);
 			gfs2_trans_add_meta(ip->i_gl, dibh);
 			gfs2_dinode_out(ip, dibh->b_data);
+			brelse(dibh);
 			up_write(&ip->i_rw_mutex);
 			gfs2_trans_end(sdp);
 		}
@@ -1296,6 +1288,23 @@ static bool mp_eq_to_hgt(struct metapath *mp, __u16 *list, unsigned int h)
 	return true;
 }
 
+static inline void
+metapointer_range(struct metapath *mp, int height,
+		  __u16 *start_list, unsigned int start_aligned,
+		  __be64 **start, __be64 **end)
+{
+	struct buffer_head *bh = mp->mp_bh[height];
+	__be64 *first;
+
+	first = metaptr1(height, mp);
+	*start = first;
+	if (mp_eq_to_hgt(mp, start_list, height)) {
+		bool keep_start = height < start_aligned;
+		*start = first + start_list[height] + keep_start;
+	}
+	*end = (__be64 *)(bh->b_data + bh->b_size);
+}
+
 /**
  * trunc_dealloc - truncate a file down to a desired size
  * @ip: inode to truncate
@@ -1322,7 +1331,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 	int ret, state;
 	int mp_h; /* metapath buffers are read in to this height */
 	u64 prev_bnr = 0;
-	bool keep_start; /* need to preserve the first meta pointer? */
+	__be64 *start, *end;
 
 	memset(&mp, 0, sizeof(mp));
 	find_metapath(sdp, lblock, &mp, ip->i_height);
@@ -1353,8 +1362,11 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 		goto out_metapath;
 
 	/* issue read-ahead on metadata */
-	for (mp_h = 0; mp_h < mp.mp_aheight - 1; mp_h++)
-		gfs2_metapath_ra(ip->i_gl, &mp, mp_h);
+	for (mp_h = 0; mp_h < mp.mp_aheight - 1; mp_h++) {
+		metapointer_range(&mp, mp_h, start_list, start_aligned,
+				  &start, &end);
+		gfs2_metapath_ra(ip->i_gl, start, end);
+	}
 
 	if (mp.mp_aheight == ip->i_height)
 		state = DEALLOC_MP_FULL; /* We have a complete metapath */
@@ -1389,11 +1401,20 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 			}
 			prev_bnr = bh->b_blocknr;
 
-			keep_start = mp_h < start_aligned &&
-				     mp_eq_to_hgt(&mp, start_list, mp_h);
+			if (gfs2_metatype_check(sdp, bh,
+						(mp_h ? GFS2_METATYPE_IN :
+							GFS2_METATYPE_DI))) {
+				ret = -EIO;
+				goto out;
+			}
+
+			metapointer_range(&mp, mp_h, start_list, start_aligned,
+					  &start, &end);
+			ret = sweep_bh_for_rgrps(ip, &rd_gh, mp.mp_bh[mp_h],
+						 start, end,
+						 mp_h != ip->i_height - 1,
+						 &btotal);
 
-			ret = sweep_bh_for_rgrps(ip, &rd_gh, &mp, &btotal,
-						 mp_h, keep_start);
 			/* If we hit an error or just swept dinode buffer,
 			   just exit. */
 			if (ret || !mp_h) {
@@ -1447,9 +1468,12 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 
 			/* issue read-ahead on metadata */
 			if (mp.mp_aheight > 1) {
-				for (; ret > 1; ret--)
-					gfs2_metapath_ra(ip->i_gl, &mp,
-						mp.mp_aheight - ret);
+				for (; ret > 1; ret--) {
+					metapointer_range(&mp, mp.mp_aheight - ret,
+							  start_list, start_aligned,
+							  &start, &end);
+					gfs2_metapath_ra(ip->i_gl, start, end);
+				}
 			}
 
 			/* If buffers found for the entire strip height */
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 11/12] gfs2: Turn trunc_dealloc into punch_hole
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (9 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 10/12] gfs2: Generalize truncate code Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2018-01-12 17:38   ` Andreas Gruenbacher
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 12/12] gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE) Andreas Gruenbacher
  2018-01-16 17:51 ` [Cluster-devel] [PATCH 00/12] gfs2: punch hole Bob Peterson
  12 siblings, 1 reply; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Add an upper bound to the range of blocks to deallocate blocks to
function trunc_dealloc so that this function can be used for truncating
a file as well as for punching a hole into a file.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 176 +++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 120 insertions(+), 56 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 984e27715a2e..da7e04b9973e 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -461,13 +461,6 @@ enum alloc_state {
 	/* ALLOC_UNSTUFF = 3,   TBD and rather complicated */
 };
 
-static inline unsigned int hptrs(struct gfs2_sbd *sdp, const unsigned int hgt)
-{
-	if (hgt)
-		return sdp->sd_inptrs;
-	return sdp->sd_diptrs;
-}
-
 /**
  * gfs2_bmap_alloc - Build a metadata tree of the requested height
  * @inode: The GFS2 inode
@@ -1240,6 +1233,13 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
 	return ret;
 }
 
+static bool mp_eq_to_hgt(struct metapath *mp, __u16 *list, unsigned int h)
+{
+	if (memcmp(mp->mp_list, list, h * sizeof(mp->mp_list[0])))
+		return false;
+	return true;
+}
+
 /**
  * find_nonnull_ptr - find a non-null pointer given a metapath and height
  * assumes the metapath is valid (with buffers) out to height h
@@ -1250,28 +1250,34 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
  *          false if all remaining pointers are NULL in the buffer
  */
 static bool find_nonnull_ptr(struct gfs2_sbd *sdp, struct metapath *mp,
-			     unsigned int h)
+			     unsigned int h,
+			     __u16 *end_list, unsigned int end_aligned)
 {
-	__be64 *ptr;
-	unsigned int ptrs = hptrs(sdp, h) - 1;
+	struct buffer_head *bh = mp->mp_bh[h];
+	__be64 *first, *ptr, *end;
+
+	first = metaptr1(h, mp);
+	ptr = first + mp->mp_list[h];
+	end = (__be64 *)(bh->b_data + bh->b_size);
+	if (end_list && mp_eq_to_hgt(mp, end_list, h)) {
+		bool keep_end = h < end_aligned;
+		end = first + end_list[h] + keep_end;
+	}
 
-	while (true) {
-		ptr = metapointer(h, mp);
+	while (ptr < end) {
 		if (*ptr) { /* if we have a non-null pointer */
 			/* Now zero the metapath after the current height. */
+			mp->mp_list[h] = ptr - first;
+
 			h++;
 			if (h < GFS2_MAX_META_HEIGHT)
-				memset(&mp->mp_list[h], 0,
-				       (GFS2_MAX_META_HEIGHT - h) *
-				       sizeof(mp->mp_list[0]));
+				mp->mp_list[h + 1] = 0;
 			return true;
 		}
 
-		if (mp->mp_list[h] < ptrs)
-			mp->mp_list[h]++;
-		else
-			return false; /* no more pointers in this buffer */
+		ptr++;
 	}
+	return false;
 }
 
 enum dealloc_states {
@@ -1281,16 +1287,10 @@ enum dealloc_states {
 	DEALLOC_DONE = 3,       /* process complete */
 };
 
-static bool mp_eq_to_hgt(struct metapath *mp, __u16 *list, unsigned int h)
-{
-	if (memcmp(mp->mp_list, list, h * sizeof(mp->mp_list[0])))
-		return false;
-	return true;
-}
-
 static inline void
 metapointer_range(struct metapath *mp, int height,
 		  __u16 *start_list, unsigned int start_aligned,
+		  __u16 *end_list, unsigned int end_aligned,
 		  __be64 **start, __be64 **end)
 {
 	struct buffer_head *bh = mp->mp_bh[height];
@@ -1303,29 +1303,55 @@ metapointer_range(struct metapath *mp, int height,
 		*start = first + start_list[height] + keep_start;
 	}
 	*end = (__be64 *)(bh->b_data + bh->b_size);
+	if (end_list && mp_eq_to_hgt(mp, end_list, height)) {
+		bool keep_end = height < end_aligned;
+		*end = first + end_list[height] + keep_end;
+	}
+}
+
+static inline bool walk_done(struct gfs2_sbd *sdp,
+			     struct metapath *mp, int height,
+			     __u16 *end_list, unsigned int end_aligned)
+{
+	__u16 end;
+
+	if (end_list) {
+		bool keep_end = height < end_aligned;
+		if (!mp_eq_to_hgt(mp, end_list, height))
+			return false;
+		end = end_list[height] + keep_end;
+	} else
+		end = (height > 0) ? sdp->sd_inptrs : sdp->sd_diptrs;
+	return mp->mp_list[height] >= end;
 }
 
 /**
- * trunc_dealloc - truncate a file down to a desired size
+ * punch_hole - deallocate blocks in a file
  * @ip: inode to truncate
- * @newsize: The desired size of the file
+ * @offset: the start of the hole
+ * @length: the size of the hole
+ *
+ * Punch a hole into a file or truncate a file at a given position.  This
+ * function operates in whole blocks (@offset and @length are rounded
+ * accordingly); partially filled blocks must be cleared otherwise.
  *
- * This function truncates a file to newsize. It works from the
- * bottom up, and from the right to the left. In other words, it strips off
- * the highest layer (data) before stripping any of the metadata. Doing it
- * this way is best in case the operation is interrupted by power failure, etc.
- * The dinode is rewritten in every transaction to guarantee integrity.
+ * This function works from the bottom up, and from the right to the left. In
+ * other words, it strips off the highest layer (data) before stripping any of
+ * the metadata. Doing it this way is best in case the operation is interrupted
+ * by power failure, etc.  The dinode is rewritten in every transaction to
+ * guarantee integrity.
  */
-static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
+static int punch_hole(struct gfs2_inode *ip, u64 offset, u64 length)
 {
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
-	struct metapath mp;
+	struct metapath mp = {};
 	struct buffer_head *dibh, *bh;
 	struct gfs2_holder rd_gh;
 	unsigned int bsize_shift = sdp->sd_sb.sb_bsize_shift;
-	u64 lblock = (newsize + (1 << bsize_shift) - 1) >> bsize_shift;
-	__u16 start_list[GFS2_MAX_META_HEIGHT]; /* new beginning of truncation */
-	unsigned int start_aligned;
+	u64 lblock = (offset + (1 << bsize_shift) - 1) >> bsize_shift;
+	__u16 start_list[GFS2_MAX_META_HEIGHT];
+	__u16 __end_list[GFS2_MAX_META_HEIGHT], *end_list = NULL;
+	unsigned int start_aligned, end_aligned;
 	unsigned int strip_h = ip->i_height - 1;
 	u32 btotal = 0;
 	int ret, state;
@@ -1333,19 +1359,49 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 	u64 prev_bnr = 0;
 	__be64 *start, *end;
 
-	memset(&mp, 0, sizeof(mp));
-	find_metapath(sdp, lblock, &mp, ip->i_height);
+	/*
+	 * The start position of the hole is defined by lblock, start_list, and
+	 * start_aligned.  The end position of the hole is defined by lend,
+	 * end_list, and end_aligned.
+	 *
+	 * start_aligned and end_aligned define down to which height the start
+	 * and end positions are aligned to the metadata tree (i.e., the
+	 * position is a multiple of the metadata granularity at the height
+	 * above).  This determines at which heights additional meta pointers
+	 * needs to be preserved for the remaining data.
+	 */
+
+	if (length) {
+		u64 maxsize = sdp->sd_heightsize[ip->i_height];
+		u64 end_offset = offset + length;
+		u64 lend;
+
+		/*
+		 * Clip the end at the maximum file size for the given height:
+		 * that's how far the metadata goes; files bigger than that
+		 * will have additional layers of indirection.
+		 */
+		if (end_offset > maxsize)
+			end_offset = maxsize;
+		lend = end_offset >> bsize_shift;
+
+		if (lblock >= lend)
+			return 0;
 
+		find_metapath(sdp, lend, &mp, ip->i_height);
+		end_list = __end_list;
+		memcpy(end_list, mp.mp_list, sizeof(mp.mp_list));
+
+		for (mp_h = ip->i_height - 1; mp_h > 0; mp_h--) {
+			if (end_list[mp_h])
+				break;
+		}
+		end_aligned = mp_h;
+	}
+
+	find_metapath(sdp, lblock, &mp, ip->i_height);
 	memcpy(start_list, mp.mp_list, sizeof(start_list));
 
-	/*
-	 * Set start_aligned to the metadata height up to which the truncate
-	 * point is aligned to the metadata tree (i.e., the truncate point is a
-	 * multiple of the granularity at the height above).  This determines
-	 * at which heights an additional meta pointer needs to be preserved:
-	 * an additional meta pointer is needed at a given height if
-	 * height < start_aligned.
-	 */
 	for (mp_h = ip->i_height - 1; mp_h > 0; mp_h--) {
 		if (start_list[mp_h])
 			break;
@@ -1364,7 +1420,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 	/* issue read-ahead on metadata */
 	for (mp_h = 0; mp_h < mp.mp_aheight - 1; mp_h++) {
 		metapointer_range(&mp, mp_h, start_list, start_aligned,
-				  &start, &end);
+				  end_list, end_aligned, &start, &end);
 		gfs2_metapath_ra(ip->i_gl, start, end);
 	}
 
@@ -1408,7 +1464,14 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 				goto out;
 			}
 
+			/*
+			 * Below, passing end_aligned as 0 gives us the
+			 * metapointer range excluding the end point: the end
+			 * point is the first metapath we must not deallocate!
+			 */
+
 			metapointer_range(&mp, mp_h, start_list, start_aligned,
+					  end_list, 0 /* end_aligned */,
 					  &start, &end);
 			ret = sweep_bh_for_rgrps(ip, &rd_gh, mp.mp_bh[mp_h],
 						 start, end,
@@ -1445,13 +1508,13 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 			}
 			mp.mp_list[mp_h] = 0;
 			mp_h--; /* search one metadata height down */
-			if (mp.mp_list[mp_h] >= hptrs(sdp, mp_h) - 1)
-				break; /* loop around in the same state */
 			mp.mp_list[mp_h]++;
+			if (walk_done(sdp, &mp, mp_h, end_list, end_aligned))
+				break;
 			/* Here we've found a part of the metapath that is not
 			 * allocated. We need to search at that height for the
 			 * next non-null pointer. */
-			if (find_nonnull_ptr(sdp, &mp, mp_h)) {
+			if (find_nonnull_ptr(sdp, &mp, mp_h, end_list, end_aligned)) {
 				state = DEALLOC_FILL_MP;
 				mp_h++;
 			}
@@ -1471,6 +1534,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 				for (; ret > 1; ret--) {
 					metapointer_range(&mp, mp.mp_aheight - ret,
 							  start_list, start_aligned,
+							  end_list, end_aligned,
 							  &start, &end);
 					gfs2_metapath_ra(ip->i_gl, start, end);
 				}
@@ -1487,7 +1551,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
 			/* If we find a non-null block pointer, crawl a bit
 			   higher up in the metapath and try again, otherwise
 			   we need to look lower for a new starting point. */
-			if (find_nonnull_ptr(sdp, &mp, mp_h))
+			if (find_nonnull_ptr(sdp, &mp, mp_h, end_list, end_aligned))
 				mp_h++;
 			else
 				state = DEALLOC_MP_LOWER;
@@ -1584,7 +1648,7 @@ static int do_shrink(struct inode *inode, u64 newsize)
 	if (gfs2_is_stuffed(ip))
 		return 0;
 
-	error = trunc_dealloc(ip, newsize);
+	error = punch_hole(ip, newsize, 0);
 	if (error == 0)
 		error = trunc_end(ip);
 
@@ -1716,7 +1780,7 @@ int gfs2_setattr_size(struct inode *inode, u64 newsize)
 int gfs2_truncatei_resume(struct gfs2_inode *ip)
 {
 	int error;
-	error = trunc_dealloc(ip, i_size_read(&ip->i_inode));
+	error = punch_hole(ip, i_size_read(&ip->i_inode), 0);
 	if (!error)
 		error = trunc_end(ip);
 	return error;
@@ -1724,7 +1788,7 @@ int gfs2_truncatei_resume(struct gfs2_inode *ip)
 
 int gfs2_file_dealloc(struct gfs2_inode *ip)
 {
-	return trunc_dealloc(ip, 0);
+	return punch_hole(ip, 0, 0);
 }
 
 /**
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 12/12] gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE)
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (10 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 11/12] gfs2: Turn trunc_dealloc into punch_hole Andreas Gruenbacher
@ 2017-12-22 14:35 ` Andreas Gruenbacher
  2018-01-16 17:51 ` [Cluster-devel] [PATCH 00/12] gfs2: punch hole Bob Peterson
  12 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2017-12-22 14:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Implement the top-level bits of punching a hole into a file.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 124 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/gfs2/bmap.h |   1 +
 fs/gfs2/file.c |  19 +++++----
 3 files changed, 137 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index da7e04b9973e..819e86a73560 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1956,3 +1956,127 @@ int gfs2_write_alloc_required(struct gfs2_inode *ip, u64 offset,
 	return 0;
 }
 
+static int stuffed_zero_range(struct inode *inode, loff_t offset, loff_t length)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct buffer_head *dibh;
+	int error;
+
+	if (offset >= inode->i_size)
+		return 0;
+	if (offset + length > inode->i_size)
+		length = inode->i_size - offset;
+
+	error = gfs2_meta_inode_buffer(ip, &dibh);
+	if (error)
+		return error;
+	gfs2_trans_add_meta(ip->i_gl, dibh);
+	memset(dibh->b_data + sizeof(struct gfs2_dinode) + offset, 0,
+	       length);
+	brelse(dibh);
+	return 0;
+}
+
+static int gfs2_journaled_truncate_range(struct inode *inode, loff_t offset,
+					 loff_t length)
+{
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	loff_t max_chunk = GFS2_JTRUNC_REVOKES * sdp->sd_vfs->s_blocksize;
+	int error;
+
+	while (length) {
+		struct gfs2_trans *tr;
+		loff_t chunk;
+		unsigned int offs;
+
+		chunk = length;
+		if (chunk > max_chunk)
+			chunk = max_chunk;
+
+		offs = offset & ~PAGE_MASK;
+		if (offs && chunk > PAGE_SIZE)
+			chunk = offs + ((chunk - offs) & PAGE_MASK);
+
+		truncate_pagecache_range(inode, offset, chunk);
+		offset += chunk;
+		length -= chunk;
+
+		/* XXX Is this check sufficient? */
+		tr = current->journal_info;
+		if (!tr->tr_num_revoke)
+			continue;
+
+		gfs2_trans_end(sdp);
+		error = gfs2_trans_begin(sdp, RES_DINODE, GFS2_JTRUNC_REVOKES);
+		if (error)
+			return error;
+	}
+	return 0;
+}
+
+int __gfs2_punch_hole(struct file *file, loff_t offset, loff_t length)
+{
+	struct inode *inode = file_inode(file);
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	int error;
+
+	if (gfs2_is_jdata(ip))
+		error = gfs2_trans_begin(sdp, RES_DINODE + 2 * RES_JDATA,
+					 GFS2_JTRUNC_REVOKES);
+	else
+		error = gfs2_trans_begin(sdp, RES_DINODE, 0);
+	if (error)
+		return error;
+
+	if (gfs2_is_stuffed(ip)) {
+		error = stuffed_zero_range(inode, offset, length);
+		if (error)
+			goto out;
+	} else {
+		unsigned int start_off, end_off, blocksize;
+
+		blocksize = i_blocksize(inode);
+		start_off = offset & (blocksize - 1);
+		end_off = (offset + length) & (blocksize - 1);
+		if (start_off) {
+			unsigned int len = length;
+			if (length > blocksize - start_off)
+				len = blocksize - start_off;
+			error = gfs2_block_zero_range(inode, offset, len);
+			if (error)
+				goto out;
+			if (start_off + length < blocksize)
+				end_off = 0;
+		}
+		if (end_off) {
+			error = gfs2_block_zero_range(inode,
+				offset + length - end_off, end_off);
+			if (error)
+				goto out;
+		}
+	}
+
+	if (gfs2_is_jdata(ip)) {
+		BUG_ON(!current->journal_info);
+		gfs2_journaled_truncate_range(inode, offset, length);
+	} else
+		truncate_pagecache_range(inode, offset, offset + length - 1);
+
+	file_update_time(file);
+	mark_inode_dirty(inode);
+
+	if (current->journal_info)
+		gfs2_trans_end(sdp);
+
+	if (!gfs2_is_stuffed(ip)) {
+		error = punch_hole(ip, offset, length);
+		if (error)
+			goto out;
+	}
+
+out:
+	if (current->journal_info)
+		gfs2_trans_end(sdp);
+	return error;
+}
diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
index 443cc182cf18..c3402fe00653 100644
--- a/fs/gfs2/bmap.h
+++ b/fs/gfs2/bmap.h
@@ -61,5 +61,6 @@ extern int gfs2_write_alloc_required(struct gfs2_inode *ip, u64 offset,
 				     unsigned int len);
 extern int gfs2_map_journal_extents(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd);
 extern void gfs2_free_journal_extents(struct gfs2_jdesc *jd);
+extern int __gfs2_punch_hole(struct file *file, loff_t offset, loff_t length);
 
 #endif /* __BMAP_DOT_H__ */
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 58705ef8643a..bd60dc682676 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -924,7 +924,7 @@ static long gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t le
 	struct gfs2_holder gh;
 	int ret;
 
-	if (mode & ~FALLOC_FL_KEEP_SIZE)
+	if (mode & ~(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE))
 		return -EOPNOTSUPP;
 	/* fallocate is needed by gfs2_grow to reserve space in the rindex */
 	if (gfs2_is_jdata(ip) && inode != sdp->sd_rindex)
@@ -948,13 +948,18 @@ static long gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t le
 	if (ret)
 		goto out_unlock;
 
-	ret = gfs2_rsqa_alloc(ip);
-	if (ret)
-		goto out_putw;
+	if (mode & FALLOC_FL_PUNCH_HOLE) {
+		ret = __gfs2_punch_hole(file, offset, len);
+	} else {
+		ret = gfs2_rsqa_alloc(ip);
+		if (ret)
+			goto out_putw;
 
-	ret = __gfs2_fallocate(file, mode, offset, len);
-	if (ret)
-		gfs2_rs_deltree(&ip->i_res);
+		ret = __gfs2_fallocate(file, mode, offset, len);
+
+		if (ret)
+			gfs2_rs_deltree(&ip->i_res);
+	}
 
 out_putw:
 	put_write_access(inode);
-- 
2.14.3



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 03/12] gfs2: Clean up trunc_start error path
  2017-12-22 14:34 ` [Cluster-devel] [PATCH 03/12] gfs2: Clean up trunc_start error path Andreas Gruenbacher
@ 2018-01-08 20:12   ` Bob Peterson
  2018-01-08 20:50     ` Bob Peterson
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Peterson @ 2018-01-08 20:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| ---
|  fs/gfs2/bmap.c | 15 +++++----------
|  1 file changed, 5 insertions(+), 10 deletions(-)
| 
| diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
| index 8b993e4d80b2..0ad6d812c78b 100644
| --- a/fs/gfs2/bmap.c
| +++ b/fs/gfs2/bmap.c
| @@ -1022,7 +1022,7 @@ static int trunc_start(struct inode *inode, u64
| oldsize, u64 newsize)
|  	struct gfs2_inode *ip = GFS2_I(inode);
|  	struct gfs2_sbd *sdp = GFS2_SB(inode);
|  	struct address_space *mapping = inode->i_mapping;
| -	struct buffer_head *dibh;
| +	struct buffer_head *dibh = NULL;
|  	int journaled = gfs2_is_jdata(ip);
|  	int error;
|  
| @@ -1045,7 +1045,7 @@ static int trunc_start(struct inode *inode, u64
| oldsize, u64 newsize)
|  		if (newsize & (u64)(sdp->sd_sb.sb_bsize - 1)) {
|  			error = gfs2_block_truncate_page(mapping, newsize);
|  			if (error)
| -				goto out_brelse;
| +				goto out;
|  		}
|  		ip->i_diskflags |= GFS2_DIF_TRUNC_IN_PROG;
|  	}
| @@ -1059,15 +1059,10 @@ static int trunc_start(struct inode *inode, u64
| oldsize, u64 newsize)
|  	else
|  		truncate_pagecache(inode, newsize);
|  
| -	if (error) {
| -		brelse(dibh);
| -		return error;
| -	}
| -
| -out_brelse:
| -	brelse(dibh);
|  out:
| -	gfs2_trans_end(sdp);
| +	brelse(dibh);
| +	if (current->journal_info)
| +		gfs2_trans_end(sdp);
|  	return error;
|  }

Hi,

This isn't quite right.

	error = gfs2_meta_inode_buffer(ip, &dibh);
	if (error)
		goto out;
...
out:
	brelse(dibh);

If this returns an error, we'll do double brelse in out.
If gfs2_meta_inode_buffer returns an error, it has released the
buffer_head already.

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 03/12] gfs2: Clean up trunc_start error path
  2018-01-08 20:12   ` Bob Peterson
@ 2018-01-08 20:50     ` Bob Peterson
  0 siblings, 0 replies; 21+ messages in thread
From: Bob Peterson @ 2018-01-08 20:50 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| Hi,
| 
| This isn't quite right.
| 
| 	error = gfs2_meta_inode_buffer(ip, &dibh);
| 	if (error)
| 		goto out;
| ...
| out:
| 	brelse(dibh);
| 
| If this returns an error, we'll do double brelse in out.
| If gfs2_meta_inode_buffer returns an error, it has released the
| buffer_head already.
| 
| Regards,
| 
| Bob Peterson
| Red Hat File Systems

Never mind. It's not a double-brelse because dibh will still be NULL,
and brelse(NULL) is a no-op.

Bob Peterson



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 05/12] gfs2: Remove minor gfs2_journaled_truncate inefficiencies
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 05/12] gfs2: Remove minor gfs2_journaled_truncate inefficiencies Andreas Gruenbacher
@ 2018-01-09 13:53   ` Bob Peterson
  2018-01-09 14:21     ` Andreas Gruenbacher
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Peterson @ 2018-01-09 13:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

----- Original Message -----
| First, this function truncates the file in chunks.  When the original
| file size isn't block aligned, each chunk that is truncated will remain
| be misaligned.  This is inefficient.
| 
| Second, this function doesn't recognize where holes are, so it loops
| through them.  For each chunk of a hole, it creates a new transaction.
| At least avoid creating another transactions whe the current one is
| still empty.  (An better fix would be to skip large holes, of course.)
| 
| Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| ---
(snip)
| +		/* XXX Is this check sufficient? */
| +		tr = current->journal_info;
| +		if (!tr->tr_num_revoke)
| +			continue;

Maybe I'm missing your intent here, but perhaps you can use this:

	if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {

like gfs2_trans_end does for determining whether the transaction was
actually used.

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 05/12] gfs2: Remove minor gfs2_journaled_truncate inefficiencies
  2018-01-09 13:53   ` Bob Peterson
@ 2018-01-09 14:21     ` Andreas Gruenbacher
  0 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2018-01-09 14:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 9 January 2018 at 14:53, Bob Peterson <rpeterso@redhat.com> wrote:
> Hi,
>
> ----- Original Message -----
> | First, this function truncates the file in chunks.  When the original
> | file size isn't block aligned, each chunk that is truncated will remain
> | be misaligned.  This is inefficient.
> |
> | Second, this function doesn't recognize where holes are, so it loops
> | through them.  For each chunk of a hole, it creates a new transaction.
> | At least avoid creating another transactions whe the current one is
> | still empty.  (An better fix would be to skip large holes, of course.)
> |
> | Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> | ---
> (snip)
> | +             /* XXX Is this check sufficient? */
> | +             tr = current->journal_info;
> | +             if (!tr->tr_num_revoke)
> | +                     continue;
>
> Maybe I'm missing your intent here, but perhaps you can use this:
>
>         if (!test_bit(TR_TOUCHED, &tr->tr_flags)) {
>
> like gfs2_trans_end does for determining whether the transaction was
> actually used.

Yes, that should work.

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 10/12] gfs2: Generalize truncate code
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 10/12] gfs2: Generalize truncate code Andreas Gruenbacher
@ 2018-01-09 15:45   ` Bob Peterson
  2018-01-09 17:14     ` Andreas Gruenbacher
  0 siblings, 1 reply; 21+ messages in thread
From: Bob Peterson @ 2018-01-09 15:45 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

----- Original Message -----
| Pull the code for computing the range of metapointers to iterate out of
| gfs2_metapath_ra (for readahead), sweep_bh_for_rgrps (for deallocating
| metapointers within a block), and trunc_dealloc (for walking the
| metadata tree).
| 
| In sweep_bh_for_rgrps, move the code for looking up the resource group
| descriptor of the current resource group out of the inner loop.  The
| metatype check moves to trunc_dealloc.
| 
| Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| ---
(snip)
| @@ -1229,7 +1216,11 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip,
| struct gfs2_holder *rd_gh,
|  					    outside the rgrp we just processed,
|  					    do it all over again. */
|  		if (current->journal_info) {
| -			struct buffer_head *dibh = mp->mp_bh[0];
| +			struct buffer_head *dibh;
| +
| +			ret = gfs2_meta_inode_buffer(ip, &dibh);
| +			if (ret)
| +				goto out;
(snip)...
| +			brelse(dibh);

I don't understand why you prefer to use gfs2_meta_inode_buffer here
instead of just using the copy we already have in the metapath.
It seems to me this method would be a lot less efficient.
Can you explain?

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 10/12] gfs2: Generalize truncate code
  2018-01-09 15:45   ` Bob Peterson
@ 2018-01-09 17:14     ` Andreas Gruenbacher
  0 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2018-01-09 17:14 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 9 January 2018 at 16:45, Bob Peterson <rpeterso@redhat.com> wrote:
> Hi,
>
> ----- Original Message -----
> | Pull the code for computing the range of metapointers to iterate out of
> | gfs2_metapath_ra (for readahead), sweep_bh_for_rgrps (for deallocating
> | metapointers within a block), and trunc_dealloc (for walking the
> | metadata tree).
> |
> | In sweep_bh_for_rgrps, move the code for looking up the resource group
> | descriptor of the current resource group out of the inner loop.  The
> | metatype check moves to trunc_dealloc.
> |
> | Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> | ---
> (snip)
> | @@ -1229,7 +1216,11 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip,
> | struct gfs2_holder *rd_gh,
> |                                           outside the rgrp we just processed,
> |                                           do it all over again. */
> |               if (current->journal_info) {
> | -                     struct buffer_head *dibh = mp->mp_bh[0];
> | +                     struct buffer_head *dibh;
> | +
> | +                     ret = gfs2_meta_inode_buffer(ip, &dibh);
> | +                     if (ret)
> | +                             goto out;
> (snip)...
> | +                     brelse(dibh);
>
> I don't understand why you prefer to use gfs2_meta_inode_buffer here
> instead of just using the copy we already have in the metapath.
> It seems to me this method would be a lot less efficient.
> Can you explain?

This patch removes the metapath parameter of sweep_bh_for_rgrps as
well. We could pass in the dibh in a separate parameter, but we only
need it when an indirect block spans multiple resource groups, which
should be relatively rare. Even then, we'll also be switching resource
group glocks, and so an additional bh lookup won't have a measurable
impact.

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 11/12] gfs2: Turn trunc_dealloc into punch_hole
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 11/12] gfs2: Turn trunc_dealloc into punch_hole Andreas Gruenbacher
@ 2018-01-12 17:38   ` Andreas Gruenbacher
  0 siblings, 0 replies; 21+ messages in thread
From: Andreas Gruenbacher @ 2018-01-12 17:38 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 22 December 2017 at 15:35, Andreas Gruenbacher <agruenba@redhat.com> wrote:
> Add an upper bound to the range of blocks to deallocate blocks to
> function trunc_dealloc so that this function can be used for truncating
> a file as well as for punching a hole into a file.
>
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> ---
>  fs/gfs2/bmap.c | 176 +++++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 120 insertions(+), 56 deletions(-)
>
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index 984e27715a2e..da7e04b9973e 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -461,13 +461,6 @@ enum alloc_state {
>         /* ALLOC_UNSTUFF = 3,   TBD and rather complicated */
>  };
>
> -static inline unsigned int hptrs(struct gfs2_sbd *sdp, const unsigned int hgt)
> -{
> -       if (hgt)
> -               return sdp->sd_inptrs;
> -       return sdp->sd_diptrs;
> -}
> -
>  /**
>   * gfs2_bmap_alloc - Build a metadata tree of the requested height
>   * @inode: The GFS2 inode
> @@ -1240,6 +1233,13 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
>         return ret;
>  }
>
> +static bool mp_eq_to_hgt(struct metapath *mp, __u16 *list, unsigned int h)
> +{
> +       if (memcmp(mp->mp_list, list, h * sizeof(mp->mp_list[0])))
> +               return false;
> +       return true;
> +}
> +
>  /**
>   * find_nonnull_ptr - find a non-null pointer given a metapath and height
>   * assumes the metapath is valid (with buffers) out to height h
> @@ -1250,28 +1250,34 @@ static int sweep_bh_for_rgrps(struct gfs2_inode *ip, struct gfs2_holder *rd_gh,
>   *          false if all remaining pointers are NULL in the buffer
>   */
>  static bool find_nonnull_ptr(struct gfs2_sbd *sdp, struct metapath *mp,
> -                            unsigned int h)
> +                            unsigned int h,
> +                            __u16 *end_list, unsigned int end_aligned)
>  {
> -       __be64 *ptr;
> -       unsigned int ptrs = hptrs(sdp, h) - 1;
> +       struct buffer_head *bh = mp->mp_bh[h];
> +       __be64 *first, *ptr, *end;
> +
> +       first = metaptr1(h, mp);
> +       ptr = first + mp->mp_list[h];
> +       end = (__be64 *)(bh->b_data + bh->b_size);
> +       if (end_list && mp_eq_to_hgt(mp, end_list, h)) {
> +               bool keep_end = h < end_aligned;
> +               end = first + end_list[h] + keep_end;
> +       }
>
> -       while (true) {
> -               ptr = metapointer(h, mp);
> +       while (ptr < end) {
>                 if (*ptr) { /* if we have a non-null pointer */
>                         /* Now zero the metapath after the current height. */
> +                       mp->mp_list[h] = ptr - first;
> +
>                         h++;
>                         if (h < GFS2_MAX_META_HEIGHT)
> -                               memset(&mp->mp_list[h], 0,
> -                                      (GFS2_MAX_META_HEIGHT - h) *
> -                                      sizeof(mp->mp_list[0]));
> +                               mp->mp_list[h + 1] = 0;

h is already incremented here, so this should be mp->mp_list[h] = 0;

>                         return true;
>                 }
>
> -               if (mp->mp_list[h] < ptrs)
> -                       mp->mp_list[h]++;
> -               else
> -                       return false; /* no more pointers in this buffer */
> +               ptr++;
>         }
> +       return false;
>  }
>
>  enum dealloc_states {
> @@ -1281,16 +1287,10 @@ enum dealloc_states {
>         DEALLOC_DONE = 3,       /* process complete */
>  };
>
> -static bool mp_eq_to_hgt(struct metapath *mp, __u16 *list, unsigned int h)
> -{
> -       if (memcmp(mp->mp_list, list, h * sizeof(mp->mp_list[0])))
> -               return false;
> -       return true;
> -}
> -
>  static inline void
>  metapointer_range(struct metapath *mp, int height,
>                   __u16 *start_list, unsigned int start_aligned,
> +                 __u16 *end_list, unsigned int end_aligned,
>                   __be64 **start, __be64 **end)
>  {
>         struct buffer_head *bh = mp->mp_bh[height];
> @@ -1303,29 +1303,55 @@ metapointer_range(struct metapath *mp, int height,
>                 *start = first + start_list[height] + keep_start;
>         }
>         *end = (__be64 *)(bh->b_data + bh->b_size);
> +       if (end_list && mp_eq_to_hgt(mp, end_list, height)) {
> +               bool keep_end = height < end_aligned;
> +               *end = first + end_list[height] + keep_end;
> +       }
> +}
> +
> +static inline bool walk_done(struct gfs2_sbd *sdp,
> +                            struct metapath *mp, int height,
> +                            __u16 *end_list, unsigned int end_aligned)
> +{
> +       __u16 end;
> +
> +       if (end_list) {
> +               bool keep_end = height < end_aligned;
> +               if (!mp_eq_to_hgt(mp, end_list, height))
> +                       return false;
> +               end = end_list[height] + keep_end;
> +       } else
> +               end = (height > 0) ? sdp->sd_inptrs : sdp->sd_diptrs;
> +       return mp->mp_list[height] >= end;
>  }
>
>  /**
> - * trunc_dealloc - truncate a file down to a desired size
> + * punch_hole - deallocate blocks in a file
>   * @ip: inode to truncate
> - * @newsize: The desired size of the file
> + * @offset: the start of the hole
> + * @length: the size of the hole
> + *
> + * Punch a hole into a file or truncate a file at a given position.  This
> + * function operates in whole blocks (@offset and @length are rounded
> + * accordingly); partially filled blocks must be cleared otherwise.
>   *
> - * This function truncates a file to newsize. It works from the
> - * bottom up, and from the right to the left. In other words, it strips off
> - * the highest layer (data) before stripping any of the metadata. Doing it
> - * this way is best in case the operation is interrupted by power failure, etc.
> - * The dinode is rewritten in every transaction to guarantee integrity.
> + * This function works from the bottom up, and from the right to the left. In
> + * other words, it strips off the highest layer (data) before stripping any of
> + * the metadata. Doing it this way is best in case the operation is interrupted
> + * by power failure, etc.  The dinode is rewritten in every transaction to
> + * guarantee integrity.
>   */
> -static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
> +static int punch_hole(struct gfs2_inode *ip, u64 offset, u64 length)
>  {
>         struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
> -       struct metapath mp;
> +       struct metapath mp = {};
>         struct buffer_head *dibh, *bh;
>         struct gfs2_holder rd_gh;
>         unsigned int bsize_shift = sdp->sd_sb.sb_bsize_shift;
> -       u64 lblock = (newsize + (1 << bsize_shift) - 1) >> bsize_shift;
> -       __u16 start_list[GFS2_MAX_META_HEIGHT]; /* new beginning of truncation */
> -       unsigned int start_aligned;
> +       u64 lblock = (offset + (1 << bsize_shift) - 1) >> bsize_shift;
> +       __u16 start_list[GFS2_MAX_META_HEIGHT];
> +       __u16 __end_list[GFS2_MAX_META_HEIGHT], *end_list = NULL;
> +       unsigned int start_aligned, end_aligned;
>         unsigned int strip_h = ip->i_height - 1;
>         u32 btotal = 0;
>         int ret, state;
> @@ -1333,19 +1359,49 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
>         u64 prev_bnr = 0;
>         __be64 *start, *end;
>
> -       memset(&mp, 0, sizeof(mp));
> -       find_metapath(sdp, lblock, &mp, ip->i_height);
> +       /*
> +        * The start position of the hole is defined by lblock, start_list, and
> +        * start_aligned.  The end position of the hole is defined by lend,
> +        * end_list, and end_aligned.
> +        *
> +        * start_aligned and end_aligned define down to which height the start
> +        * and end positions are aligned to the metadata tree (i.e., the
> +        * position is a multiple of the metadata granularity at the height
> +        * above).  This determines at which heights additional meta pointers
> +        * needs to be preserved for the remaining data.
> +        */
> +
> +       if (length) {
> +               u64 maxsize = sdp->sd_heightsize[ip->i_height];
> +               u64 end_offset = offset + length;
> +               u64 lend;
> +
> +               /*
> +                * Clip the end at the maximum file size for the given height:
> +                * that's how far the metadata goes; files bigger than that
> +                * will have additional layers of indirection.
> +                */
> +               if (end_offset > maxsize)
> +                       end_offset = maxsize;
> +               lend = end_offset >> bsize_shift;
> +
> +               if (lblock >= lend)
> +                       return 0;
>
> +               find_metapath(sdp, lend, &mp, ip->i_height);
> +               end_list = __end_list;
> +               memcpy(end_list, mp.mp_list, sizeof(mp.mp_list));
> +
> +               for (mp_h = ip->i_height - 1; mp_h > 0; mp_h--) {
> +                       if (end_list[mp_h])
> +                               break;
> +               }
> +               end_aligned = mp_h;
> +       }
> +
> +       find_metapath(sdp, lblock, &mp, ip->i_height);
>         memcpy(start_list, mp.mp_list, sizeof(start_list));
>
> -       /*
> -        * Set start_aligned to the metadata height up to which the truncate
> -        * point is aligned to the metadata tree (i.e., the truncate point is a
> -        * multiple of the granularity at the height above).  This determines
> -        * at which heights an additional meta pointer needs to be preserved:
> -        * an additional meta pointer is needed at a given height if
> -        * height < start_aligned.
> -        */
>         for (mp_h = ip->i_height - 1; mp_h > 0; mp_h--) {
>                 if (start_list[mp_h])
>                         break;
> @@ -1364,7 +1420,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
>         /* issue read-ahead on metadata */
>         for (mp_h = 0; mp_h < mp.mp_aheight - 1; mp_h++) {
>                 metapointer_range(&mp, mp_h, start_list, start_aligned,
> -                                 &start, &end);
> +                                 end_list, end_aligned, &start, &end);
>                 gfs2_metapath_ra(ip->i_gl, start, end);
>         }
>
> @@ -1408,7 +1464,14 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
>                                 goto out;
>                         }
>
> +                       /*
> +                        * Below, passing end_aligned as 0 gives us the
> +                        * metapointer range excluding the end point: the end
> +                        * point is the first metapath we must not deallocate!
> +                        */
> +
>                         metapointer_range(&mp, mp_h, start_list, start_aligned,
> +                                         end_list, 0 /* end_aligned */,
>                                           &start, &end);
>                         ret = sweep_bh_for_rgrps(ip, &rd_gh, mp.mp_bh[mp_h],
>                                                  start, end,
> @@ -1445,13 +1508,13 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
>                         }
>                         mp.mp_list[mp_h] = 0;
>                         mp_h--; /* search one metadata height down */
> -                       if (mp.mp_list[mp_h] >= hptrs(sdp, mp_h) - 1)
> -                               break; /* loop around in the same state */
>                         mp.mp_list[mp_h]++;
> +                       if (walk_done(sdp, &mp, mp_h, end_list, end_aligned))
> +                               break;
>                         /* Here we've found a part of the metapath that is not
>                          * allocated. We need to search at that height for the
>                          * next non-null pointer. */
> -                       if (find_nonnull_ptr(sdp, &mp, mp_h)) {
> +                       if (find_nonnull_ptr(sdp, &mp, mp_h, end_list, end_aligned)) {
>                                 state = DEALLOC_FILL_MP;
>                                 mp_h++;
>                         }
> @@ -1471,6 +1534,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
>                                 for (; ret > 1; ret--) {
>                                         metapointer_range(&mp, mp.mp_aheight - ret,
>                                                           start_list, start_aligned,
> +                                                         end_list, end_aligned,
>                                                           &start, &end);
>                                         gfs2_metapath_ra(ip->i_gl, start, end);
>                                 }
> @@ -1487,7 +1551,7 @@ static int trunc_dealloc(struct gfs2_inode *ip, u64 newsize)
>                         /* If we find a non-null block pointer, crawl a bit
>                            higher up in the metapath and try again, otherwise
>                            we need to look lower for a new starting point. */
> -                       if (find_nonnull_ptr(sdp, &mp, mp_h))
> +                       if (find_nonnull_ptr(sdp, &mp, mp_h, end_list, end_aligned))
>                                 mp_h++;
>                         else
>                                 state = DEALLOC_MP_LOWER;
> @@ -1584,7 +1648,7 @@ static int do_shrink(struct inode *inode, u64 newsize)
>         if (gfs2_is_stuffed(ip))
>                 return 0;
>
> -       error = trunc_dealloc(ip, newsize);
> +       error = punch_hole(ip, newsize, 0);
>         if (error == 0)
>                 error = trunc_end(ip);
>
> @@ -1716,7 +1780,7 @@ int gfs2_setattr_size(struct inode *inode, u64 newsize)
>  int gfs2_truncatei_resume(struct gfs2_inode *ip)
>  {
>         int error;
> -       error = trunc_dealloc(ip, i_size_read(&ip->i_inode));
> +       error = punch_hole(ip, i_size_read(&ip->i_inode), 0);
>         if (!error)
>                 error = trunc_end(ip);
>         return error;
> @@ -1724,7 +1788,7 @@ int gfs2_truncatei_resume(struct gfs2_inode *ip)
>
>  int gfs2_file_dealloc(struct gfs2_inode *ip)
>  {
> -       return trunc_dealloc(ip, 0);
> +       return punch_hole(ip, 0, 0);
>  }
>
>  /**
> --
> 2.14.3
>

Andreas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Cluster-devel] [PATCH 00/12] gfs2: punch hole
  2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
                   ` (11 preceding siblings ...)
  2017-12-22 14:35 ` [Cluster-devel] [PATCH 12/12] gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE) Andreas Gruenbacher
@ 2018-01-16 17:51 ` Bob Peterson
  12 siblings, 0 replies; 21+ messages in thread
From: Bob Peterson @ 2018-01-16 17:51 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| These patches fix a few minor issues in the non-recursive delete
| implementation, add upper bound parameters so that these functions can
| be used for punching holes in addition to truncating files at a given
| position, and implement the fallocate hole punching operation base don
| that.
| 
| These patches pass xfstests, but they haven't been tested very heavily
| beyond that.
| 
| Andreas Gruenbacher (11):
|   gfs2: Remove pointless BUG_ON
|   gfs2: Clean up trunc_start error path
|   gfs2: truncate: Remove unnecessary oldsize parameters
|   gfs2: Remove minor gfs2_journaled_truncate inefficiencies
|   gfs2: Clean up {lookup,fillup}_metapath
|   gfs2: Fix metadata read-ahead during truncate
|   gfs2: Improve non-recursive delete algorithm
|   Turn gfs2_block_truncate_page into gfs2_block_zero_range
|   gfs2: Generalize truncate code
|   gfs2: Turn trunc_dealloc into punch_hole
|   gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE)
| 
| Steven Whitehouse (1):
|   gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use
| 
|  fs/gfs2/bmap.c  | 571
|  +++++++++++++++++++++++++++++++++++++++-----------------
|  fs/gfs2/bmap.h  |   1 +
|  fs/gfs2/file.c  |  19 +-
|  fs/gfs2/rgrp.c  |   7 +
|  fs/gfs2/trans.c |   1 -
|  5 files changed, 418 insertions(+), 181 deletions(-)
| 
| --
| 2.14.3

Hi,

Thanks. I pushed the (now-revised) punch-hole patch set to the for-next
branch of linux-gfs2 repository. I also threw a bunch of my truncate
and delete tests at them, and it all seemed to work properly.

7d2040199855 Steven Whitehouse   gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use
ccd77a7f6cad Andreas Gruenbacher gfs2: Remove pointless BUG_ON
e6bf7ff650bd Andreas Gruenbacher gfs2: Clean up trunc_start error path
afbf9be543a6 Andreas Gruenbacher gfs2: truncate: Remove unnecessary oldsize parameters
fb898e411939 Andreas Gruenbacher gfs2: Remove minor gfs2_journaled_truncate inefficiencies
5824d54d8ad3 Andreas Gruenbacher gfs2: Clean up {lookup,fillup}_metapath
019cb01e5d94 Andreas Gruenbacher gfs2: Fix metadata read-ahead during truncate
11935f6d990d Andreas Gruenbacher gfs2: Improve non-recursive delete algorithm
2d63ef4b89ba Andreas Gruenbacher Turn gfs2_block_truncate_page into gfs2_block_zero_range
2a220ac9e3f7 Andreas Gruenbacher gfs2: Generalize truncate code
60788120c6c2 Andreas Gruenbacher gfs2: Turn trunc_dealloc into punch_hole
9da8249e14da Andreas Gruenbacher gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE)

Regards,

Bob Peterson



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-01-16 17:51 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-22 14:34 [Cluster-devel] [PATCH 00/12] gfs2: punch hole Andreas Gruenbacher
2017-12-22 14:34 ` [Cluster-devel] [PATCH 01/12] gfs2: Add gfs2_blk2rgrpd comment and fix incorrect use Andreas Gruenbacher
2017-12-22 14:34 ` [Cluster-devel] [PATCH 02/12] gfs2: Remove pointless BUG_ON Andreas Gruenbacher
2017-12-22 14:34 ` [Cluster-devel] [PATCH 03/12] gfs2: Clean up trunc_start error path Andreas Gruenbacher
2018-01-08 20:12   ` Bob Peterson
2018-01-08 20:50     ` Bob Peterson
2017-12-22 14:34 ` [Cluster-devel] [PATCH 04/12] gfs2: truncate: Remove unnecessary oldsize parameters Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 05/12] gfs2: Remove minor gfs2_journaled_truncate inefficiencies Andreas Gruenbacher
2018-01-09 13:53   ` Bob Peterson
2018-01-09 14:21     ` Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 06/12] gfs2: Clean up {lookup, fillup}_metapath Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 07/12] gfs2: Fix metadata read-ahead during truncate Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 08/12] gfs2: Improve non-recursive delete algorithm Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 09/12] Turn gfs2_block_truncate_page into gfs2_block_zero_range Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 10/12] gfs2: Generalize truncate code Andreas Gruenbacher
2018-01-09 15:45   ` Bob Peterson
2018-01-09 17:14     ` Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 11/12] gfs2: Turn trunc_dealloc into punch_hole Andreas Gruenbacher
2018-01-12 17:38   ` Andreas Gruenbacher
2017-12-22 14:35 ` [Cluster-devel] [PATCH 12/12] gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE) Andreas Gruenbacher
2018-01-16 17:51 ` [Cluster-devel] [PATCH 00/12] gfs2: punch hole Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.