All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2
@ 2011-10-20 21:08 Dmitry Monakhov
  2011-10-20 21:08 ` [PATCH 1/6] ext4: cleanup ext4_ext_grow_indepth code Dmitry Monakhov
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-20 21:08 UTC (permalink / raw)
  To: linux-ext4, tytso, achender; +Cc: Dmitry Monakhov

Changes from V1
 - patch queue prepared against ext4.git dcf2d804ed6ffe
 - add cleanup for ext_rm_leaf
                                                                                                 
LOG:                                                                                            
#Following two patches are mostly cleanup 
ext4-cleanup-ext4_ext_grow_indepth-code.patch
ext4-move-inode-indepth-shrink-logic-to-didicated-fu.patch
# Later patches are fixes for real issues.
ext4-Do-not-clear-EOFBLOCKS_FL-too-soon.patch
ext4-remove-messy-logic-from-ext4_ext_rm_leaf.patch
ext4-fix-punch_hole-extend-handler.patch
ext4-update-EOFBLOCKS-flag-on-fallocate-properly.patch

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] ext4: cleanup ext4_ext_grow_indepth code
  2011-10-20 21:08 [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2 Dmitry Monakhov
@ 2011-10-20 21:08 ` Dmitry Monakhov
  2011-10-22  5:27   ` Ted Ts'o
  2011-10-20 21:08 ` [PATCH 2/6] ext4: move inode indepth shrink logic to didicated function Dmitry Monakhov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-20 21:08 UTC (permalink / raw)
  To: linux-ext4, tytso, achender; +Cc: Dmitry Monakhov

Currently code make an impression what grow procedure is very complicated
and some mythical paths, blocks are involved. But in fact grow in depth
it relatively simple procedure:
 1) Just create new meta block and copy roots content to it
 2) Convert root from extent to index if old depth == 0
 3) update root block pointer

This patch does:
 - Reorganize code to make it more self explanatory
 - Do not pass path parameter to new_meta_block() in order to
   provoke allocation from inode's group because top-level block
   should site closer to it's inode, but not to leaf data block.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/extents.c |   41 +++++++++++++++--------------------------
 1 files changed, 15 insertions(+), 26 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 2759357..8eb3656 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1045,16 +1045,14 @@ cleanup:
  */
 static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
 				 unsigned int flags,
-				 struct ext4_ext_path *path,
 				 struct ext4_extent *newext)
 {
-	struct ext4_ext_path *curp = path;
 	struct ext4_extent_header *neh;
 	struct buffer_head *bh;
 	ext4_fsblk_t newblock;
 	int err = 0;
 
-	newblock = ext4_ext_new_meta_block(handle, inode, path,
+	newblock = ext4_ext_new_meta_block(handle, inode, NULL,
 		newext, &err, flags);
 	if (newblock == 0)
 		return err;
@@ -1074,7 +1072,8 @@ static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
 	}
 
 	/* move top-level index/leaf into new block */
-	memmove(bh->b_data, curp->p_hdr, sizeof(EXT4_I(inode)->i_data));
+	memmove(bh->b_data, EXT4_I(inode)->i_data,
+		sizeof(EXT4_I(inode)->i_data));
 
 	/* set size of new block */
 	neh = ext_block_hdr(bh);
@@ -1092,32 +1091,23 @@ static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
 	if (err)
 		goto out;
 
-	/* create index in new top-level index: num,max,pointer */
-	err = ext4_ext_get_access(handle, inode, curp);
-	if (err)
-		goto out;
-
-	curp->p_hdr->eh_magic = EXT4_EXT_MAGIC;
-	curp->p_hdr->eh_max = cpu_to_le16(ext4_ext_space_root_idx(inode, 0));
-	curp->p_hdr->eh_entries = cpu_to_le16(1);
-	curp->p_idx = EXT_FIRST_INDEX(curp->p_hdr);
-
-	if (path[0].p_hdr->eh_depth)
-		curp->p_idx->ei_block =
-			EXT_FIRST_INDEX(path[0].p_hdr)->ei_block;
-	else
-		curp->p_idx->ei_block =
-			EXT_FIRST_EXTENT(path[0].p_hdr)->ee_block;
-	ext4_idx_store_pblock(curp->p_idx, newblock);
-
+	/* Update top-level index: num,max,pointer */
 	neh = ext_inode_hdr(inode);
+	neh->eh_entries = cpu_to_le16(1);
+	ext4_idx_store_pblock(EXT_FIRST_INDEX(neh), newblock);
+	if (neh->eh_depth == 0) {
+		/* Root extent block becomes index block */
+		neh->eh_max = cpu_to_le16(ext4_ext_space_root_idx(inode, 0));
+		EXT_FIRST_INDEX(neh)->ei_block =
+			EXT_FIRST_EXTENT(neh)->ee_block;
+	}
 	ext_debug("new root: num %d(%d), lblock %d, ptr %llu\n",
 		  le16_to_cpu(neh->eh_entries), le16_to_cpu(neh->eh_max),
 		  le32_to_cpu(EXT_FIRST_INDEX(neh)->ei_block),
 		  ext4_idx_pblock(EXT_FIRST_INDEX(neh)));
 
-	neh->eh_depth = cpu_to_le16(path->p_depth + 1);
-	err = ext4_ext_dirty(handle, inode, curp);
+	neh->eh_depth = cpu_to_le16(neh->eh_depth + 1);
+	ext4_mark_inode_dirty(handle, inode);
 out:
 	brelse(bh);
 
@@ -1165,8 +1155,7 @@ repeat:
 			err = PTR_ERR(path);
 	} else {
 		/* tree is full, time to grow in depth */
-		err = ext4_ext_grow_indepth(handle, inode, flags,
-					    path, newext);
+		err = ext4_ext_grow_indepth(handle, inode, flags, newext);
 		if (err)
 			goto out;
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/6] ext4: move inode indepth shrink logic to didicated function
  2011-10-20 21:08 [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2 Dmitry Monakhov
  2011-10-20 21:08 ` [PATCH 1/6] ext4: cleanup ext4_ext_grow_indepth code Dmitry Monakhov
@ 2011-10-20 21:08 ` Dmitry Monakhov
  2011-10-25  8:01   ` Ted Ts'o
  2011-10-20 21:08 ` [PATCH 3/6] ext4: Do not clear EOFBLOCKS_FL too soon Dmitry Monakhov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-20 21:08 UTC (permalink / raw)
  To: linux-ext4, tytso, achender; +Cc: Dmitry Monakhov

- add ext4_ext_try_shrink helper
- ext4_mark_inode_dirty() called externally in order to allow
  caller to butch several inode updates in to one mark_dirty call.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/extents.c |   59 ++++++++++++++++++++++------------------------------
 1 files changed, 25 insertions(+), 34 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 8eb3656..315775e 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2570,6 +2570,22 @@ ext4_ext_more_to_rm(struct ext4_ext_path *path)
 	return 1;
 }
 
+static int ext4_ext_try_shrink(handle_t *handle, struct inode *inode)
+{
+	/* TODO: flexible tree reduction should be here */
+	if (ext_depth(inode) && ext_inode_hdr(inode)->eh_entries == 0) {
+		/*
+		 * truncate to zero freed all the tree,
+		 * so we need to correct eh_depth
+		 */
+		ext_inode_hdr(inode)->eh_depth = 0;
+		ext_inode_hdr(inode)->eh_max =
+			cpu_to_le16(ext4_ext_space_root(inode, 0));
+		return 1;
+	}
+	return 0;
+}
+
 static int ext4_ext_remove_space(struct inode *inode, ext4_lblk_t start)
 {
 	struct super_block *sb = inode->i_sb;
@@ -2577,7 +2593,7 @@ static int ext4_ext_remove_space(struct inode *inode, ext4_lblk_t start)
 	struct ext4_ext_path *path;
 	ext4_fsblk_t partial_cluster = 0;
 	handle_t *handle;
-	int i, err;
+	int i, err, err2;
 
 	ext_debug("truncate since %u\n", start);
 
@@ -2703,29 +2719,18 @@ again:
 				 EXT4_SB(sb)->s_cluster_ratio, flags);
 		partial_cluster = 0;
 	}
-
-	/* TODO: flexible tree reduction should be here */
-	if (path->p_hdr->eh_entries == 0) {
-		/*
-		 * truncate to zero freed all the tree,
-		 * so we need to correct eh_depth
-		 */
-		err = ext4_ext_get_access(handle, inode, path);
-		if (err == 0) {
-			ext_inode_hdr(inode)->eh_depth = 0;
-			ext_inode_hdr(inode)->eh_max =
-				cpu_to_le16(ext4_ext_space_root(inode, 0));
-			err = ext4_ext_dirty(handle, inode, path);
-		}
-	}
+	if(ext4_ext_try_shrink(handle, inode))
+		err2 = ext4_mark_inode_dirty(handle, inode);
+	if (!err)
+		err = err2;
 out:
 	ext4_ext_drop_refs(path);
 	kfree(path);
 	if (err == -EAGAIN)
 		goto again;
-	ext4_journal_stop(handle);
+	err2 = ext4_journal_stop(handle);
 
-	return err;
+	return err ? err : err2;
 }
 
 /*
@@ -3965,22 +3970,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 					       &partial_cluster, map->m_lblk,
 					       map->m_lblk + punched_out);
 
-			if (!err && path->p_hdr->eh_entries == 0) {
-				/*
-				 * Punch hole freed all of this sub tree,
-				 * so we need to correct eh_depth
-				 */
-				err = ext4_ext_get_access(handle, inode, path);
-				if (err == 0) {
-					ext_inode_hdr(inode)->eh_depth = 0;
-					ext_inode_hdr(inode)->eh_max =
-					cpu_to_le16(ext4_ext_space_root(
-						inode, 0));

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/6] ext4: Do not clear EOFBLOCKS_FL too soon
  2011-10-20 21:08 [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2 Dmitry Monakhov
  2011-10-20 21:08 ` [PATCH 1/6] ext4: cleanup ext4_ext_grow_indepth code Dmitry Monakhov
  2011-10-20 21:08 ` [PATCH 2/6] ext4: move inode indepth shrink logic to didicated function Dmitry Monakhov
@ 2011-10-20 21:08 ` Dmitry Monakhov
  2011-10-25  8:18   ` Ted Ts'o
  2011-10-20 21:08 ` [PATCH 4/6] ext4: remove messy logic from ext4_ext_rm_leaf Dmitry Monakhov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-20 21:08 UTC (permalink / raw)
  To: linux-ext4, tytso, achender; +Cc: Dmitry Monakhov

ext4_ext_insert_extent() may fail due to number of reasons (ENOSPC)
so let's update eof flag only after extent was successfully inserted.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/extents.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 315775e..d81085e 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4115,10 +4115,7 @@ got_allocated_blocks:
 			map->m_flags |= EXT4_MAP_UNINIT;
 	}
 
-	err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
-	if (!err)
-		err = ext4_ext_insert_extent(handle, inode, path,
-					     &newex, flags);
+	err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
 	if (err && free_on_err) {
 		int fb_flags = flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE ?
 			EXT4_FREE_BLOCKS_NO_QUOT_UPDATE : 0;
@@ -4130,6 +4127,11 @@ got_allocated_blocks:
 				 ext4_ext_get_actual_len(&newex), fb_flags);
 		goto out2;
 	}
+	err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
+	if (err) {
+		ext4_discard_preallocations(inode);
+		goto out2;
+	}
 
 	/* previous routine could use block we allocated */
 	newblock = ext4_ext_pblock(&newex);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/6] ext4: remove messy logic from ext4_ext_rm_leaf
  2011-10-20 21:08 [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2 Dmitry Monakhov
                   ` (2 preceding siblings ...)
  2011-10-20 21:08 ` [PATCH 3/6] ext4: Do not clear EOFBLOCKS_FL too soon Dmitry Monakhov
@ 2011-10-20 21:08 ` Dmitry Monakhov
  2011-10-25 11:44   ` Ted Ts'o
  2011-10-20 21:08 ` [PATCH 5/6] ext4: fix punch_hole extend handler Dmitry Monakhov
  2011-10-20 21:08 ` [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly Dmitry Monakhov
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-20 21:08 UTC (permalink / raw)
  To: linux-ext4, tytso, achender; +Cc: Dmitry Monakhov

- Both callers(truncate and punch_hole) already aligned left end point
  so we no longer need split logic here.
- Remove dead duplicated code.
- Call ext4_ext_dirty only after we have updated eh_entries, otherwise
  we'll loose entries update. Regression caused by d583fb87a3ff0
  266'th testcase in xfstests (http://patchwork.ozlabs.org/patch/120872)

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/extents.c |  103 ++++++----------------------------------------------
 1 files changed, 12 insertions(+), 91 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index d81085e..963f883 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2306,13 +2306,12 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 	int err = 0, correct_index = 0;
 	int depth = ext_depth(inode), credits;
 	struct ext4_extent_header *eh;
-	ext4_lblk_t a, b, block;
+	ext4_lblk_t a, b;
 	unsigned num;
 	ext4_lblk_t ex_ee_block;
 	unsigned short ex_ee_len;
 	unsigned uninitialized = 0;
 	struct ext4_extent *ex;
-	struct ext4_map_blocks map;
 
 	/* the header must be checked already in ext4_ext_remove_space() */
 	ext_debug("truncate since %u in leaf\n", start);
@@ -2355,86 +2354,18 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 			ex_ee_block = le32_to_cpu(ex->ee_block);
 			ex_ee_len = ext4_ext_get_actual_len(ex);
 			continue;
-		} else if (a != ex_ee_block &&
-			b != ex_ee_block + ex_ee_len - 1) {
-			/*
-			 * If this is a truncate, then this condition should
-			 * never happen because at least one of the end points
-			 * needs to be on the edge of the extent.
-			 */
-			if (end == EXT_MAX_BLOCKS - 1) {
-				ext_debug("  bad truncate %u:%u\n",
-						start, end);
-				block = 0;
-				num = 0;
-				err = -EIO;
-				goto out;
-			}
-			/*
-			 * else this is a hole punch, so the extent needs to
-			 * be split since neither edge of the hole is on the
-			 * extent edge
-			 */
-			else{
-				map.m_pblk = ext4_ext_pblock(ex);
-				map.m_lblk = ex_ee_block;
-				map.m_len = b - ex_ee_block;
-
-				err = ext4_split_extent(handle,
-					inode, path, &map, 0,
-					EXT4_GET_BLOCKS_PUNCH_OUT_EXT |
-					EXT4_GET_BLOCKS_PRE_IO);
-
-				if (err < 0)
-					goto out;
-
-				ex_ee_len = ext4_ext_get_actual_len(ex);
-
-				b = ex_ee_block+ex_ee_len - 1 < end ?
-					ex_ee_block+ex_ee_len - 1 : end;
-
-				/* Then remove tail of this extent */
-				block = ex_ee_block;
-				num = a - block;
-			}
+		} else if (b != ex_ee_block + ex_ee_len - 1) {
+			EXT4_ERROR_INODE(inode,"  bad truncate %u:%u\n",
+					 start, end);
+			err = -EIO;
+			goto out;
 		} else if (a != ex_ee_block) {
 			/* remove tail of the extent */
-			block = ex_ee_block;
-			num = a - block;
-		} else if (b != ex_ee_block + ex_ee_len - 1) {
-			/* remove head of the extent */
-			block = b;
-			num =  ex_ee_block + ex_ee_len - b;
-
-			/*
-			 * If this is a truncate, this condition
-			 * should never happen
-			 */
-			if (end == EXT_MAX_BLOCKS - 1) {
-				ext_debug("  bad truncate %u:%u\n",
-					start, end);
-				err = -EIO;
-				goto out;
-			}
+			num = a - ex_ee_block;
 		} else {
 			/* remove whole extent: excellent! */
-			block = ex_ee_block;
 			num = 0;
-			if (a != ex_ee_block) {
-				ext_debug("  bad truncate %u:%u\n",
-					start, end);
-				err = -EIO;
-				goto out;
-			}
-
-			if (b != ex_ee_block + ex_ee_len - 1) {
-				ext_debug("  bad truncate %u:%u\n",
-					start, end);
-				err = -EIO;
-				goto out;
-			}
 		}
-
 		/*
 		 * 3 for leaf, sb, and inode plus 2 (bmap and group
 		 * descriptor) for each block group; assume two block
@@ -2464,19 +2395,10 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 				goto out;
 		}
 
-		if (num == 0) {
+		if (num == 0)
 			/* this extent is removed; mark slot entirely unused */
 			ext4_ext_store_pblock(ex, 0);
-		} else if (block != ex_ee_block) {
-			/*
-			 * If this was a head removal, then we need to update
-			 * the physical block since it is now at a different
-			 * location
-			 */
-			ext4_ext_store_pblock(ex, ext4_ext_pblock(ex) + (b-a));
-		}
 
-		ex->ee_block = cpu_to_le32(block);
 		ex->ee_len = cpu_to_le16(num);
 		/*
 		 * Do not mark uninitialized if all the blocks in the
@@ -2484,11 +2406,6 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
 		 */
 		if (uninitialized && num)
 			ext4_ext_mark_uninitialized(ex);
-
-		err = ext4_ext_dirty(handle, inode, path + depth);
-		if (err)
-			goto out;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/6] ext4: fix punch_hole extend handler
  2011-10-20 21:08 [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2 Dmitry Monakhov
                   ` (3 preceding siblings ...)
  2011-10-20 21:08 ` [PATCH 4/6] ext4: remove messy logic from ext4_ext_rm_leaf Dmitry Monakhov
@ 2011-10-20 21:08 ` Dmitry Monakhov
  2011-10-25 12:05   ` Ted Ts'o
  2011-10-20 21:08 ` [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly Dmitry Monakhov
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-20 21:08 UTC (permalink / raw)
  To: linux-ext4, tytso, achender; +Cc: Dmitry Monakhov

Current implementation has following issues:
 - EOFBLOCK does not changed if necessery
 - ext4_ext_rm_leaf() may return -EAGAIN due to transaction restart
 - punched extent converted to uninitialized incorrectly, i dont understand
   why do we ever have to do that conversion, but once we do it let's do it
   in a right way.
 - fsync() logic is broken because ei->i_sync_tid was not updated
 - Last but not the least all: punch hole logic is sited directly
   in ext4_ext_map_blocks() on 3rd level of control, IMHO one can
   easily screw-up his eyes while invastigating that code. We have
   nothing to hide aren't we?

This patch performs following changes:
 - Move punch hole logic to didicated function similar to uninitialized
   extent handlers.
 - Clear EOFBLOCK if necessery, unfortunately we can not reuse
   check_eofblock_fl() function because it purpose to handle file
   expansion, but in our case we have to recheck base invariant that:
      clear_eof_flag = (eof_block >= last_allocated_block)
 - Repeat punch hole after transaction restart.
 - Update inode sync transaction id on exit.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/extents.c |  208 +++++++++++++++++++++++++++++++++--------------------
 1 files changed, 130 insertions(+), 78 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 963f883..877e61b 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3219,6 +3219,126 @@ static void unmap_underlying_metadata_blocks(struct block_device *bdev,
                 unmap_underlying_metadata(bdev, block + i);
 }
 
+static int
+ext4_ext_handle_punched_extent(handle_t *handle, struct inode *inode,
+			struct ext4_map_blocks *map,
+			struct ext4_ext_path *path)
+{
+	struct ext4_extent *ex = path[path->p_depth].p_ext;
+	ext4_lblk_t ee_block =  ext4_ext_get_actual_len(ex);
+	unsigned short ee_len = le32_to_cpu(ex->ee_block);
+	struct ext4_map_blocks punch_map;
+	ext4_fsblk_t partial_cluster = 0;
+	unsigned int punched_out = 0;
+   	int err, inode_dirty = 0;
+
+	/* Punch out the map length, but only to the end of the extent */
+	punched_out = ee_len - (map->m_lblk - ee_block);
+	if (punched_out > map->m_len)
+		punched_out = map->m_len;
+	/*
+	 * Sense extents need to be converted to uninitialized, they must
+	 * fit in an uninitialized extent
+	 */
+	if (punched_out > EXT_UNINIT_MAX_LEN)
+		punched_out = EXT_UNINIT_MAX_LEN;
+
+	punch_map.m_lblk = map->m_lblk;
+	punch_map.m_pblk = map->m_lblk - ee_block + ext4_ext_pblock(ex);;
+	punch_map.m_len = punched_out;
+	punch_map.m_flags = 0;
+
+	/* Check to see if the extent needs to be split */
+	if (punch_map.m_len != ee_len || punch_map.m_lblk != ee_block) {
+		err = ext4_split_extent(handle, inode,	path, &punch_map, 0,
+					EXT4_GET_BLOCKS_PUNCH_OUT_EXT |
+					EXT4_GET_BLOCKS_PRE_IO);
+		if (err < 0) {
+			goto out;
+		}
+		/* find extent for the block at the start of the hole */
+		ext4_ext_drop_refs(path);
+		kfree(path);
+
+		path = ext4_ext_find_extent(inode, map->m_lblk, NULL);
+		if (IS_ERR(path)) {
+			err = PTR_ERR(path);
+			path = NULL;
+			goto out;
+		}
+		ex = path[path->p_depth].p_ext;
+		ee_len = ext4_ext_get_actual_len(ex);
+		ee_block = le32_to_cpu(ex->ee_block);
+	}
+
+	err = ext4_ext_get_access(handle, inode, path + path->p_depth);
+	if (err)
+		return err;
+	ext4_ext_mark_uninitialized(ex);
+	err = ext4_ext_dirty(handle, inode, path + path->p_depth);
+	if (err)
+		goto out;
+	ext4_ext_invalidate_cache(inode);
+	err = ext4_ext_rm_leaf(handle, inode, path, &partial_cluster,
+			       map->m_lblk, map->m_lblk + punched_out);
+	if (err)
+		goto out;
+
+	inode_dirty = ext4_ext_try_shrink(handle, inode);
+	/* We have punched out an extent, if it was the only extent beyond
+	 * i_size  eofblocks flag should be cleared.*/
+	if (ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS)) {
+		ext4_fsblk_t eof_block =
+			(inode->i_size + (1 << inode->i_blkbits) - 1) >>
+				inode->i_blkbits;
+		/* find the latest extent */
+		ext4_ext_drop_refs(path);
+		kfree(path);
+		path = ext4_ext_find_extent(inode, EXT_MAX_BLOCKS -1, NULL);
+		if (IS_ERR(path)) {
+			err = PTR_ERR(path);
+			path = NULL;
+			goto out;
+		}
+		ex = path[path->p_depth].p_ext;
+		if (ex) {
+			ee_len = ext4_ext_get_actual_len(ex);
+			ee_block = le32_to_cpu(ex->ee_block);
+		} else {
+			/* Inode is empty */
+			ee_block = ee_len = 0;
+		}
+		if (eof_block >= ee_block + ee_len) {
+			ext4_clear_inode_flag(inode, EXT4_INODE_EOFBLOCKS);
+			inode_dirty = 1;
+		} else if (!ext4_ext_is_uninitialized(ex)) {
+			EXT4_ERROR_INODE(inode, "initialized extent beyond "
+					"EOF i_size: %lld, ex[%u:%u] "
+					"depth: %d pblock %lld",
+					inode->i_size, ee_block, ee_len,
+					path->p_depth,
+					path[path->p_depth].p_block);
+			err = -EIO;
+			/* Continue, because inode shrink should be
+			 * accomplished regardless to staled eof blocks */
+		}
+	}
+	if (inode_dirty) {
+		int err2 = ext4_mark_inode_dirty(handle, inode);
+		if (!err)
+			err = err2;
+	}
+out:
+	ext4_update_inode_fsync_trans(handle, inode, 0);
+	if (path) {
+		ext4_ext_drop_refs(path);
+		kfree(path);
+	}
+	return err ? err : punched_out;
+}
+
+
+
 /*
  * Handle EOFBLOCKS_FL flag, clearing it if necessary
  */
@@ -3715,24 +3835,24 @@ static int get_implied_cluster_alloc(struct super_block *sb,
 int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 			struct ext4_map_blocks *map, int flags)
 {
-	struct ext4_ext_path *path = NULL;
+	struct ext4_ext_path *path;
 	struct ext4_extent newex, *ex, *ex2;
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	ext4_fsblk_t newblock = 0;
 	int free_on_err = 0, err = 0, depth, ret;
 	unsigned int allocated = 0, offset = 0;
 	unsigned int allocated_clusters = 0, reserved_clusters = 0;
-	unsigned int punched_out = 0;
-	unsigned int result = 0;
 	struct ext4_allocation_request ar;
 	ext4_io_end_t *io = EXT4_I(inode)->cur_aio_dio;
 	ext4_lblk_t cluster_offset;
-	struct ext4_map_blocks punch_map;
 
 	ext_debug("blocks %u/%u requested for inode %lu\n",
 		  map->m_lblk, map->m_len, inode->i_ino);
 	trace_ext4_ext_map_blocks_enter(inode, map->m_lblk, map->m_len, flags);
 
+again:
+	path = NULL;
+
 	/* check in cache */
 	if (!(flags & EXT4_GET_BLOCKS_PUNCH_OUT_EXT) &&
 		ext4_ext_in_cache(inode, map->m_lblk, &newex)) {
@@ -3803,8 +3923,6 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 
 		/* if found extent covers block, simply return it */
 		if (in_range(map->m_lblk, ee_block, ee_len)) {
-			ext4_fsblk_t partial_cluster = 0;
-
 			newblock = map->m_lblk - ee_block + ee_start;
 			/* number of remaining blocks in the extent */
 			allocated = ee_len - (map->m_lblk - ee_block);
@@ -3826,74 +3944,11 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 					allocated, newblock);
 				return ret;
 			}
-
-			/*
-			 * Punch out the map length, but only to the
-			 * end of the extent
-			 */
-			punched_out = allocated < map->m_len ?
-				allocated : map->m_len;
-
-			/*
-			 * Sense extents need to be converted to
-			 * uninitialized, they must fit in an
-			 * uninitialized extent
-			 */
-			if (punched_out > EXT_UNINIT_MAX_LEN)
-				punched_out = EXT_UNINIT_MAX_LEN;
-
-			punch_map.m_lblk = map->m_lblk;
-			punch_map.m_pblk = newblock;
-			punch_map.m_len = punched_out;
-			punch_map.m_flags = 0;
-
-			/* Check to see if the extent needs to be split */
-			if (punch_map.m_len != ee_len ||
-				punch_map.m_lblk != ee_block) {
-
-				ret = ext4_split_extent(handle, inode,
-				path, &punch_map, 0,
-				EXT4_GET_BLOCKS_PUNCH_OUT_EXT |
-				EXT4_GET_BLOCKS_PRE_IO);
-
-				if (ret < 0) {
-					err = ret;
-					goto out2;
-				}
-				/*
-				 * find extent for the block at
-				 * the start of the hole
-				 */
-				ext4_ext_drop_refs(path);
-				kfree(path);
-
-				path = ext4_ext_find_extent(inode,
-				map->m_lblk, NULL);
-				if (IS_ERR(path)) {
-					err = PTR_ERR(path);
-					path = NULL;
-					goto out2;
-				}
-
-				depth = ext_depth(inode);
-				ex = path[depth].p_ext;
-				ee_len = ext4_ext_get_actual_len(ex);
-				ee_block = le32_to_cpu(ex->ee_block);
-				ee_start = ext4_ext_pblock(ex);
-
-			}
-
-			ext4_ext_mark_uninitialized(ex);
-
-			ext4_ext_invalidate_cache(inode);
-
-			err = ext4_ext_rm_leaf(handle, inode, path,
-					       &partial_cluster, map->m_lblk,
-					       map->m_lblk + punched_out);
-
-			if (!err && ext4_ext_try_shrink(handle, inode))
-				err = ext4_mark_inode_dirty(handle, inode);
-
+			ret = ext4_ext_handle_punched_extent(handle, inode,
+							map, path);
+			if (ret == -EAGAIN)
+				goto again;
+			return ret;
 			goto out2;
 		}
 	}
@@ -4165,10 +4220,7 @@ out2:
 	trace_ext4_ext_map_blocks_exit(inode, map->m_lblk,
 		newblock, map->m_len, err ? err : allocated);
 
-	result = (flags & EXT4_GET_BLOCKS_PUNCH_OUT_EXT) ?
-			punched_out : allocated;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly
  2011-10-20 21:08 [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2 Dmitry Monakhov
                   ` (4 preceding siblings ...)
  2011-10-20 21:08 ` [PATCH 5/6] ext4: fix punch_hole extend handler Dmitry Monakhov
@ 2011-10-20 21:08 ` Dmitry Monakhov
  2011-10-25 12:22   ` Ted Ts'o
  2011-10-28 21:40   ` Andreas Dilger
  5 siblings, 2 replies; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-20 21:08 UTC (permalink / raw)
  To: linux-ext4, tytso, achender; +Cc: Dmitry Monakhov

EOFBLOCK_FL should be updated if called w/o FALLOCATE_FL_KEEP_SIZE
Currently it happens only if new extent was allocated
TESTCASE:
fallocate test_file -n -l4096
fallocate test_file -l4096
Last fallocate cmd has updated size, but keept EOFBLOCK_FL set. And fsck
will complain about that.

Also remove ping pong in ext4_fallocate() in case of new extents, where
ext4_ext_map_blocks() clear EOFBLOCKS bit, and later
ext4_falloc_update_inode() restore it again.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/ext4/ext4.h    |    2 ++
 fs/ext4/extents.c |   34 +++++++++++++++++++---------------
 fs/ext4/inode.c   |    6 ++++--
 3 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 5655582..60c6a76 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -542,6 +542,8 @@ struct ext4_new_group_data {
 #define EXT4_GET_BLOCKS_PUNCH_OUT_EXT		0x0020
 	/* Don't normalize allocation size (used for fallocate) */
 #define EXT4_GET_BLOCKS_NO_NORMALIZE		0x0040
+	/* Request will not result in inode size update (user for fallocate) */
+#define EXT4_GET_BLOCKS_KEEP_SIZE		0x0080
 
 /*
  * Flags used by ext4_free_blocks
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 877e61b..8fab23b 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3647,14 +3647,8 @@ ext4_ext_handle_uninitialized_extents(handle_t *handle, struct inode *inode,
 
 	/* buffered write, writepage time, convert*/
 	ret = ext4_ext_convert_to_initialized(handle, inode, map, path);
-	if (ret >= 0) {
+	if (ret >= 0)
 		ext4_update_inode_fsync_trans(handle, inode, 1);
-		err = check_eofblocks_fl(handle, inode, map->m_lblk, path,
-					 map->m_len);
-		if (err < 0)
-			goto out2;
-	}
-
 out:
 	if (ret <= 0) {
 		err = ret;
@@ -3695,6 +3689,12 @@ out:
 
 map_out:
 	map->m_flags |= EXT4_MAP_MAPPED;
+	if ((flags & EXT4_GET_BLOCKS_KEEP_SIZE) == 0) {
+		err = check_eofblocks_fl(handle, inode, map->m_lblk, path,
+					 map->m_len);
+		if (err < 0)
+			goto out2;
+	}
 out1:
 	if (allocated > map->m_len)
 		allocated = map->m_len;
@@ -4103,12 +4103,13 @@ got_allocated_blocks:
 				 ext4_ext_get_actual_len(&newex), fb_flags);
 		goto out2;
 	}
-	err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
-	if (err) {
-		ext4_discard_preallocations(inode);
-		goto out2;
+	if ((flags & EXT4_GET_BLOCKS_KEEP_SIZE) == 0) {
+		err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
+		if (err) {
+			ext4_discard_preallocations(inode);
+			goto out2;
+		}
 	}
-
 	/* previous routine could use block we allocated */
 	newblock = ext4_ext_pblock(&newex);
 	allocated = ext4_ext_get_actual_len(&newex);
@@ -4349,6 +4350,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	int ret = 0;
 	int ret2 = 0;
 	int retries = 0;
+	int flags;
 	struct ext4_map_blocks map;
 	unsigned int credits, blkbits = inode->i_blkbits;
 
@@ -4385,6 +4387,10 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 		trace_ext4_fallocate_exit(inode, offset, max_blocks, ret);
 		return ret;
 	}
+	flags = EXT4_GET_BLOCKS_CREATE_UNINIT_EXT |
+		EXT4_GET_BLOCKS_NO_NORMALIZE;
+	if (mode & FALLOC_FL_KEEP_SIZE)
+		flags |= EXT4_GET_BLOCKS_KEEP_SIZE;
 retry:
 	while (ret >= 0 && ret < max_blocks) {
 		map.m_lblk = map.m_lblk + ret;
@@ -4394,9 +4400,7 @@ retry:
 			ret = PTR_ERR(handle);
 			break;
 		}
-		ret = ext4_map_blocks(handle, inode, &map,
-				      EXT4_GET_BLOCKS_CREATE_UNINIT_EXT |
-				      EXT4_GET_BLOCKS_NO_NORMALIZE);
+		ret = ext4_map_blocks(handle, inode, &map, flags);
 		if (ret <= 0) {
 #ifdef EXT4FS_DEBUG
 			WARN_ON(ret <= 0);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1380cd2..8efb3ab 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -477,9 +477,11 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
 	 */
 	down_read((&EXT4_I(inode)->i_data_sem));
 	if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
-		retval = ext4_ext_map_blocks(handle, inode, map, 0);
+		retval = ext4_ext_map_blocks(handle, inode, map, flags &
+					     EXT4_GET_BLOCKS_KEEP_SIZE);
 	} else {
-		retval = ext4_ind_map_blocks(handle, inode, map, 0);
+		retval = ext4_ind_map_blocks(handle, inode, map, flags &
+					     EXT4_GET_BLOCKS_KEEP_SIZE);
 	}
 	up_read((&EXT4_I(inode)->i_data_sem));
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] ext4: cleanup ext4_ext_grow_indepth code
  2011-10-20 21:08 ` [PATCH 1/6] ext4: cleanup ext4_ext_grow_indepth code Dmitry Monakhov
@ 2011-10-22  5:27   ` Ted Ts'o
  0 siblings, 0 replies; 18+ messages in thread
From: Ted Ts'o @ 2011-10-22  5:27 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, achender

On Fri, Oct 21, 2011 at 01:08:54AM +0400, Dmitry Monakhov wrote:
> Currently code make an impression what grow procedure is very complicated
> and some mythical paths, blocks are involved. But in fact grow in depth
> it relatively simple procedure:
>  1) Just create new meta block and copy roots content to it
>  2) Convert root from extent to index if old depth == 0
>  3) update root block pointer
> 

Applied, thanks.

BTW: 

>  - Do not pass path parameter to new_meta_block() in order to
>    provoke allocation from inode's group because top-level block
>    should site closer to it's inode, but not to leaf data block.

This happens anyway, due to logic in mballoc; as a result the path
parameter in ext4_ext_new_meta_block() is really pointless, and should
be dropped.

						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] ext4: move inode indepth shrink logic to didicated function
  2011-10-20 21:08 ` [PATCH 2/6] ext4: move inode indepth shrink logic to didicated function Dmitry Monakhov
@ 2011-10-25  8:01   ` Ted Ts'o
  2011-10-28 10:47     ` Dmitry Monakhov
  0 siblings, 1 reply; 18+ messages in thread
From: Ted Ts'o @ 2011-10-25  8:01 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, achender

On Fri, Oct 21, 2011 at 01:08:55AM +0400, Dmitry Monakhov wrote:
> - add ext4_ext_try_shrink helper
> - ext4_mark_inode_dirty() called externally in order to allow
>   caller to butch several inode updates in to one mark_dirty call.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>

This patch is broken in two ways:

1) It drops the absolutely necessary calls to ext4_ext_get_access()
    and ext4_ext_dirty().  If you don't do this you will get file
    system corruptions.

2) Some of the callers to the new ext4_ext_try_shrink() helper depends
   on it return 0 or 1 depending on whether the tree was shrunk, but
   others assumed that it would return an error code.  Which is OK,
   since the error codes should be negative, but that means it's
   critical that the callers check to see whether return code is
   really an error before returning it.

Since this is just a cleanup, I'm going to skip this for now.  Dmitry,
could you fix this up and resubmit?   Thanks!!

						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] ext4: Do not clear EOFBLOCKS_FL too soon
  2011-10-20 21:08 ` [PATCH 3/6] ext4: Do not clear EOFBLOCKS_FL too soon Dmitry Monakhov
@ 2011-10-25  8:18   ` Ted Ts'o
  2011-10-25 11:57     ` Dmitry Monakhov
  0 siblings, 1 reply; 18+ messages in thread
From: Ted Ts'o @ 2011-10-25  8:18 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, achender

On Fri, Oct 21, 2011 at 01:08:56AM +0400, Dmitry Monakhov wrote:
> ext4_ext_insert_extent() may fail due to number of reasons (ENOSPC)
> so let's update eof flag only after extent was successfully inserted.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>

The problem with this patch is that if the check_eofblocks_fl() fails,
the patch jumps to out: and doesn't undo the block allocation and
extent tree manipulation.

I suspect a better way of solving this problem is to keep the existing
order, but to save the state of eofblocks flag, and if
ext4_ext_insert_extent() fails, to restore the state of the eofblocks
flag before jumping to the exit routine.

					- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] ext4: remove messy logic from ext4_ext_rm_leaf
  2011-10-20 21:08 ` [PATCH 4/6] ext4: remove messy logic from ext4_ext_rm_leaf Dmitry Monakhov
@ 2011-10-25 11:44   ` Ted Ts'o
  0 siblings, 0 replies; 18+ messages in thread
From: Ted Ts'o @ 2011-10-25 11:44 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, achender

On Fri, Oct 21, 2011 at 01:08:57AM +0400, Dmitry Monakhov wrote:
> - Both callers(truncate and punch_hole) already aligned left end point
>   so we no longer need split logic here.
> - Remove dead duplicated code.
> - Call ext4_ext_dirty only after we have updated eh_entries, otherwise
>   we'll loose entries update. Regression caused by d583fb87a3ff0
>   266'th testcase in xfstests (http://patchwork.ozlabs.org/patch/120872)
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>

Thanks, applied

					- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] ext4: Do not clear EOFBLOCKS_FL too soon
  2011-10-25  8:18   ` Ted Ts'o
@ 2011-10-25 11:57     ` Dmitry Monakhov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-25 11:57 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: linux-ext4, achender

On Tue, 25 Oct 2011 04:18:34 -0400, Ted Ts'o <tytso@mit.edu> wrote:
> On Fri, Oct 21, 2011 at 01:08:56AM +0400, Dmitry Monakhov wrote:
> > ext4_ext_insert_extent() may fail due to number of reasons (ENOSPC)
> > so let's update eof flag only after extent was successfully inserted.
> > 
> > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> The problem with this patch is that if the check_eofblocks_fl() fails,
> the patch jumps to out: and doesn't undo the block allocation and
> extent tree manipulation.
check_eofblocks_fl may fail only due to ext4_journal_get_write_access
or ext4_mark_iloc_dirty, this means that something serious happen,
so it is very unlikely we can undo block allocation.
> 
> I suspect a better way of solving this problem is to keep the existing
> order, but to save the state of eofblocks flag, and if
> ext4_ext_insert_extent() fails, to restore the state of the eofblocks
> flag before jumping to the exit routine.
Yes. This is definitely looks better, will redo.
> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] ext4: fix punch_hole extend handler
  2011-10-20 21:08 ` [PATCH 5/6] ext4: fix punch_hole extend handler Dmitry Monakhov
@ 2011-10-25 12:05   ` Ted Ts'o
  2011-10-25 12:35     ` Dmitry Monakhov
  0 siblings, 1 reply; 18+ messages in thread
From: Ted Ts'o @ 2011-10-25 12:05 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, achender

On Fri, Oct 21, 2011 at 01:08:58AM +0400, Dmitry Monakhov wrote:
> Current implementation has following issues:
>  - EOFBLOCK does not changed if necessery
>  - ext4_ext_rm_leaf() may return -EAGAIN due to transaction restart
>  - punched extent converted to uninitialized incorrectly, i dont understand
>    why do we ever have to do that conversion, but once we do it let's do it
>    in a right way.
>  - fsync() logic is broken because ei->i_sync_tid was not updated
>  - Last but not the least all: punch hole logic is sited directly
>    in ext4_ext_map_blocks() on 3rd level of control, IMHO one can
>    easily screw-up his eyes while invastigating that code. We have
>    nothing to hide aren't we?
> 
> This patch performs following changes:
>  - Move punch hole logic to didicated function similar to uninitialized
>    extent handlers.
>  - Clear EOFBLOCK if necessery, unfortunately we can not reuse
>    check_eofblock_fl() function because it purpose to handle file
>    expansion, but in our case we have to recheck base invariant that:
>       clear_eof_flag = (eof_block >= last_allocated_block)
>  - Repeat punch hole after transaction restart.
>  - Update inode sync transaction id on exit.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>

I'm really sorry, but this was too complicated for me to review for
this merge window.  Maybe Allison or some of the people who worked on
the punch code could review this?  Or maybe this could be broken up
into smaller patches?

Or I can look at this again for the next merge window, but this is too
complicated for me to understand (and the punch code is too new for me
to have an deep understanding for quick reviews).  Sorry I didn't
review this sooner; between trying to get e2fsprogs 1.42 ready for
release and kernel.org recovery efforts, I got beind on my reviews.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly
  2011-10-20 21:08 ` [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly Dmitry Monakhov
@ 2011-10-25 12:22   ` Ted Ts'o
  2011-10-28 21:40   ` Andreas Dilger
  1 sibling, 0 replies; 18+ messages in thread
From: Ted Ts'o @ 2011-10-25 12:22 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, achender

On Fri, Oct 21, 2011 at 01:08:59AM +0400, Dmitry Monakhov wrote:
> EOFBLOCK_FL should be updated if called w/o FALLOCATE_FL_KEEP_SIZE
> Currently it happens only if new extent was allocated
> TESTCASE:
> fallocate test_file -n -l4096
> fallocate test_file -l4096
> Last fallocate cmd has updated size, but keept EOFBLOCK_FL set. And fsck
> will complain about that.
> 
> Also remove ping pong in ext4_fallocate() in case of new extents, where
> ext4_ext_map_blocks() clear EOFBLOCKS bit, and later
> ext4_falloc_update_inode() restore it again.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>

Applied, thanks.

					- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] ext4: fix punch_hole extend handler
  2011-10-25 12:05   ` Ted Ts'o
@ 2011-10-25 12:35     ` Dmitry Monakhov
  2011-10-26 13:38       ` Ted Ts'o
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-25 12:35 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: linux-ext4, achender

On Tue, 25 Oct 2011 08:05:53 -0400, "Ted Ts'o" <tytso@mit.edu> wrote:
> On Fri, Oct 21, 2011 at 01:08:58AM +0400, Dmitry Monakhov wrote:
> > Current implementation has following issues:
> >  - EOFBLOCK does not changed if necessery
> >  - ext4_ext_rm_leaf() may return -EAGAIN due to transaction restart
> >  - punched extent converted to uninitialized incorrectly, i dont understand
> >    why do we ever have to do that conversion, but once we do it let's do it
> >    in a right way.
> >  - fsync() logic is broken because ei->i_sync_tid was not updated
> >  - Last but not the least all: punch hole logic is sited directly
> >    in ext4_ext_map_blocks() on 3rd level of control, IMHO one can
> >    easily screw-up his eyes while invastigating that code. We have
> >    nothing to hide aren't we?
> > 
> > This patch performs following changes:
> >  - Move punch hole logic to didicated function similar to uninitialized
> >    extent handlers.
> >  - Clear EOFBLOCK if necessery, unfortunately we can not reuse
> >    check_eofblock_fl() function because it purpose to handle file
> >    expansion, but in our case we have to recheck base invariant that:
> >       clear_eof_flag = (eof_block >= last_allocated_block)
> >  - Repeat punch hole after transaction restart.
> >  - Update inode sync transaction id on exit.
> > 
> > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> I'm really sorry, but this was too complicated for me to review for
> this merge window.  Maybe Allison or some of the people who worked on
> the punch code could review this?  Or maybe this could be broken up
> into smaller patches?
Yes. ;( you right, it is kind of non obvious patch.
i'll split this like follows:
1) move punch_hole handler from ext4_ext_map_blocks to dedicated
function as is (90% the original patch)
2) fix other issues one by one.
Will do
> 
> Or I can look at this again for the next merge window, but this is too
> complicated for me to understand (and the punch code is too new for me
> to have an deep understanding for quick reviews).  Sorry I didn't
> review this sooner; between trying to get e2fsprogs 1.42 ready for
> release and kernel.org recovery efforts, I got beind on my reviews.
BTW can you please highlight your plans to make libquota in to working
shape, right now it simply rewrites old quota file so fsck each time
result int fs modifications. I have some patches in my queue which add
quota check capabilities in libquota/e2fsck. 

> 
> Regards,
> 
> 						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] ext4: fix punch_hole extend handler
  2011-10-25 12:35     ` Dmitry Monakhov
@ 2011-10-26 13:38       ` Ted Ts'o
  0 siblings, 0 replies; 18+ messages in thread
From: Ted Ts'o @ 2011-10-26 13:38 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, achender

On Tue, Oct 25, 2011 at 04:35:45PM +0400, Dmitry Monakhov wrote:
> BTW can you please highlight your plans to make libquota in to working
> shape, right now it simply rewrites old quota file so fsck each time
> result int fs modifications. I have some patches in my queue which add
> quota check capabilities in libquota/e2fsck. 

I've requested that Aditya make changes to libquota so it doesn't
always rewrite the quota file at each fsck; I consider this a bug.
Unfortunately he hasn't had a chance to get to this yet.  If you have
patches which implement part of this, please feel free to submit your
patches.

I don't believe libquota is going to make sense the way it is
currently written to be an exported library.  It's very tightly
dependent on ext2fs, and it's not clear it would be useful to external
programs.  It would require major changes to the API so it could be
useful for other file systems.  Given that there are some other file
systems (like ocfs2) which are using the quota formats as internal
file, perhaps it would be worthwhile to abstract away the ext2fs
specific functionality.  But I'm not entirely convinced, and perhaps
we will just rename libquota to libinternal, assume that it will
always be built statically, and then move things like profile.c into
libinternal.a.

						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] ext4: move inode indepth shrink logic to didicated function
  2011-10-25  8:01   ` Ted Ts'o
@ 2011-10-28 10:47     ` Dmitry Monakhov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Monakhov @ 2011-10-28 10:47 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: linux-ext4, achender

On Tue, 25 Oct 2011 04:01:35 -0400, "Ted Ts'o" <tytso@mit.edu> wrote:
> On Fri, Oct 21, 2011 at 01:08:55AM +0400, Dmitry Monakhov wrote:
> > - add ext4_ext_try_shrink helper
> > - ext4_mark_inode_dirty() called externally in order to allow
> >   caller to butch several inode updates in to one mark_dirty call.
> > 
> > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> This patch is broken in two ways:
> 
> 1) It drops the absolutely necessary calls to ext4_ext_get_access()
>     and ext4_ext_dirty().  If you don't do this you will get file
>     system corruptions.
You almost caught me.. When i read this comment for the first time, i've
thought that you right, and this patch is crap. Patch was written two
month ago so i almost forget exact details. But still it was strange
because whole series was tested very hard, and never saw any corruption,
or complains. From other point of view usually meta data modification
w/o prior get_write_access call result in complain from JBD thread about dirty
bufs in transaction.c:warn_dirty_buffer(). This never happen in my case.

Answer is simple: ext4_ext_get_access(path[0]) is noop in case of inode body
because it has no bh attached. And ext4_ext_dirty() turns in to
ext4_mark_inode_dirty(). That's why patch is correct..
> 
> 2) Some of the callers to the new ext4_ext_try_shrink() helper depends
>    on it return 0 or 1 depending on whether the tree was shrunk, but
>    others assumed that it would return an error code.  Which is OK,
>    since the error codes should be negative, but that means it's
>    critical that the callers check to see whether return code is
>    really an error before returning it.
and {0:1} retcode is sufficient. Situation will change after flexible
tree reduction appear, but this happens in perfect future.
> 
> Since this is just a cleanup, I'm going to skip this for now.  Dmitry,
> could you fix this up and resubmit?   Thanks!!
> 
> 						- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly
  2011-10-20 21:08 ` [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly Dmitry Monakhov
  2011-10-25 12:22   ` Ted Ts'o
@ 2011-10-28 21:40   ` Andreas Dilger
  1 sibling, 0 replies; 18+ messages in thread
From: Andreas Dilger @ 2011-10-28 21:40 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-ext4, tytso, achender

On 2011-10-20, at 3:08 PM, Dmitry Monakhov wrote:
> EOFBLOCK_FL should be updated if called w/o FALLOCATE_FL_KEEP_SIZE
> Currently it happens only if new extent was allocated
> Last fallocate cmd has updated size, but kept EOFBLOCK_FL set. And fsck
> will complain about that.
> 
> Also remove ping pong in ext4_fallocate() in case of new extents, where
> ext4_ext_map_blocks() clear EOFBLOCKS bit, and later
> ext4_falloc_update_inode() restore it again.

Looking into this code, I see some strange behaviour that you added in
an earlier commit (21ca087a3q) to both ext4_ext_convert_to_initialized()
and ext4_split_unwritten_extents(), but I'm trying to figure out if it
is actually needed today.

>    ext4: Do not zero out uninitialized extents beyond i_size
> 
>    The extents code will sometimes zero out blocks and mark them as
>    initialized instead of splitting an extent into several smaller ones.
>    This optimization however, causes problems if the extent is beyond
>    i_size because fsck will complain if there are uninitialized blocks

Should this be "... if there are initialized blocks after i_size"?

>    after i_size as this can not be distinguished from an inode that has
>    an incorrect i_size field.

Shouldn't it be possible to just mark the file with EXT4_EOFBLOCK_FL
so that the whole extent can be zeroed out and avoid the overhead of
splitting the extent and having to zero out the blocks beyond EOF on
the next file IO?

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.






Cheers, Andreas






^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-10-28 21:40 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-20 21:08 [PATCH 0/6] ext4: cleanups and regression fixes for grow/shrink logic V2 Dmitry Monakhov
2011-10-20 21:08 ` [PATCH 1/6] ext4: cleanup ext4_ext_grow_indepth code Dmitry Monakhov
2011-10-22  5:27   ` Ted Ts'o
2011-10-20 21:08 ` [PATCH 2/6] ext4: move inode indepth shrink logic to didicated function Dmitry Monakhov
2011-10-25  8:01   ` Ted Ts'o
2011-10-28 10:47     ` Dmitry Monakhov
2011-10-20 21:08 ` [PATCH 3/6] ext4: Do not clear EOFBLOCKS_FL too soon Dmitry Monakhov
2011-10-25  8:18   ` Ted Ts'o
2011-10-25 11:57     ` Dmitry Monakhov
2011-10-20 21:08 ` [PATCH 4/6] ext4: remove messy logic from ext4_ext_rm_leaf Dmitry Monakhov
2011-10-25 11:44   ` Ted Ts'o
2011-10-20 21:08 ` [PATCH 5/6] ext4: fix punch_hole extend handler Dmitry Monakhov
2011-10-25 12:05   ` Ted Ts'o
2011-10-25 12:35     ` Dmitry Monakhov
2011-10-26 13:38       ` Ted Ts'o
2011-10-20 21:08 ` [PATCH 6/6] ext4: update EOFBLOCKS flag on fallocate properly Dmitry Monakhov
2011-10-25 12:22   ` Ted Ts'o
2011-10-28 21:40   ` Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.