linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/10] ext4: support adding multi-delalloc blocks
@ 2024-05-08  6:12 Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 01/10] ext4: factor out a common helper to query extent map Zhang Yi
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Changes since v2:
 - Improve the commit message in patch 2,4,6 as Ritesh and Jan
   suggested, makes the changes more clear.
 - Add patch 3, add a warning if the delalloc counters are still not
   zero on inactive.
 - In patch 6, add a WARN in ext4_es_insert_delayed_extent(), strictly
   requires the end_allocated parameter to be set to false if the
   inserting extent belongs to one cluster.
 - In patch 9, modify the reserve blocks math formula as Jan suggested,
   prevent the count going to be negative.
 - In patch 10, update the stale ext4_da_map_blocks() function comments.

Hello!

This patch series is the part 2 prepartory changes of the buffered IO
iomap conversion, I picked them out from my buffered IO iomap conversion
RFC series v3[1], add a fix for an issue found in current ext4 code, and
also add bigalloc feature support. Please look the following patches for
details.

The first 3 patches fix an incorrect delalloc reserved blocks count
issue and add a warning to make it easy to detect, the second 6 patches
make ext4_insert_delayed_block() call path support inserting
multi-delalloc blocks once a time, and the last patch makes
ext4_da_map_blocks() buffer_head unaware, prepared for iomap.

This patch set has been passed 'kvm-xfstests -g auto' tests, I hope it
could be reviewed and merged first.

[1] https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/

Thanks,
Yi.

---
v2: https://lore.kernel.org/linux-ext4/20240410034203.2188357-1-yi.zhang@huaweicloud.com/

Zhang Yi (10):
  ext4: factor out a common helper to query extent map
  ext4: check the extent status again before inserting delalloc block
  ext4: warn if delalloc counters are not zero on inactive
  ext4: trim delalloc extent
  ext4: drop iblock parameter
  ext4: make ext4_es_insert_delayed_block() insert multi-blocks
  ext4: make ext4_da_reserve_space() reserve multi-clusters
  ext4: factor out check for whether a cluster is allocated
  ext4: make ext4_insert_delayed_block() insert multi-blocks
  ext4: make ext4_da_map_blocks() buffer_head unaware

 fs/ext4/extents_status.c    |  70 +++++++---
 fs/ext4/extents_status.h    |   5 +-
 fs/ext4/inode.c             | 248 +++++++++++++++++++++++-------------
 fs/ext4/super.c             |   6 +-
 include/trace/events/ext4.h |  26 ++--
 5 files changed, 231 insertions(+), 124 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 01/10] ext4: factor out a common helper to query extent map
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 02/10] ext4: check the extent status again before inserting delalloc block Zhang Yi
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Factor out a new common helper ext4_map_query_blocks() from the
ext4_da_map_blocks(), it query and return the extent map status on the
inode's extent path, no logic changes.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/inode.c | 57 +++++++++++++++++++++++++++----------------------
 1 file changed, 32 insertions(+), 25 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 537803250ca9..6a41172c06e1 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -453,6 +453,35 @@ static void ext4_map_blocks_es_recheck(handle_t *handle,
 }
 #endif /* ES_AGGRESSIVE_TEST */
 
+static int ext4_map_query_blocks(handle_t *handle, struct inode *inode,
+				 struct ext4_map_blocks *map)
+{
+	unsigned int status;
+	int retval;
+
+	if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
+		retval = ext4_ext_map_blocks(handle, inode, map, 0);
+	else
+		retval = ext4_ind_map_blocks(handle, inode, map, 0);
+
+	if (retval <= 0)
+		return retval;
+
+	if (unlikely(retval != map->m_len)) {
+		ext4_warning(inode->i_sb,
+			     "ES len assertion failed for inode "
+			     "%lu: retval %d != map->m_len %d",
+			     inode->i_ino, retval, map->m_len);
+		WARN_ON(1);
+	}
+
+	status = map->m_flags & EXT4_MAP_UNWRITTEN ?
+			EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN;
+	ext4_es_insert_extent(inode, map->m_lblk, map->m_len,
+			      map->m_pblk, status);
+	return retval;
+}
+
 /*
  * The ext4_map_blocks() function tries to look up the requested blocks,
  * and returns if the blocks are already mapped.
@@ -1744,33 +1773,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 	down_read(&EXT4_I(inode)->i_data_sem);
 	if (ext4_has_inline_data(inode))
 		retval = 0;
-	else if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
-		retval = ext4_ext_map_blocks(NULL, inode, map, 0);
 	else
-		retval = ext4_ind_map_blocks(NULL, inode, map, 0);
-	if (retval < 0) {
-		up_read(&EXT4_I(inode)->i_data_sem);
-		return retval;
-	}
-	if (retval > 0) {
-		unsigned int status;
-
-		if (unlikely(retval != map->m_len)) {
-			ext4_warning(inode->i_sb,
-				     "ES len assertion failed for inode "
-				     "%lu: retval %d != map->m_len %d",
-				     inode->i_ino, retval, map->m_len);
-			WARN_ON(1);
-		}
-
-		status = map->m_flags & EXT4_MAP_UNWRITTEN ?
-				EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN;
-		ext4_es_insert_extent(inode, map->m_lblk, map->m_len,
-				      map->m_pblk, status);
-		up_read(&EXT4_I(inode)->i_data_sem);
-		return retval;
-	}
+		retval = ext4_map_query_blocks(NULL, inode, map);
 	up_read(&EXT4_I(inode)->i_data_sem);
+	if (retval)
+		return retval;
 
 add_delayed:
 	down_write(&EXT4_I(inode)->i_data_sem);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 02/10] ext4: check the extent status again before inserting delalloc block
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 01/10] ext4: factor out a common helper to query extent map Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-08 15:02   ` Markus Elfring
  2024-05-08  6:12 ` [PATCH v3 03/10] ext4: warn if delalloc counters are not zero on inactive Zhang Yi
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).

If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.

Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
   folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
   lock.

But this can race between ext4_page_mkwrite() & ext4 fallocate path

ext4_page_mkwrite()             ext4_fallocate()
 block_page_mkwrite()
  ext4_da_map_blocks()
   //find hole in extent status tree
                                 ext4_alloc_file_blocks()
                                  ext4_map_blocks()
                                   //allocate block and unwritten extent
   ext4_insert_delayed_block()
    ext4_da_reserve_space()
     //reserve one more block
    ext4_es_insert_delayed_block()
     //drop unwritten extent and add delayed extent by mistake

Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:

 EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!

This patch fixes the problem by looking up extent status tree again
while the i_data_sem is held in write mode. If it still can't find
any entry, then we insert a new da entry into the extent status tree.

Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6a41172c06e1..6114ca79f464 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 		if (ext4_es_is_hole(&es))
 			goto add_delayed;
 
+found:
 		/*
 		 * Delayed extent could be allocated by fallocate.
 		 * So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 
 add_delayed:
 	down_write(&EXT4_I(inode)->i_data_sem);
+	/*
+	 * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+	 * and fallocate path (no folio lock) can race. Make sure we
+	 * lookup the extent status tree here again while i_data_sem
+	 * is held in write mode, before inserting a new da entry in
+	 * the extent status tree.
+	 */
+	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+		if (!ext4_es_is_hole(&es)) {
+			up_write(&EXT4_I(inode)->i_data_sem);
+			goto found;
+		}
+	} else if (!ext4_has_inline_data(inode)) {
+		retval = ext4_map_query_blocks(NULL, inode, map);
+		if (retval) {
+			up_write(&EXT4_I(inode)->i_data_sem);
+			return retval;
+		}
+	}
+
 	retval = ext4_insert_delayed_block(inode, map->m_lblk);
 	up_write(&EXT4_I(inode)->i_data_sem);
 	if (retval)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 03/10] ext4: warn if delalloc counters are not zero on inactive
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 01/10] ext4: factor out a common helper to query extent map Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 02/10] ext4: check the extent status again before inserting delalloc block Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-12 15:10   ` Jan Kara
  2024-05-08  6:12 ` [PATCH v3 04/10] ext4: trim delalloc extent Zhang Yi
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

The per-inode i_reserved_data_blocks count the reserved delalloc blocks
in a regular file, it should be zero when destroying the file. The
per-fs s_dirtyclusters_counter count all reserved delalloc blocks in a
filesystem, it also should be zero when umounting the filesystem. Now we
have only an error message if the i_reserved_data_blocks is not zero,
which is unable to be simply captured, so add WARN_ON_ONCE to make it
more visable.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/super.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 044135796f2b..440dd54eea25 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1343,6 +1343,9 @@ static void ext4_put_super(struct super_block *sb)
 
 	ext4_group_desc_free(sbi);
 	ext4_flex_groups_free(sbi);
+
+	WARN_ON_ONCE(!ext4_forced_shutdown(sb) &&
+		     percpu_counter_sum(&sbi->s_dirtyclusters_counter));
 	ext4_percpu_param_destroy(sbi);
 #ifdef CONFIG_QUOTA
 	for (int i = 0; i < EXT4_MAXQUOTAS; i++)
@@ -1473,7 +1476,8 @@ static void ext4_destroy_inode(struct inode *inode)
 		dump_stack();
 	}
 
-	if (EXT4_I(inode)->i_reserved_data_blocks)
+	if (!ext4_forced_shutdown(inode->i_sb) &&
+	    WARN_ON_ONCE(EXT4_I(inode)->i_reserved_data_blocks))
 		ext4_msg(inode->i_sb, KERN_ERR,
 			 "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!",
 			 inode->i_ino, EXT4_I(inode),
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 04/10] ext4: trim delalloc extent
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
                   ` (2 preceding siblings ...)
  2024-05-08  6:12 ` [PATCH v3 03/10] ext4: warn if delalloc counters are not zero on inactive Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-08 15:21   ` Markus Elfring
  2024-05-08  6:12 ` [PATCH v3 05/10] ext4: drop iblock parameter Zhang Yi
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

In ext4_da_map_blocks(), we could found four kind of extents in the
extent status tree: hole, unwritten, written and delayed extent. Now we
only trim the map len if we found an unwritten extent or a written
extent. This is okay now since map->m_len is always set to one and we
always insert one delayed block at a time. But this will become isn't
okay for other two cases if ext4_insert_delayed_block() and
ext4_da_map_blocks() support inserting multiple map->len blocks later.

1. If we found a hole in the extent status tree which es->es_len is
   shorter than the length we want to write, we should trim the
   map->m_len to prevent adding extra delay more blocks than we
   expected. For example, assume we write data [A, C) to a file that
   contains a hole extent [A, B) and a written extent [B, D) in the
   cache.

                         A     B  C  D
   before da write:   ...hhhhhh|wwwwww....

   Then we will get extent [A, B), we should trim map->m_len to B-A
   before inserting new delalloc blocks, if not, the range [B, C) will
   be duplicated.

2. If we found a delayed extent in the extent status tree which
   es->es_len is shorter than the length we want to write, we should
   trim the map->m_len to es->es_len and return directly since the front
   part of this map has been delayed, we can't insert the delalloc
   extent that contains the latter part in this round, we should return
   the delayed length and the caller should increase the position and
   call ext4_da_map_blocks() again. For example, assume we write data
   [A, C) to a file that contains a delayed extent [A, B) in the cache.

                         A     B  C
   before da write:   ...dddddd|hhh....

   Then we will get delayed extent [A, B), we should also trim map->m_len
   to B-A and return, if not, we will incorrectly assume that the write
   is complete and won't insert [B, C).

So we need to always trim the map->m_len if the found es->es_len in the
extent status tree is shorter than the map->m_len, prearing for
inserting a extent with multiple delalloc blocks. This patch only does a
pre-fix, the handle is crude and ext4_da_map_blocks() deserve a cleanup,
we will do that later.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6114ca79f464..fb76016e2ab7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1734,6 +1734,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 
 	/* Lookup extent status tree firstly */
 	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+		retval = es.es_len - (iblock - es.es_lblk);
+		if (retval > map->m_len)
+			retval = map->m_len;
+		map->m_len = retval;
+
 		if (ext4_es_is_hole(&es))
 			goto add_delayed;
 
@@ -1750,10 +1755,6 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 		}
 
 		map->m_pblk = ext4_es_pblock(&es) + iblock - es.es_lblk;
-		retval = es.es_len - (iblock - es.es_lblk);
-		if (retval > map->m_len)
-			retval = map->m_len;
-		map->m_len = retval;
 		if (ext4_es_is_written(&es))
 			map->m_flags |= EXT4_MAP_MAPPED;
 		else if (ext4_es_is_unwritten(&es))
@@ -1790,6 +1791,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 	 * the extent status tree.
 	 */
 	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+		retval = es.es_len - (iblock - es.es_lblk);
+		if (retval > map->m_len)
+			retval = map->m_len;
+		map->m_len = retval;
+
 		if (!ext4_es_is_hole(&es)) {
 			up_write(&EXT4_I(inode)->i_data_sem);
 			goto found;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 05/10] ext4: drop iblock parameter
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
                   ` (3 preceding siblings ...)
  2024-05-08  6:12 ` [PATCH v3 04/10] ext4: trim delalloc extent Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 06/10] ext4: make ext4_es_insert_delayed_block() insert multi-blocks Zhang Yi
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

The start block of the delalloc extent to be inserted is equal to
map->m_lblk, just drop the duplicate iblock input parameter.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/inode.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fb76016e2ab7..de157aebc306 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1712,8 +1712,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
  * time. This function looks up the requested blocks and sets the
  * buffer delay bit under the protection of i_data_sem.
  */
-static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
-			      struct ext4_map_blocks *map,
+static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
 			      struct buffer_head *bh)
 {
 	struct extent_status es;
@@ -1733,8 +1732,8 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 		  (unsigned long) map->m_lblk);
 
 	/* Lookup extent status tree firstly */
-	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
-		retval = es.es_len - (iblock - es.es_lblk);
+	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
+		retval = es.es_len - (map->m_lblk - es.es_lblk);
 		if (retval > map->m_len)
 			retval = map->m_len;
 		map->m_len = retval;
@@ -1754,7 +1753,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 			return 0;
 		}
 
-		map->m_pblk = ext4_es_pblock(&es) + iblock - es.es_lblk;
+		map->m_pblk = ext4_es_pblock(&es) + map->m_lblk - es.es_lblk;
 		if (ext4_es_is_written(&es))
 			map->m_flags |= EXT4_MAP_MAPPED;
 		else if (ext4_es_is_unwritten(&es))
@@ -1790,8 +1789,8 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 	 * is held in write mode, before inserting a new da entry in
 	 * the extent status tree.
 	 */
-	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
-		retval = es.es_len - (iblock - es.es_lblk);
+	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
+		retval = es.es_len - (map->m_lblk - es.es_lblk);
 		if (retval > map->m_len)
 			retval = map->m_len;
 		map->m_len = retval;
@@ -1848,7 +1847,7 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 	 * preallocated blocks are unmapped but should treated
 	 * the same as allocated blocks.
 	 */
-	ret = ext4_da_map_blocks(inode, iblock, &map, bh);
+	ret = ext4_da_map_blocks(inode, &map, bh);
 	if (ret <= 0)
 		return ret;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 06/10] ext4: make ext4_es_insert_delayed_block() insert multi-blocks
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
                   ` (4 preceding siblings ...)
  2024-05-08  6:12 ` [PATCH v3 05/10] ext4: drop iblock parameter Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-12 15:19   ` Jan Kara
  2024-05-08  6:12 ` [PATCH v3 07/10] ext4: make ext4_da_reserve_space() reserve multi-clusters Zhang Yi
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Rename ext4_es_insert_delayed_block() to ext4_es_insert_delayed_extent()
and pass length parameter to make it insert multiple delalloc blocks at
a time. For the case of bigalloc, split the allocated parameter to
lclu_allocated and end_allocated. lclu_allocated indicates the
allocation state of the cluster which is containing the lblk,
end_allocated indicates the allocation state of the extent end, clusters
in the middle of delay allocated extent must be unallocated.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/extents_status.c    | 70 ++++++++++++++++++++++++++-----------
 fs/ext4/extents_status.h    |  5 +--
 fs/ext4/inode.c             |  2 +-
 include/trace/events/ext4.h | 16 +++++----
 4 files changed, 62 insertions(+), 31 deletions(-)

diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 4a00e2f019d9..23caf1f028b0 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -2052,34 +2052,49 @@ bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk)
 }
 
 /*
- * ext4_es_insert_delayed_block - adds a delayed block to the extents status
- *                                tree, adding a pending reservation where
- *                                needed
+ * ext4_es_insert_delayed_extent - adds some delayed blocks to the extents
+ *                                 status tree, adding a pending reservation
+ *                                 where needed
  *
  * @inode - file containing the newly added block
- * @lblk - logical block to be added
- * @allocated - indicates whether a physical cluster has been allocated for
- *              the logical cluster that contains the block
+ * @lblk - start logical block to be added
+ * @len - length of blocks to be added
+ * @lclu_allocated/end_allocated - indicates whether a physical cluster has
+ *                                 been allocated for the logical cluster
+ *                                 that contains the start/end block. Note that
+ *                                 end_allocated should always be set to false
+ *                                 if the start and the end block are in the
+ *                                 same cluster
  */
-void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
-				  bool allocated)
+void ext4_es_insert_delayed_extent(struct inode *inode, ext4_lblk_t lblk,
+				   ext4_lblk_t len, bool lclu_allocated,
+				   bool end_allocated)
 {
+	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	struct extent_status newes;
+	ext4_lblk_t end = lblk + len - 1;
 	int err1 = 0, err2 = 0, err3 = 0;
 	struct extent_status *es1 = NULL;
 	struct extent_status *es2 = NULL;
-	struct pending_reservation *pr = NULL;
+	struct pending_reservation *pr1 = NULL;
+	struct pending_reservation *pr2 = NULL;
 
 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
 		return;
 
-	es_debug("add [%u/1) delayed to extent status tree of inode %lu\n",
-		 lblk, inode->i_ino);
+	es_debug("add [%u/%u) delayed to extent status tree of inode %lu\n",
+		 lblk, len, inode->i_ino);
+	if (!len)
+		return;
+
+	WARN_ON_ONCE((EXT4_B2C(sbi, lblk) == EXT4_B2C(sbi, end)) &&
+		     end_allocated);
 
 	newes.es_lblk = lblk;
-	newes.es_len = 1;
+	newes.es_len = len;
 	ext4_es_store_pblock_status(&newes, ~0, EXTENT_STATUS_DELAYED);
-	trace_ext4_es_insert_delayed_block(inode, &newes, allocated);
+	trace_ext4_es_insert_delayed_extent(inode, &newes, lclu_allocated,
+					    end_allocated);
 
 	ext4_es_insert_extent_check(inode, &newes);
 
@@ -2088,11 +2103,15 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
 		es1 = __es_alloc_extent(true);
 	if ((err1 || err2) && !es2)
 		es2 = __es_alloc_extent(true);
-	if ((err1 || err2 || err3) && allocated && !pr)
-		pr = __alloc_pending(true);
+	if (err1 || err2 || err3) {
+		if (lclu_allocated && !pr1)
+			pr1 = __alloc_pending(true);
+		if (end_allocated && !pr2)
+			pr2 = __alloc_pending(true);
+	}
 	write_lock(&EXT4_I(inode)->i_es_lock);
 
-	err1 = __es_remove_extent(inode, lblk, lblk, NULL, es1);
+	err1 = __es_remove_extent(inode, lblk, end, NULL, es1);
 	if (err1 != 0)
 		goto error;
 	/* Free preallocated extent if it didn't get used. */
@@ -2112,13 +2131,22 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
 		es2 = NULL;
 	}
 
-	if (allocated) {
-		err3 = __insert_pending(inode, lblk, &pr);
+	if (lclu_allocated) {
+		err3 = __insert_pending(inode, lblk, &pr1);
 		if (err3 != 0)
 			goto error;
-		if (pr) {
-			__free_pending(pr);
-			pr = NULL;
+		if (pr1) {
+			__free_pending(pr1);
+			pr1 = NULL;
+		}
+	}
+	if (end_allocated) {
+		err3 = __insert_pending(inode, end, &pr2);
+		if (err3 != 0)
+			goto error;
+		if (pr2) {
+			__free_pending(pr2);
+			pr2 = NULL;
 		}
 	}
 error:
diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index d9847a4a25db..3c8e2edee5d5 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -249,8 +249,9 @@ extern void ext4_exit_pending(void);
 extern void ext4_init_pending_tree(struct ext4_pending_tree *tree);
 extern void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk);
 extern bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk);
-extern void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
-					 bool allocated);
+extern void ext4_es_insert_delayed_extent(struct inode *inode, ext4_lblk_t lblk,
+					  ext4_lblk_t len, bool lclu_allocated,
+					  bool end_allocated);
 extern unsigned int ext4_es_delayed_clu(struct inode *inode, ext4_lblk_t lblk,
 					ext4_lblk_t len);
 extern void ext4_clear_inode_es(struct inode *inode);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index de157aebc306..f64fe8b873ce 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1702,7 +1702,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
 		}
 	}
 
-	ext4_es_insert_delayed_block(inode, lblk, allocated);
+	ext4_es_insert_delayed_extent(inode, lblk, 1, allocated, false);
 	return 0;
 }
 
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index a697f4b77162..6b41ac61310f 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2478,11 +2478,11 @@ TRACE_EVENT(ext4_es_shrink,
 		  __entry->scan_time, __entry->nr_skipped, __entry->retried)
 );
 
-TRACE_EVENT(ext4_es_insert_delayed_block,
+TRACE_EVENT(ext4_es_insert_delayed_extent,
 	TP_PROTO(struct inode *inode, struct extent_status *es,
-		 bool allocated),
+		 bool lclu_allocated, bool end_allocated),
 
-	TP_ARGS(inode, es, allocated),
+	TP_ARGS(inode, es, lclu_allocated, end_allocated),
 
 	TP_STRUCT__entry(
 		__field(	dev_t,		dev		)
@@ -2491,7 +2491,8 @@ TRACE_EVENT(ext4_es_insert_delayed_block,
 		__field(	ext4_lblk_t,	len		)
 		__field(	ext4_fsblk_t,	pblk		)
 		__field(	char,		status		)
-		__field(	bool,		allocated	)
+		__field(	bool,		lclu_allocated	)
+		__field(	bool,		end_allocated	)
 	),
 
 	TP_fast_assign(
@@ -2501,16 +2502,17 @@ TRACE_EVENT(ext4_es_insert_delayed_block,
 		__entry->len		= es->es_len;
 		__entry->pblk		= ext4_es_show_pblock(es);
 		__entry->status		= ext4_es_status(es);
-		__entry->allocated	= allocated;
+		__entry->lclu_allocated	= lclu_allocated;
+		__entry->end_allocated	= end_allocated;
 	),
 
 	TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s "
-		  "allocated %d",
+		  "allocated %d %d",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
 		  (unsigned long) __entry->ino,
 		  __entry->lblk, __entry->len,
 		  __entry->pblk, show_extent_status(__entry->status),
-		  __entry->allocated)
+		  __entry->lclu_allocated, __entry->end_allocated)
 );
 
 /* fsmap traces */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 07/10] ext4: make ext4_da_reserve_space() reserve multi-clusters
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
                   ` (5 preceding siblings ...)
  2024-05-08  6:12 ` [PATCH v3 06/10] ext4: make ext4_es_insert_delayed_block() insert multi-blocks Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 08/10] ext4: factor out check for whether a cluster is allocated Zhang Yi
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Add 'nr_resv' parameter to ext4_da_reserve_space(), which indicates the
number of clusters wants to reserve, make it reserve multiple clusters
at a time.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c             | 18 +++++++++---------
 include/trace/events/ext4.h | 10 ++++++----
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index f64fe8b873ce..0c52969654ac 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1479,9 +1479,9 @@ static int ext4_journalled_write_end(struct file *file,
 }
 
 /*
- * Reserve space for a single cluster
+ * Reserve space for 'nr_resv' clusters
  */
-static int ext4_da_reserve_space(struct inode *inode)
+static int ext4_da_reserve_space(struct inode *inode, int nr_resv)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	struct ext4_inode_info *ei = EXT4_I(inode);
@@ -1492,18 +1492,18 @@ static int ext4_da_reserve_space(struct inode *inode)
 	 * us from metadata over-estimation, though we may go over by
 	 * a small amount in the end.  Here we just reserve for data.
 	 */
-	ret = dquot_reserve_block(inode, EXT4_C2B(sbi, 1));
+	ret = dquot_reserve_block(inode, EXT4_C2B(sbi, nr_resv));
 	if (ret)
 		return ret;
 
 	spin_lock(&ei->i_block_reservation_lock);
-	if (ext4_claim_free_clusters(sbi, 1, 0)) {
+	if (ext4_claim_free_clusters(sbi, nr_resv, 0)) {
 		spin_unlock(&ei->i_block_reservation_lock);
-		dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1));
+		dquot_release_reservation_block(inode, EXT4_C2B(sbi, nr_resv));
 		return -ENOSPC;
 	}
-	ei->i_reserved_data_blocks++;
-	trace_ext4_da_reserve_space(inode);
+	ei->i_reserved_data_blocks += nr_resv;
+	trace_ext4_da_reserve_space(inode, nr_resv);
 	spin_unlock(&ei->i_block_reservation_lock);
 
 	return 0;       /* success */
@@ -1678,7 +1678,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
 	 * extents status tree doesn't get a match.
 	 */
 	if (sbi->s_cluster_ratio == 1) {
-		ret = ext4_da_reserve_space(inode);
+		ret = ext4_da_reserve_space(inode, 1);
 		if (ret != 0)   /* ENOSPC */
 			return ret;
 	} else {   /* bigalloc */
@@ -1690,7 +1690,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
 				if (ret < 0)
 					return ret;
 				if (ret == 0) {
-					ret = ext4_da_reserve_space(inode);
+					ret = ext4_da_reserve_space(inode, 1);
 					if (ret != 0)   /* ENOSPC */
 						return ret;
 				} else {
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 6b41ac61310f..cc5e9b7b2b44 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -1246,14 +1246,15 @@ TRACE_EVENT(ext4_da_update_reserve_space,
 );
 
 TRACE_EVENT(ext4_da_reserve_space,
-	TP_PROTO(struct inode *inode),
+	TP_PROTO(struct inode *inode, int nr_resv),
 
-	TP_ARGS(inode),
+	TP_ARGS(inode, nr_resv),
 
 	TP_STRUCT__entry(
 		__field(	dev_t,	dev			)
 		__field(	ino_t,	ino			)
 		__field(	__u64,	i_blocks		)
+		__field(	int,	reserve_blocks		)
 		__field(	int,	reserved_data_blocks	)
 		__field(	__u16,  mode			)
 	),
@@ -1262,16 +1263,17 @@ TRACE_EVENT(ext4_da_reserve_space,
 		__entry->dev	= inode->i_sb->s_dev;
 		__entry->ino	= inode->i_ino;
 		__entry->i_blocks = inode->i_blocks;
+		__entry->reserve_blocks = nr_resv;
 		__entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks;
 		__entry->mode	= inode->i_mode;
 	),
 
-	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu "
+	TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu reserve_blocks %d"
 		  "reserved_data_blocks %d",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
 		  (unsigned long) __entry->ino,
 		  __entry->mode, __entry->i_blocks,
-		  __entry->reserved_data_blocks)
+		  __entry->reserve_blocks, __entry->reserved_data_blocks)
 );
 
 TRACE_EVENT(ext4_da_release_space,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 08/10] ext4: factor out check for whether a cluster is allocated
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
                   ` (6 preceding siblings ...)
  2024-05-08  6:12 ` [PATCH v3 07/10] ext4: make ext4_da_reserve_space() reserve multi-clusters Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-12 15:40   ` Jan Kara
  2024-05-08  6:12 ` [PATCH v3 09/10] ext4: make ext4_insert_delayed_block() insert multi-blocks Zhang Yi
  2024-05-08  6:12 ` [PATCH v3 10/10] ext4: make ext4_da_map_blocks() buffer_head unaware Zhang Yi
  9 siblings, 1 reply; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Factor out a common helper ext4_da_check_clu_allocated(), check whether
the cluster containing a delalloc block to be added has been delayed or
allocated, no logic changes.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c | 52 +++++++++++++++++++++++++++++++++----------------
 1 file changed, 35 insertions(+), 17 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0c52969654ac..6e418d3f8e87 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1649,6 +1649,34 @@ static void ext4_print_free_blocks(struct inode *inode)
 	return;
 }
 
+/*
+ * Check whether the cluster containing lblk has been delayed or allocated,
+ * if not, it means we should reserve a cluster when add delalloc, return 1,
+ * otherwise return 0 or error code.
+ */
+static int ext4_da_check_clu_allocated(struct inode *inode, ext4_lblk_t lblk,
+				       bool *allocated)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+	int ret;
+
+	*allocated = false;
+	if (ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk))
+		return 0;
+
+	if (ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk))
+		goto allocated;
+
+	ret = ext4_clu_mapped(inode, EXT4_B2C(sbi, lblk));
+	if (ret < 0)
+		return ret;
+	if (ret == 0)
+		return 1;
+allocated:
+	*allocated = true;
+	return 0;
+}
+
 /*
  * ext4_insert_delayed_block - adds a delayed block to the extents status
  *                             tree, incrementing the reserved cluster/block
@@ -1682,23 +1710,13 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
 		if (ret != 0)   /* ENOSPC */
 			return ret;
 	} else {   /* bigalloc */
-		if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) {
-			if (!ext4_es_scan_clu(inode,
-					      &ext4_es_is_mapped, lblk)) {
-				ret = ext4_clu_mapped(inode,
-						      EXT4_B2C(sbi, lblk));
-				if (ret < 0)
-					return ret;
-				if (ret == 0) {
-					ret = ext4_da_reserve_space(inode, 1);
-					if (ret != 0)   /* ENOSPC */
-						return ret;
-				} else {
-					allocated = true;
-				}
-			} else {
-				allocated = true;
-			}
+		ret = ext4_da_check_clu_allocated(inode, lblk, &allocated);
+		if (ret < 0)
+			return ret;
+		if (ret > 0) {
+			ret = ext4_da_reserve_space(inode, 1);
+			if (ret != 0)   /* ENOSPC */
+				return ret;
 		}
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 09/10] ext4: make ext4_insert_delayed_block() insert multi-blocks
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
                   ` (7 preceding siblings ...)
  2024-05-08  6:12 ` [PATCH v3 08/10] ext4: factor out check for whether a cluster is allocated Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-12 21:47   ` Jan Kara
  2024-05-08  6:12 ` [PATCH v3 10/10] ext4: make ext4_da_map_blocks() buffer_head unaware Zhang Yi
  9 siblings, 1 reply; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Rename ext4_insert_delayed_block() to ext4_insert_delayed_blocks(),
pass length parameter to make it insert multiple delalloc blocks at a
time. For non-bigalloc case, just reserve len blocks and insert delalloc
extent. For bigalloc case, we can ensure that the clusters in the middle
of a extent must be unallocated, we only need to check whether the start
and end clusters are delayed/allocated. We should subtract the space for
the start and/or end block(s) if they are allocated.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c | 48 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6e418d3f8e87..c56386d1b10d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1678,24 +1678,29 @@ static int ext4_da_check_clu_allocated(struct inode *inode, ext4_lblk_t lblk,
 }
 
 /*
- * ext4_insert_delayed_block - adds a delayed block to the extents status
- *                             tree, incrementing the reserved cluster/block
- *                             count or making a pending reservation
- *                             where needed
+ * ext4_insert_delayed_blocks - adds a multiple delayed blocks to the extents
+ *                              status tree, incrementing the reserved
+ *                              cluster/block count or making pending
+ *                              reservations where needed
  *
  * @inode - file containing the newly added block
- * @lblk - logical block to be added
+ * @lblk - start logical block to be added
+ * @len - length of blocks to be added
  *
  * Returns 0 on success, negative error code on failure.
  */
-static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
+static int ext4_insert_delayed_blocks(struct inode *inode, ext4_lblk_t lblk,
+				      ext4_lblk_t len)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	int ret;
-	bool allocated = false;
+	bool lclu_allocated = false;
+	bool end_allocated = false;
+	ext4_lblk_t resv_clu;
+	ext4_lblk_t end = lblk + len - 1;
 
 	/*
-	 * If the cluster containing lblk is shared with a delayed,
+	 * If the cluster containing lblk or end is shared with a delayed,
 	 * written, or unwritten extent in a bigalloc file system, it's
 	 * already been accounted for and does not need to be reserved.
 	 * A pending reservation must be made for the cluster if it's
@@ -1706,21 +1711,36 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
 	 * extents status tree doesn't get a match.
 	 */
 	if (sbi->s_cluster_ratio == 1) {
-		ret = ext4_da_reserve_space(inode, 1);
+		ret = ext4_da_reserve_space(inode, len);
 		if (ret != 0)   /* ENOSPC */
 			return ret;
 	} else {   /* bigalloc */
-		ret = ext4_da_check_clu_allocated(inode, lblk, &allocated);
+		resv_clu = EXT4_B2C(sbi, end) - EXT4_B2C(sbi, lblk) + 1;
+
+		ret = ext4_da_check_clu_allocated(inode, lblk, &lclu_allocated);
 		if (ret < 0)
 			return ret;
-		if (ret > 0) {
-			ret = ext4_da_reserve_space(inode, 1);
+		if (ret == 0)
+			resv_clu--;
+
+		if (EXT4_B2C(sbi, lblk) != EXT4_B2C(sbi, end)) {
+			ret = ext4_da_check_clu_allocated(inode, end,
+							  &end_allocated);
+			if (ret < 0)
+				return ret;
+			if (ret == 0)
+				resv_clu--;
+		}
+
+		if (resv_clu) {
+			ret = ext4_da_reserve_space(inode, resv_clu);
 			if (ret != 0)   /* ENOSPC */
 				return ret;
 		}
 	}
 
-	ext4_es_insert_delayed_extent(inode, lblk, 1, allocated, false);
+	ext4_es_insert_delayed_extent(inode, lblk, len, lclu_allocated,
+				      end_allocated);
 	return 0;
 }
 
@@ -1825,7 +1845,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
 		}
 	}
 
-	retval = ext4_insert_delayed_block(inode, map->m_lblk);
+	retval = ext4_insert_delayed_blocks(inode, map->m_lblk, map->m_len);
 	up_write(&EXT4_I(inode)->i_data_sem);
 	if (retval)
 		return retval;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 10/10] ext4: make ext4_da_map_blocks() buffer_head unaware
  2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
                   ` (8 preceding siblings ...)
  2024-05-08  6:12 ` [PATCH v3 09/10] ext4: make ext4_insert_delayed_block() insert multi-blocks Zhang Yi
@ 2024-05-08  6:12 ` Zhang Yi
  2024-05-12 21:51   ` Jan Kara
  9 siblings, 1 reply; 22+ messages in thread
From: Zhang Yi @ 2024-05-08  6:12 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

After calling the ext4_da_map_blocks(), a delalloc extent state could
be identified through the EXT4_MAP_DELAYED flag in map. So factor out
buffer_head related handles in ext4_da_map_blocks(), make this function
buffer_head unaware and becomes a common helper, and also update the
stale function commtents, preparing for the iomap da write path in the
future.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c | 63 ++++++++++++++++++++++++-------------------------
 1 file changed, 31 insertions(+), 32 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c56386d1b10d..1dba5337382a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1745,36 +1745,32 @@ static int ext4_insert_delayed_blocks(struct inode *inode, ext4_lblk_t lblk,
 }
 
 /*
- * This function is grabs code from the very beginning of
- * ext4_map_blocks, but assumes that the caller is from delayed write
- * time. This function looks up the requested blocks and sets the
- * buffer delay bit under the protection of i_data_sem.
+ * Looks up the requested blocks and sets the delalloc extent map.
+ * First try to look up for the extent entry that contains the requested
+ * blocks in the extent status tree without i_data_sem, then try to look
+ * up for the ondisk extent mapping with i_data_sem in read mode,
+ * finally hold i_data_sem in write mode, looks up again and add a
+ * delalloc extent entry if it still couldn't find any extent. Pass out
+ * the mapped extent through @map and return 0 on success.
  */
-static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
-			      struct buffer_head *bh)
+static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map)
 {
 	struct extent_status es;
 	int retval;
-	sector_t invalid_block = ~((sector_t) 0xffff);
 #ifdef ES_AGGRESSIVE_TEST
 	struct ext4_map_blocks orig_map;
 
 	memcpy(&orig_map, map, sizeof(*map));
 #endif
 
-	if (invalid_block < ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es))
-		invalid_block = ~0;
-
 	map->m_flags = 0;
 	ext_debug(inode, "max_blocks %u, logical block %lu\n", map->m_len,
 		  (unsigned long) map->m_lblk);
 
 	/* Lookup extent status tree firstly */
 	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
-		retval = es.es_len - (map->m_lblk - es.es_lblk);
-		if (retval > map->m_len)
-			retval = map->m_len;
-		map->m_len = retval;
+		map->m_len = min_t(unsigned int, map->m_len,
+				   es.es_len - (map->m_lblk - es.es_lblk));
 
 		if (ext4_es_is_hole(&es))
 			goto add_delayed;
@@ -1784,10 +1780,8 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
 		 * Delayed extent could be allocated by fallocate.
 		 * So we need to check it.
 		 */
-		if (ext4_es_is_delayed(&es) && !ext4_es_is_unwritten(&es)) {
-			map_bh(bh, inode->i_sb, invalid_block);
-			set_buffer_new(bh);
-			set_buffer_delay(bh);
+		if (ext4_es_is_delonly(&es)) {
+			map->m_flags |= EXT4_MAP_DELAYED;
 			return 0;
 		}
 
@@ -1802,7 +1796,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
 #ifdef ES_AGGRESSIVE_TEST
 		ext4_map_blocks_es_recheck(NULL, inode, map, &orig_map, 0);
 #endif
-		return retval;
+		return 0;
 	}
 
 	/*
@@ -1816,7 +1810,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
 		retval = ext4_map_query_blocks(NULL, inode, map);
 	up_read(&EXT4_I(inode)->i_data_sem);
 	if (retval)
-		return retval;
+		return retval < 0 ? retval : 0;
 
 add_delayed:
 	down_write(&EXT4_I(inode)->i_data_sem);
@@ -1828,10 +1822,8 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
 	 * the extent status tree.
 	 */
 	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
-		retval = es.es_len - (map->m_lblk - es.es_lblk);
-		if (retval > map->m_len)
-			retval = map->m_len;
-		map->m_len = retval;
+		map->m_len = min_t(unsigned int, map->m_len,
+				   es.es_len - (map->m_lblk - es.es_lblk));
 
 		if (!ext4_es_is_hole(&es)) {
 			up_write(&EXT4_I(inode)->i_data_sem);
@@ -1841,18 +1833,14 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
 		retval = ext4_map_query_blocks(NULL, inode, map);
 		if (retval) {
 			up_write(&EXT4_I(inode)->i_data_sem);
-			return retval;
+			return retval < 0 ? retval : 0;
 		}
 	}
 
+	map->m_flags |= EXT4_MAP_DELAYED;
 	retval = ext4_insert_delayed_blocks(inode, map->m_lblk, map->m_len);
 	up_write(&EXT4_I(inode)->i_data_sem);
-	if (retval)
-		return retval;
 
-	map_bh(bh, inode->i_sb, invalid_block);
-	set_buffer_new(bh);
-	set_buffer_delay(bh);
 	return retval;
 }
 
@@ -1872,11 +1860,15 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 			   struct buffer_head *bh, int create)
 {
 	struct ext4_map_blocks map;
+	sector_t invalid_block = ~((sector_t) 0xffff);
 	int ret = 0;
 
 	BUG_ON(create == 0);
 	BUG_ON(bh->b_size != inode->i_sb->s_blocksize);
 
+	if (invalid_block < ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es))
+		invalid_block = ~0;
+
 	map.m_lblk = iblock;
 	map.m_len = 1;
 
@@ -1885,10 +1877,17 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 	 * preallocated blocks are unmapped but should treated
 	 * the same as allocated blocks.
 	 */
-	ret = ext4_da_map_blocks(inode, &map, bh);
-	if (ret <= 0)
+	ret = ext4_da_map_blocks(inode, &map);
+	if (ret < 0)
 		return ret;
 
+	if (map.m_flags & EXT4_MAP_DELAYED) {
+		map_bh(bh, inode->i_sb, invalid_block);
+		set_buffer_new(bh);
+		set_buffer_delay(bh);
+		return 0;
+	}
+
 	map_bh(bh, inode->i_sb, map.m_pblk);
 	ext4_update_bh_state(bh, map.m_flags);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 02/10] ext4: check the extent status again before inserting delalloc block
  2024-05-08  6:12 ` [PATCH v3 02/10] ext4: check the extent status again before inserting delalloc block Zhang Yi
@ 2024-05-08 15:02   ` Markus Elfring
  2024-05-09  8:26     ` Zhang Yi
  0 siblings, 1 reply; 22+ messages in thread
From: Markus Elfring @ 2024-05-08 15:02 UTC (permalink / raw)
  To: Zhang Yi, linux-ext4, linux-fsdevel, kernel-janitors
  Cc: LKML, Andreas Dilger, Jan Kara, Ritesh Harjani,
	Theodore Ts'o, Yu Kuai, Zhang Yi, Zhihao Cheng

…
> This patch fixes the problem by looking …

Will corresponding imperative wordings be desirable for an improved change description?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9-rc7#n94

Regards,
Markus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 04/10] ext4: trim delalloc extent
  2024-05-08  6:12 ` [PATCH v3 04/10] ext4: trim delalloc extent Zhang Yi
@ 2024-05-08 15:21   ` Markus Elfring
  2024-05-09  8:27     ` Zhang Yi
  0 siblings, 1 reply; 22+ messages in thread
From: Markus Elfring @ 2024-05-08 15:21 UTC (permalink / raw)
  To: Zhang Yi, linux-ext4, linux-fsdevel, kernel-janitors
  Cc: LKML, Andreas Dilger, Jan Kara, Ritesh Harjani,
	Theodore Ts'o, Yu Kuai, Zhang Yi, Zhihao Cheng

> In ext4_da_map_blocks(), we could found four kind of extents …

                                    find?

Regards,
Markus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 02/10] ext4: check the extent status again before inserting delalloc block
  2024-05-08 15:02   ` Markus Elfring
@ 2024-05-09  8:26     ` Zhang Yi
  0 siblings, 0 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-09  8:26 UTC (permalink / raw)
  To: Markus Elfring, Zhang Yi, linux-ext4, linux-fsdevel, kernel-janitors
  Cc: LKML, Andreas Dilger, Jan Kara, Ritesh Harjani,
	Theodore Ts'o, Yu Kuai, Zhihao Cheng

On 2024/5/8 23:02, Markus Elfring wrote:
> …
>> This patch fixes the problem by looking …
> 
> Will corresponding imperative wordings be desirable for an improved change description?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9-rc7#n94
> 

Yes, it would be helpful.

Thanks,
Yi.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 04/10] ext4: trim delalloc extent
  2024-05-08 15:21   ` Markus Elfring
@ 2024-05-09  8:27     ` Zhang Yi
  0 siblings, 0 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-09  8:27 UTC (permalink / raw)
  To: Markus Elfring, Zhang Yi, linux-ext4, linux-fsdevel, kernel-janitors
  Cc: LKML, Andreas Dilger, Jan Kara, Ritesh Harjani,
	Theodore Ts'o, Yu Kuai, Zhihao Cheng

On 2024/5/8 23:21, Markus Elfring wrote:
>> In ext4_da_map_blocks(), we could found four kind of extents …
> 
>                                     find?
> 

Sure.

Thanks,
Yi.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 03/10] ext4: warn if delalloc counters are not zero on inactive
  2024-05-08  6:12 ` [PATCH v3 03/10] ext4: warn if delalloc counters are not zero on inactive Zhang Yi
@ 2024-05-12 15:10   ` Jan Kara
  2024-05-13 14:17     ` Zhang Yi
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Kara @ 2024-05-12 15:10 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	jack, ritesh.list, yi.zhang, chengzhihao1, yukuai3

On Wed 08-05-24 14:12:13, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> The per-inode i_reserved_data_blocks count the reserved delalloc blocks
> in a regular file, it should be zero when destroying the file. The
> per-fs s_dirtyclusters_counter count all reserved delalloc blocks in a
> filesystem, it also should be zero when umounting the filesystem. Now we
> have only an error message if the i_reserved_data_blocks is not zero,
> which is unable to be simply captured, so add WARN_ON_ONCE to make it
> more visable.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Well, maybe the warnings could be guarded by !(EXT4_SB(sb)->s_mount_state &
EXT4_ERROR_FS)? Because the warning isn't very interesting when the
filesystem was corrupted and if somebody runs with errors=continue we would
still possibly hit this warning although we don't really care...

								Honza

> ---
>  fs/ext4/super.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 044135796f2b..440dd54eea25 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1343,6 +1343,9 @@ static void ext4_put_super(struct super_block *sb)
>  
>  	ext4_group_desc_free(sbi);
>  	ext4_flex_groups_free(sbi);
> +
> +	WARN_ON_ONCE(!ext4_forced_shutdown(sb) &&
> +		     percpu_counter_sum(&sbi->s_dirtyclusters_counter));
>  	ext4_percpu_param_destroy(sbi);
>  #ifdef CONFIG_QUOTA
>  	for (int i = 0; i < EXT4_MAXQUOTAS; i++)
> @@ -1473,7 +1476,8 @@ static void ext4_destroy_inode(struct inode *inode)
>  		dump_stack();
>  	}
>  
> -	if (EXT4_I(inode)->i_reserved_data_blocks)
> +	if (!ext4_forced_shutdown(inode->i_sb) &&
> +	    WARN_ON_ONCE(EXT4_I(inode)->i_reserved_data_blocks))
>  		ext4_msg(inode->i_sb, KERN_ERR,
>  			 "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!",
>  			 inode->i_ino, EXT4_I(inode),
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 06/10] ext4: make ext4_es_insert_delayed_block() insert multi-blocks
  2024-05-08  6:12 ` [PATCH v3 06/10] ext4: make ext4_es_insert_delayed_block() insert multi-blocks Zhang Yi
@ 2024-05-12 15:19   ` Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2024-05-12 15:19 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	jack, ritesh.list, yi.zhang, chengzhihao1, yukuai3

On Wed 08-05-24 14:12:16, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> Rename ext4_es_insert_delayed_block() to ext4_es_insert_delayed_extent()
> and pass length parameter to make it insert multiple delalloc blocks at
> a time. For the case of bigalloc, split the allocated parameter to
> lclu_allocated and end_allocated. lclu_allocated indicates the
> allocation state of the cluster which is containing the lblk,
> end_allocated indicates the allocation state of the extent end, clusters
> in the middle of delay allocated extent must be unallocated.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/extents_status.c    | 70 ++++++++++++++++++++++++++-----------
>  fs/ext4/extents_status.h    |  5 +--
>  fs/ext4/inode.c             |  2 +-
>  include/trace/events/ext4.h | 16 +++++----
>  4 files changed, 62 insertions(+), 31 deletions(-)
> 
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 4a00e2f019d9..23caf1f028b0 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -2052,34 +2052,49 @@ bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk)
>  }
>  
>  /*
> - * ext4_es_insert_delayed_block - adds a delayed block to the extents status
> - *                                tree, adding a pending reservation where
> - *                                needed
> + * ext4_es_insert_delayed_extent - adds some delayed blocks to the extents
> + *                                 status tree, adding a pending reservation
> + *                                 where needed
>   *
>   * @inode - file containing the newly added block
> - * @lblk - logical block to be added
> - * @allocated - indicates whether a physical cluster has been allocated for
> - *              the logical cluster that contains the block
> + * @lblk - start logical block to be added
> + * @len - length of blocks to be added
> + * @lclu_allocated/end_allocated - indicates whether a physical cluster has
> + *                                 been allocated for the logical cluster
> + *                                 that contains the start/end block. Note that
> + *                                 end_allocated should always be set to false
> + *                                 if the start and the end block are in the
> + *                                 same cluster
>   */
> -void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
> -				  bool allocated)
> +void ext4_es_insert_delayed_extent(struct inode *inode, ext4_lblk_t lblk,
> +				   ext4_lblk_t len, bool lclu_allocated,
> +				   bool end_allocated)
>  {
> +	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>  	struct extent_status newes;
> +	ext4_lblk_t end = lblk + len - 1;
>  	int err1 = 0, err2 = 0, err3 = 0;
>  	struct extent_status *es1 = NULL;
>  	struct extent_status *es2 = NULL;
> -	struct pending_reservation *pr = NULL;
> +	struct pending_reservation *pr1 = NULL;
> +	struct pending_reservation *pr2 = NULL;
>  
>  	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>  		return;
>  
> -	es_debug("add [%u/1) delayed to extent status tree of inode %lu\n",
> -		 lblk, inode->i_ino);
> +	es_debug("add [%u/%u) delayed to extent status tree of inode %lu\n",
> +		 lblk, len, inode->i_ino);
> +	if (!len)
> +		return;
> +
> +	WARN_ON_ONCE((EXT4_B2C(sbi, lblk) == EXT4_B2C(sbi, end)) &&
> +		     end_allocated);
>  
>  	newes.es_lblk = lblk;
> -	newes.es_len = 1;
> +	newes.es_len = len;
>  	ext4_es_store_pblock_status(&newes, ~0, EXTENT_STATUS_DELAYED);
> -	trace_ext4_es_insert_delayed_block(inode, &newes, allocated);
> +	trace_ext4_es_insert_delayed_extent(inode, &newes, lclu_allocated,
> +					    end_allocated);
>  
>  	ext4_es_insert_extent_check(inode, &newes);
>  
> @@ -2088,11 +2103,15 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
>  		es1 = __es_alloc_extent(true);
>  	if ((err1 || err2) && !es2)
>  		es2 = __es_alloc_extent(true);
> -	if ((err1 || err2 || err3) && allocated && !pr)
> -		pr = __alloc_pending(true);
> +	if (err1 || err2 || err3) {
> +		if (lclu_allocated && !pr1)
> +			pr1 = __alloc_pending(true);
> +		if (end_allocated && !pr2)
> +			pr2 = __alloc_pending(true);
> +	}
>  	write_lock(&EXT4_I(inode)->i_es_lock);
>  
> -	err1 = __es_remove_extent(inode, lblk, lblk, NULL, es1);
> +	err1 = __es_remove_extent(inode, lblk, end, NULL, es1);
>  	if (err1 != 0)
>  		goto error;
>  	/* Free preallocated extent if it didn't get used. */
> @@ -2112,13 +2131,22 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
>  		es2 = NULL;
>  	}
>  
> -	if (allocated) {
> -		err3 = __insert_pending(inode, lblk, &pr);
> +	if (lclu_allocated) {
> +		err3 = __insert_pending(inode, lblk, &pr1);
>  		if (err3 != 0)
>  			goto error;
> -		if (pr) {
> -			__free_pending(pr);
> -			pr = NULL;
> +		if (pr1) {
> +			__free_pending(pr1);
> +			pr1 = NULL;
> +		}
> +	}
> +	if (end_allocated) {
> +		err3 = __insert_pending(inode, end, &pr2);
> +		if (err3 != 0)
> +			goto error;
> +		if (pr2) {
> +			__free_pending(pr2);
> +			pr2 = NULL;
>  		}
>  	}
>  error:
> diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
> index d9847a4a25db..3c8e2edee5d5 100644
> --- a/fs/ext4/extents_status.h
> +++ b/fs/ext4/extents_status.h
> @@ -249,8 +249,9 @@ extern void ext4_exit_pending(void);
>  extern void ext4_init_pending_tree(struct ext4_pending_tree *tree);
>  extern void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk);
>  extern bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk);
> -extern void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
> -					 bool allocated);
> +extern void ext4_es_insert_delayed_extent(struct inode *inode, ext4_lblk_t lblk,
> +					  ext4_lblk_t len, bool lclu_allocated,
> +					  bool end_allocated);
>  extern unsigned int ext4_es_delayed_clu(struct inode *inode, ext4_lblk_t lblk,
>  					ext4_lblk_t len);
>  extern void ext4_clear_inode_es(struct inode *inode);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index de157aebc306..f64fe8b873ce 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1702,7 +1702,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
>  		}
>  	}
>  
> -	ext4_es_insert_delayed_block(inode, lblk, allocated);
> +	ext4_es_insert_delayed_extent(inode, lblk, 1, allocated, false);
>  	return 0;
>  }
>  
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index a697f4b77162..6b41ac61310f 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -2478,11 +2478,11 @@ TRACE_EVENT(ext4_es_shrink,
>  		  __entry->scan_time, __entry->nr_skipped, __entry->retried)
>  );
>  
> -TRACE_EVENT(ext4_es_insert_delayed_block,
> +TRACE_EVENT(ext4_es_insert_delayed_extent,
>  	TP_PROTO(struct inode *inode, struct extent_status *es,
> -		 bool allocated),
> +		 bool lclu_allocated, bool end_allocated),
>  
> -	TP_ARGS(inode, es, allocated),
> +	TP_ARGS(inode, es, lclu_allocated, end_allocated),
>  
>  	TP_STRUCT__entry(
>  		__field(	dev_t,		dev		)
> @@ -2491,7 +2491,8 @@ TRACE_EVENT(ext4_es_insert_delayed_block,
>  		__field(	ext4_lblk_t,	len		)
>  		__field(	ext4_fsblk_t,	pblk		)
>  		__field(	char,		status		)
> -		__field(	bool,		allocated	)
> +		__field(	bool,		lclu_allocated	)
> +		__field(	bool,		end_allocated	)
>  	),
>  
>  	TP_fast_assign(
> @@ -2501,16 +2502,17 @@ TRACE_EVENT(ext4_es_insert_delayed_block,
>  		__entry->len		= es->es_len;
>  		__entry->pblk		= ext4_es_show_pblock(es);
>  		__entry->status		= ext4_es_status(es);
> -		__entry->allocated	= allocated;
> +		__entry->lclu_allocated	= lclu_allocated;
> +		__entry->end_allocated	= end_allocated;
>  	),
>  
>  	TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s "
> -		  "allocated %d",
> +		  "allocated %d %d",
>  		  MAJOR(__entry->dev), MINOR(__entry->dev),
>  		  (unsigned long) __entry->ino,
>  		  __entry->lblk, __entry->len,
>  		  __entry->pblk, show_extent_status(__entry->status),
> -		  __entry->allocated)
> +		  __entry->lclu_allocated, __entry->end_allocated)
>  );
>  
>  /* fsmap traces */
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 08/10] ext4: factor out check for whether a cluster is allocated
  2024-05-08  6:12 ` [PATCH v3 08/10] ext4: factor out check for whether a cluster is allocated Zhang Yi
@ 2024-05-12 15:40   ` Jan Kara
  2024-05-14  2:37     ` Zhang Yi
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Kara @ 2024-05-12 15:40 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	jack, ritesh.list, yi.zhang, chengzhihao1, yukuai3

On Wed 08-05-24 14:12:18, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> Factor out a common helper ext4_da_check_clu_allocated(), check whether
> the cluster containing a delalloc block to be added has been delayed or
> allocated, no logic changes.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

I have one suggestion for improvement here.

> +/*
> + * Check whether the cluster containing lblk has been delayed or allocated,
> + * if not, it means we should reserve a cluster when add delalloc, return 1,
> + * otherwise return 0 or error code.
> + */
> +static int ext4_da_check_clu_allocated(struct inode *inode, ext4_lblk_t lblk,
> +				       bool *allocated)

The name of the function does not quite match what it is returning and that
is confusing. Essentially we have three states here:

a) cluster allocated
b) cluster has delalloc reservation
c) cluster doesn't have either

So maybe we could call the function ext4_clu_alloc_state() and return 0 /
1 / 2 based on the state?

								Honza

> +{
> +	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +	int ret;
> +
> +	*allocated = false;
> +	if (ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk))
> +		return 0;
> +
> +	if (ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk))
> +		goto allocated;
> +
> +	ret = ext4_clu_mapped(inode, EXT4_B2C(sbi, lblk));
> +	if (ret < 0)
> +		return ret;
> +	if (ret == 0)
> +		return 1;
> +allocated:
> +	*allocated = true;
> +	return 0;
> +}
> +
>  /*
>   * ext4_insert_delayed_block - adds a delayed block to the extents status
>   *                             tree, incrementing the reserved cluster/block
> @@ -1682,23 +1710,13 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
>  		if (ret != 0)   /* ENOSPC */
>  			return ret;
>  	} else {   /* bigalloc */
> -		if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) {
> -			if (!ext4_es_scan_clu(inode,
> -					      &ext4_es_is_mapped, lblk)) {
> -				ret = ext4_clu_mapped(inode,
> -						      EXT4_B2C(sbi, lblk));
> -				if (ret < 0)
> -					return ret;
> -				if (ret == 0) {
> -					ret = ext4_da_reserve_space(inode, 1);
> -					if (ret != 0)   /* ENOSPC */
> -						return ret;
> -				} else {
> -					allocated = true;
> -				}
> -			} else {
> -				allocated = true;
> -			}
> +		ret = ext4_da_check_clu_allocated(inode, lblk, &allocated);
> +		if (ret < 0)
> +			return ret;
> +		if (ret > 0) {
> +			ret = ext4_da_reserve_space(inode, 1);
> +			if (ret != 0)   /* ENOSPC */
> +				return ret;
>  		}
>  	}
>  
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 09/10] ext4: make ext4_insert_delayed_block() insert multi-blocks
  2024-05-08  6:12 ` [PATCH v3 09/10] ext4: make ext4_insert_delayed_block() insert multi-blocks Zhang Yi
@ 2024-05-12 21:47   ` Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2024-05-12 21:47 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	jack, ritesh.list, yi.zhang, chengzhihao1, yukuai3

On Wed 08-05-24 14:12:19, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> Rename ext4_insert_delayed_block() to ext4_insert_delayed_blocks(),
> pass length parameter to make it insert multiple delalloc blocks at a
> time. For non-bigalloc case, just reserve len blocks and insert delalloc
> extent. For bigalloc case, we can ensure that the clusters in the middle
> of a extent must be unallocated, we only need to check whether the start
> and end clusters are delayed/allocated. We should subtract the space for
> the start and/or end block(s) if they are allocated.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inode.c | 48 ++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 6e418d3f8e87..c56386d1b10d 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1678,24 +1678,29 @@ static int ext4_da_check_clu_allocated(struct inode *inode, ext4_lblk_t lblk,
>  }
>  
>  /*
> - * ext4_insert_delayed_block - adds a delayed block to the extents status
> - *                             tree, incrementing the reserved cluster/block
> - *                             count or making a pending reservation
> - *                             where needed
> + * ext4_insert_delayed_blocks - adds a multiple delayed blocks to the extents
> + *                              status tree, incrementing the reserved
> + *                              cluster/block count or making pending
> + *                              reservations where needed
>   *
>   * @inode - file containing the newly added block
> - * @lblk - logical block to be added
> + * @lblk - start logical block to be added
> + * @len - length of blocks to be added
>   *
>   * Returns 0 on success, negative error code on failure.
>   */
> -static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
> +static int ext4_insert_delayed_blocks(struct inode *inode, ext4_lblk_t lblk,
> +				      ext4_lblk_t len)
>  {
>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>  	int ret;
> -	bool allocated = false;
> +	bool lclu_allocated = false;
> +	bool end_allocated = false;
> +	ext4_lblk_t resv_clu;
> +	ext4_lblk_t end = lblk + len - 1;
>  
>  	/*
> -	 * If the cluster containing lblk is shared with a delayed,
> +	 * If the cluster containing lblk or end is shared with a delayed,
>  	 * written, or unwritten extent in a bigalloc file system, it's
>  	 * already been accounted for and does not need to be reserved.
>  	 * A pending reservation must be made for the cluster if it's
> @@ -1706,21 +1711,36 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
>  	 * extents status tree doesn't get a match.
>  	 */
>  	if (sbi->s_cluster_ratio == 1) {
> -		ret = ext4_da_reserve_space(inode, 1);
> +		ret = ext4_da_reserve_space(inode, len);
>  		if (ret != 0)   /* ENOSPC */
>  			return ret;
>  	} else {   /* bigalloc */
> -		ret = ext4_da_check_clu_allocated(inode, lblk, &allocated);
> +		resv_clu = EXT4_B2C(sbi, end) - EXT4_B2C(sbi, lblk) + 1;
> +
> +		ret = ext4_da_check_clu_allocated(inode, lblk, &lclu_allocated);
>  		if (ret < 0)
>  			return ret;
> -		if (ret > 0) {
> -			ret = ext4_da_reserve_space(inode, 1);
> +		if (ret == 0)
> +			resv_clu--;
> +
> +		if (EXT4_B2C(sbi, lblk) != EXT4_B2C(sbi, end)) {
> +			ret = ext4_da_check_clu_allocated(inode, end,
> +							  &end_allocated);
> +			if (ret < 0)
> +				return ret;
> +			if (ret == 0)
> +				resv_clu--;
> +		}
> +
> +		if (resv_clu) {
> +			ret = ext4_da_reserve_space(inode, resv_clu);
>  			if (ret != 0)   /* ENOSPC */
>  				return ret;
>  		}
>  	}
>  
> -	ext4_es_insert_delayed_extent(inode, lblk, 1, allocated, false);
> +	ext4_es_insert_delayed_extent(inode, lblk, len, lclu_allocated,
> +				      end_allocated);
>  	return 0;
>  }
>  
> @@ -1825,7 +1845,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
>  		}
>  	}
>  
> -	retval = ext4_insert_delayed_block(inode, map->m_lblk);
> +	retval = ext4_insert_delayed_blocks(inode, map->m_lblk, map->m_len);
>  	up_write(&EXT4_I(inode)->i_data_sem);
>  	if (retval)
>  		return retval;
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 10/10] ext4: make ext4_da_map_blocks() buffer_head unaware
  2024-05-08  6:12 ` [PATCH v3 10/10] ext4: make ext4_da_map_blocks() buffer_head unaware Zhang Yi
@ 2024-05-12 21:51   ` Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2024-05-12 21:51 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	jack, ritesh.list, yi.zhang, chengzhihao1, yukuai3

On Wed 08-05-24 14:12:20, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> After calling the ext4_da_map_blocks(), a delalloc extent state could
> be identified through the EXT4_MAP_DELAYED flag in map. So factor out
> buffer_head related handles in ext4_da_map_blocks(), make this function
> buffer_head unaware and becomes a common helper, and also update the
> stale function commtents, preparing for the iomap da write path in the
> future.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inode.c | 63 ++++++++++++++++++++++++-------------------------
>  1 file changed, 31 insertions(+), 32 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index c56386d1b10d..1dba5337382a 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1745,36 +1745,32 @@ static int ext4_insert_delayed_blocks(struct inode *inode, ext4_lblk_t lblk,
>  }
>  
>  /*
> - * This function is grabs code from the very beginning of
> - * ext4_map_blocks, but assumes that the caller is from delayed write
> - * time. This function looks up the requested blocks and sets the
> - * buffer delay bit under the protection of i_data_sem.
> + * Looks up the requested blocks and sets the delalloc extent map.
> + * First try to look up for the extent entry that contains the requested
> + * blocks in the extent status tree without i_data_sem, then try to look
> + * up for the ondisk extent mapping with i_data_sem in read mode,
> + * finally hold i_data_sem in write mode, looks up again and add a
> + * delalloc extent entry if it still couldn't find any extent. Pass out
> + * the mapped extent through @map and return 0 on success.
>   */
> -static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
> -			      struct buffer_head *bh)
> +static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map)
>  {
>  	struct extent_status es;
>  	int retval;
> -	sector_t invalid_block = ~((sector_t) 0xffff);
>  #ifdef ES_AGGRESSIVE_TEST
>  	struct ext4_map_blocks orig_map;
>  
>  	memcpy(&orig_map, map, sizeof(*map));
>  #endif
>  
> -	if (invalid_block < ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es))
> -		invalid_block = ~0;
> -
>  	map->m_flags = 0;
>  	ext_debug(inode, "max_blocks %u, logical block %lu\n", map->m_len,
>  		  (unsigned long) map->m_lblk);
>  
>  	/* Lookup extent status tree firstly */
>  	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
> -		retval = es.es_len - (map->m_lblk - es.es_lblk);
> -		if (retval > map->m_len)
> -			retval = map->m_len;
> -		map->m_len = retval;
> +		map->m_len = min_t(unsigned int, map->m_len,
> +				   es.es_len - (map->m_lblk - es.es_lblk));
>  
>  		if (ext4_es_is_hole(&es))
>  			goto add_delayed;
> @@ -1784,10 +1780,8 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
>  		 * Delayed extent could be allocated by fallocate.
>  		 * So we need to check it.
>  		 */
> -		if (ext4_es_is_delayed(&es) && !ext4_es_is_unwritten(&es)) {
> -			map_bh(bh, inode->i_sb, invalid_block);
> -			set_buffer_new(bh);
> -			set_buffer_delay(bh);
> +		if (ext4_es_is_delonly(&es)) {
> +			map->m_flags |= EXT4_MAP_DELAYED;
>  			return 0;
>  		}
>  
> @@ -1802,7 +1796,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
>  #ifdef ES_AGGRESSIVE_TEST
>  		ext4_map_blocks_es_recheck(NULL, inode, map, &orig_map, 0);
>  #endif
> -		return retval;
> +		return 0;
>  	}
>  
>  	/*
> @@ -1816,7 +1810,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
>  		retval = ext4_map_query_blocks(NULL, inode, map);
>  	up_read(&EXT4_I(inode)->i_data_sem);
>  	if (retval)
> -		return retval;
> +		return retval < 0 ? retval : 0;
>  
>  add_delayed:
>  	down_write(&EXT4_I(inode)->i_data_sem);
> @@ -1828,10 +1822,8 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
>  	 * the extent status tree.
>  	 */
>  	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
> -		retval = es.es_len - (map->m_lblk - es.es_lblk);
> -		if (retval > map->m_len)
> -			retval = map->m_len;
> -		map->m_len = retval;
> +		map->m_len = min_t(unsigned int, map->m_len,
> +				   es.es_len - (map->m_lblk - es.es_lblk));
>  
>  		if (!ext4_es_is_hole(&es)) {
>  			up_write(&EXT4_I(inode)->i_data_sem);
> @@ -1841,18 +1833,14 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map,
>  		retval = ext4_map_query_blocks(NULL, inode, map);
>  		if (retval) {
>  			up_write(&EXT4_I(inode)->i_data_sem);
> -			return retval;
> +			return retval < 0 ? retval : 0;
>  		}
>  	}
>  
> +	map->m_flags |= EXT4_MAP_DELAYED;
>  	retval = ext4_insert_delayed_blocks(inode, map->m_lblk, map->m_len);
>  	up_write(&EXT4_I(inode)->i_data_sem);
> -	if (retval)
> -		return retval;
>  
> -	map_bh(bh, inode->i_sb, invalid_block);
> -	set_buffer_new(bh);
> -	set_buffer_delay(bh);
>  	return retval;
>  }
>  
> @@ -1872,11 +1860,15 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
>  			   struct buffer_head *bh, int create)
>  {
>  	struct ext4_map_blocks map;
> +	sector_t invalid_block = ~((sector_t) 0xffff);
>  	int ret = 0;
>  
>  	BUG_ON(create == 0);
>  	BUG_ON(bh->b_size != inode->i_sb->s_blocksize);
>  
> +	if (invalid_block < ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es))
> +		invalid_block = ~0;
> +
>  	map.m_lblk = iblock;
>  	map.m_len = 1;
>  
> @@ -1885,10 +1877,17 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
>  	 * preallocated blocks are unmapped but should treated
>  	 * the same as allocated blocks.
>  	 */
> -	ret = ext4_da_map_blocks(inode, &map, bh);
> -	if (ret <= 0)
> +	ret = ext4_da_map_blocks(inode, &map);
> +	if (ret < 0)
>  		return ret;
>  
> +	if (map.m_flags & EXT4_MAP_DELAYED) {
> +		map_bh(bh, inode->i_sb, invalid_block);
> +		set_buffer_new(bh);
> +		set_buffer_delay(bh);
> +		return 0;
> +	}
> +
>  	map_bh(bh, inode->i_sb, map.m_pblk);
>  	ext4_update_bh_state(bh, map.m_flags);
>  
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 03/10] ext4: warn if delalloc counters are not zero on inactive
  2024-05-12 15:10   ` Jan Kara
@ 2024-05-13 14:17     ` Zhang Yi
  0 siblings, 0 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-13 14:17 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	ritesh.list, yi.zhang, chengzhihao1, yukuai3

On 2024/5/12 23:10, Jan Kara wrote:
> On Wed 08-05-24 14:12:13, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> The per-inode i_reserved_data_blocks count the reserved delalloc blocks
>> in a regular file, it should be zero when destroying the file. The
>> per-fs s_dirtyclusters_counter count all reserved delalloc blocks in a
>> filesystem, it also should be zero when umounting the filesystem. Now we
>> have only an error message if the i_reserved_data_blocks is not zero,
>> which is unable to be simply captured, so add WARN_ON_ONCE to make it
>> more visable.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> 
> Well, maybe the warnings could be guarded by !(EXT4_SB(sb)->s_mount_state &
> EXT4_ERROR_FS)? Because the warning isn't very interesting when the
> filesystem was corrupted and if somebody runs with errors=continue we would
> still possibly hit this warning although we don't really care...
> 

Make sense, I missed the errors=continue mode.

Thanks,
Yi.

> 
>> ---
>>  fs/ext4/super.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 044135796f2b..440dd54eea25 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -1343,6 +1343,9 @@ static void ext4_put_super(struct super_block *sb)
>>  
>>  	ext4_group_desc_free(sbi);
>>  	ext4_flex_groups_free(sbi);
>> +
>> +	WARN_ON_ONCE(!ext4_forced_shutdown(sb) &&
>> +		     percpu_counter_sum(&sbi->s_dirtyclusters_counter));
>>  	ext4_percpu_param_destroy(sbi);
>>  #ifdef CONFIG_QUOTA
>>  	for (int i = 0; i < EXT4_MAXQUOTAS; i++)
>> @@ -1473,7 +1476,8 @@ static void ext4_destroy_inode(struct inode *inode)
>>  		dump_stack();
>>  	}
>>  
>> -	if (EXT4_I(inode)->i_reserved_data_blocks)
>> +	if (!ext4_forced_shutdown(inode->i_sb) &&
>> +	    WARN_ON_ONCE(EXT4_I(inode)->i_reserved_data_blocks))
>>  		ext4_msg(inode->i_sb, KERN_ERR,
>>  			 "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!",
>>  			 inode->i_ino, EXT4_I(inode),
>> -- 
>> 2.39.2
>>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 08/10] ext4: factor out check for whether a cluster is allocated
  2024-05-12 15:40   ` Jan Kara
@ 2024-05-14  2:37     ` Zhang Yi
  0 siblings, 0 replies; 22+ messages in thread
From: Zhang Yi @ 2024-05-14  2:37 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	ritesh.list, yi.zhang, chengzhihao1, yukuai3

On 2024/5/12 23:40, Jan Kara wrote:
> On Wed 08-05-24 14:12:18, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> Factor out a common helper ext4_da_check_clu_allocated(), check whether
>> the cluster containing a delalloc block to be added has been delayed or
>> allocated, no logic changes.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> 
> I have one suggestion for improvement here.
> 
>> +/*
>> + * Check whether the cluster containing lblk has been delayed or allocated,
>> + * if not, it means we should reserve a cluster when add delalloc, return 1,
>> + * otherwise return 0 or error code.
>> + */
>> +static int ext4_da_check_clu_allocated(struct inode *inode, ext4_lblk_t lblk,
>> +				       bool *allocated)
> 
> The name of the function does not quite match what it is returning and that
> is confusing. Essentially we have three states here:
> 
> a) cluster allocated
> b) cluster has delalloc reservation
> c) cluster doesn't have either
> 
> So maybe we could call the function ext4_clu_alloc_state() and return 0 /
> 1 / 2 based on the state?
> 
> 								Honza

Sure, thanks for the suggestion, it looks better.

Thanks,
Yi.

> 
>> +{
>> +	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>> +	int ret;
>> +
>> +	*allocated = false;
>> +	if (ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk))
>> +		return 0;
>> +
>> +	if (ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk))
>> +		goto allocated;
>> +
>> +	ret = ext4_clu_mapped(inode, EXT4_B2C(sbi, lblk));
>> +	if (ret < 0)
>> +		return ret;
>> +	if (ret == 0)
>> +		return 1;
>> +allocated:
>> +	*allocated = true;
>> +	return 0;
>> +}
>> +
>>  /*
>>   * ext4_insert_delayed_block - adds a delayed block to the extents status
>>   *                             tree, incrementing the reserved cluster/block
>> @@ -1682,23 +1710,13 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk)
>>  		if (ret != 0)   /* ENOSPC */
>>  			return ret;
>>  	} else {   /* bigalloc */
>> -		if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) {
>> -			if (!ext4_es_scan_clu(inode,
>> -					      &ext4_es_is_mapped, lblk)) {
>> -				ret = ext4_clu_mapped(inode,
>> -						      EXT4_B2C(sbi, lblk));
>> -				if (ret < 0)
>> -					return ret;
>> -				if (ret == 0) {
>> -					ret = ext4_da_reserve_space(inode, 1);
>> -					if (ret != 0)   /* ENOSPC */
>> -						return ret;
>> -				} else {
>> -					allocated = true;
>> -				}
>> -			} else {
>> -				allocated = true;
>> -			}
>> +		ret = ext4_da_check_clu_allocated(inode, lblk, &allocated);
>> +		if (ret < 0)
>> +			return ret;
>> +		if (ret > 0) {
>> +			ret = ext4_da_reserve_space(inode, 1);
>> +			if (ret != 0)   /* ENOSPC */
>> +				return ret;
>>  		}
>>  	}
>>  
>> -- 
>> 2.39.2
>>


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-05-14  2:37 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-08  6:12 [PATCH v3 00/10] ext4: support adding multi-delalloc blocks Zhang Yi
2024-05-08  6:12 ` [PATCH v3 01/10] ext4: factor out a common helper to query extent map Zhang Yi
2024-05-08  6:12 ` [PATCH v3 02/10] ext4: check the extent status again before inserting delalloc block Zhang Yi
2024-05-08 15:02   ` Markus Elfring
2024-05-09  8:26     ` Zhang Yi
2024-05-08  6:12 ` [PATCH v3 03/10] ext4: warn if delalloc counters are not zero on inactive Zhang Yi
2024-05-12 15:10   ` Jan Kara
2024-05-13 14:17     ` Zhang Yi
2024-05-08  6:12 ` [PATCH v3 04/10] ext4: trim delalloc extent Zhang Yi
2024-05-08 15:21   ` Markus Elfring
2024-05-09  8:27     ` Zhang Yi
2024-05-08  6:12 ` [PATCH v3 05/10] ext4: drop iblock parameter Zhang Yi
2024-05-08  6:12 ` [PATCH v3 06/10] ext4: make ext4_es_insert_delayed_block() insert multi-blocks Zhang Yi
2024-05-12 15:19   ` Jan Kara
2024-05-08  6:12 ` [PATCH v3 07/10] ext4: make ext4_da_reserve_space() reserve multi-clusters Zhang Yi
2024-05-08  6:12 ` [PATCH v3 08/10] ext4: factor out check for whether a cluster is allocated Zhang Yi
2024-05-12 15:40   ` Jan Kara
2024-05-14  2:37     ` Zhang Yi
2024-05-08  6:12 ` [PATCH v3 09/10] ext4: make ext4_insert_delayed_block() insert multi-blocks Zhang Yi
2024-05-12 21:47   ` Jan Kara
2024-05-08  6:12 ` [PATCH v3 10/10] ext4: make ext4_da_map_blocks() buffer_head unaware Zhang Yi
2024-05-12 21:51   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).