Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCHv3 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping
       [not found] <cover.1582702693.git.riteshh@linux.ibm.com>
@ 2020-02-26  9:57 ` Ritesh Harjani
  2020-02-26 12:26   ` Jan Kara
  2020-02-26  9:57 ` [PATCHv3 2/6] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26  9:57 UTC (permalink / raw)
  To: jack, tytso, linux-ext4
  Cc: adilger.kernel, linux-fsdevel, darrick.wong, hch, cmaiolino,
	Ritesh Harjani

IOMAP_F_MERGED needs to be set in case of non-extent based mapping.
This is needed in later patches for conversion of ext4_fiemap to use iomap.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/inode.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d035acab5b2a..3b4230cf0bc2 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3335,6 +3335,10 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,
 	iomap->offset = (u64) map->m_lblk << blkbits;
 	iomap->length = (u64) map->m_len << blkbits;
 
+	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
+		!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
+		iomap->flags |= IOMAP_F_MERGED;
+
 	/*
 	 * Flags passed to ext4_map_blocks() for direct I/O writes can result
 	 * in m_flags having both EXT4_MAP_MAPPED and EXT4_MAP_UNWRITTEN bits
-- 
2.21.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv3 2/6] ext4: Optimize ext4_ext_precache for 0 depth
       [not found] <cover.1582702693.git.riteshh@linux.ibm.com>
  2020-02-26  9:57 ` [PATCHv3 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
@ 2020-02-26  9:57 ` Ritesh Harjani
  2020-02-26 12:28   ` Jan Kara
  2020-02-26  9:57 ` [PATCHv3 3/6] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26  9:57 UTC (permalink / raw)
  To: jack, tytso, linux-ext4
  Cc: adilger.kernel, linux-fsdevel, darrick.wong, hch, cmaiolino,
	Ritesh Harjani

This patch avoids the memory alloc & free path when depth is 0, 
since anyway there is no extra caching done in that case.
So on checking depth 0, simply return early.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/extents.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ee83fe7c98aa..0de548bb3c90 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -594,6 +594,12 @@ int ext4_ext_precache(struct inode *inode)
 	down_read(&ei->i_data_sem);
 	depth = ext_depth(inode);
 
+	/* Don't cache anything if there are no external extent blocks */
+	if (!depth) {
+		up_read(&ei->i_data_sem);
+		return ret;
+	}
+
 	path = kcalloc(depth + 1, sizeof(struct ext4_ext_path),
 		       GFP_NOFS);
 	if (path == NULL) {
@@ -601,9 +607,6 @@ int ext4_ext_precache(struct inode *inode)
 		return -ENOMEM;
 	}
 
-	/* Don't cache anything if there are no external extent blocks */
-	if (depth == 0)
-		goto out;
 	path[0].p_hdr = ext_inode_hdr(inode);
 	ret = ext4_ext_check(inode, path[0].p_hdr, depth, 0);
 	if (ret)
-- 
2.21.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv3 3/6] ext4: Move ext4 bmap to use iomap infrastructure.
       [not found] <cover.1582702693.git.riteshh@linux.ibm.com>
  2020-02-26  9:57 ` [PATCHv3 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
  2020-02-26  9:57 ` [PATCHv3 2/6] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
@ 2020-02-26  9:57 ` Ritesh Harjani
  2020-02-26 12:29   ` Jan Kara
  2020-02-26  9:57 ` [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap Ritesh Harjani
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26  9:57 UTC (permalink / raw)
  To: jack, tytso, linux-ext4
  Cc: adilger.kernel, linux-fsdevel, darrick.wong, hch, cmaiolino,
	Ritesh Harjani

ext4_iomap_begin is already implemented which provides ext4_map_blocks,
so just move the API from generic_block_bmap to iomap_bmap for iomap
conversion.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3b4230cf0bc2..0f8a196d8a61 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3214,7 +3214,7 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
 			return 0;
 	}
 
-	return generic_block_bmap(mapping, block, ext4_get_block);
+	return iomap_bmap(mapping, block, &ext4_iomap_ops);
 }
 
 static int ext4_readpage(struct file *file, struct page *page)
-- 
2.21.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap
       [not found] <cover.1582702693.git.riteshh@linux.ibm.com>
                   ` (2 preceding siblings ...)
  2020-02-26  9:57 ` [PATCHv3 3/6] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
@ 2020-02-26  9:57 ` Ritesh Harjani
  2020-02-26 12:39   ` Jan Kara
  2020-02-26 16:11   ` Darrick J. Wong
  2020-02-26  9:57 ` [PATCHv3 5/6] ext4: Move ext4_fiemap to use iomap framework Ritesh Harjani
  2020-02-26  9:57 ` [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST Ritesh Harjani
  5 siblings, 2 replies; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26  9:57 UTC (permalink / raw)
  To: jack, tytso, linux-ext4
  Cc: adilger.kernel, linux-fsdevel, darrick.wong, hch, cmaiolino,
	Ritesh Harjani

For indirect block mapping if the i_block > max supported block in inode
then ext4_ind_map_blocks may return a -EIO error. But in case of fiemap
this could be a valid query to ext4_map_blocks.
So in case if !create then return 0. This also makes ext4_warning to
ext4_debug in ext4_block_to_path() for the same reason.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/indirect.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index 3a4ab70fe9e0..e1ab495dd900 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -102,7 +102,11 @@ static int ext4_block_to_path(struct inode *inode,
 		offsets[n++] = i_block & (ptrs - 1);
 		final = ptrs;
 	} else {
-		ext4_warning(inode->i_sb, "block %lu > max in inode %lu",
+		/*
+		 * It's not yet an error to just query beyond max
+		 * block in inode. Fiemap callers may do so.
+		 */
+		ext4_debug("block %lu > max in inode %lu",
 			     i_block + direct_blocks +
 			     indirect_blocks + double_blocks, inode->i_ino);
 	}
@@ -537,8 +541,11 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
 	depth = ext4_block_to_path(inode, map->m_lblk, offsets,
 				   &blocks_to_boundary);
 
-	if (depth == 0)
+	if (depth == 0) {
+		if (!(flags & EXT4_GET_BLOCKS_CREATE))
+			err = 0;
 		goto out;
+	}
 
 	partial = ext4_get_branch(inode, depth, offsets, chain, &err);
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv3 5/6] ext4: Move ext4_fiemap to use iomap framework.
       [not found] <cover.1582702693.git.riteshh@linux.ibm.com>
                   ` (3 preceding siblings ...)
  2020-02-26  9:57 ` [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap Ritesh Harjani
@ 2020-02-26  9:57 ` Ritesh Harjani
  2020-02-26 13:27   ` Jan Kara
  2020-02-26  9:57 ` [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST Ritesh Harjani
  5 siblings, 1 reply; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26  9:57 UTC (permalink / raw)
  To: jack, tytso, linux-ext4
  Cc: adilger.kernel, linux-fsdevel, darrick.wong, hch, cmaiolino,
	Ritesh Harjani

This patch moves ext4_fiemap to use iomap framework.
For xattr a new 'ext4_iomap_xattr_ops' is added.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/extents.c | 281 +++++++---------------------------------------
 fs/ext4/inline.c  |  41 -------
 2 files changed, 41 insertions(+), 281 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 0de548bb3c90..d5a692d1c3c3 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -28,6 +28,7 @@
 #include <linux/uaccess.h>
 #include <linux/fiemap.h>
 #include <linux/backing-dev.h>
+#include <linux/iomap.h>
 #include "ext4_jbd2.h"
 #include "ext4_extents.h"
 #include "xattr.h"
@@ -97,9 +98,6 @@ static int ext4_split_extent_at(handle_t *handle,
 			     int split_flag,
 			     int flags);
 
-static int ext4_find_delayed_extent(struct inode *inode,
-				    struct extent_status *newes);
-
 static int ext4_ext_trunc_restart_fn(struct inode *inode, int *dropped)
 {
 	/*
@@ -2176,155 +2174,6 @@ int ext4_ext_insert_extent(handle_t *handle, struct inode *inode,
 	return err;
 }
 
-static int ext4_fill_fiemap_extents(struct inode *inode,
-				    ext4_lblk_t block, ext4_lblk_t num,
-				    struct fiemap_extent_info *fieinfo)
-{
-	struct ext4_ext_path *path = NULL;
-	struct ext4_extent *ex;
-	struct extent_status es;
-	ext4_lblk_t next, next_del, start = 0, end = 0;
-	ext4_lblk_t last = block + num;
-	int exists, depth = 0, err = 0;
-	unsigned int flags = 0;
-	unsigned char blksize_bits = inode->i_sb->s_blocksize_bits;
-
-	while (block < last && block != EXT_MAX_BLOCKS) {
-		num = last - block;
-		/* find extent for this block */
-		down_read(&EXT4_I(inode)->i_data_sem);
-
-		path = ext4_find_extent(inode, block, &path, 0);
-		if (IS_ERR(path)) {
-			up_read(&EXT4_I(inode)->i_data_sem);
-			err = PTR_ERR(path);
-			path = NULL;
-			break;
-		}
-
-		depth = ext_depth(inode);
-		if (unlikely(path[depth].p_hdr == NULL)) {
-			up_read(&EXT4_I(inode)->i_data_sem);
-			EXT4_ERROR_INODE(inode, "path[%d].p_hdr == NULL", depth);
-			err = -EFSCORRUPTED;
-			break;
-		}
-		ex = path[depth].p_ext;
-		next = ext4_ext_next_allocated_block(path);
-
-		flags = 0;
-		exists = 0;
-		if (!ex) {
-			/* there is no extent yet, so try to allocate
-			 * all requested space */
-			start = block;
-			end = block + num;
-		} else if (le32_to_cpu(ex->ee_block) > block) {
-			/* need to allocate space before found extent */
-			start = block;
-			end = le32_to_cpu(ex->ee_block);
-			if (block + num < end)
-				end = block + num;
-		} else if (block >= le32_to_cpu(ex->ee_block)
-					+ ext4_ext_get_actual_len(ex)) {
-			/* need to allocate space after found extent */
-			start = block;
-			end = block + num;
-			if (end >= next)
-				end = next;
-		} else if (block >= le32_to_cpu(ex->ee_block)) {
-			/*
-			 * some part of requested space is covered
-			 * by found extent
-			 */
-			start = block;
-			end = le32_to_cpu(ex->ee_block)
-				+ ext4_ext_get_actual_len(ex);
-			if (block + num < end)
-				end = block + num;
-			exists = 1;
-		} else {
-			BUG();
-		}
-		BUG_ON(end <= start);
-
-		if (!exists) {
-			es.es_lblk = start;
-			es.es_len = end - start;
-			es.es_pblk = 0;
-		} else {
-			es.es_lblk = le32_to_cpu(ex->ee_block);
-			es.es_len = ext4_ext_get_actual_len(ex);
-			es.es_pblk = ext4_ext_pblock(ex);
-			if (ext4_ext_is_unwritten(ex))
-				flags |= FIEMAP_EXTENT_UNWRITTEN;
-		}
-
-		/*
-		 * Find delayed extent and update es accordingly. We call
-		 * it even in !exists case to find out whether es is the
-		 * last existing extent or not.
-		 */
-		next_del = ext4_find_delayed_extent(inode, &es);
-		if (!exists && next_del) {
-			exists = 1;
-			flags |= (FIEMAP_EXTENT_DELALLOC |
-				  FIEMAP_EXTENT_UNKNOWN);
-		}
-		up_read(&EXT4_I(inode)->i_data_sem);
-
-		if (unlikely(es.es_len == 0)) {
-			EXT4_ERROR_INODE(inode, "es.es_len == 0");
-			err = -EFSCORRUPTED;
-			break;
-		}
-
-		/*
-		 * This is possible iff next == next_del == EXT_MAX_BLOCKS.
-		 * we need to check next == EXT_MAX_BLOCKS because it is
-		 * possible that an extent is with unwritten and delayed
-		 * status due to when an extent is delayed allocated and
-		 * is allocated by fallocate status tree will track both of
-		 * them in a extent.
-		 *
-		 * So we could return a unwritten and delayed extent, and
-		 * its block is equal to 'next'.
-		 */
-		if (next == next_del && next == EXT_MAX_BLOCKS) {
-			flags |= FIEMAP_EXTENT_LAST;
-			if (unlikely(next_del != EXT_MAX_BLOCKS ||
-				     next != EXT_MAX_BLOCKS)) {
-				EXT4_ERROR_INODE(inode,
-						 "next extent == %u, next "
-						 "delalloc extent = %u",
-						 next, next_del);
-				err = -EFSCORRUPTED;
-				break;
-			}
-		}
-
-		if (exists) {
-			err = fiemap_fill_next_extent(fieinfo,
-				(__u64)es.es_lblk << blksize_bits,
-				(__u64)es.es_pblk << blksize_bits,
-				(__u64)es.es_len << blksize_bits,
-				flags);
-			if (err < 0)
-				break;
-			if (err == 1) {
-				err = 0;
-				break;
-			}
-		}
-
-		block = es.es_lblk + es.es_len;
-	}
-
-	ext4_ext_drop_refs(path);
-	kfree(path);
-	return err;
-}
-
 static int ext4_fill_es_cache_info(struct inode *inode,
 				   ext4_lblk_t block, ext4_lblk_t num,
 				   struct fiemap_extent_info *fieinfo)
@@ -5058,64 +4907,13 @@ int ext4_convert_unwritten_io_end_vec(handle_t *handle, ext4_io_end_t *io_end)
 	return ret < 0 ? ret : err;
 }
 
-/*
- * If newes is not existing extent (newes->ec_pblk equals zero) find
- * delayed extent at start of newes and update newes accordingly and
- * return start of the next delayed extent.
- *
- * If newes is existing extent (newes->ec_pblk is not equal zero)
- * return start of next delayed extent or EXT_MAX_BLOCKS if no delayed
- * extent found. Leave newes unmodified.
- */
-static int ext4_find_delayed_extent(struct inode *inode,
-				    struct extent_status *newes)
-{
-	struct extent_status es;
-	ext4_lblk_t block, next_del;
-
-	if (newes->es_pblk == 0) {
-		ext4_es_find_extent_range(inode, &ext4_es_is_delayed,
-					  newes->es_lblk,
-					  newes->es_lblk + newes->es_len - 1,
-					  &es);
-
-		/*
-		 * No extent in extent-tree contains block @newes->es_pblk,
-		 * then the block may stay in 1)a hole or 2)delayed-extent.
-		 */
-		if (es.es_len == 0)
-			/* A hole found. */
-			return 0;
-
-		if (es.es_lblk > newes->es_lblk) {
-			/* A hole found. */
-			newes->es_len = min(es.es_lblk - newes->es_lblk,
-					    newes->es_len);
-			return 0;
-		}
-
-		newes->es_len = es.es_lblk + es.es_len - newes->es_lblk;
-	}
-
-	block = newes->es_lblk + newes->es_len;
-	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, block,
-				  EXT_MAX_BLOCKS, &es);
-	if (es.es_len == 0)
-		next_del = EXT_MAX_BLOCKS;
-	else
-		next_del = es.es_lblk;
-
-	return next_del;
-}
-
-static int ext4_xattr_fiemap(struct inode *inode,
-				struct fiemap_extent_info *fieinfo)
+static int ext4_iomap_xattr_fiemap(struct inode *inode, struct iomap *iomap)
 {
 	__u64 physical = 0;
 	__u64 length;
-	__u32 flags = FIEMAP_EXTENT_LAST;
 	int blockbits = inode->i_sb->s_blocksize_bits;
 	int error = 0;
+	u16 iomap_type;
 
 	/* in-inode? */
 	if (ext4_test_inode_state(inode, EXT4_STATE_XATTR)) {
@@ -5130,40 +4928,44 @@ static int ext4_xattr_fiemap(struct inode *inode,
 				EXT4_I(inode)->i_extra_isize;
 		physical += offset;
 		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
-		flags |= FIEMAP_EXTENT_DATA_INLINE;
 		brelse(iloc.bh);
+		iomap_type = IOMAP_INLINE;
 	} else { /* external block */
 		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
 		length = inode->i_sb->s_blocksize;
+		iomap_type = IOMAP_MAPPED;
 	}
 
-	if (physical)
-		error = fiemap_fill_next_extent(fieinfo, 0, physical,
-						length, flags);
-	return (error < 0 ? error : 0);
+	iomap->addr = physical;
+	iomap->offset = 0;
+	iomap->length = length;
+	iomap->type = iomap_type;
+	iomap->flags = 0;
+	return error;
 }
 
-static int _ext4_fiemap(struct inode *inode,
-			struct fiemap_extent_info *fieinfo,
-			__u64 start, __u64 len,
-			int (*fill)(struct inode *, ext4_lblk_t,
-				    ext4_lblk_t,
-				    struct fiemap_extent_info *))
+static int ext4_iomap_xattr_begin(struct inode *inode, loff_t offset,
+				  loff_t length, unsigned flags,
+				  struct iomap *iomap, struct iomap *srcmap)
 {
-	ext4_lblk_t start_blk;
-	u32 ext4_fiemap_flags = FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR;
+	int error;
 
-	int error = 0;
-
-	if (ext4_has_inline_data(inode)) {
-		int has_inline = 1;
+	error = ext4_iomap_xattr_fiemap(inode, iomap);
+	if (error == 0 && (offset >= iomap->length))
+		error = -ENOENT;
+	return error;
+}
 
-		error = ext4_inline_data_fiemap(inode, fieinfo, &has_inline,
-						start, len);
+const struct iomap_ops ext4_iomap_xattr_ops = {
+	.iomap_begin		= ext4_iomap_xattr_begin,
+};
 
-		if (has_inline)
-			return error;
-	}
+static int _ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+			__u64 start, __u64 len, bool from_es_cache)
+{
+	ext4_lblk_t start_blk;
+	u32 ext4_fiemap_flags = FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR;
+	int error = 0;
 
 	if (fieinfo->fi_flags & FIEMAP_FLAG_CACHE) {
 		error = ext4_ext_precache(inode);
@@ -5172,19 +4974,19 @@ static int _ext4_fiemap(struct inode *inode,
 		fieinfo->fi_flags &= ~FIEMAP_FLAG_CACHE;
 	}
 
-	/* fallback to generic here if not in extents fmt */
-	if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) &&
-	    fill == ext4_fill_fiemap_extents)
-		return generic_block_fiemap(inode, fieinfo, start, len,
-			ext4_get_block);
-
-	if (fill == ext4_fill_es_cache_info)
+	if (from_es_cache)
 		ext4_fiemap_flags &= FIEMAP_FLAG_XATTR;
+
 	if (fiemap_check_flags(fieinfo, ext4_fiemap_flags))
 		return -EBADR;
 
 	if (fieinfo->fi_flags & FIEMAP_FLAG_XATTR) {
-		error = ext4_xattr_fiemap(inode, fieinfo);
+		fieinfo->fi_flags &= ~FIEMAP_FLAG_XATTR;
+		error = iomap_fiemap(inode, fieinfo, start, len,
+				     &ext4_iomap_xattr_ops);
+	} else if (!from_es_cache) {
+		error = iomap_fiemap(inode, fieinfo, start, len,
+				     &ext4_iomap_report_ops);
 	} else {
 		ext4_lblk_t len_blks;
 		__u64 last_blk;
@@ -5199,7 +5001,8 @@ static int _ext4_fiemap(struct inode *inode,
 		 * Walk the extent tree gathering extent information
 		 * and pushing extents back to the user.
 		 */
-		error = fill(inode, start_blk, len_blks, fieinfo);
+		error = ext4_fill_es_cache_info(inode, start_blk, len_blks,
+						fieinfo);
 	}
 	return error;
 }
@@ -5207,8 +5010,7 @@ static int _ext4_fiemap(struct inode *inode,
 int ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		__u64 start, __u64 len)
 {
-	return _ext4_fiemap(inode, fieinfo, start, len,
-			    ext4_fill_fiemap_extents);
+	return _ext4_fiemap(inode, fieinfo, start, len, false);
 }
 
 int ext4_get_es_cache(struct inode *inode, struct fiemap_extent_info *fieinfo,
@@ -5224,8 +5026,7 @@ int ext4_get_es_cache(struct inode *inode, struct fiemap_extent_info *fieinfo,
 			return 0;
 	}
 
-	return _ext4_fiemap(inode, fieinfo, start, len,
-			    ext4_fill_es_cache_info);
+	return _ext4_fiemap(inode, fieinfo, start, len, true);
 }
 
 
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index e61603f47035..b8b99634c832 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -1857,47 +1857,6 @@ int ext4_inline_data_iomap(struct inode *inode, struct iomap *iomap)
 	return error;
 }
 
-int ext4_inline_data_fiemap(struct inode *inode,
-			    struct fiemap_extent_info *fieinfo,
-			    int *has_inline, __u64 start, __u64 len)
-{
-	__u64 physical = 0;
-	__u64 inline_len;
-	__u32 flags = FIEMAP_EXTENT_DATA_INLINE | FIEMAP_EXTENT_NOT_ALIGNED |
-		FIEMAP_EXTENT_LAST;
-	int error = 0;
-	struct ext4_iloc iloc;
-
-	down_read(&EXT4_I(inode)->xattr_sem);
-	if (!ext4_has_inline_data(inode)) {
-		*has_inline = 0;
-		goto out;
-	}
-	inline_len = min_t(size_t, ext4_get_inline_size(inode),
-			   i_size_read(inode));
-	if (start >= inline_len)
-		goto out;
-	if (start + len < inline_len)
-		inline_len = start + len;
-	inline_len -= start;
-
-	error = ext4_get_inode_loc(inode, &iloc);
-	if (error)
-		goto out;
-
-	physical = (__u64)iloc.bh->b_blocknr << inode->i_sb->s_blocksize_bits;
-	physical += (char *)ext4_raw_inode(&iloc) - iloc.bh->b_data;
-	physical += offsetof(struct ext4_inode, i_block);
-
-	brelse(iloc.bh);
-out:
-	up_read(&EXT4_I(inode)->xattr_sem);
-	if (physical)
-		error = fiemap_fill_next_extent(fieinfo, start, physical,
-						inline_len, flags);
-	return (error < 0 ? error : 0);
-}
-
 int ext4_inline_data_truncate(struct inode *inode, int *has_inline)
 {
 	handle_t *handle;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST
       [not found] <cover.1582702693.git.riteshh@linux.ibm.com>
                   ` (4 preceding siblings ...)
  2020-02-26  9:57 ` [PATCHv3 5/6] ext4: Move ext4_fiemap to use iomap framework Ritesh Harjani
@ 2020-02-26  9:57 ` Ritesh Harjani
  2020-02-26 13:05   ` Matthew Wilcox
  5 siblings, 1 reply; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26  9:57 UTC (permalink / raw)
  To: jack, tytso, linux-ext4
  Cc: adilger.kernel, linux-fsdevel, darrick.wong, hch, cmaiolino,
	Ritesh Harjani

Currently FIEMAP_EXTENT_LAST is not working consistently across
different filesystem's fiemap implementations and thus this feature
may be broken. So fix the documentation about this flag to meet the
right expectations.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 Documentation/filesystems/fiemap.txt | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/fiemap.txt b/Documentation/filesystems/fiemap.txt
index f6d9c99103a4..7c5d22511df7 100644
--- a/Documentation/filesystems/fiemap.txt
+++ b/Documentation/filesystems/fiemap.txt
@@ -71,8 +71,7 @@ allocated is less than would be required to map the requested range,
 the maximum number of extents that can be mapped in the fm_extent[]
 array will be returned and fm_mapped_extents will be equal to
 fm_extent_count. In that case, the last extent in the array will not
-complete the requested range and will not have the FIEMAP_EXTENT_LAST
-flag set (see the next section on extent flags).
+complete the requested range.
 
 Each extent is described by a single fiemap_extent structure as
 returned in fm_extents.
@@ -96,7 +95,7 @@ block size of the file system.  With the exception of extents flagged as
 FIEMAP_EXTENT_MERGED, adjacent extents will not be merged.
 
 The fe_flags field contains flags which describe the extent returned.
-A special flag, FIEMAP_EXTENT_LAST is always set on the last extent in
+A special flag, FIEMAP_EXTENT_LAST *may be* set on the last extent in
 the file so that the process making fiemap calls can determine when no
 more extents are available, without having to call the ioctl again.
 
@@ -115,8 +114,9 @@ data. Note that the opposite is not true - it would be valid for
 FIEMAP_EXTENT_NOT_ALIGNED to appear alone.
 
 * FIEMAP_EXTENT_LAST
-This is the last extent in the file. A mapping attempt past this
-extent will return nothing.
+This is generally the last extent in the file. A mapping attempt past this
+extent may return nothing. But the user must still confirm by trying to map
+past this extent, since different filesystems implement this differently.
 
 * FIEMAP_EXTENT_UNKNOWN
 The location of this extent is currently unknown. This may indicate
-- 
2.21.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping
  2020-02-26  9:57 ` [PATCHv3 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
@ 2020-02-26 12:26   ` Jan Kara
  2020-02-26 12:33     ` Ritesh Harjani
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2020-02-26 12:26 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel,
	darrick.wong, hch, cmaiolino

On Wed 26-02-20 15:27:03, Ritesh Harjani wrote:
> IOMAP_F_MERGED needs to be set in case of non-extent based mapping.
> This is needed in later patches for conversion of ext4_fiemap to use iomap.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
>  fs/ext4/inode.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d035acab5b2a..3b4230cf0bc2 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3335,6 +3335,10 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,
>  	iomap->offset = (u64) map->m_lblk << blkbits;
>  	iomap->length = (u64) map->m_len << blkbits;
>  
> +	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
> +		!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
   ^^ we indent the second line of condition either like:

	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
	    !ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))

or like:

	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
			!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))

But on at the same level as following code block because that's highly
confusing at times...

Also EXT4_MAP_UNWRITTEN is never set for indirect block mapped files (that
on disk format does not support unwritten extents).

								Honza

> +		iomap->flags |= IOMAP_F_MERGED;
> +
>  	/*
>  	 * Flags passed to ext4_map_blocks() for direct I/O writes can result
>  	 * in m_flags having both EXT4_MAP_MAPPED and EXT4_MAP_UNWRITTEN bits
> -- 
> 2.21.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 2/6] ext4: Optimize ext4_ext_precache for 0 depth
  2020-02-26  9:57 ` [PATCHv3 2/6] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
@ 2020-02-26 12:28   ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2020-02-26 12:28 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel,
	darrick.wong, hch, cmaiolino

On Wed 26-02-20 15:27:04, Ritesh Harjani wrote:
> This patch avoids the memory alloc & free path when depth is 0, 
> since anyway there is no extra caching done in that case.
> So on checking depth 0, simply return early.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/extents.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index ee83fe7c98aa..0de548bb3c90 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -594,6 +594,12 @@ int ext4_ext_precache(struct inode *inode)
>  	down_read(&ei->i_data_sem);
>  	depth = ext_depth(inode);
>  
> +	/* Don't cache anything if there are no external extent blocks */
> +	if (!depth) {
> +		up_read(&ei->i_data_sem);
> +		return ret;
> +	}
> +
>  	path = kcalloc(depth + 1, sizeof(struct ext4_ext_path),
>  		       GFP_NOFS);
>  	if (path == NULL) {
> @@ -601,9 +607,6 @@ int ext4_ext_precache(struct inode *inode)
>  		return -ENOMEM;
>  	}
>  
> -	/* Don't cache anything if there are no external extent blocks */
> -	if (depth == 0)
> -		goto out;
>  	path[0].p_hdr = ext_inode_hdr(inode);
>  	ret = ext4_ext_check(inode, path[0].p_hdr, depth, 0);
>  	if (ret)
> -- 
> 2.21.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 3/6] ext4: Move ext4 bmap to use iomap infrastructure.
  2020-02-26  9:57 ` [PATCHv3 3/6] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
@ 2020-02-26 12:29   ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2020-02-26 12:29 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel,
	darrick.wong, hch, cmaiolino

On Wed 26-02-20 15:27:05, Ritesh Harjani wrote:
> ext4_iomap_begin is already implemented which provides ext4_map_blocks,
> so just move the API from generic_block_bmap to iomap_bmap for iomap
> conversion.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3b4230cf0bc2..0f8a196d8a61 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3214,7 +3214,7 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
>  			return 0;
>  	}
>  
> -	return generic_block_bmap(mapping, block, ext4_get_block);
> +	return iomap_bmap(mapping, block, &ext4_iomap_ops);
>  }
>  
>  static int ext4_readpage(struct file *file, struct page *page)
> -- 
> 2.21.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping
  2020-02-26 12:26   ` Jan Kara
@ 2020-02-26 12:33     ` Ritesh Harjani
  0 siblings, 0 replies; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26 12:33 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, linux-ext4, adilger.kernel, linux-fsdevel, darrick.wong,
	hch, cmaiolino



On 2/26/20 5:56 PM, Jan Kara wrote:
> On Wed 26-02-20 15:27:03, Ritesh Harjani wrote:
>> IOMAP_F_MERGED needs to be set in case of non-extent based mapping.
>> This is needed in later patches for conversion of ext4_fiemap to use iomap.
>>
>> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>> ---
>>   fs/ext4/inode.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index d035acab5b2a..3b4230cf0bc2 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -3335,6 +3335,10 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,
>>   	iomap->offset = (u64) map->m_lblk << blkbits;
>>   	iomap->length = (u64) map->m_len << blkbits;
>>   
>> +	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
>> +		!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
>     ^^ we indent the second line of condition either like:
> 
> 	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
> 	    !ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))

Yes, I too prefer it this way. Will fix it.

> 
> or like:
> 
> 	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
> 			!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
> 
> But on at the same level as following code block because that's highly
> confusing at times...
> 
> Also EXT4_MAP_UNWRITTEN is never set for indirect block mapped files (that
> on disk format does not support unwritten extents).

Aah yes. Let me fix it.

> 
> 								Honza
> 
>> +		iomap->flags |= IOMAP_F_MERGED;
>> +
>>   	/*
>>   	 * Flags passed to ext4_map_blocks() for direct I/O writes can result
>>   	 * in m_flags having both EXT4_MAP_MAPPED and EXT4_MAP_UNWRITTEN bits
>> -- 
>> 2.21.0
>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap
  2020-02-26  9:57 ` [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap Ritesh Harjani
@ 2020-02-26 12:39   ` Jan Kara
  2020-02-26 12:47     ` Ritesh Harjani
  2020-02-26 16:11   ` Darrick J. Wong
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Kara @ 2020-02-26 12:39 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel,
	darrick.wong, hch, cmaiolino

On Wed 26-02-20 15:27:06, Ritesh Harjani wrote:
> For indirect block mapping if the i_block > max supported block in inode
> then ext4_ind_map_blocks may return a -EIO error. But in case of fiemap
> this could be a valid query to ext4_map_blocks.
> So in case if !create then return 0. This also makes ext4_warning to
> ext4_debug in ext4_block_to_path() for the same reason.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Hmm, won't it be cleaner to just handle this in ext4_iomap_begin_report()?
We do trim map.m_len there anyway so it is only logical to trim it to
proper value supported by the inode on-disk format... BTW, note we have
sbi->s_bitmap_maxbytes value already computed in the superblock...

								Honza

> ---
>  fs/ext4/indirect.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
> index 3a4ab70fe9e0..e1ab495dd900 100644
> --- a/fs/ext4/indirect.c
> +++ b/fs/ext4/indirect.c
> @@ -102,7 +102,11 @@ static int ext4_block_to_path(struct inode *inode,
>  		offsets[n++] = i_block & (ptrs - 1);
>  		final = ptrs;
>  	} else {
> -		ext4_warning(inode->i_sb, "block %lu > max in inode %lu",
> +		/*
> +		 * It's not yet an error to just query beyond max
> +		 * block in inode. Fiemap callers may do so.
> +		 */
> +		ext4_debug("block %lu > max in inode %lu",
>  			     i_block + direct_blocks +
>  			     indirect_blocks + double_blocks, inode->i_ino);
>  	}
> @@ -537,8 +541,11 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>  	depth = ext4_block_to_path(inode, map->m_lblk, offsets,
>  				   &blocks_to_boundary);
>  
> -	if (depth == 0)
> +	if (depth == 0) {
> +		if (!(flags & EXT4_GET_BLOCKS_CREATE))
> +			err = 0;
>  		goto out;
> +	}
>  
>  	partial = ext4_get_branch(inode, depth, offsets, chain, &err);
>  
> -- 
> 2.21.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap
  2020-02-26 12:39   ` Jan Kara
@ 2020-02-26 12:47     ` Ritesh Harjani
  0 siblings, 0 replies; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-26 12:47 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, linux-ext4, adilger.kernel, linux-fsdevel, darrick.wong,
	hch, cmaiolino



On 2/26/20 6:09 PM, Jan Kara wrote:
> On Wed 26-02-20 15:27:06, Ritesh Harjani wrote:
>> For indirect block mapping if the i_block > max supported block in inode
>> then ext4_ind_map_blocks may return a -EIO error. But in case of fiemap
>> this could be a valid query to ext4_map_blocks.
>> So in case if !create then return 0. This also makes ext4_warning to
>> ext4_debug in ext4_block_to_path() for the same reason.
>>
>> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> 
> Hmm, won't it be cleaner to just handle this in ext4_iomap_begin_report()?
> We do trim map.m_len there anyway so it is only logical to trim it to
> proper value supported by the inode on-disk format... BTW, note we have
> sbi->s_bitmap_maxbytes value already computed in the superblock...

hmm. Yes, thanks for the pointers. Let me check this again.

-ritesh


> 
> 								Honza
> 
>> ---
>>   fs/ext4/indirect.c | 11 +++++++++--
>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
>> index 3a4ab70fe9e0..e1ab495dd900 100644
>> --- a/fs/ext4/indirect.c
>> +++ b/fs/ext4/indirect.c
>> @@ -102,7 +102,11 @@ static int ext4_block_to_path(struct inode *inode,
>>   		offsets[n++] = i_block & (ptrs - 1);
>>   		final = ptrs;
>>   	} else {
>> -		ext4_warning(inode->i_sb, "block %lu > max in inode %lu",
>> +		/*
>> +		 * It's not yet an error to just query beyond max
>> +		 * block in inode. Fiemap callers may do so.
>> +		 */
>> +		ext4_debug("block %lu > max in inode %lu",
>>   			     i_block + direct_blocks +
>>   			     indirect_blocks + double_blocks, inode->i_ino);
>>   	}
>> @@ -537,8 +541,11 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>>   	depth = ext4_block_to_path(inode, map->m_lblk, offsets,
>>   				   &blocks_to_boundary);
>>   
>> -	if (depth == 0)
>> +	if (depth == 0) {
>> +		if (!(flags & EXT4_GET_BLOCKS_CREATE))
>> +			err = 0;
>>   		goto out;
>> +	}
>>   
>>   	partial = ext4_get_branch(inode, depth, offsets, chain, &err);
>>   
>> -- 
>> 2.21.0
>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST
  2020-02-26  9:57 ` [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST Ritesh Harjani
@ 2020-02-26 13:05   ` Matthew Wilcox
  2020-02-26 16:17     ` Darrick J. Wong
  0 siblings, 1 reply; 21+ messages in thread
From: Matthew Wilcox @ 2020-02-26 13:05 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel,
	darrick.wong, hch, cmaiolino

On Wed, Feb 26, 2020 at 03:27:08PM +0530, Ritesh Harjani wrote:
> Currently FIEMAP_EXTENT_LAST is not working consistently across
> different filesystem's fiemap implementations and thus this feature
> may be broken. So fix the documentation about this flag to meet the
> right expectations.

Are you saying filesystems have both false positives and false negatives?
I can understand how a filesystem might fail to set FIEMAP_EXTENT_LAST,
but not how a filesystem might set it when there's actually another
extent beyond this one.

>  * FIEMAP_EXTENT_LAST
> -This is the last extent in the file. A mapping attempt past this
> -extent will return nothing.
> +This is generally the last extent in the file. A mapping attempt past this
> +extent may return nothing. But the user must still confirm by trying to map
> +past this extent, since different filesystems implement this differently.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 5/6] ext4: Move ext4_fiemap to use iomap framework.
  2020-02-26  9:57 ` [PATCHv3 5/6] ext4: Move ext4_fiemap to use iomap framework Ritesh Harjani
@ 2020-02-26 13:27   ` Jan Kara
  2020-02-27  5:38     ` Ritesh Harjani
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2020-02-26 13:27 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel,
	darrick.wong, hch, cmaiolino

On Wed 26-02-20 15:27:07, Ritesh Harjani wrote:
> This patch moves ext4_fiemap to use iomap framework.
> For xattr a new 'ext4_iomap_xattr_ops' is added.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

...

> -static int ext4_xattr_fiemap(struct inode *inode,
> -				struct fiemap_extent_info *fieinfo)
> +static int ext4_iomap_xattr_fiemap(struct inode *inode, struct iomap *iomap)
>  {
>  	__u64 physical = 0;
>  	__u64 length;
> -	__u32 flags = FIEMAP_EXTENT_LAST;
>  	int blockbits = inode->i_sb->s_blocksize_bits;
>  	int error = 0;
> +	u16 iomap_type;
>  
>  	/* in-inode? */
>  	if (ext4_test_inode_state(inode, EXT4_STATE_XATTR)) {
> @@ -5130,40 +4928,44 @@ static int ext4_xattr_fiemap(struct inode *inode,
>  				EXT4_I(inode)->i_extra_isize;
>  		physical += offset;
>  		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
> -		flags |= FIEMAP_EXTENT_DATA_INLINE;
>  		brelse(iloc.bh);
> +		iomap_type = IOMAP_INLINE;
>  	} else { /* external block */
>  		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
>  		length = inode->i_sb->s_blocksize;
> +		iomap_type = IOMAP_MAPPED;
>  	}

If i_file_acl is 0 (i.e., no external xattr block), then I think returned
iomap should be different...

> +static int ext4_iomap_xattr_begin(struct inode *inode, loff_t offset,
> +				  loff_t length, unsigned flags,
> +				  struct iomap *iomap, struct iomap *srcmap)
>  {
> -	ext4_lblk_t start_blk;
> -	u32 ext4_fiemap_flags = FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR;
> +	int error;
>  
> -	int error = 0;
> -
> -	if (ext4_has_inline_data(inode)) {
> -		int has_inline = 1;
> +	error = ext4_iomap_xattr_fiemap(inode, iomap);
> +	if (error == 0 && (offset >= iomap->length))
> +		error = -ENOENT;

Is ENOENT really correct here? It seems strange to me...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap
  2020-02-26  9:57 ` [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap Ritesh Harjani
  2020-02-26 12:39   ` Jan Kara
@ 2020-02-26 16:11   ` Darrick J. Wong
  2020-02-27  5:27     ` Ritesh Harjani
  1 sibling, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2020-02-26 16:11 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel, hch, cmaiolino

On Wed, Feb 26, 2020 at 03:27:06PM +0530, Ritesh Harjani wrote:
> For indirect block mapping if the i_block > max supported block in inode
> then ext4_ind_map_blocks may return a -EIO error. But in case of fiemap
> this could be a valid query to ext4_map_blocks.
> So in case if !create then return 0. This also makes ext4_warning to
> ext4_debug in ext4_block_to_path() for the same reason.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
>  fs/ext4/indirect.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
> index 3a4ab70fe9e0..e1ab495dd900 100644
> --- a/fs/ext4/indirect.c
> +++ b/fs/ext4/indirect.c
> @@ -102,7 +102,11 @@ static int ext4_block_to_path(struct inode *inode,
>  		offsets[n++] = i_block & (ptrs - 1);
>  		final = ptrs;
>  	} else {
> -		ext4_warning(inode->i_sb, "block %lu > max in inode %lu",
> +		/*
> +		 * It's not yet an error to just query beyond max
> +		 * block in inode. Fiemap callers may do so.
> +		 */
> +		ext4_debug("block %lu > max in inode %lu",
>  			     i_block + direct_blocks +
>  			     indirect_blocks + double_blocks, inode->i_ino);

Does that mean fiemap callers can spamflood dmesg with this message just
by setting the query start range to a huge value?

--D

>  	}
> @@ -537,8 +541,11 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>  	depth = ext4_block_to_path(inode, map->m_lblk, offsets,
>  				   &blocks_to_boundary);
>  
> -	if (depth == 0)
> +	if (depth == 0) {
> +		if (!(flags & EXT4_GET_BLOCKS_CREATE))
> +			err = 0;
>  		goto out;
> +	}
>  
>  	partial = ext4_get_branch(inode, depth, offsets, chain, &err);
>  
> -- 
> 2.21.0
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST
  2020-02-26 13:05   ` Matthew Wilcox
@ 2020-02-26 16:17     ` Darrick J. Wong
  2020-02-26 17:26       ` Jan Kara
  2020-02-27  5:17       ` Ritesh Harjani
  0 siblings, 2 replies; 21+ messages in thread
From: Darrick J. Wong @ 2020-02-26 16:17 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Ritesh Harjani, jack, tytso, linux-ext4, adilger.kernel,
	linux-fsdevel, hch, cmaiolino

On Wed, Feb 26, 2020 at 05:05:03AM -0800, Matthew Wilcox wrote:
> On Wed, Feb 26, 2020 at 03:27:08PM +0530, Ritesh Harjani wrote:
> > Currently FIEMAP_EXTENT_LAST is not working consistently across
> > different filesystem's fiemap implementations and thus this feature
> > may be broken. So fix the documentation about this flag to meet the
> > right expectations.
> 
> Are you saying filesystems have both false positives and false negatives?
> I can understand how a filesystem might fail to set FIEMAP_EXTENT_LAST,
> but not how a filesystem might set it when there's actually another
> extent beyond this one.
> 
> >  * FIEMAP_EXTENT_LAST
> > -This is the last extent in the file. A mapping attempt past this
> > -extent will return nothing.
> > +This is generally the last extent in the file. A mapping attempt past this
> > +extent may return nothing. But the user must still confirm by trying to map
> > +past this extent, since different filesystems implement this differently.

"This flag means nothing and can be set arbitrarily by the fs for the lulz."

Yuck.  I was really hoping for "This is set on the last extent record in
the dataset generated by the query parameters", particularly becaue
that's how e2fsprogs utilties interpret that flag.

--D

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST
  2020-02-26 16:17     ` Darrick J. Wong
@ 2020-02-26 17:26       ` Jan Kara
  2020-02-27  3:24         ` Dave Chinner
  2020-02-27  5:17       ` Ritesh Harjani
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Kara @ 2020-02-26 17:26 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Matthew Wilcox, Ritesh Harjani, jack, tytso, linux-ext4,
	adilger.kernel, linux-fsdevel, hch, cmaiolino

On Wed 26-02-20 08:17:42, Darrick J. Wong wrote:
> On Wed, Feb 26, 2020 at 05:05:03AM -0800, Matthew Wilcox wrote:
> > On Wed, Feb 26, 2020 at 03:27:08PM +0530, Ritesh Harjani wrote:
> > > Currently FIEMAP_EXTENT_LAST is not working consistently across
> > > different filesystem's fiemap implementations and thus this feature
> > > may be broken. So fix the documentation about this flag to meet the
> > > right expectations.
> > 
> > Are you saying filesystems have both false positives and false negatives?
> > I can understand how a filesystem might fail to set FIEMAP_EXTENT_LAST,
> > but not how a filesystem might set it when there's actually another
> > extent beyond this one.
> > 
> > >  * FIEMAP_EXTENT_LAST
> > > -This is the last extent in the file. A mapping attempt past this
> > > -extent will return nothing.
> > > +This is generally the last extent in the file. A mapping attempt past this
> > > +extent may return nothing. But the user must still confirm by trying to map
> > > +past this extent, since different filesystems implement this differently.
> 
> "This flag means nothing and can be set arbitrarily by the fs for the lulz."
> 
> Yuck.  I was really hoping for "This is set on the last extent record in
> the dataset generated by the query parameters", particularly becaue
> that's how e2fsprogs utilties interpret that flag.

The problem is that in some cases, we cannot determine whether the flag
should be set without doing work equivalent to another fiemap call. And it
just seems pointless to try to provide the flag in those cases.

But I agree with Matthew that seeing the flag without the extent really
being the last one shouldn't happen (unless someone is changing the file).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST
  2020-02-26 17:26       ` Jan Kara
@ 2020-02-27  3:24         ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2020-02-27  3:24 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, Matthew Wilcox, Ritesh Harjani, tytso,
	linux-ext4, adilger.kernel, linux-fsdevel, hch, cmaiolino

On Wed, Feb 26, 2020 at 06:26:53PM +0100, Jan Kara wrote:
> On Wed 26-02-20 08:17:42, Darrick J. Wong wrote:
> > On Wed, Feb 26, 2020 at 05:05:03AM -0800, Matthew Wilcox wrote:
> > > On Wed, Feb 26, 2020 at 03:27:08PM +0530, Ritesh Harjani wrote:
> > > > Currently FIEMAP_EXTENT_LAST is not working consistently across
> > > > different filesystem's fiemap implementations and thus this feature
> > > > may be broken. So fix the documentation about this flag to meet the
> > > > right expectations.
> > > 
> > > Are you saying filesystems have both false positives and false negatives?
> > > I can understand how a filesystem might fail to set FIEMAP_EXTENT_LAST,
> > > but not how a filesystem might set it when there's actually another
> > > extent beyond this one.
> > > 
> > > >  * FIEMAP_EXTENT_LAST
> > > > -This is the last extent in the file. A mapping attempt past this
> > > > -extent will return nothing.
> > > > +This is generally the last extent in the file. A mapping attempt past this
> > > > +extent may return nothing. But the user must still confirm by trying to map
> > > > +past this extent, since different filesystems implement this differently.
> > 
> > "This flag means nothing and can be set arbitrarily by the fs for the lulz."
> > 
> > Yuck.  I was really hoping for "This is set on the last extent record in
> > the dataset generated by the query parameters", particularly becaue
> > that's how e2fsprogs utilties interpret that flag.
> 
> The problem is that in some cases, we cannot determine whether the flag
> should be set without doing work equivalent to another fiemap call. And it
> just seems pointless to try to provide the flag in those cases.
> 
> But I agree with Matthew that seeing the flag without the extent really
> being the last one shouldn't happen (unless someone is changing the file).

Which, of course, can happen. And it can even be the kernel (e.g.
data writeback converting delalloc extents).

IOWs, the information returned to userspace is out of date before it
even gets to userspace, so userspace can't rely on the LAST flag to
be correct in any situation. That's just an extension of the fact
that userspace cannot rely on the extent map being correct and
accurate in any situation....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST
  2020-02-26 16:17     ` Darrick J. Wong
  2020-02-26 17:26       ` Jan Kara
@ 2020-02-27  5:17       ` Ritesh Harjani
  1 sibling, 0 replies; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-27  5:17 UTC (permalink / raw)
  To: Darrick J. Wong, Matthew Wilcox
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel, hch, cmaiolino



On 2/26/20 9:47 PM, Darrick J. Wong wrote:
> On Wed, Feb 26, 2020 at 05:05:03AM -0800, Matthew Wilcox wrote:
>> On Wed, Feb 26, 2020 at 03:27:08PM +0530, Ritesh Harjani wrote:
>>> Currently FIEMAP_EXTENT_LAST is not working consistently across
>>> different filesystem's fiemap implementations and thus this feature
>>> may be broken. So fix the documentation about this flag to meet the
>>> right expectations.
>>
>> Are you saying filesystems have both false positives and false negatives?
>> I can understand how a filesystem might fail to set FIEMAP_EXTENT_LAST,
>> but not how a filesystem might set it when there's actually another
>> extent beyond this one.
>>
>>>   * FIEMAP_EXTENT_LAST
>>> -This is the last extent in the file. A mapping attempt past this
>>> -extent will return nothing.
>>> +This is generally the last extent in the file. A mapping attempt past this
>>> +extent may return nothing. But the user must still confirm by trying to map
>>> +past this extent, since different filesystems implement this differently.
> 
> "This flag means nothing and can be set arbitrarily by the fs for the lulz."
> 

:) Got it. Will add more information to it.


> Yuck.  I was really hoping for "This is set on the last extent record in
> the dataset generated by the query parameters", particularly becaue
> that's how e2fsprogs utilties interpret that flag.

-extent may return nothing. But the user must still confirm by trying to map
-past this extent, since different filesystems implement this differently.
+extent may return nothing. In some implementations this flag is also set on
+the last dataset queried by the user (via fiemap->fm_length).


Let me know if above looks good.

-ritesh


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap
  2020-02-26 16:11   ` Darrick J. Wong
@ 2020-02-27  5:27     ` Ritesh Harjani
  0 siblings, 0 replies; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-27  5:27 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: jack, tytso, linux-ext4, adilger.kernel, linux-fsdevel, hch, cmaiolino



On 2/26/20 9:41 PM, Darrick J. Wong wrote:
> On Wed, Feb 26, 2020 at 03:27:06PM +0530, Ritesh Harjani wrote:
>> For indirect block mapping if the i_block > max supported block in inode
>> then ext4_ind_map_blocks may return a -EIO error. But in case of fiemap
>> this could be a valid query to ext4_map_blocks.
>> So in case if !create then return 0. This also makes ext4_warning to
>> ext4_debug in ext4_block_to_path() for the same reason.
>>
>> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>> ---
>>   fs/ext4/indirect.c | 11 +++++++++--
>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
>> index 3a4ab70fe9e0..e1ab495dd900 100644
>> --- a/fs/ext4/indirect.c
>> +++ b/fs/ext4/indirect.c
>> @@ -102,7 +102,11 @@ static int ext4_block_to_path(struct inode *inode,
>>   		offsets[n++] = i_block & (ptrs - 1);
>>   		final = ptrs;
>>   	} else {
>> -		ext4_warning(inode->i_sb, "block %lu > max in inode %lu",
>> +		/*
>> +		 * It's not yet an error to just query beyond max
>> +		 * block in inode. Fiemap callers may do so.
>> +		 */
>> +		ext4_debug("block %lu > max in inode %lu",
>>   			     i_block + direct_blocks +
>>   			     indirect_blocks + double_blocks, inode->i_ino);
> 
> Does that mean fiemap callers can spamflood dmesg with this message just
> by setting the query start range to a huge value?

Not in the old implementation. But This could happen with indirect
block mapping with new implementation in iomap (as there is no check in 
place before calling ext4_map_blocks()).
Previously __generic_block_fiemap() used to not query beyond
i_size_read(), so we were safe there.

So yes now as Jan also suggested, will add a check in place in
ext4_iomap_begin_report() itself, so that this flooding wont happen.


Thanks for the review!!

-ritesh

> 
> --D
> 
>>   	}
>> @@ -537,8 +541,11 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>>   	depth = ext4_block_to_path(inode, map->m_lblk, offsets,
>>   				   &blocks_to_boundary);
>>   
>> -	if (depth == 0)
>> +	if (depth == 0) {
>> +		if (!(flags & EXT4_GET_BLOCKS_CREATE))
>> +			err = 0;
>>   		goto out;
>> +	}
>>   
>>   	partial = ext4_get_branch(inode, depth, offsets, chain, &err);
>>   
>> -- 
>> 2.21.0
>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3 5/6] ext4: Move ext4_fiemap to use iomap framework.
  2020-02-26 13:27   ` Jan Kara
@ 2020-02-27  5:38     ` Ritesh Harjani
  0 siblings, 0 replies; 21+ messages in thread
From: Ritesh Harjani @ 2020-02-27  5:38 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, linux-ext4, adilger.kernel, linux-fsdevel, darrick.wong,
	hch, cmaiolino



On 2/26/20 6:57 PM, Jan Kara wrote:
> On Wed 26-02-20 15:27:07, Ritesh Harjani wrote:
>> This patch moves ext4_fiemap to use iomap framework.
>> For xattr a new 'ext4_iomap_xattr_ops' is added.
>>
>> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> 
> ...
> 
>> -static int ext4_xattr_fiemap(struct inode *inode,
>> -				struct fiemap_extent_info *fieinfo)
>> +static int ext4_iomap_xattr_fiemap(struct inode *inode, struct iomap *iomap)
>>   {
>>   	__u64 physical = 0;
>>   	__u64 length;
>> -	__u32 flags = FIEMAP_EXTENT_LAST;
>>   	int blockbits = inode->i_sb->s_blocksize_bits;
>>   	int error = 0;
>> +	u16 iomap_type;
>>   
>>   	/* in-inode? */
>>   	if (ext4_test_inode_state(inode, EXT4_STATE_XATTR)) {
>> @@ -5130,40 +4928,44 @@ static int ext4_xattr_fiemap(struct inode *inode,
>>   				EXT4_I(inode)->i_extra_isize;
>>   		physical += offset;
>>   		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
>> -		flags |= FIEMAP_EXTENT_DATA_INLINE;
>>   		brelse(iloc.bh);
>> +		iomap_type = IOMAP_INLINE;
>>   	} else { /* external block */
>>   		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
>>   		length = inode->i_sb->s_blocksize;
>> +		iomap_type = IOMAP_MAPPED;
>>   	}
> 
> If i_file_acl is 0 (i.e., no external xattr block), then I think returned
> iomap should be different...

Yes, my bad. Let me fix this please.


> 
>> +static int ext4_iomap_xattr_begin(struct inode *inode, loff_t offset,
>> +				  loff_t length, unsigned flags,
>> +				  struct iomap *iomap, struct iomap *srcmap)
>>   {
>> -	ext4_lblk_t start_blk;
>> -	u32 ext4_fiemap_flags = FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR;
>> +	int error;
>>   
>> -	int error = 0;
>> -
>> -	if (ext4_has_inline_data(inode)) {
>> -		int has_inline = 1;
>> +	error = ext4_iomap_xattr_fiemap(inode, iomap);
>> +	if (error == 0 && (offset >= iomap->length))
>> +		error = -ENOENT;
> 
> Is ENOENT really correct here? It seems strange to me...
> 

Yes, -ENOENT means that there is no mapping beyond this and so iomap
core will stop querying further in iomap_fiemap.
I see that we do the same in case of inline mapping too in
ext4_iomap_begin_report().

-ritesh


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, back to index

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1582702693.git.riteshh@linux.ibm.com>
2020-02-26  9:57 ` [PATCHv3 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
2020-02-26 12:26   ` Jan Kara
2020-02-26 12:33     ` Ritesh Harjani
2020-02-26  9:57 ` [PATCHv3 2/6] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
2020-02-26 12:28   ` Jan Kara
2020-02-26  9:57 ` [PATCHv3 3/6] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
2020-02-26 12:29   ` Jan Kara
2020-02-26  9:57 ` [PATCHv3 4/6] ext4: Make ext4_ind_map_blocks work with fiemap Ritesh Harjani
2020-02-26 12:39   ` Jan Kara
2020-02-26 12:47     ` Ritesh Harjani
2020-02-26 16:11   ` Darrick J. Wong
2020-02-27  5:27     ` Ritesh Harjani
2020-02-26  9:57 ` [PATCHv3 5/6] ext4: Move ext4_fiemap to use iomap framework Ritesh Harjani
2020-02-26 13:27   ` Jan Kara
2020-02-27  5:38     ` Ritesh Harjani
2020-02-26  9:57 ` [PATCHv3 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST Ritesh Harjani
2020-02-26 13:05   ` Matthew Wilcox
2020-02-26 16:17     ` Darrick J. Wong
2020-02-26 17:26       ` Jan Kara
2020-02-27  3:24         ` Dave Chinner
2020-02-27  5:17       ` Ritesh Harjani

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git