linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
@ 2020-01-28 10:18 Ritesh Harjani
  2020-01-28 10:18 ` [RFCv2 1/4] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-28 10:18 UTC (permalink / raw)
  To: jack, tytso, adilger.kernel, linux-ext4
  Cc: linux-fsdevel, darrick.wong, hch, cmaiolino, Ritesh Harjani

Hello All,

Background
==========
There are RFCv2 patches to move ext4 bmap & fiemap calls to use iomap APIs.
This reduces the users of ext4_get_block API and thus a step towards getting
rid of buffer_heads from ext4. Also reduces a lot of code by making use of
existing iomap_ops (except for xattr implementation).

Testing (done on ext4 master branch)
========
'xfstests -g auto' passes with default mkfs/mount configuration
(v/s which also pass with vanilla kernel without this patch). Except
generic/473 which also failes on XFS. This seems to be the test case issue
since it expects the data in slightly different way as compared to what iomap
returns.
Point 2.a below describes more about this.

Observations/Review required
============================
1. bmap related old v/s new method differences:-
	a. In case if addr > INT_MAX, it issues a warning and
	   returns 0 as the block no. While earlier it used to return the
	   truncated value with no warning.
	b. block no. is only returned in case of iomap->type is IOMAP_MAPPED,
	   but not when iomap->type is IOMAP_UNWRITTEN. While with previously
	   we used to get block no. for both of above cases.

2. Fiemap related old v/s new method differences:-
	a. iomap_fiemap returns the disk extent information in exact
	   correspondence with start of user requested logical offset till the
	   length requested by user. While in previous implementation the
	   returned information used to give the complete extent information if
	   the range requested by user lies in between the extent mapping.
	b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
	   fiemap_extent mapping range requested by the user via fm_length (
	   if that has a valid mapped extent on the disk). But if the user
	   requested for more fm_length which could not be mapped in the last
	   fiemap_extent, then the flag is not set.
	   

e.g. output for above differences 2.a & 2.b
===========================================
create a file with below cmds. 
1. fallocate -o 0 -l 8K testfile.txt
2. xfs_io -c "pwrite 8K 8K" testfile.txt
3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
4. run this binary on with and without these patches:- ./a.out (test_fiemap_diff.c) [4]

o/p of xfs_io -c "fiemap -v"
============================================
With this patch on patched kernel:-
testfile.txt:
 EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
   0: [0..15]:         122802736..122802751    16 0x800
   1: [16..31]:        122687536..122687551    16   0x1

without patch on vanilla kernel (no difference):-
testfile.txt:
 EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
   0: [0..15]:         332211376..332211391    16 0x800
   1: [16..31]:        332722392..332722407    16   0x1


o/p of a.out without patch:-
================
riteshh-> ./a.out 
logical: [       0..      15] phys: 332211376..332211391 flags: 0x800 tot: 16
(0) extent flag = 2048

o/p of a.out with patch (both point 2.a & 2.b could be seen)
=======================
riteshh-> ./a.out
logical: [       0..       7] phys: 122802736..122802743 flags: 0x801 tot: 8
(0) extent flag = 2049

FYI - In test_fiemap_diff.c test we had 
a. fm_extent_count = 1
b. fm_start = 0
c. fm_length = 4K
Whereas when we change fm_extent_count = 32, then we don't see any difference.

e.g. output for above difference listed in point 1.b
====================================================

o/p without patch (block no returned for unwritten block as well)
=========Testing IOCTL FIBMAP=========
File size = 16384, blkcnt = 4, blocksize = 4096
  0   41526422
  1   41526423
  2   41590299
  3   41590300

o/p with patch (0 returned for unwritten block)
=========Testing IOCTL FIBMAP=========
File size = 16384, blkcnt = 4, blocksize = 4096
  0          0          0
  1          0          0
  2   15335942      29953
  3   15335943      29953


Summary:-
========
Due to some of the observational differences to user, listed above,
requesting to please help with a careful review in moving this to iomap.
Digging into some older threads, it looks like these differences should be fine,
since the same tools have been working fine with XFS (which uses iomap based
implementation) [1]
Also as Ted suggested in [3]: Fiemap & bmap spec could be made based on the ext4
implementation. But since all the tools also work with xfs which uses iomap
based fiemap, so we should be good there.


References of some previous discussions:
=======================================
[1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
[2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
[3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
[4]: https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c
[RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html 


Ritesh Harjani (4):
  ext4: Add IOMAP_F_MERGED for non-extent based mapping
  ext4: Optimize ext4_ext_precache for 0 depth
  ext4: Move ext4 bmap to use iomap infrastructure.
  ext4: Move ext4_fiemap to use iomap infrastructure

 fs/ext4/extents.c | 288 +++++++---------------------------------------
 fs/ext4/inline.c  |  41 -------
 fs/ext4/inode.c   |   6 +-
 3 files changed, 49 insertions(+), 286 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFCv2 1/4] ext4: Add IOMAP_F_MERGED for non-extent based mapping
  2020-01-28 10:18 [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Ritesh Harjani
@ 2020-01-28 10:18 ` Ritesh Harjani
  2020-01-28 10:18 ` [RFCv2 2/4] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-28 10:18 UTC (permalink / raw)
  To: jack, tytso, adilger.kernel, linux-ext4
  Cc: linux-fsdevel, darrick.wong, hch, cmaiolino, Ritesh Harjani

IOMAP_F_MERGED needs to be set in case of non-extent based mapping. This
is also needed for conversion of fiemap to use iomap.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/inode.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d035acab5b2a..3b4230cf0bc2 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3335,6 +3335,10 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,
 	iomap->offset = (u64) map->m_lblk << blkbits;
 	iomap->length = (u64) map->m_len << blkbits;
 
+	if ((map->m_flags & (EXT4_MAP_UNWRITTEN | EXT4_MAP_MAPPED)) &&
+		!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
+		iomap->flags |= IOMAP_F_MERGED;
+
 	/*
 	 * Flags passed to ext4_map_blocks() for direct I/O writes can result
 	 * in m_flags having both EXT4_MAP_MAPPED and EXT4_MAP_UNWRITTEN bits
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFCv2 2/4] ext4: Optimize ext4_ext_precache for 0 depth
  2020-01-28 10:18 [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Ritesh Harjani
  2020-01-28 10:18 ` [RFCv2 1/4] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
@ 2020-01-28 10:18 ` Ritesh Harjani
  2020-01-28 10:18 ` [RFCv2 3/4] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-28 10:18 UTC (permalink / raw)
  To: jack, tytso, adilger.kernel, linux-ext4
  Cc: linux-fsdevel, darrick.wong, hch, cmaiolino, Ritesh Harjani

This patch avoids the memory alloc & free path when depth is 0,
since anyway there is no extra caching done in that case.
So on checking depth 0, simply return early.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/extents.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ee83fe7c98aa..0de548bb3c90 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -594,6 +594,12 @@ int ext4_ext_precache(struct inode *inode)
 	down_read(&ei->i_data_sem);
 	depth = ext_depth(inode);
 
+	/* Don't cache anything if there are no external extent blocks */
+	if (!depth) {
+		up_read(&ei->i_data_sem);
+		return ret;
+	}
+
 	path = kcalloc(depth + 1, sizeof(struct ext4_ext_path),
 		       GFP_NOFS);
 	if (path == NULL) {
@@ -601,9 +607,6 @@ int ext4_ext_precache(struct inode *inode)
 		return -ENOMEM;
 	}
 
-	/* Don't cache anything if there are no external extent blocks */
-	if (depth == 0)
-		goto out;
 	path[0].p_hdr = ext_inode_hdr(inode);
 	ret = ext4_ext_check(inode, path[0].p_hdr, depth, 0);
 	if (ret)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFCv2 3/4] ext4: Move ext4 bmap to use iomap infrastructure.
  2020-01-28 10:18 [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Ritesh Harjani
  2020-01-28 10:18 ` [RFCv2 1/4] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
  2020-01-28 10:18 ` [RFCv2 2/4] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
@ 2020-01-28 10:18 ` Ritesh Harjani
  2020-01-28 10:18 ` [RFCv2 4/4] ext4: Move ext4_fiemap " Ritesh Harjani
  2020-01-30 16:00 ` [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Darrick J. Wong
  4 siblings, 0 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-28 10:18 UTC (permalink / raw)
  To: jack, tytso, adilger.kernel, linux-ext4
  Cc: linux-fsdevel, darrick.wong, hch, cmaiolino, Ritesh Harjani

ext4_iomap_begin is already implemented which provides
ext4_map_blocks, so just move the API from
generic_block_bmap to iomap_bmap for iomap conversion.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3b4230cf0bc2..0f8a196d8a61 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3214,7 +3214,7 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
 			return 0;
 	}
 
-	return generic_block_bmap(mapping, block, ext4_get_block);
+	return iomap_bmap(mapping, block, &ext4_iomap_ops);
 }
 
 static int ext4_readpage(struct file *file, struct page *page)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFCv2 4/4] ext4: Move ext4_fiemap to use iomap infrastructure
  2020-01-28 10:18 [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Ritesh Harjani
                   ` (2 preceding siblings ...)
  2020-01-28 10:18 ` [RFCv2 3/4] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
@ 2020-01-28 10:18 ` Ritesh Harjani
  2020-01-28 15:28   ` Darrick J. Wong
  2020-01-30 16:00 ` [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Darrick J. Wong
  4 siblings, 1 reply; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-28 10:18 UTC (permalink / raw)
  To: jack, tytso, adilger.kernel, linux-ext4
  Cc: linux-fsdevel, darrick.wong, hch, cmaiolino, Ritesh Harjani

Since ext4 already defines necessary iomap_ops required to move to iomap
for fiemap, so this patch makes those changes to use existing iomap_ops
for ext4_fiemap and thus removes all unwanted code.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/extents.c | 279 +++++++---------------------------------------
 fs/ext4/inline.c  |  41 -------
 2 files changed, 38 insertions(+), 282 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 0de548bb3c90..901caee2fcb1 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -28,6 +28,7 @@
 #include <linux/uaccess.h>
 #include <linux/fiemap.h>
 #include <linux/backing-dev.h>
+#include <linux/iomap.h>
 #include "ext4_jbd2.h"
 #include "ext4_extents.h"
 #include "xattr.h"
@@ -97,9 +98,6 @@ static int ext4_split_extent_at(handle_t *handle,
 			     int split_flag,
 			     int flags);
 
-static int ext4_find_delayed_extent(struct inode *inode,
-				    struct extent_status *newes);
-
 static int ext4_ext_trunc_restart_fn(struct inode *inode, int *dropped)
 {
 	/*
@@ -2176,155 +2174,6 @@ int ext4_ext_insert_extent(handle_t *handle, struct inode *inode,
 	return err;
 }
 
-static int ext4_fill_fiemap_extents(struct inode *inode,
-				    ext4_lblk_t block, ext4_lblk_t num,
-				    struct fiemap_extent_info *fieinfo)
-{
-	struct ext4_ext_path *path = NULL;
-	struct ext4_extent *ex;
-	struct extent_status es;
-	ext4_lblk_t next, next_del, start = 0, end = 0;
-	ext4_lblk_t last = block + num;
-	int exists, depth = 0, err = 0;
-	unsigned int flags = 0;
-	unsigned char blksize_bits = inode->i_sb->s_blocksize_bits;
-
-	while (block < last && block != EXT_MAX_BLOCKS) {
-		num = last - block;
-		/* find extent for this block */
-		down_read(&EXT4_I(inode)->i_data_sem);
-
-		path = ext4_find_extent(inode, block, &path, 0);
-		if (IS_ERR(path)) {
-			up_read(&EXT4_I(inode)->i_data_sem);
-			err = PTR_ERR(path);
-			path = NULL;
-			break;
-		}
-
-		depth = ext_depth(inode);
-		if (unlikely(path[depth].p_hdr == NULL)) {
-			up_read(&EXT4_I(inode)->i_data_sem);
-			EXT4_ERROR_INODE(inode, "path[%d].p_hdr == NULL", depth);
-			err = -EFSCORRUPTED;
-			break;
-		}
-		ex = path[depth].p_ext;
-		next = ext4_ext_next_allocated_block(path);
-
-		flags = 0;
-		exists = 0;
-		if (!ex) {
-			/* there is no extent yet, so try to allocate
-			 * all requested space */
-			start = block;
-			end = block + num;
-		} else if (le32_to_cpu(ex->ee_block) > block) {
-			/* need to allocate space before found extent */
-			start = block;
-			end = le32_to_cpu(ex->ee_block);
-			if (block + num < end)
-				end = block + num;
-		} else if (block >= le32_to_cpu(ex->ee_block)
-					+ ext4_ext_get_actual_len(ex)) {
-			/* need to allocate space after found extent */
-			start = block;
-			end = block + num;
-			if (end >= next)
-				end = next;
-		} else if (block >= le32_to_cpu(ex->ee_block)) {
-			/*
-			 * some part of requested space is covered
-			 * by found extent
-			 */
-			start = block;
-			end = le32_to_cpu(ex->ee_block)
-				+ ext4_ext_get_actual_len(ex);
-			if (block + num < end)
-				end = block + num;
-			exists = 1;
-		} else {
-			BUG();
-		}
-		BUG_ON(end <= start);
-
-		if (!exists) {
-			es.es_lblk = start;
-			es.es_len = end - start;
-			es.es_pblk = 0;
-		} else {
-			es.es_lblk = le32_to_cpu(ex->ee_block);
-			es.es_len = ext4_ext_get_actual_len(ex);
-			es.es_pblk = ext4_ext_pblock(ex);
-			if (ext4_ext_is_unwritten(ex))
-				flags |= FIEMAP_EXTENT_UNWRITTEN;
-		}
-
-		/*
-		 * Find delayed extent and update es accordingly. We call
-		 * it even in !exists case to find out whether es is the
-		 * last existing extent or not.
-		 */
-		next_del = ext4_find_delayed_extent(inode, &es);
-		if (!exists && next_del) {
-			exists = 1;
-			flags |= (FIEMAP_EXTENT_DELALLOC |
-				  FIEMAP_EXTENT_UNKNOWN);
-		}
-		up_read(&EXT4_I(inode)->i_data_sem);
-
-		if (unlikely(es.es_len == 0)) {
-			EXT4_ERROR_INODE(inode, "es.es_len == 0");
-			err = -EFSCORRUPTED;
-			break;
-		}
-
-		/*
-		 * This is possible iff next == next_del == EXT_MAX_BLOCKS.
-		 * we need to check next == EXT_MAX_BLOCKS because it is
-		 * possible that an extent is with unwritten and delayed
-		 * status due to when an extent is delayed allocated and
-		 * is allocated by fallocate status tree will track both of
-		 * them in a extent.
-		 *
-		 * So we could return a unwritten and delayed extent, and
-		 * its block is equal to 'next'.
-		 */
-		if (next == next_del && next == EXT_MAX_BLOCKS) {
-			flags |= FIEMAP_EXTENT_LAST;
-			if (unlikely(next_del != EXT_MAX_BLOCKS ||
-				     next != EXT_MAX_BLOCKS)) {
-				EXT4_ERROR_INODE(inode,
-						 "next extent == %u, next "
-						 "delalloc extent = %u",
-						 next, next_del);
-				err = -EFSCORRUPTED;
-				break;
-			}
-		}
-
-		if (exists) {
-			err = fiemap_fill_next_extent(fieinfo,
-				(__u64)es.es_lblk << blksize_bits,
-				(__u64)es.es_pblk << blksize_bits,
-				(__u64)es.es_len << blksize_bits,
-				flags);
-			if (err < 0)
-				break;
-			if (err == 1) {
-				err = 0;
-				break;
-			}
-		}
-
-		block = es.es_lblk + es.es_len;
-	}
-
-	ext4_ext_drop_refs(path);
-	kfree(path);
-	return err;
-}
-
 static int ext4_fill_es_cache_info(struct inode *inode,
 				   ext4_lblk_t block, ext4_lblk_t num,
 				   struct fiemap_extent_info *fieinfo)
@@ -5058,62 +4907,10 @@ int ext4_convert_unwritten_io_end_vec(handle_t *handle, ext4_io_end_t *io_end)
 	return ret < 0 ? ret : err;
 }
 
-/*
- * If newes is not existing extent (newes->ec_pblk equals zero) find
- * delayed extent at start of newes and update newes accordingly and
- * return start of the next delayed extent.
- *
- * If newes is existing extent (newes->ec_pblk is not equal zero)
- * return start of next delayed extent or EXT_MAX_BLOCKS if no delayed
- * extent found. Leave newes unmodified.
- */
-static int ext4_find_delayed_extent(struct inode *inode,
-				    struct extent_status *newes)
-{
-	struct extent_status es;
-	ext4_lblk_t block, next_del;
-
-	if (newes->es_pblk == 0) {
-		ext4_es_find_extent_range(inode, &ext4_es_is_delayed,
-					  newes->es_lblk,
-					  newes->es_lblk + newes->es_len - 1,
-					  &es);
-
-		/*
-		 * No extent in extent-tree contains block @newes->es_pblk,
-		 * then the block may stay in 1)a hole or 2)delayed-extent.
-		 */
-		if (es.es_len == 0)
-			/* A hole found. */
-			return 0;
-
-		if (es.es_lblk > newes->es_lblk) {
-			/* A hole found. */
-			newes->es_len = min(es.es_lblk - newes->es_lblk,
-					    newes->es_len);
-			return 0;
-		}
-
-		newes->es_len = es.es_lblk + es.es_len - newes->es_lblk;
-	}
-
-	block = newes->es_lblk + newes->es_len;
-	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, block,
-				  EXT_MAX_BLOCKS, &es);
-	if (es.es_len == 0)
-		next_del = EXT_MAX_BLOCKS;
-	else
-		next_del = es.es_lblk;
-
-	return next_del;
-}
-
-static int ext4_xattr_fiemap(struct inode *inode,
-				struct fiemap_extent_info *fieinfo)
+static int ext4_iomap_xattr_fiemap(struct inode *inode, struct iomap *iomap)
 {
 	__u64 physical = 0;
 	__u64 length;
-	__u32 flags = FIEMAP_EXTENT_LAST;
 	int blockbits = inode->i_sb->s_blocksize_bits;
 	int error = 0;
 
@@ -5130,40 +4927,42 @@ static int ext4_xattr_fiemap(struct inode *inode,
 				EXT4_I(inode)->i_extra_isize;
 		physical += offset;
 		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
-		flags |= FIEMAP_EXTENT_DATA_INLINE;
 		brelse(iloc.bh);
 	} else { /* external block */
 		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
 		length = inode->i_sb->s_blocksize;
 	}
 
-	if (physical)
-		error = fiemap_fill_next_extent(fieinfo, 0, physical,
-						length, flags);
-	return (error < 0 ? error : 0);
+	iomap->addr = physical;
+	iomap->offset = 0;
+	iomap->length = length;
+	iomap->type = IOMAP_INLINE;
+	iomap->flags = 0;
+	return error;
 }
 
-static int _ext4_fiemap(struct inode *inode,
-			struct fiemap_extent_info *fieinfo,
-			__u64 start, __u64 len,
-			int (*fill)(struct inode *, ext4_lblk_t,
-				    ext4_lblk_t,
-				    struct fiemap_extent_info *))
+static int ext4_iomap_xattr_begin(struct inode *inode, loff_t offset,
+				  loff_t length, unsigned flags,
+				  struct iomap *iomap, struct iomap *srcmap)
 {
-	ext4_lblk_t start_blk;
-	u32 ext4_fiemap_flags = FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR;
-
-	int error = 0;
+	int error;
 
-	if (ext4_has_inline_data(inode)) {
-		int has_inline = 1;
+	error = ext4_iomap_xattr_fiemap(inode, iomap);
+	if (error == 0 && (offset >= iomap->length))
+		error = -ENOENT;
+	return error;
+}
 
-		error = ext4_inline_data_fiemap(inode, fieinfo, &has_inline,
-						start, len);
+const struct iomap_ops ext4_iomap_xattr_ops = {
+	.iomap_begin		= ext4_iomap_xattr_begin,
+};
 
-		if (has_inline)
-			return error;
-	}
+static int _ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+			__u64 start, __u64 len, bool from_es_cache)
+{
+	ext4_lblk_t start_blk;
+	u32 ext4_fiemap_flags = FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR;
+	int error = 0;
 
 	if (fieinfo->fi_flags & FIEMAP_FLAG_CACHE) {
 		error = ext4_ext_precache(inode);
@@ -5172,19 +4971,19 @@ static int _ext4_fiemap(struct inode *inode,
 		fieinfo->fi_flags &= ~FIEMAP_FLAG_CACHE;
 	}
 
-	/* fallback to generic here if not in extents fmt */
-	if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) &&
-	    fill == ext4_fill_fiemap_extents)
-		return generic_block_fiemap(inode, fieinfo, start, len,
-			ext4_get_block);
-
-	if (fill == ext4_fill_es_cache_info)
+	if (from_es_cache)
 		ext4_fiemap_flags &= FIEMAP_FLAG_XATTR;
+
 	if (fiemap_check_flags(fieinfo, ext4_fiemap_flags))
 		return -EBADR;
 
 	if (fieinfo->fi_flags & FIEMAP_FLAG_XATTR) {
-		error = ext4_xattr_fiemap(inode, fieinfo);
+		fieinfo->fi_flags &= ~FIEMAP_FLAG_XATTR;
+		error = iomap_fiemap(inode, fieinfo, start, len,
+				     &ext4_iomap_xattr_ops);
+	} else if (!from_es_cache) {
+		error = iomap_fiemap(inode, fieinfo, start, len,
+				     &ext4_iomap_report_ops);
 	} else {
 		ext4_lblk_t len_blks;
 		__u64 last_blk;
@@ -5199,7 +4998,8 @@ static int _ext4_fiemap(struct inode *inode,
 		 * Walk the extent tree gathering extent information
 		 * and pushing extents back to the user.
 		 */
-		error = fill(inode, start_blk, len_blks, fieinfo);
+		error = ext4_fill_es_cache_info(inode, start_blk, len_blks,
+						fieinfo);
 	}
 	return error;
 }
@@ -5207,8 +5007,7 @@ static int _ext4_fiemap(struct inode *inode,
 int ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		__u64 start, __u64 len)
 {
-	return _ext4_fiemap(inode, fieinfo, start, len,
-			    ext4_fill_fiemap_extents);
+	return _ext4_fiemap(inode, fieinfo, start, len, false);
 }
 
 int ext4_get_es_cache(struct inode *inode, struct fiemap_extent_info *fieinfo,
@@ -5224,11 +5023,9 @@ int ext4_get_es_cache(struct inode *inode, struct fiemap_extent_info *fieinfo,
 			return 0;
 	}
 
-	return _ext4_fiemap(inode, fieinfo, start, len,
-			    ext4_fill_es_cache_info);
+	return _ext4_fiemap(inode, fieinfo, start, len, true);
 }
 
-
 /*
  * ext4_access_path:
  * Function to access the path buffer for marking it dirty.
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index e61603f47035..b8b99634c832 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -1857,47 +1857,6 @@ int ext4_inline_data_iomap(struct inode *inode, struct iomap *iomap)
 	return error;
 }
 
-int ext4_inline_data_fiemap(struct inode *inode,
-			    struct fiemap_extent_info *fieinfo,
-			    int *has_inline, __u64 start, __u64 len)
-{
-	__u64 physical = 0;
-	__u64 inline_len;
-	__u32 flags = FIEMAP_EXTENT_DATA_INLINE | FIEMAP_EXTENT_NOT_ALIGNED |
-		FIEMAP_EXTENT_LAST;
-	int error = 0;
-	struct ext4_iloc iloc;
-
-	down_read(&EXT4_I(inode)->xattr_sem);
-	if (!ext4_has_inline_data(inode)) {
-		*has_inline = 0;
-		goto out;
-	}
-	inline_len = min_t(size_t, ext4_get_inline_size(inode),
-			   i_size_read(inode));
-	if (start >= inline_len)
-		goto out;
-	if (start + len < inline_len)
-		inline_len = start + len;
-	inline_len -= start;
-
-	error = ext4_get_inode_loc(inode, &iloc);
-	if (error)
-		goto out;
-
-	physical = (__u64)iloc.bh->b_blocknr << inode->i_sb->s_blocksize_bits;
-	physical += (char *)ext4_raw_inode(&iloc) - iloc.bh->b_data;
-	physical += offsetof(struct ext4_inode, i_block);
-
-	brelse(iloc.bh);
-out:
-	up_read(&EXT4_I(inode)->xattr_sem);
-	if (physical)
-		error = fiemap_fill_next_extent(fieinfo, start, physical,
-						inline_len, flags);
-	return (error < 0 ? error : 0);
-}
-
 int ext4_inline_data_truncate(struct inode *inode, int *has_inline)
 {
 	handle_t *handle;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFCv2 4/4] ext4: Move ext4_fiemap to use iomap infrastructure
  2020-01-28 10:18 ` [RFCv2 4/4] ext4: Move ext4_fiemap " Ritesh Harjani
@ 2020-01-28 15:28   ` Darrick J. Wong
  2020-01-29  6:19     ` Ritesh Harjani
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2020-01-28 15:28 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino

On Tue, Jan 28, 2020 at 03:48:28PM +0530, Ritesh Harjani wrote:
> Since ext4 already defines necessary iomap_ops required to move to iomap
> for fiemap, so this patch makes those changes to use existing iomap_ops
> for ext4_fiemap and thus removes all unwanted code.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
>  fs/ext4/extents.c | 279 +++++++---------------------------------------
>  fs/ext4/inline.c  |  41 -------
>  2 files changed, 38 insertions(+), 282 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 0de548bb3c90..901caee2fcb1 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c

<snip> Just a cursory glance...

> @@ -5130,40 +4927,42 @@ static int ext4_xattr_fiemap(struct inode *inode,
>  				EXT4_I(inode)->i_extra_isize;
>  		physical += offset;
>  		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
> -		flags |= FIEMAP_EXTENT_DATA_INLINE;
>  		brelse(iloc.bh);
>  	} else { /* external block */
>  		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
>  		length = inode->i_sb->s_blocksize;
>  	}
>  
> -	if (physical)
> -		error = fiemap_fill_next_extent(fieinfo, 0, physical,
> -						length, flags);
> -	return (error < 0 ? error : 0);
> +	iomap->addr = physical;
> +	iomap->offset = 0;
> +	iomap->length = length;
> +	iomap->type = IOMAP_INLINE;
> +	iomap->flags = 0;

Er... external "ACL" blocks aren't IOMAP_INLINE.

--D

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 4/4] ext4: Move ext4_fiemap to use iomap infrastructure
  2020-01-28 15:28   ` Darrick J. Wong
@ 2020-01-29  6:19     ` Ritesh Harjani
  2020-01-29 16:18       ` Darrick J. Wong
  0 siblings, 1 reply; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-29  6:19 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino

Hello Darrick,

On 1/28/20 8:58 PM, Darrick J. Wong wrote:
> On Tue, Jan 28, 2020 at 03:48:28PM +0530, Ritesh Harjani wrote:
>> Since ext4 already defines necessary iomap_ops required to move to iomap
>> for fiemap, so this patch makes those changes to use existing iomap_ops
>> for ext4_fiemap and thus removes all unwanted code.
>>
>> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>> ---
>>   fs/ext4/extents.c | 279 +++++++---------------------------------------
>>   fs/ext4/inline.c  |  41 -------
>>   2 files changed, 38 insertions(+), 282 deletions(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index 0de548bb3c90..901caee2fcb1 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
> 
> <snip> Just a cursory glance...
> 
>> @@ -5130,40 +4927,42 @@ static int ext4_xattr_fiemap(struct inode *inode,
>>   				EXT4_I(inode)->i_extra_isize;
>>   		physical += offset;
>>   		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
>> -		flags |= FIEMAP_EXTENT_DATA_INLINE;
>>   		brelse(iloc.bh);
>>   	} else { /* external block */
>>   		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
>>   		length = inode->i_sb->s_blocksize;
>>   	}
>>   
>> -	if (physical)
>> -		error = fiemap_fill_next_extent(fieinfo, 0, physical,
>> -						length, flags);
>> -	return (error < 0 ? error : 0);
>> +	iomap->addr = physical;
>> +	iomap->offset = 0;
>> +	iomap->length = length;
>> +	iomap->type = IOMAP_INLINE;
>> +	iomap->flags = 0;
> 
> Er... external "ACL" blocks aren't IOMAP_INLINE.

Sorry, I should have mentioned about this too in the cover letter.
So current patchset is mainly only converting bmap & fiemap to use iomap 
APIs. Even the original implementation does not have external ACL block
implemented for xattr_fiemap.

Let me spend some time to implement it. But I would still like to keep
that as a separate patch.

But thanks for looking into it. There's this point 2.a & 2.b mentioned 
in the cover letter where I could really use your help in understanding
if all of that is a known behavior from iomap_fiemap side
(whenever you have some time of course :) )

-ritesh


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 4/4] ext4: Move ext4_fiemap to use iomap infrastructure
  2020-01-29  6:19     ` Ritesh Harjani
@ 2020-01-29 16:18       ` Darrick J. Wong
  2020-01-29 23:54         ` Ritesh Harjani
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2020-01-29 16:18 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino

On Wed, Jan 29, 2020 at 11:49:38AM +0530, Ritesh Harjani wrote:
> Hello Darrick,
> 
> On 1/28/20 8:58 PM, Darrick J. Wong wrote:
> > On Tue, Jan 28, 2020 at 03:48:28PM +0530, Ritesh Harjani wrote:
> > > Since ext4 already defines necessary iomap_ops required to move to iomap
> > > for fiemap, so this patch makes those changes to use existing iomap_ops
> > > for ext4_fiemap and thus removes all unwanted code.
> > > 
> > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > > ---
> > >   fs/ext4/extents.c | 279 +++++++---------------------------------------
> > >   fs/ext4/inline.c  |  41 -------
> > >   2 files changed, 38 insertions(+), 282 deletions(-)
> > > 
> > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > > index 0de548bb3c90..901caee2fcb1 100644
> > > --- a/fs/ext4/extents.c
> > > +++ b/fs/ext4/extents.c
> > 
> > <snip> Just a cursory glance...
> > 
> > > @@ -5130,40 +4927,42 @@ static int ext4_xattr_fiemap(struct inode *inode,
> > >   				EXT4_I(inode)->i_extra_isize;
> > >   		physical += offset;
> > >   		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
> > > -		flags |= FIEMAP_EXTENT_DATA_INLINE;
> > >   		brelse(iloc.bh);
> > >   	} else { /* external block */
> > >   		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
> > >   		length = inode->i_sb->s_blocksize;
> > >   	}
> > > -	if (physical)
> > > -		error = fiemap_fill_next_extent(fieinfo, 0, physical,
> > > -						length, flags);
> > > -	return (error < 0 ? error : 0);
> > > +	iomap->addr = physical;
> > > +	iomap->offset = 0;
> > > +	iomap->length = length;
> > > +	iomap->type = IOMAP_INLINE;
> > > +	iomap->flags = 0;
> > 
> > Er... external "ACL" blocks aren't IOMAP_INLINE.
> 
> Sorry, I should have mentioned about this too in the cover letter.
> So current patchset is mainly only converting bmap & fiemap to use iomap
> APIs. Even the original implementation does not have external ACL block
> implemented for xattr_fiemap.

Er ... yes it did.  The "} else { /* external block */" block sets
physical to i_file_acl.

> Let me spend some time to implement it. But I would still like to keep
> that as a separate patch.
> 
> But thanks for looking into it. There's this point 2.a & 2.b mentioned in
> the cover letter where I could really use your help in understanding
> if all of that is a known behavior from iomap_fiemap side
> (whenever you have some time of course :) )

i'll go have a look.

--D

> -ritesh
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 4/4] ext4: Move ext4_fiemap to use iomap infrastructure
  2020-01-29 16:18       ` Darrick J. Wong
@ 2020-01-29 23:54         ` Ritesh Harjani
  0 siblings, 0 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-29 23:54 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino



On 1/29/20 9:48 PM, Darrick J. Wong wrote:
> On Wed, Jan 29, 2020 at 11:49:38AM +0530, Ritesh Harjani wrote:
>> Hello Darrick,
>>
>> On 1/28/20 8:58 PM, Darrick J. Wong wrote:
>>> On Tue, Jan 28, 2020 at 03:48:28PM +0530, Ritesh Harjani wrote:
>>>> Since ext4 already defines necessary iomap_ops required to move to iomap
>>>> for fiemap, so this patch makes those changes to use existing iomap_ops
>>>> for ext4_fiemap and thus removes all unwanted code.
>>>>
>>>> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>>>> ---
>>>>    fs/ext4/extents.c | 279 +++++++---------------------------------------
>>>>    fs/ext4/inline.c  |  41 -------
>>>>    2 files changed, 38 insertions(+), 282 deletions(-)
>>>>
>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>>> index 0de548bb3c90..901caee2fcb1 100644
>>>> --- a/fs/ext4/extents.c
>>>> +++ b/fs/ext4/extents.c
>>>
>>> <snip> Just a cursory glance...
>>>
>>>> @@ -5130,40 +4927,42 @@ static int ext4_xattr_fiemap(struct inode *inode,
>>>>    				EXT4_I(inode)->i_extra_isize;
>>>>    		physical += offset;
>>>>    		length = EXT4_SB(inode->i_sb)->s_inode_size - offset;
>>>> -		flags |= FIEMAP_EXTENT_DATA_INLINE;
>>>>    		brelse(iloc.bh);
>>>>    	} else { /* external block */
>>>>    		physical = (__u64)EXT4_I(inode)->i_file_acl << blockbits;
>>>>    		length = inode->i_sb->s_blocksize;
>>>>    	}
>>>> -	if (physical)
>>>> -		error = fiemap_fill_next_extent(fieinfo, 0, physical,
>>>> -						length, flags);
>>>> -	return (error < 0 ? error : 0);
>>>> +	iomap->addr = physical;
>>>> +	iomap->offset = 0;
>>>> +	iomap->length = length;
>>>> +	iomap->type = IOMAP_INLINE;
>>>> +	iomap->flags = 0;
>>>
>>> Er... external "ACL" blocks aren't IOMAP_INLINE.
>>
>> Sorry, I should have mentioned about this too in the cover letter.
>> So current patchset is mainly only converting bmap & fiemap to use iomap
>> APIs. Even the original implementation does not have external ACL block
>> implemented for xattr_fiemap.
> 
> Er ... yes it did.  The "} else { /* external block */" block sets
> physical to i_file_acl.

Oops.. my bad. I got it confused with EA inode feature.
Urghh... I should remove my bias while looking at a review comment.

so I think for i_file_acl (external block) we should set
iomap->type = IOMAP_MAPPED.

Will fix this and submit in the same thread. Thanks for catching it.

> 
>> Let me spend some time to implement it. But I would still like to keep
>> that as a separate patch.
>>
>> But thanks for looking into it. There's this point 2.a & 2.b mentioned in
>> the cover letter where I could really use your help in understanding
>> if all of that is a known behavior from iomap_fiemap side
>> (whenever you have some time of course :) )
> 
> i'll go have a look.

Thanks.


> 
> --D
> 
>> -ritesh
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-01-28 10:18 [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Ritesh Harjani
                   ` (3 preceding siblings ...)
  2020-01-28 10:18 ` [RFCv2 4/4] ext4: Move ext4_fiemap " Ritesh Harjani
@ 2020-01-30 16:00 ` Darrick J. Wong
  2020-01-30 17:34   ` Ritesh Harjani
  2020-02-05 12:47   ` Ritesh Harjani
  4 siblings, 2 replies; 20+ messages in thread
From: Darrick J. Wong @ 2020-01-30 16:00 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino

On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
> Hello All,
> 
> Background
> ==========
> There are RFCv2 patches to move ext4 bmap & fiemap calls to use iomap APIs.
> This reduces the users of ext4_get_block API and thus a step towards getting
> rid of buffer_heads from ext4. Also reduces a lot of code by making use of
> existing iomap_ops (except for xattr implementation).
> 
> Testing (done on ext4 master branch)
> ========
> 'xfstests -g auto' passes with default mkfs/mount configuration
> (v/s which also pass with vanilla kernel without this patch). Except
> generic/473 which also failes on XFS. This seems to be the test case issue
> since it expects the data in slightly different way as compared to what iomap
> returns.
> Point 2.a below describes more about this.
> 
> Observations/Review required
> ============================
> 1. bmap related old v/s new method differences:-
> 	a. In case if addr > INT_MAX, it issues a warning and
> 	   returns 0 as the block no. While earlier it used to return the
> 	   truncated value with no warning.

Good...

> 	b. block no. is only returned in case of iomap->type is IOMAP_MAPPED,
> 	   but not when iomap->type is IOMAP_UNWRITTEN. While with previously
> 	   we used to get block no. for both of above cases.

Assuming the only remaining usecase of bmap is to tell old bootloaders
where to find vmlinuz blocks on disk, I don't see much reason to map
unwritten blocks -- there's no data there, and if your bootloader writes
to the filesystem(!) then you can't read whatever was written there
anyway.

Uh, can we put this ioctl on the deprecation list, please? :)

> 2. Fiemap related old v/s new method differences:-
> 	a. iomap_fiemap returns the disk extent information in exact
> 	   correspondence with start of user requested logical offset till the
> 	   length requested by user. While in previous implementation the
> 	   returned information used to give the complete extent information if
> 	   the range requested by user lies in between the extent mapping.

This is a topic of much disagreement.  The FIEMAP documentation says
that the call must return data for the requested range, but *may* return
a mapping that extends beyond the requested range.

XFS (and now iomap) only return data for the requested range, whereas
ext4 has (had?) the behavior you describe.  generic/473 was an attempt
to enforce the ext4 behavior across all filesystems, but I put it in my
dead list and never run it.

So it's a behavioral change, but the new behavior isn't forbidden.

> 	b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
> 	   fiemap_extent mapping range requested by the user via fm_length (
> 	   if that has a valid mapped extent on the disk).

That sounds like a bug.  _LAST is supposed to be set on the last extent
in the file, not the last record in the queried dataset.

> But if the user
> 	   requested for more fm_length which could not be mapped in the last
> 	   fiemap_extent, then the flag is not set.

Yes... if there were more extents to map than there was space in the map
array, then _LAST should remain unset to encourage userspace to come
back for the rest of the mappings.

(Unless maybe I'm misunderstanding here...)

> e.g. output for above differences 2.a & 2.b
> ===========================================
> create a file with below cmds. 
> 1. fallocate -o 0 -l 8K testfile.txt
> 2. xfs_io -c "pwrite 8K 8K" testfile.txt
> 3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
> 4. run this binary on with and without these patches:- ./a.out (test_fiemap_diff.c) [4]
> 
> o/p of xfs_io -c "fiemap -v"
> ============================================
> With this patch on patched kernel:-
> testfile.txt:
>  EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>    0: [0..15]:         122802736..122802751    16 0x800
>    1: [16..31]:        122687536..122687551    16   0x1
> 
> without patch on vanilla kernel (no difference):-
> testfile.txt:
>  EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>    0: [0..15]:         332211376..332211391    16 0x800
>    1: [16..31]:        332722392..332722407    16   0x1
> 
> 
> o/p of a.out without patch:-
> ================
> riteshh-> ./a.out 
> logical: [       0..      15] phys: 332211376..332211391 flags: 0x800 tot: 16
> (0) extent flag = 2048
> 
> o/p of a.out with patch (both point 2.a & 2.b could be seen)
> =======================
> riteshh-> ./a.out
> logical: [       0..       7] phys: 122802736..122802743 flags: 0x801 tot: 8
> (0) extent flag = 2049
> 
> FYI - In test_fiemap_diff.c test we had 
> a. fm_extent_count = 1
> b. fm_start = 0
> c. fm_length = 4K
> Whereas when we change fm_extent_count = 32, then we don't see any difference.
> 
> e.g. output for above difference listed in point 1.b
> ====================================================
> 
> o/p without patch (block no returned for unwritten block as well)
> =========Testing IOCTL FIBMAP=========
> File size = 16384, blkcnt = 4, blocksize = 4096
>   0   41526422
>   1   41526423
>   2   41590299
>   3   41590300
> 
> o/p with patch (0 returned for unwritten block)
> =========Testing IOCTL FIBMAP=========
> File size = 16384, blkcnt = 4, blocksize = 4096
>   0          0          0
>   1          0          0
>   2   15335942      29953
>   3   15335943      29953
> 
> 
> Summary:-
> ========
> Due to some of the observational differences to user, listed above,
> requesting to please help with a careful review in moving this to iomap.
> Digging into some older threads, it looks like these differences should be fine,
> since the same tools have been working fine with XFS (which uses iomap based
> implementation) [1]
> Also as Ted suggested in [3]: Fiemap & bmap spec could be made based on the ext4
> implementation. But since all the tools also work with xfs which uses iomap
> based fiemap, so we should be good there.

<nod> Thanks for the worked example output. :)

--D

> 
> References of some previous discussions:
> =======================================
> [1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
> [2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
> [3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
> [4]: https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c
> [RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html 
> 
> 
> Ritesh Harjani (4):
>   ext4: Add IOMAP_F_MERGED for non-extent based mapping
>   ext4: Optimize ext4_ext_precache for 0 depth
>   ext4: Move ext4 bmap to use iomap infrastructure.
>   ext4: Move ext4_fiemap to use iomap infrastructure
> 
>  fs/ext4/extents.c | 288 +++++++---------------------------------------
>  fs/ext4/inline.c  |  41 -------
>  fs/ext4/inode.c   |   6 +-
>  3 files changed, 49 insertions(+), 286 deletions(-)
> 
> -- 
> 2.21.0
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-01-30 16:00 ` [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Darrick J. Wong
@ 2020-01-30 17:34   ` Ritesh Harjani
  2020-02-05 12:47   ` Ritesh Harjani
  1 sibling, 0 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-01-30 17:34 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino



On 1/30/20 9:30 PM, Darrick J. Wong wrote:
> On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
>> Hello All,
>>
>> Background
>> ==========
>> There are RFCv2 patches to move ext4 bmap & fiemap calls to use iomap APIs.
>> This reduces the users of ext4_get_block API and thus a step towards getting
>> rid of buffer_heads from ext4. Also reduces a lot of code by making use of
>> existing iomap_ops (except for xattr implementation).
>>
>> Testing (done on ext4 master branch)
>> ========
>> 'xfstests -g auto' passes with default mkfs/mount configuration
>> (v/s which also pass with vanilla kernel without this patch). Except
>> generic/473 which also failes on XFS. This seems to be the test case issue
>> since it expects the data in slightly different way as compared to what iomap
>> returns.
>> Point 2.a below describes more about this.
>>
>> Observations/Review required
>> ============================
>> 1. bmap related old v/s new method differences:-
>> 	a. In case if addr > INT_MAX, it issues a warning and
>> 	   returns 0 as the block no. While earlier it used to return the
>> 	   truncated value with no warning.
> 
> Good...
> 
>> 	b. block no. is only returned in case of iomap->type is IOMAP_MAPPED,
>> 	   but not when iomap->type is IOMAP_UNWRITTEN. While with previously
>> 	   we used to get block no. for both of above cases.
> 
> Assuming the only remaining usecase of bmap is to tell old bootloaders
> where to find vmlinuz blocks on disk, I don't see much reason to map
> unwritten blocks -- there's no data there, and if your bootloader writes
> to the filesystem(!) then you can't read whatever was written there
> anyway.

Yes, no objection there. Just wanted to get it reviewed.


> 
> Uh, can we put this ioctl on the deprecation list, please? :)
> 
>> 2. Fiemap related old v/s new method differences:-
>> 	a. iomap_fiemap returns the disk extent information in exact
>> 	   correspondence with start of user requested logical offset till the
>> 	   length requested by user. While in previous implementation the
>> 	   returned information used to give the complete extent information if
>> 	   the range requested by user lies in between the extent mapping.
> 
> This is a topic of much disagreement.  The FIEMAP documentation says
> that the call must return data for the requested range, but *may* return
> a mapping that extends beyond the requested range.
> 
> XFS (and now iomap) only return data for the requested range, whereas
> ext4 has (had?) the behavior you describe.  generic/473 was an attempt
> to enforce the ext4 behavior across all filesystems, but I put it in my
> dead list and never run it.
> 
> So it's a behavioral change, but the new behavior isn't forbidden.

Sure, thanks.

> 
>> 	b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
>> 	   fiemap_extent mapping range requested by the user via fm_length (
>> 	   if that has a valid mapped extent on the disk).
> 
> That sounds like a bug.  _LAST is supposed to be set on the last extent
> in the file, not the last record in the queried dataset.

Thought so too, sure will spend some time to try fixing it.


> 
>> But if the user
>> 	   requested for more fm_length which could not be mapped in the last
>> 	   fiemap_extent, then the flag is not set.
> 
> Yes... if there were more extents to map than there was space in the map
> array, then _LAST should remain unset to encourage userspace to come
> back for the rest of the mappings.
> 
> (Unless maybe I'm misunderstanding here...)
> 
>> e.g. output for above differences 2.a & 2.b
>> ===========================================
>> create a file with below cmds.
>> 1. fallocate -o 0 -l 8K testfile.txt
>> 2. xfs_io -c "pwrite 8K 8K" testfile.txt
>> 3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
>> 4. run this binary on with and without these patches:- ./a.out (test_fiemap_diff.c) [4]
>>
>> o/p of xfs_io -c "fiemap -v"
>> ============================================
>> With this patch on patched kernel:-
>> testfile.txt:
>>   EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>>     0: [0..15]:         122802736..122802751    16 0x800
>>     1: [16..31]:        122687536..122687551    16   0x1
>>
>> without patch on vanilla kernel (no difference):-
>> testfile.txt:
>>   EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>>     0: [0..15]:         332211376..332211391    16 0x800
>>     1: [16..31]:        332722392..332722407    16   0x1
>>
>>
>> o/p of a.out without patch:-
>> ================
>> riteshh-> ./a.out
>> logical: [       0..      15] phys: 332211376..332211391 flags: 0x800 tot: 16
>> (0) extent flag = 2048
>>
>> o/p of a.out with patch (both point 2.a & 2.b could be seen)
>> =======================
>> riteshh-> ./a.out
>> logical: [       0..       7] phys: 122802736..122802743 flags: 0x801 tot: 8
>> (0) extent flag = 2049
>>
>> FYI - In test_fiemap_diff.c test we had
>> a. fm_extent_count = 1
>> b. fm_start = 0
>> c. fm_length = 4K
>> Whereas when we change fm_extent_count = 32, then we don't see any difference.
>>
>> e.g. output for above difference listed in point 1.b
>> ====================================================
>>
>> o/p without patch (block no returned for unwritten block as well)
>> =========Testing IOCTL FIBMAP=========
>> File size = 16384, blkcnt = 4, blocksize = 4096
>>    0   41526422
>>    1   41526423
>>    2   41590299
>>    3   41590300
>>
>> o/p with patch (0 returned for unwritten block)
>> =========Testing IOCTL FIBMAP=========
>> File size = 16384, blkcnt = 4, blocksize = 4096
>>    0          0          0
>>    1          0          0
>>    2   15335942      29953
>>    3   15335943      29953
>>
>>
>> Summary:-
>> ========
>> Due to some of the observational differences to user, listed above,
>> requesting to please help with a careful review in moving this to iomap.
>> Digging into some older threads, it looks like these differences should be fine,
>> since the same tools have been working fine with XFS (which uses iomap based
>> implementation) [1]
>> Also as Ted suggested in [3]: Fiemap & bmap spec could be made based on the ext4
>> implementation. But since all the tools also work with xfs which uses iomap
>> based fiemap, so we should be good there.
> 
> <nod> Thanks for the worked example output. :)

Thanks for the review. :)

ritesh


> 
> --D
> 
>>
>> References of some previous discussions:
>> =======================================
>> [1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
>> [2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
>> [3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
>> [4]: https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c
>> [RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html
>>
>>
>> Ritesh Harjani (4):
>>    ext4: Add IOMAP_F_MERGED for non-extent based mapping
>>    ext4: Optimize ext4_ext_precache for 0 depth
>>    ext4: Move ext4 bmap to use iomap infrastructure.
>>    ext4: Move ext4_fiemap to use iomap infrastructure
>>
>>   fs/ext4/extents.c | 288 +++++++---------------------------------------
>>   fs/ext4/inline.c  |  41 -------
>>   fs/ext4/inode.c   |   6 +-
>>   3 files changed, 49 insertions(+), 286 deletions(-)
>>
>> -- 
>> 2.21.0
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-01-30 16:00 ` [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Darrick J. Wong
  2020-01-30 17:34   ` Ritesh Harjani
@ 2020-02-05 12:47   ` Ritesh Harjani
  2020-02-05 15:57     ` Darrick J. Wong
  1 sibling, 1 reply; 20+ messages in thread
From: Ritesh Harjani @ 2020-02-05 12:47 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino



On 1/30/20 11:04 PM, Ritesh Harjani wrote:
> 
> 
> On 1/30/20 9:30 PM, Darrick J. Wong wrote:
>> On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
>>> Hello All,
>>>
>>> Background
>>> ==========
>>> There are RFCv2 patches to move ext4 bmap & fiemap calls to use iomap 
>>> APIs.
>>> This reduces the users of ext4_get_block API and thus a step towards 
>>> getting
>>> rid of buffer_heads from ext4. Also reduces a lot of code by making 
>>> use of
>>> existing iomap_ops (except for xattr implementation).
>>>
>>> Testing (done on ext4 master branch)
>>> ========
>>> 'xfstests -g auto' passes with default mkfs/mount configuration
>>> (v/s which also pass with vanilla kernel without this patch). Except
>>> generic/473 which also failes on XFS. This seems to be the test case 
>>> issue
>>> since it expects the data in slightly different way as compared to 
>>> what iomap
>>> returns.
>>> Point 2.a below describes more about this.
>>>
>>> Observations/Review required
>>> ============================
>>> 1. bmap related old v/s new method differences:-
>>>     a. In case if addr > INT_MAX, it issues a warning and
>>>        returns 0 as the block no. While earlier it used to return the
>>>        truncated value with no warning.
>>
>> Good...
>>
>>>     b. block no. is only returned in case of iomap->type is 
>>> IOMAP_MAPPED,
>>>        but not when iomap->type is IOMAP_UNWRITTEN. While with 
>>> previously
>>>        we used to get block no. for both of above cases.
>>
>> Assuming the only remaining usecase of bmap is to tell old bootloaders
>> where to find vmlinuz blocks on disk, I don't see much reason to map
>> unwritten blocks -- there's no data there, and if your bootloader writes
>> to the filesystem(!) then you can't read whatever was written there
>> anyway.
> 
> Yes, no objection there. Just wanted to get it reviewed.
> 
> 
>>
>> Uh, can we put this ioctl on the deprecation list, please? :)
>>
>>> 2. Fiemap related old v/s new method differences:-
>>>     a. iomap_fiemap returns the disk extent information in exact
>>>        correspondence with start of user requested logical offset 
>>> till the
>>>        length requested by user. While in previous implementation the
>>>        returned information used to give the complete extent 
>>> information if
>>>        the range requested by user lies in between the extent mapping.
>>
>> This is a topic of much disagreement.  The FIEMAP documentation says
>> that the call must return data for the requested range, but *may* return
>> a mapping that extends beyond the requested range.
>>
>> XFS (and now iomap) only return data for the requested range, whereas
>> ext4 has (had?) the behavior you describe.  generic/473 was an attempt
>> to enforce the ext4 behavior across all filesystems, but I put it in my
>> dead list and never run it.
>>
>> So it's a behavioral change, but the new behavior isn't forbidden.
> 
> Sure, thanks.
> 
>>
>>>     b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
>>>        fiemap_extent mapping range requested by the user via fm_length (
>>>        if that has a valid mapped extent on the disk).
>>
>> That sounds like a bug.  _LAST is supposed to be set on the last extent
>> in the file, not the last record in the queried dataset.
> 
> Thought so too, sure will spend some time to try fixing it.

Looked into this. I think below should fix our above reported problem 
with current iomap code.
If no objection I will send send PATCHv3 with below fix as the first
patch in the series.

diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index bccf305ea9ce..ee53991810d5 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct 
fiemap_extent_info *fi,
         }

         if (ctx.prev.type != IOMAP_HOLE) {
-               ret = iomap_to_fiemap(fi, &ctx.prev, FIEMAP_EXTENT_LAST);
+               u32 flags = 0;
+               loff_t isize = i_size_read(inode);
+
+               if (ctx.prev.offset + ctx.prev.length >= isize)
+                       flags |= FIEMAP_EXTENT_LAST;
+               ret = iomap_to_fiemap(fi, &ctx.prev, flags);
                 if (ret < 0)
                         return ret;
         }


-ritesh


> 
> 
>>
>>> But if the user
>>>        requested for more fm_length which could not be mapped in the 
>>> last
>>>        fiemap_extent, then the flag is not set.
>>
>> Yes... if there were more extents to map than there was space in the map
>> array, then _LAST should remain unset to encourage userspace to come
>> back for the rest of the mappings.
>>
>> (Unless maybe I'm misunderstanding here...)
>>
>>> e.g. output for above differences 2.a & 2.b
>>> ===========================================
>>> create a file with below cmds.
>>> 1. fallocate -o 0 -l 8K testfile.txt
>>> 2. xfs_io -c "pwrite 8K 8K" testfile.txt
>>> 3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
>>> 4. run this binary on with and without these patches:- ./a.out 
>>> (test_fiemap_diff.c) [4]
>>>
>>> o/p of xfs_io -c "fiemap -v"
>>> ============================================
>>> With this patch on patched kernel:-
>>> testfile.txt:
>>>   EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>>>     0: [0..15]:         122802736..122802751    16 0x800
>>>     1: [16..31]:        122687536..122687551    16   0x1
>>>
>>> without patch on vanilla kernel (no difference):-
>>> testfile.txt:
>>>   EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>>>     0: [0..15]:         332211376..332211391    16 0x800
>>>     1: [16..31]:        332722392..332722407    16   0x1
>>>
>>>
>>> o/p of a.out without patch:-
>>> ================
>>> riteshh-> ./a.out
>>> logical: [       0..      15] phys: 332211376..332211391 flags: 0x800 
>>> tot: 16
>>> (0) extent flag = 2048
>>>
>>> o/p of a.out with patch (both point 2.a & 2.b could be seen)
>>> =======================
>>> riteshh-> ./a.out
>>> logical: [       0..       7] phys: 122802736..122802743 flags: 0x801 
>>> tot: 8
>>> (0) extent flag = 2049
>>>
>>> FYI - In test_fiemap_diff.c test we had
>>> a. fm_extent_count = 1
>>> b. fm_start = 0
>>> c. fm_length = 4K
>>> Whereas when we change fm_extent_count = 32, then we don't see any 
>>> difference.
>>>
>>> e.g. output for above difference listed in point 1.b
>>> ====================================================
>>>
>>> o/p without patch (block no returned for unwritten block as well)
>>> =========Testing IOCTL FIBMAP=========
>>> File size = 16384, blkcnt = 4, blocksize = 4096
>>>    0   41526422
>>>    1   41526423
>>>    2   41590299
>>>    3   41590300
>>>
>>> o/p with patch (0 returned for unwritten block)
>>> =========Testing IOCTL FIBMAP=========
>>> File size = 16384, blkcnt = 4, blocksize = 4096
>>>    0          0          0
>>>    1          0          0
>>>    2   15335942      29953
>>>    3   15335943      29953
>>>
>>>
>>> Summary:-
>>> ========
>>> Due to some of the observational differences to user, listed above,
>>> requesting to please help with a careful review in moving this to iomap.
>>> Digging into some older threads, it looks like these differences 
>>> should be fine,
>>> since the same tools have been working fine with XFS (which uses 
>>> iomap based
>>> implementation) [1]
>>> Also as Ted suggested in [3]: Fiemap & bmap spec could be made based 
>>> on the ext4
>>> implementation. But since all the tools also work with xfs which uses 
>>> iomap
>>> based fiemap, so we should be good there.
>>
>> <nod> Thanks for the worked example output. :)
> 
> Thanks for the review. :)
> 
> ritesh
> 
> 
>>
>> --D
>>
>>>
>>> References of some previous discussions:
>>> =======================================
>>> [1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
>>> [2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
>>> [3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
>>> [4]: 
>>> https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c 
>>>
>>> [RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html
>>>
>>>
>>> Ritesh Harjani (4):
>>>    ext4: Add IOMAP_F_MERGED for non-extent based mapping
>>>    ext4: Optimize ext4_ext_precache for 0 depth
>>>    ext4: Move ext4 bmap to use iomap infrastructure.
>>>    ext4: Move ext4_fiemap to use iomap infrastructure
>>>
>>>   fs/ext4/extents.c | 288 +++++++---------------------------------------
>>>   fs/ext4/inline.c  |  41 -------
>>>   fs/ext4/inode.c   |   6 +-
>>>   3 files changed, 49 insertions(+), 286 deletions(-)
>>>
>>> -- 
>>> 2.21.0
>>>


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-05 12:47   ` Ritesh Harjani
@ 2020-02-05 15:57     ` Darrick J. Wong
  2020-02-06  5:26       ` Ritesh Harjani
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2020-02-05 15:57 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino

On Wed, Feb 05, 2020 at 06:17:44PM +0530, Ritesh Harjani wrote:
> 
> 
> On 1/30/20 11:04 PM, Ritesh Harjani wrote:
> > 
> > 
> > On 1/30/20 9:30 PM, Darrick J. Wong wrote:
> > > On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
> > > > Hello All,
> > > > 
> > > > Background
> > > > ==========
> > > > There are RFCv2 patches to move ext4 bmap & fiemap calls to use
> > > > iomap APIs.
> > > > This reduces the users of ext4_get_block API and thus a step
> > > > towards getting
> > > > rid of buffer_heads from ext4. Also reduces a lot of code by
> > > > making use of
> > > > existing iomap_ops (except for xattr implementation).
> > > > 
> > > > Testing (done on ext4 master branch)
> > > > ========
> > > > 'xfstests -g auto' passes with default mkfs/mount configuration
> > > > (v/s which also pass with vanilla kernel without this patch). Except
> > > > generic/473 which also failes on XFS. This seems to be the test
> > > > case issue
> > > > since it expects the data in slightly different way as compared
> > > > to what iomap
> > > > returns.
> > > > Point 2.a below describes more about this.
> > > > 
> > > > Observations/Review required
> > > > ============================
> > > > 1. bmap related old v/s new method differences:-
> > > >     a. In case if addr > INT_MAX, it issues a warning and
> > > >        returns 0 as the block no. While earlier it used to return the
> > > >        truncated value with no warning.
> > > 
> > > Good...
> > > 
> > > >     b. block no. is only returned in case of iomap->type is
> > > > IOMAP_MAPPED,
> > > >        but not when iomap->type is IOMAP_UNWRITTEN. While with
> > > > previously
> > > >        we used to get block no. for both of above cases.
> > > 
> > > Assuming the only remaining usecase of bmap is to tell old bootloaders
> > > where to find vmlinuz blocks on disk, I don't see much reason to map
> > > unwritten blocks -- there's no data there, and if your bootloader writes
> > > to the filesystem(!) then you can't read whatever was written there
> > > anyway.
> > 
> > Yes, no objection there. Just wanted to get it reviewed.
> > 
> > 
> > > 
> > > Uh, can we put this ioctl on the deprecation list, please? :)
> > > 
> > > > 2. Fiemap related old v/s new method differences:-
> > > >     a. iomap_fiemap returns the disk extent information in exact
> > > >        correspondence with start of user requested logical
> > > > offset till the
> > > >        length requested by user. While in previous implementation the
> > > >        returned information used to give the complete extent
> > > > information if
> > > >        the range requested by user lies in between the extent mapping.
> > > 
> > > This is a topic of much disagreement.  The FIEMAP documentation says
> > > that the call must return data for the requested range, but *may* return
> > > a mapping that extends beyond the requested range.
> > > 
> > > XFS (and now iomap) only return data for the requested range, whereas
> > > ext4 has (had?) the behavior you describe.  generic/473 was an attempt
> > > to enforce the ext4 behavior across all filesystems, but I put it in my
> > > dead list and never run it.
> > > 
> > > So it's a behavioral change, but the new behavior isn't forbidden.
> > 
> > Sure, thanks.
> > 
> > > 
> > > >     b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
> > > >        fiemap_extent mapping range requested by the user via fm_length (
> > > >        if that has a valid mapped extent on the disk).
> > > 
> > > That sounds like a bug.  _LAST is supposed to be set on the last extent
> > > in the file, not the last record in the queried dataset.
> > 
> > Thought so too, sure will spend some time to try fixing it.
> 
> Looked into this. I think below should fix our above reported problem with
> current iomap code.
> If no objection I will send send PATCHv3 with below fix as the first
> patch in the series.
> 
> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
> index bccf305ea9ce..ee53991810d5 100644
> --- a/fs/iomap/fiemap.c
> +++ b/fs/iomap/fiemap.c
> @@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct
> fiemap_extent_info *fi,
>         }
> 
>         if (ctx.prev.type != IOMAP_HOLE) {
> -               ret = iomap_to_fiemap(fi, &ctx.prev, FIEMAP_EXTENT_LAST);
> +               u32 flags = 0;
> +               loff_t isize = i_size_read(inode);
> +
> +               if (ctx.prev.offset + ctx.prev.length >= isize)

What happens if ctx.prev actually is the last iomap in the file, but
isize extends beyond that?  e.g.,

# xfs_io -f -c 'pwrite 0 64k' /a
# truncate -s 100m /a
# filefrag -v /a

I think we need the fiemap variant of the iomap_begin functions to pass
a flag in the iomap that the fiemap implementation can pick up.

--D

> +                       flags |= FIEMAP_EXTENT_LAST;
> +               ret = iomap_to_fiemap(fi, &ctx.prev, flags);
>                 if (ret < 0)
>                         return ret;
>         }
> 
> 
> -ritesh
> 
> 
> > 
> > 
> > > 
> > > > But if the user
> > > >        requested for more fm_length which could not be mapped in
> > > > the last
> > > >        fiemap_extent, then the flag is not set.
> > > 
> > > Yes... if there were more extents to map than there was space in the map
> > > array, then _LAST should remain unset to encourage userspace to come
> > > back for the rest of the mappings.
> > > 
> > > (Unless maybe I'm misunderstanding here...)
> > > 
> > > > e.g. output for above differences 2.a & 2.b
> > > > ===========================================
> > > > create a file with below cmds.
> > > > 1. fallocate -o 0 -l 8K testfile.txt
> > > > 2. xfs_io -c "pwrite 8K 8K" testfile.txt
> > > > 3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
> > > > 4. run this binary on with and without these patches:- ./a.out
> > > > (test_fiemap_diff.c) [4]
> > > > 
> > > > o/p of xfs_io -c "fiemap -v"
> > > > ============================================
> > > > With this patch on patched kernel:-
> > > > testfile.txt:
> > > >   EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
> > > >     0: [0..15]:         122802736..122802751    16 0x800
> > > >     1: [16..31]:        122687536..122687551    16   0x1
> > > > 
> > > > without patch on vanilla kernel (no difference):-
> > > > testfile.txt:
> > > >   EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
> > > >     0: [0..15]:         332211376..332211391    16 0x800
> > > >     1: [16..31]:        332722392..332722407    16   0x1
> > > > 
> > > > 
> > > > o/p of a.out without patch:-
> > > > ================
> > > > riteshh-> ./a.out
> > > > logical: [       0..      15] phys: 332211376..332211391 flags:
> > > > 0x800 tot: 16
> > > > (0) extent flag = 2048
> > > > 
> > > > o/p of a.out with patch (both point 2.a & 2.b could be seen)
> > > > =======================
> > > > riteshh-> ./a.out
> > > > logical: [       0..       7] phys: 122802736..122802743 flags:
> > > > 0x801 tot: 8
> > > > (0) extent flag = 2049
> > > > 
> > > > FYI - In test_fiemap_diff.c test we had
> > > > a. fm_extent_count = 1
> > > > b. fm_start = 0
> > > > c. fm_length = 4K
> > > > Whereas when we change fm_extent_count = 32, then we don't see
> > > > any difference.
> > > > 
> > > > e.g. output for above difference listed in point 1.b
> > > > ====================================================
> > > > 
> > > > o/p without patch (block no returned for unwritten block as well)
> > > > =========Testing IOCTL FIBMAP=========
> > > > File size = 16384, blkcnt = 4, blocksize = 4096
> > > >    0   41526422
> > > >    1   41526423
> > > >    2   41590299
> > > >    3   41590300
> > > > 
> > > > o/p with patch (0 returned for unwritten block)
> > > > =========Testing IOCTL FIBMAP=========
> > > > File size = 16384, blkcnt = 4, blocksize = 4096
> > > >    0          0          0
> > > >    1          0          0
> > > >    2   15335942      29953
> > > >    3   15335943      29953
> > > > 
> > > > 
> > > > Summary:-
> > > > ========
> > > > Due to some of the observational differences to user, listed above,
> > > > requesting to please help with a careful review in moving this to iomap.
> > > > Digging into some older threads, it looks like these differences
> > > > should be fine,
> > > > since the same tools have been working fine with XFS (which uses
> > > > iomap based
> > > > implementation) [1]
> > > > Also as Ted suggested in [3]: Fiemap & bmap spec could be made
> > > > based on the ext4
> > > > implementation. But since all the tools also work with xfs which
> > > > uses iomap
> > > > based fiemap, so we should be good there.
> > > 
> > > <nod> Thanks for the worked example output. :)
> > 
> > Thanks for the review. :)
> > 
> > ritesh
> > 
> > 
> > > 
> > > --D
> > > 
> > > > 
> > > > References of some previous discussions:
> > > > =======================================
> > > > [1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
> > > > [2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
> > > > [3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
> > > > [4]: https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c
> > > > 
> > > > [RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html
> > > > 
> > > > 
> > > > Ritesh Harjani (4):
> > > >    ext4: Add IOMAP_F_MERGED for non-extent based mapping
> > > >    ext4: Optimize ext4_ext_precache for 0 depth
> > > >    ext4: Move ext4 bmap to use iomap infrastructure.
> > > >    ext4: Move ext4_fiemap to use iomap infrastructure
> > > > 
> > > >   fs/ext4/extents.c | 288 +++++++---------------------------------------
> > > >   fs/ext4/inline.c  |  41 -------
> > > >   fs/ext4/inode.c   |   6 +-
> > > >   3 files changed, 49 insertions(+), 286 deletions(-)
> > > > 
> > > > -- 
> > > > 2.21.0
> > > > 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-05 15:57     ` Darrick J. Wong
@ 2020-02-06  5:26       ` Ritesh Harjani
  2020-02-06 10:22         ` Jan Kara
  0 siblings, 1 reply; 20+ messages in thread
From: Ritesh Harjani @ 2020-02-06  5:26 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: jack, tytso, adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino



On 2/5/20 9:27 PM, Darrick J. Wong wrote:
> On Wed, Feb 05, 2020 at 06:17:44PM +0530, Ritesh Harjani wrote:
>>
>>
>> On 1/30/20 11:04 PM, Ritesh Harjani wrote:
>>>
>>>
>>> On 1/30/20 9:30 PM, Darrick J. Wong wrote:
>>>> On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
>>>>> Hello All,
>>>>>
>>>>> Background
>>>>> ==========
>>>>> There are RFCv2 patches to move ext4 bmap & fiemap calls to use
>>>>> iomap APIs.
>>>>> This reduces the users of ext4_get_block API and thus a step
>>>>> towards getting
>>>>> rid of buffer_heads from ext4. Also reduces a lot of code by
>>>>> making use of
>>>>> existing iomap_ops (except for xattr implementation).
>>>>>
>>>>> Testing (done on ext4 master branch)
>>>>> ========
>>>>> 'xfstests -g auto' passes with default mkfs/mount configuration
>>>>> (v/s which also pass with vanilla kernel without this patch). Except
>>>>> generic/473 which also failes on XFS. This seems to be the test
>>>>> case issue
>>>>> since it expects the data in slightly different way as compared
>>>>> to what iomap
>>>>> returns.
>>>>> Point 2.a below describes more about this.
>>>>>
>>>>> Observations/Review required
>>>>> ============================
>>>>> 1. bmap related old v/s new method differences:-
>>>>>      a. In case if addr > INT_MAX, it issues a warning and
>>>>>         returns 0 as the block no. While earlier it used to return the
>>>>>         truncated value with no warning.
>>>>
>>>> Good...
>>>>
>>>>>      b. block no. is only returned in case of iomap->type is
>>>>> IOMAP_MAPPED,
>>>>>         but not when iomap->type is IOMAP_UNWRITTEN. While with
>>>>> previously
>>>>>         we used to get block no. for both of above cases.
>>>>
>>>> Assuming the only remaining usecase of bmap is to tell old bootloaders
>>>> where to find vmlinuz blocks on disk, I don't see much reason to map
>>>> unwritten blocks -- there's no data there, and if your bootloader writes
>>>> to the filesystem(!) then you can't read whatever was written there
>>>> anyway.
>>>
>>> Yes, no objection there. Just wanted to get it reviewed.
>>>
>>>
>>>>
>>>> Uh, can we put this ioctl on the deprecation list, please? :)
>>>>
>>>>> 2. Fiemap related old v/s new method differences:-
>>>>>      a. iomap_fiemap returns the disk extent information in exact
>>>>>         correspondence with start of user requested logical
>>>>> offset till the
>>>>>         length requested by user. While in previous implementation the
>>>>>         returned information used to give the complete extent
>>>>> information if
>>>>>         the range requested by user lies in between the extent mapping.
>>>>
>>>> This is a topic of much disagreement.  The FIEMAP documentation says
>>>> that the call must return data for the requested range, but *may* return
>>>> a mapping that extends beyond the requested range.
>>>>
>>>> XFS (and now iomap) only return data for the requested range, whereas
>>>> ext4 has (had?) the behavior you describe.  generic/473 was an attempt
>>>> to enforce the ext4 behavior across all filesystems, but I put it in my
>>>> dead list and never run it.
>>>>
>>>> So it's a behavioral change, but the new behavior isn't forbidden.
>>>
>>> Sure, thanks.
>>>
>>>>
>>>>>      b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
>>>>>         fiemap_extent mapping range requested by the user via fm_length (
>>>>>         if that has a valid mapped extent on the disk).
>>>>
>>>> That sounds like a bug.  _LAST is supposed to be set on the last extent
>>>> in the file, not the last record in the queried dataset.
>>>
>>> Thought so too, sure will spend some time to try fixing it.
>>
>> Looked into this. I think below should fix our above reported problem with
>> current iomap code.
>> If no objection I will send send PATCHv3 with below fix as the first
>> patch in the series.
>>
>> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
>> index bccf305ea9ce..ee53991810d5 100644
>> --- a/fs/iomap/fiemap.c
>> +++ b/fs/iomap/fiemap.c
>> @@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct
>> fiemap_extent_info *fi,
>>          }
>>
>>          if (ctx.prev.type != IOMAP_HOLE) {
>> -               ret = iomap_to_fiemap(fi, &ctx.prev, FIEMAP_EXTENT_LAST);
>> +               u32 flags = 0;
>> +               loff_t isize = i_size_read(inode);
>> +
>> +               if (ctx.prev.offset + ctx.prev.length >= isize)
> 
> What happens if ctx.prev actually is the last iomap in the file, but
> isize extends beyond that?  e.g.,
> 
> # xfs_io -f -c 'pwrite 0 64k' /a
> # truncate -s 100m /a
> # filefrag -v /a

Err.. should never miss this truncate case.

Digging further, I see even generic_block_fiemap() does not take care of
this case if the file isize is extended by truncate.
It happens to mark _LAST only based on i_size_read(). It seems only ext*
family and hpfs filesystem seems to be using this currently.
And gfs2 was the user of this api till sometime back before it finally
moved to use iomap_fiemap() api.


> 
> I think we need the fiemap variant of the iomap_begin functions to pass
> a flag in the iomap that the fiemap implementation can pick up.

Sure, let me do some digging on this one. One challenge which I think 
would be for filesystems to tell *efficiently* on whether this is the
last extent or not (without checking on every iomap_begin call about,
if there exist a next extent on the disk by doing more I/O - that's why
*efficiently*).

If done then -
we could use IOMAP_FIEMAP as the flag to pass to iomap_begin functions
and it could return us the iomap->type marked with IOMAP_EXTENT_LAST 
which could represent that this is actually the last extent on disk for
this inode.


-ritesh

> 
> --D
> 
>> +                       flags |= FIEMAP_EXTENT_LAST;
>> +               ret = iomap_to_fiemap(fi, &ctx.prev, flags);
>>                  if (ret < 0)
>>                          return ret;
>>          }
>>
>>
>> -ritesh
>>
>>
>>>
>>>
>>>>
>>>>> But if the user
>>>>>         requested for more fm_length which could not be mapped in
>>>>> the last
>>>>>         fiemap_extent, then the flag is not set.
>>>>
>>>> Yes... if there were more extents to map than there was space in the map
>>>> array, then _LAST should remain unset to encourage userspace to come
>>>> back for the rest of the mappings.
>>>>
>>>> (Unless maybe I'm misunderstanding here...)
>>>>
>>>>> e.g. output for above differences 2.a & 2.b
>>>>> ===========================================
>>>>> create a file with below cmds.
>>>>> 1. fallocate -o 0 -l 8K testfile.txt
>>>>> 2. xfs_io -c "pwrite 8K 8K" testfile.txt
>>>>> 3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
>>>>> 4. run this binary on with and without these patches:- ./a.out
>>>>> (test_fiemap_diff.c) [4]
>>>>>
>>>>> o/p of xfs_io -c "fiemap -v"
>>>>> ============================================
>>>>> With this patch on patched kernel:-
>>>>> testfile.txt:
>>>>>    EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>>>>>      0: [0..15]:         122802736..122802751    16 0x800
>>>>>      1: [16..31]:        122687536..122687551    16   0x1
>>>>>
>>>>> without patch on vanilla kernel (no difference):-
>>>>> testfile.txt:
>>>>>    EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
>>>>>      0: [0..15]:         332211376..332211391    16 0x800
>>>>>      1: [16..31]:        332722392..332722407    16   0x1
>>>>>
>>>>>
>>>>> o/p of a.out without patch:-
>>>>> ================
>>>>> riteshh-> ./a.out
>>>>> logical: [       0..      15] phys: 332211376..332211391 flags:
>>>>> 0x800 tot: 16
>>>>> (0) extent flag = 2048
>>>>>
>>>>> o/p of a.out with patch (both point 2.a & 2.b could be seen)
>>>>> =======================
>>>>> riteshh-> ./a.out
>>>>> logical: [       0..       7] phys: 122802736..122802743 flags:
>>>>> 0x801 tot: 8
>>>>> (0) extent flag = 2049
>>>>>
>>>>> FYI - In test_fiemap_diff.c test we had
>>>>> a. fm_extent_count = 1
>>>>> b. fm_start = 0
>>>>> c. fm_length = 4K
>>>>> Whereas when we change fm_extent_count = 32, then we don't see
>>>>> any difference.
>>>>>
>>>>> e.g. output for above difference listed in point 1.b
>>>>> ====================================================
>>>>>
>>>>> o/p without patch (block no returned for unwritten block as well)
>>>>> =========Testing IOCTL FIBMAP=========
>>>>> File size = 16384, blkcnt = 4, blocksize = 4096
>>>>>     0   41526422
>>>>>     1   41526423
>>>>>     2   41590299
>>>>>     3   41590300
>>>>>
>>>>> o/p with patch (0 returned for unwritten block)
>>>>> =========Testing IOCTL FIBMAP=========
>>>>> File size = 16384, blkcnt = 4, blocksize = 4096
>>>>>     0          0          0
>>>>>     1          0          0
>>>>>     2   15335942      29953
>>>>>     3   15335943      29953
>>>>>
>>>>>
>>>>> Summary:-
>>>>> ========
>>>>> Due to some of the observational differences to user, listed above,
>>>>> requesting to please help with a careful review in moving this to iomap.
>>>>> Digging into some older threads, it looks like these differences
>>>>> should be fine,
>>>>> since the same tools have been working fine with XFS (which uses
>>>>> iomap based
>>>>> implementation) [1]
>>>>> Also as Ted suggested in [3]: Fiemap & bmap spec could be made
>>>>> based on the ext4
>>>>> implementation. But since all the tools also work with xfs which
>>>>> uses iomap
>>>>> based fiemap, so we should be good there.
>>>>
>>>> <nod> Thanks for the worked example output. :)
>>>
>>> Thanks for the review. :)
>>>
>>> ritesh
>>>
>>>
>>>>
>>>> --D
>>>>
>>>>>
>>>>> References of some previous discussions:
>>>>> =======================================
>>>>> [1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
>>>>> [2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
>>>>> [3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
>>>>> [4]: https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c
>>>>>
>>>>> [RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html
>>>>>
>>>>>
>>>>> Ritesh Harjani (4):
>>>>>     ext4: Add IOMAP_F_MERGED for non-extent based mapping
>>>>>     ext4: Optimize ext4_ext_precache for 0 depth
>>>>>     ext4: Move ext4 bmap to use iomap infrastructure.
>>>>>     ext4: Move ext4_fiemap to use iomap infrastructure
>>>>>
>>>>>    fs/ext4/extents.c | 288 +++++++---------------------------------------
>>>>>    fs/ext4/inline.c  |  41 -------
>>>>>    fs/ext4/inode.c   |   6 +-
>>>>>    3 files changed, 49 insertions(+), 286 deletions(-)
>>>>>
>>>>> -- 
>>>>> 2.21.0
>>>>>
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-06  5:26       ` Ritesh Harjani
@ 2020-02-06 10:22         ` Jan Kara
  2020-02-20 17:03           ` Ritesh Harjani
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kara @ 2020-02-06 10:22 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Darrick J. Wong, jack, tytso, adilger.kernel, linux-ext4,
	linux-fsdevel, hch, cmaiolino

On Thu 06-02-20 10:56:18, Ritesh Harjani wrote:
> 
> 
> On 2/5/20 9:27 PM, Darrick J. Wong wrote:
> > On Wed, Feb 05, 2020 at 06:17:44PM +0530, Ritesh Harjani wrote:
> > > 
> > > 
> > > On 1/30/20 11:04 PM, Ritesh Harjani wrote:
> > > > 
> > > > 
> > > > On 1/30/20 9:30 PM, Darrick J. Wong wrote:
> > > > > On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
> > > > > > Hello All,
> > > > > > 
> > > > > > Background
> > > > > > ==========
> > > > > > There are RFCv2 patches to move ext4 bmap & fiemap calls to use
> > > > > > iomap APIs.
> > > > > > This reduces the users of ext4_get_block API and thus a step
> > > > > > towards getting
> > > > > > rid of buffer_heads from ext4. Also reduces a lot of code by
> > > > > > making use of
> > > > > > existing iomap_ops (except for xattr implementation).
> > > > > > 
> > > > > > Testing (done on ext4 master branch)
> > > > > > ========
> > > > > > 'xfstests -g auto' passes with default mkfs/mount configuration
> > > > > > (v/s which also pass with vanilla kernel without this patch). Except
> > > > > > generic/473 which also failes on XFS. This seems to be the test
> > > > > > case issue
> > > > > > since it expects the data in slightly different way as compared
> > > > > > to what iomap
> > > > > > returns.
> > > > > > Point 2.a below describes more about this.
> > > > > > 
> > > > > > Observations/Review required
> > > > > > ============================
> > > > > > 1. bmap related old v/s new method differences:-
> > > > > >      a. In case if addr > INT_MAX, it issues a warning and
> > > > > >         returns 0 as the block no. While earlier it used to return the
> > > > > >         truncated value with no warning.
> > > > > 
> > > > > Good...
> > > > > 
> > > > > >      b. block no. is only returned in case of iomap->type is
> > > > > > IOMAP_MAPPED,
> > > > > >         but not when iomap->type is IOMAP_UNWRITTEN. While with
> > > > > > previously
> > > > > >         we used to get block no. for both of above cases.
> > > > > 
> > > > > Assuming the only remaining usecase of bmap is to tell old bootloaders
> > > > > where to find vmlinuz blocks on disk, I don't see much reason to map
> > > > > unwritten blocks -- there's no data there, and if your bootloader writes
> > > > > to the filesystem(!) then you can't read whatever was written there
> > > > > anyway.
> > > > 
> > > > Yes, no objection there. Just wanted to get it reviewed.
> > > > 
> > > > 
> > > > > 
> > > > > Uh, can we put this ioctl on the deprecation list, please? :)
> > > > > 
> > > > > > 2. Fiemap related old v/s new method differences:-
> > > > > >      a. iomap_fiemap returns the disk extent information in exact
> > > > > >         correspondence with start of user requested logical
> > > > > > offset till the
> > > > > >         length requested by user. While in previous implementation the
> > > > > >         returned information used to give the complete extent
> > > > > > information if
> > > > > >         the range requested by user lies in between the extent mapping.
> > > > > 
> > > > > This is a topic of much disagreement.  The FIEMAP documentation says
> > > > > that the call must return data for the requested range, but *may* return
> > > > > a mapping that extends beyond the requested range.
> > > > > 
> > > > > XFS (and now iomap) only return data for the requested range, whereas
> > > > > ext4 has (had?) the behavior you describe.  generic/473 was an attempt
> > > > > to enforce the ext4 behavior across all filesystems, but I put it in my
> > > > > dead list and never run it.
> > > > > 
> > > > > So it's a behavioral change, but the new behavior isn't forbidden.
> > > > 
> > > > Sure, thanks.
> > > > 
> > > > > 
> > > > > >      b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
> > > > > >         fiemap_extent mapping range requested by the user via fm_length (
> > > > > >         if that has a valid mapped extent on the disk).
> > > > > 
> > > > > That sounds like a bug.  _LAST is supposed to be set on the last extent
> > > > > in the file, not the last record in the queried dataset.
> > > > 
> > > > Thought so too, sure will spend some time to try fixing it.
> > > 
> > > Looked into this. I think below should fix our above reported problem with
> > > current iomap code.
> > > If no objection I will send send PATCHv3 with below fix as the first
> > > patch in the series.
> > > 
> > > diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
> > > index bccf305ea9ce..ee53991810d5 100644
> > > --- a/fs/iomap/fiemap.c
> > > +++ b/fs/iomap/fiemap.c
> > > @@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct
> > > fiemap_extent_info *fi,
> > >          }
> > > 
> > >          if (ctx.prev.type != IOMAP_HOLE) {
> > > -               ret = iomap_to_fiemap(fi, &ctx.prev, FIEMAP_EXTENT_LAST);
> > > +               u32 flags = 0;
> > > +               loff_t isize = i_size_read(inode);
> > > +
> > > +               if (ctx.prev.offset + ctx.prev.length >= isize)
> > 
> > What happens if ctx.prev actually is the last iomap in the file, but
> > isize extends beyond that?  e.g.,
> > 
> > # xfs_io -f -c 'pwrite 0 64k' /a
> > # truncate -s 100m /a
> > # filefrag -v /a
> 
> Err.. should never miss this truncate case.
> 
> Digging further, I see even generic_block_fiemap() does not take care of
> this case if the file isize is extended by truncate.
> It happens to mark _LAST only based on i_size_read(). It seems only ext*
> family and hpfs filesystem seems to be using this currently.
> And gfs2 was the user of this api till sometime back before it finally
> moved to use iomap_fiemap() api.
> 
> 
> > 
> > I think we need the fiemap variant of the iomap_begin functions to pass
> > a flag in the iomap that the fiemap implementation can pick up.
> 
> Sure, let me do some digging on this one. One challenge which I think would
> be for filesystems to tell *efficiently* on whether this is the
> last extent or not (without checking on every iomap_begin call about,
> if there exist a next extent on the disk by doing more I/O - that's why
> *efficiently*).
> 
> If done then -
> we could use IOMAP_FIEMAP as the flag to pass to iomap_begin functions
> and it could return us the iomap->type marked with IOMAP_EXTENT_LAST which
> could represent that this is actually the last extent on disk for
> this inode.

So I think IOMAP_EXTENT_LAST should be treated as an optional flag. If the
fs can provide it in a cheap way, do so. Otherwise don't bother. Because
ultimately, the FIEMAP caller has to deal with not seeing IOMAP_EXTENT_LAST
anyway (e.g. if the file has no extents or if someone modifies the file
between the calls). So maybe we need to rather update the documentation
that the IOMAP_EXTENT_LAST is best-effort only?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-06 10:22         ` Jan Kara
@ 2020-02-20 17:03           ` Ritesh Harjani
  2020-02-20 17:09             ` Christoph Hellwig
  2020-02-21  4:10             ` Ritesh Harjani
  0 siblings, 2 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-02-20 17:03 UTC (permalink / raw)
  To: Jan Kara, Darrick J. Wong, tytso
  Cc: adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino



On 2/6/20 3:52 PM, Jan Kara wrote:
> On Thu 06-02-20 10:56:18, Ritesh Harjani wrote:
>>
>>
>> On 2/5/20 9:27 PM, Darrick J. Wong wrote:
>>> On Wed, Feb 05, 2020 at 06:17:44PM +0530, Ritesh Harjani wrote:
>>>>
>>>>
>>>> On 1/30/20 11:04 PM, Ritesh Harjani wrote:
>>>>>
>>>>>
>>>>> On 1/30/20 9:30 PM, Darrick J. Wong wrote:
>>>>>> On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
>>>>>>> Hello All,
>>>>>>>
>>>>>>> Background
>>>>>>> ==========
>>>>>>> There are RFCv2 patches to move ext4 bmap & fiemap calls to use
>>>>>>> iomap APIs.
>>>>>>> This reduces the users of ext4_get_block API and thus a step
>>>>>>> towards getting
>>>>>>> rid of buffer_heads from ext4. Also reduces a lot of code by
>>>>>>> making use of
>>>>>>> existing iomap_ops (except for xattr implementation).
>>>>>>>
>>>>>>> Testing (done on ext4 master branch)
>>>>>>> ========
>>>>>>> 'xfstests -g auto' passes with default mkfs/mount configuration
>>>>>>> (v/s which also pass with vanilla kernel without this patch). Except
>>>>>>> generic/473 which also failes on XFS. This seems to be the test
>>>>>>> case issue
>>>>>>> since it expects the data in slightly different way as compared
>>>>>>> to what iomap
>>>>>>> returns.
>>>>>>> Point 2.a below describes more about this.
>>>>>>>
>>>>>>> Observations/Review required
>>>>>>> ============================
>>>>>>> 1. bmap related old v/s new method differences:-
>>>>>>>       a. In case if addr > INT_MAX, it issues a warning and
>>>>>>>          returns 0 as the block no. While earlier it used to return the
>>>>>>>          truncated value with no warning.
>>>>>>
>>>>>> Good...
>>>>>>
>>>>>>>       b. block no. is only returned in case of iomap->type is
>>>>>>> IOMAP_MAPPED,
>>>>>>>          but not when iomap->type is IOMAP_UNWRITTEN. While with
>>>>>>> previously
>>>>>>>          we used to get block no. for both of above cases.
>>>>>>
>>>>>> Assuming the only remaining usecase of bmap is to tell old bootloaders
>>>>>> where to find vmlinuz blocks on disk, I don't see much reason to map
>>>>>> unwritten blocks -- there's no data there, and if your bootloader writes
>>>>>> to the filesystem(!) then you can't read whatever was written there
>>>>>> anyway.
>>>>>
>>>>> Yes, no objection there. Just wanted to get it reviewed.
>>>>>
>>>>>
>>>>>>
>>>>>> Uh, can we put this ioctl on the deprecation list, please? :)
>>>>>>
>>>>>>> 2. Fiemap related old v/s new method differences:-
>>>>>>>       a. iomap_fiemap returns the disk extent information in exact
>>>>>>>          correspondence with start of user requested logical
>>>>>>> offset till the
>>>>>>>          length requested by user. While in previous implementation the
>>>>>>>          returned information used to give the complete extent
>>>>>>> information if
>>>>>>>          the range requested by user lies in between the extent mapping.
>>>>>>
>>>>>> This is a topic of much disagreement.  The FIEMAP documentation says
>>>>>> that the call must return data for the requested range, but *may* return
>>>>>> a mapping that extends beyond the requested range.
>>>>>>
>>>>>> XFS (and now iomap) only return data for the requested range, whereas
>>>>>> ext4 has (had?) the behavior you describe.  generic/473 was an attempt
>>>>>> to enforce the ext4 behavior across all filesystems, but I put it in my
>>>>>> dead list and never run it.
>>>>>>
>>>>>> So it's a behavioral change, but the new behavior isn't forbidden.
>>>>>
>>>>> Sure, thanks.
>>>>>
>>>>>>
>>>>>>>       b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
>>>>>>>          fiemap_extent mapping range requested by the user via fm_length (
>>>>>>>          if that has a valid mapped extent on the disk).
>>>>>>
>>>>>> That sounds like a bug.  _LAST is supposed to be set on the last extent
>>>>>> in the file, not the last record in the queried dataset.
>>>>>
>>>>> Thought so too, sure will spend some time to try fixing it.
>>>>
>>>> Looked into this. I think below should fix our above reported problem with
>>>> current iomap code.
>>>> If no objection I will send send PATCHv3 with below fix as the first
>>>> patch in the series.
>>>>
>>>> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
>>>> index bccf305ea9ce..ee53991810d5 100644
>>>> --- a/fs/iomap/fiemap.c
>>>> +++ b/fs/iomap/fiemap.c
>>>> @@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct
>>>> fiemap_extent_info *fi,
>>>>           }
>>>>
>>>>           if (ctx.prev.type != IOMAP_HOLE) {
>>>> -               ret = iomap_to_fiemap(fi, &ctx.prev, FIEMAP_EXTENT_LAST);
>>>> +               u32 flags = 0;
>>>> +               loff_t isize = i_size_read(inode);
>>>> +
>>>> +               if (ctx.prev.offset + ctx.prev.length >= isize)
>>>
>>> What happens if ctx.prev actually is the last iomap in the file, but
>>> isize extends beyond that?  e.g.,
>>>
>>> # xfs_io -f -c 'pwrite 0 64k' /a
>>> # truncate -s 100m /a
>>> # filefrag -v /a
>>
>> Err.. should never miss this truncate case.
>>
>> Digging further, I see even generic_block_fiemap() does not take care of
>> this case if the file isize is extended by truncate.
>> It happens to mark _LAST only based on i_size_read(). It seems only ext*
>> family and hpfs filesystem seems to be using this currently.
>> And gfs2 was the user of this api till sometime back before it finally
>> moved to use iomap_fiemap() api.
>>
>>
>>>
>>> I think we need the fiemap variant of the iomap_begin functions to pass
>>> a flag in the iomap that the fiemap implementation can pick up.
>>
>> Sure, let me do some digging on this one. One challenge which I think would
>> be for filesystems to tell *efficiently* on whether this is the
>> last extent or not (without checking on every iomap_begin call about,
>> if there exist a next extent on the disk by doing more I/O - that's why
>> *efficiently*).
>>
>> If done then -
>> we could use IOMAP_FIEMAP as the flag to pass to iomap_begin functions
>> and it could return us the iomap->type marked with IOMAP_EXTENT_LAST which
>> could represent that this is actually the last extent on disk for
>> this inode.
> 
> So I think IOMAP_EXTENT_LAST should be treated as an optional flag. If the
> fs can provide it in a cheap way, do so. Otherwise don't bother. Because
> ultimately, the FIEMAP caller has to deal with not seeing IOMAP_EXTENT_LAST
> anyway (e.g. if the file has no extents or if someone modifies the file
> between the calls). So maybe we need to rather update the documentation
> that the IOMAP_EXTENT_LAST is best-effort only?
> 

So I was making some changes along the above lines and I think we can 
take below approach for filesystem which could determine the
_EXTENT_LAST relatively easily and for cases if it cannot
as Jan also mentioned we could keep the current behavior as is and let
iomap core decide the last disk extent.

For this -
1. We could use IOMAP_FIEMAP flag approach to pass this flag to 
->iomap_begin fs functions. But we also will need
IOMAP_F_EXTENT_LAST & IOMAP_F_EXTENT_NOT_LAST flag for filesystem
to tell that whether this is the last extent or not.
IOMAP_F_EXTENT_NOT_LAST is also needed so that FS can tell to iomap
that the FS is capable of letting the iomap core know that whether
it is the last extent or not. If this is not specified then iomap core
will retain it's current behavior to mark FIEMAP_EXTENT_LAST.

Now in this case for ext4 we could simply call ext4_map_blocks second
time with the next lblk to map to see if there exist a next mapping or
any hole. Here again if the next lblk is a hole which is less
the INT_MAX blocks then again it will mean that the current is not the
last extent mapping. With that we could determine if the current mapping
is the last extent mapping or not.

In case of non-extent based mapping, we still may not be able to detect
the last disk block very easily. Because ext4_map_blocks may just give
mapping upto the next direct block.
(e.g. in case if the file is data followed by a hole - case described
by Darrick above)
I see generic_block_fiemap also does not detect the last block mapping
correctly and relies on i_size_read(inode) for that.

So to keep the ext4_fiemap behavior same when
moving to iomap framework, we can make the above changes where in
_EXTENT_LAST will be mainly set by FS for extent based mapping and for 
non-extent it will let the iomap current code determine last mapped
extent/block.

But - as per Jan in above comment, if the caller anyways have to always
determine that whether it is the last extent or not, then I am not sure
whether we should add this extra complexity or not, but I guess
this should be mainly so that the current behavior of ext4_fiemap should 
not change while moving to iomap. Your thoughts pls?


Raw version of ext4_iomap_begin_report implementation to determine last 
extent, which I am currently testing based on above
discussed idea... will post the full version soon. But wanted to
get some initial feedback on this one.


diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0f8a196d8a61..85c755f2989b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3520,6 +3520,25 @@ static bool ext4_iomap_is_delalloc(struct inode 
*inode,
         return true;
  }

+static bool ext4_is_last_extent(struct inode *inode, ext4_lblk_t lblk)
+{
+       int ret;
+       bool delalloc = false;
+       struct ext4_map_blocks map;
+       unsigned int len = INT_MAX;
+
+       map.m_lblk = lblk;
+       map.m_len = len;
+
+       ret = ext4_map_blocks(NULL, inode, &map, 0);
+
+       if (ret == 0)
+               delalloc = ext4_iomap_is_delalloc(inode, &map);
+       if (ret > 0 || delalloc || map.m_len < len)
+               return false;
+       return true;
+}
+
  static int ext4_iomap_begin_report(struct inode *inode, loff_t offset,
                                    loff_t length, unsigned int flags,
                                    struct iomap *iomap, struct iomap 
*srcmap)
@@ -3548,16 +3567,36 @@ static int ext4_iomap_begin_report(struct inode 
*inode, loff_t offset,
         map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
                           EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;

+
+       if ((flags & IOMAP_FIEMAP) &&
+           !ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
+               if (offset >= i_size_read(inode)) {
+                       map.m_flags = 0;
+                       map.m_pblk = 0;
+                       goto set_iomap;
+               }
+       }
+
         ret = ext4_map_blocks(NULL, inode, &map, 0);

         if (ret < 0)
                 return ret;
         if (ret == 0)
                 delalloc = ext4_iomap_is_delalloc(inode, &map);

+set_iomap:
         ext4_set_iomap(inode, iomap, &map, offset, length);
         if (delalloc && iomap->type == IOMAP_HOLE)
                 iomap->type = IOMAP_DELALLOC;

+       if (flags & IOMAP_FIEMAP &&
+           ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
+               ext4_lblk_t lblk = map.m_lblk + map.m_len;
+               if (ext4_is_last_extent(inode, lblk)) {
+                       iomap->flags |= IOMAP_F_EXTENT_LAST;
+               } else {
+                       iomap->flags |= IOMAP_F_EXTENT_NOT_LAST;
+               }
+       }
         return 0;
  }



-ritesh


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-20 17:03           ` Ritesh Harjani
@ 2020-02-20 17:09             ` Christoph Hellwig
  2020-02-21  4:16               ` Ritesh Harjani
  2020-02-21  4:10             ` Ritesh Harjani
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2020-02-20 17:09 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Jan Kara, Darrick J. Wong, tytso, adilger.kernel, linux-ext4,
	linux-fsdevel, hch, cmaiolino

On Thu, Feb 20, 2020 at 10:33:03PM +0530, Ritesh Harjani wrote:
> So I was making some changes along the above lines and I think we can take
> below approach for filesystem which could determine the
> _EXTENT_LAST relatively easily and for cases if it cannot
> as Jan also mentioned we could keep the current behavior as is and let
> iomap core decide the last disk extent.

Well, given that _EXTENT_LAST never worked properly on any file system
since it was added this actually changes behavior and could break
existing users.  I'd rather update the documentation to match reality
rather than writing a lot of code for a feature no one obviously cared
about for years.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-20 17:03           ` Ritesh Harjani
  2020-02-20 17:09             ` Christoph Hellwig
@ 2020-02-21  4:10             ` Ritesh Harjani
  1 sibling, 0 replies; 20+ messages in thread
From: Ritesh Harjani @ 2020-02-21  4:10 UTC (permalink / raw)
  To: Jan Kara, Darrick J. Wong, tytso
  Cc: adilger.kernel, linux-ext4, linux-fsdevel, hch, cmaiolino


>>>>> Looked into this. I think below should fix our above reported 
>>>>> problem with
>>>>> current iomap code.
>>>>> If no objection I will send send PATCHv3 with below fix as the first
>>>>> patch in the series.
>>>>>
>>>>> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
>>>>> index bccf305ea9ce..ee53991810d5 100644
>>>>> --- a/fs/iomap/fiemap.c
>>>>> +++ b/fs/iomap/fiemap.c
>>>>> @@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct
>>>>> fiemap_extent_info *fi,
>>>>>           }
>>>>>
>>>>>           if (ctx.prev.type != IOMAP_HOLE) {
>>>>> -               ret = iomap_to_fiemap(fi, &ctx.prev, 
>>>>> FIEMAP_EXTENT_LAST);
>>>>> +               u32 flags = 0;
>>>>> +               loff_t isize = i_size_read(inode);
>>>>> +
>>>>> +               if (ctx.prev.offset + ctx.prev.length >= isize)
>>>>
>>>> What happens if ctx.prev actually is the last iomap in the file, but
>>>> isize extends beyond that?  e.g.,
>>>>
>>>> # xfs_io -f -c 'pwrite 0 64k' /a
>>>> # truncate -s 100m /a
>>>> # filefrag -v /a
>>>
>>> Err.. should never miss this truncate case.
>>>
>>> Digging further, I see even generic_block_fiemap() does not take care of
>>> this case if the file isize is extended by truncate.
>>> It happens to mark _LAST only based on i_size_read(). It seems only ext*
>>> family and hpfs filesystem seems to be using this currently.
>>> And gfs2 was the user of this api till sometime back before it finally
>>> moved to use iomap_fiemap() api.
>>>
>>>
>>>>
>>>> I think we need the fiemap variant of the iomap_begin functions to pass
>>>> a flag in the iomap that the fiemap implementation can pick up.
>>>
>>> Sure, let me do some digging on this one. One challenge which I think 
>>> would
>>> be for filesystems to tell *efficiently* on whether this is the
>>> last extent or not (without checking on every iomap_begin call about,
>>> if there exist a next extent on the disk by doing more I/O - that's why
>>> *efficiently*).
>>>
>>> If done then -
>>> we could use IOMAP_FIEMAP as the flag to pass to iomap_begin functions
>>> and it could return us the iomap->type marked with IOMAP_EXTENT_LAST 
>>> which
>>> could represent that this is actually the last extent on disk for
>>> this inode.
>>
>> So I think IOMAP_EXTENT_LAST should be treated as an optional flag. If 
>> the
>> fs can provide it in a cheap way, do so. Otherwise don't bother. Because
>> ultimately, the FIEMAP caller has to deal with not seeing 
>> IOMAP_EXTENT_LAST
>> anyway (e.g. if the file has no extents or if someone modifies the file
>> between the calls). So maybe we need to rather update the documentation
>> that the IOMAP_EXTENT_LAST is best-effort only?
>>
> 
> So I was making some changes along the above lines and I think we can 
> take below approach for filesystem which could determine the
> _EXTENT_LAST relatively easily and for cases if it cannot
> as Jan also mentioned we could keep the current behavior as is and let
> iomap core decide the last disk extent.
> 
> For this -
> 1. We could use IOMAP_FIEMAP flag approach to pass this flag to 
> ->iomap_begin fs functions. But we also will need
> IOMAP_F_EXTENT_LAST & IOMAP_F_EXTENT_NOT_LAST flag for filesystem
> to tell that whether this is the last extent or not.
> IOMAP_F_EXTENT_NOT_LAST is also needed so that FS can tell to iomap
> that the FS is capable of letting the iomap core know that whether
> it is the last extent or not. If this is not specified then iomap core
> will retain it's current behavior to mark FIEMAP_EXTENT_LAST.
> 
> Now in this case for ext4 we could simply call ext4_map_blocks second
> time with the next lblk to map to see if there exist a next mapping or
> any hole. Here again if the next lblk is a hole which is less
> the INT_MAX blocks then again it will mean that the current is not the
> last extent mapping. With that we could determine if the current mapping
> is the last extent mapping or not.
> 
> In case of non-extent based mapping, we still may not be able to detect
> the last disk block very easily. Because ext4_map_blocks may just give
> mapping upto the next direct block.
> (e.g. in case if the file is data followed by a hole - case described
> by Darrick above)
> I see generic_block_fiemap also does not detect the last block mapping
> correctly and relies on i_size_read(inode) for that.
> 
> So to keep the ext4_fiemap behavior same when
> moving to iomap framework, we can make the above changes where in
> _EXTENT_LAST will be mainly set by FS for extent based mapping and for 
> non-extent it will let the iomap current code determine last mapped
> extent/block.
> 
> But - as per Jan in above comment, if the caller anyways have to always
> determine that whether it is the last extent or not, then I am not sure
> whether we should add this extra complexity or not, but I guess
> this should be mainly so that the current behavior of ext4_fiemap should 
> not change while moving to iomap. Your thoughts pls?
> 
> 
> Raw version of ext4_iomap_begin_report implementation to determine last 
> extent, which I am currently testing based on above
> discussed idea... will post the full version soon. But wanted to
> get some initial feedback on this one.
> 
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 0f8a196d8a61..85c755f2989b 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3520,6 +3520,25 @@ static bool ext4_iomap_is_delalloc(struct inode 
> *inode,
>          return true;
>   }
> 
> +static bool ext4_is_last_extent(struct inode *inode, ext4_lblk_t lblk)
> +{
> +       int ret;
> +       bool delalloc = false;
> +       struct ext4_map_blocks map;
> +       unsigned int len = INT_MAX;

hmm.. I had my doubt over len = INT_MAX.


> +
> +       map.m_lblk = lblk;
> +       map.m_len = len;
> +
> +       ret = ext4_map_blocks(NULL, inode, &map, 0);
> +
> +       if (ret == 0)
> +               delalloc = ext4_iomap_is_delalloc(inode, &map);
> +       if (ret > 0 || delalloc || map.m_len < len)
> +               return false;

Digging more today morning found this:-
So I guess this map.m_len < len condition could go wrong here.
Please correct me here -
So even though an ext4_extent length field is only __le16, but
the cached extent_status could be upto u32. So in a large filesystem
below two cases could be possible:-
1. A file with an extent_status entry of a hole more than INT_MAX blocks.
2. A file with an extent_status entry of a data block more than INT_MAX 
blocks
(I guess this could hold with mk_hugefiles.c feature in mke2fs prog).

So if we were to detect the end of extent then the other possible
way is to open code the ext4_map_blocks for ext4_fiemap usecase
similar to what is being done today. But just use iomap to handle
the fiemap filling (like done in this patch). That way we may not have
to even call ext4_map_blocks twice.


-ritesh


> +       return true;
> +}
> +
>   static int ext4_iomap_begin_report(struct inode *inode, loff_t offset,
>                                     loff_t length, unsigned int flags,
>                                     struct iomap *iomap, struct iomap 
> *srcmap)
> @@ -3548,16 +3567,36 @@ static int ext4_iomap_begin_report(struct inode 
> *inode, loff_t offset,
>          map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
>                            EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
> 
> +
> +       if ((flags & IOMAP_FIEMAP) &&
> +           !ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
> +               if (offset >= i_size_read(inode)) {
> +                       map.m_flags = 0;
> +                       map.m_pblk = 0;
> +                       goto set_iomap;
> +               }
> +       }
> +
>          ret = ext4_map_blocks(NULL, inode, &map, 0);
> 
>          if (ret < 0)
>                  return ret;
>          if (ret == 0)
>                  delalloc = ext4_iomap_is_delalloc(inode, &map);
> 
> +set_iomap:
>          ext4_set_iomap(inode, iomap, &map, offset, length);
>          if (delalloc && iomap->type == IOMAP_HOLE)
>                  iomap->type = IOMAP_DELALLOC;
> 
> +       if (flags & IOMAP_FIEMAP &&
> +           ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
> +               ext4_lblk_t lblk = map.m_lblk + map.m_len;
> +               if (ext4_is_last_extent(inode, lblk)) {
> +                       iomap->flags |= IOMAP_F_EXTENT_LAST;
> +               } else {
> +                       iomap->flags |= IOMAP_F_EXTENT_NOT_LAST;
> +               }
> +       }
>          return 0;
>   }
> 
> 
> 
> -ritesh
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-20 17:09             ` Christoph Hellwig
@ 2020-02-21  4:16               ` Ritesh Harjani
  2020-02-21 11:47                 ` Jan Kara
  0 siblings, 1 reply; 20+ messages in thread
From: Ritesh Harjani @ 2020-02-21  4:16 UTC (permalink / raw)
  To: Christoph Hellwig, Jan Kara, tytso
  Cc: Darrick J. Wong, adilger.kernel, linux-ext4, linux-fsdevel, cmaiolino



On 2/20/20 10:39 PM, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 10:33:03PM +0530, Ritesh Harjani wrote:
>> So I was making some changes along the above lines and I think we can take
>> below approach for filesystem which could determine the
>> _EXTENT_LAST relatively easily and for cases if it cannot
>> as Jan also mentioned we could keep the current behavior as is and let
>> iomap core decide the last disk extent.
> 
> Well, given that _EXTENT_LAST never worked properly on any file system
> since it was added this actually changes behavior and could break
> existing users.  I'd rather update the documentation to match reality
> rather than writing a lot of code for a feature no one obviously cared
> about for years.

Well I agree to this. Since either ways the _EXTENT_LAST has never 
worked properly or in the same manner across different filesystems.
In ext4 itself it works differently for extent v/s non-extent based FS.
So updating the documentation would be a right way to go from here.

Ted/Jan - do you agree here:-
Shall we move ahead with this patch series in converting ext4_fiemap to
use iomap APIs, without worrying about how _EXTENT_LAST is being set via
iomap core code?



-ritesh


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap
  2020-02-21  4:16               ` Ritesh Harjani
@ 2020-02-21 11:47                 ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2020-02-21 11:47 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Christoph Hellwig, Jan Kara, tytso, Darrick J. Wong,
	adilger.kernel, linux-ext4, linux-fsdevel, cmaiolino

On Fri 21-02-20 09:46:43, Ritesh Harjani wrote:
> 
> 
> On 2/20/20 10:39 PM, Christoph Hellwig wrote:
> > On Thu, Feb 20, 2020 at 10:33:03PM +0530, Ritesh Harjani wrote:
> > > So I was making some changes along the above lines and I think we can take
> > > below approach for filesystem which could determine the
> > > _EXTENT_LAST relatively easily and for cases if it cannot
> > > as Jan also mentioned we could keep the current behavior as is and let
> > > iomap core decide the last disk extent.
> > 
> > Well, given that _EXTENT_LAST never worked properly on any file system
> > since it was added this actually changes behavior and could break
> > existing users.  I'd rather update the documentation to match reality
> > rather than writing a lot of code for a feature no one obviously cared
> > about for years.
> 
> Well I agree to this. Since either ways the _EXTENT_LAST has never worked
> properly or in the same manner across different filesystems.
> In ext4 itself it works differently for extent v/s non-extent based FS.
> So updating the documentation would be a right way to go from here.
> 
> Ted/Jan - do you agree here:-
> Shall we move ahead with this patch series in converting ext4_fiemap to
> use iomap APIs, without worrying about how _EXTENT_LAST is being set via
> iomap core code?

Yes, I'd go ahead with the conversion and don't really bother with backward
compatibility here. In the unlikely case someone comes with a real breakage
this causes, we can always think about how to fix this.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-02-21 11:47 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-28 10:18 [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Ritesh Harjani
2020-01-28 10:18 ` [RFCv2 1/4] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
2020-01-28 10:18 ` [RFCv2 2/4] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
2020-01-28 10:18 ` [RFCv2 3/4] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
2020-01-28 10:18 ` [RFCv2 4/4] ext4: Move ext4_fiemap " Ritesh Harjani
2020-01-28 15:28   ` Darrick J. Wong
2020-01-29  6:19     ` Ritesh Harjani
2020-01-29 16:18       ` Darrick J. Wong
2020-01-29 23:54         ` Ritesh Harjani
2020-01-30 16:00 ` [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap Darrick J. Wong
2020-01-30 17:34   ` Ritesh Harjani
2020-02-05 12:47   ` Ritesh Harjani
2020-02-05 15:57     ` Darrick J. Wong
2020-02-06  5:26       ` Ritesh Harjani
2020-02-06 10:22         ` Jan Kara
2020-02-20 17:03           ` Ritesh Harjani
2020-02-20 17:09             ` Christoph Hellwig
2020-02-21  4:16               ` Ritesh Harjani
2020-02-21 11:47                 ` Jan Kara
2020-02-21  4:10             ` Ritesh Harjani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).