linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] ->bmap interface retirement
@ 2018-09-05 13:57 Carlos Maiolino
  2018-09-05 13:57 ` [PATCH 1/2] fs: Replace direct ->bmap calls by bmap() Carlos Maiolino
  2018-09-05 13:57 ` [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP Carlos Maiolino
  0 siblings, 2 replies; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-05 13:57 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, sandeen, david

Hi,

This series aims to deprecate the ->bmap interface, by using FIEMAP
infra-structure to handle FIBMAP calls. It doesn't change the FIBMAP api, so the
implementation is supposed to be transparent for users.

Although this patchset is working as I expected and it survived a few xfstests
runs, I have some questions about the way I implemented it that I'd like to
discuss, so, I'm tagging it as RFC.

The first patch is relatively simple, just replacing direct ->bmap calls by
bmap() function (as suggested by Dave Chinner), the second patch actually
changes bmap() function to call ->fiemap when available, falling back to ->bmap
when ->fiemap isn't implemented. So, in a (hopefully) not so far future, we can
get rid of the whole ->bmap interface.

These are the main points where I've doubts about:

I implemented it reutilizing the same infra-structure used by userspace FIEMAP
calls to avoid code duplication if possible, the problem is, to setup the struct
fiemap_extent_info to pass it to ->fiemap, I need to use (__force) casting, so
the compiler doesn't complain about the
"struct fiemap_extent __user *fi_extents_start" into it.
I saw a few other places in kernel doing something similar, so, that's why I
thought it would be ok to use such strategy.

I added a new FIEMAP flag (FIEMAP_KERNEL_FIBMAP), to signal the ->fiemap call
is supposed to be internal only and fiemap_fill_next_extent() can do a simple
memcpy() instead of a copy_to_user(). I added this flag to include/uapi, but I
suppose kernel only flags are not supposed to be in uapi, so, suggestions to
where I could place it are welcome.

Different filesystems will answer FIEMAP calls in slightly different ways. when
the requested range is somewhere in the middle of the extent, some filesystems
will set the map starting at the beginning of the extent (like ext4 for
example), and some (the ones based on iomap for example), will set the map
starting at the offset requested. So, to comply with both cases, bmap() checks
the logical offset returned by the filesystem, and take the appropriate action
if the filesystem returns the beginning of the extent, or not.

What you guys think?
Cheers.

Carlos Maiolino (2):
  fs: Replace direct ->bmap calls by bmap()
  Use fiemap internal infra-structure to handle FIBMAP

 fs/cachefiles/rdwr.c        |  5 ++---
 fs/ecryptfs/mmap.c          |  5 ++---
 fs/inode.c                  | 38 +++++++++++++++++++++++++++++++++++++-
 fs/ioctl.c                  | 25 +++++++++++++++++--------
 include/uapi/linux/fiemap.h |  1 +
 5 files changed, 59 insertions(+), 15 deletions(-)

-- 
2.14.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] fs: Replace direct ->bmap calls by bmap()
  2018-09-05 13:57 [PATCH RFC 0/2] ->bmap interface retirement Carlos Maiolino
@ 2018-09-05 13:57 ` Carlos Maiolino
  2018-09-05 14:23   ` Eric Sandeen
  2018-09-05 13:57 ` [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP Carlos Maiolino
  1 sibling, 1 reply; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-05 13:57 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, sandeen, david

Prepare the field to use ->fiemap for FIBMAP ioctl

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/cachefiles/rdwr.c |  5 ++---
 fs/ecryptfs/mmap.c   |  5 ++---
 fs/ioctl.c           | 14 ++++++++------
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 40f7595aad10..186e203d64a7 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -435,7 +435,7 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
 	block0 = page->index;
 	block0 <<= shift;
 
-	block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
+	block = bmap(inode, block0);
 	_debug("%llx -> %llx",
 	       (unsigned long long) block0,
 	       (unsigned long long) block);
@@ -737,8 +737,7 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
 		block0 = page->index;
 		block0 <<= shift;
 
-		block = inode->i_mapping->a_ops->bmap(inode->i_mapping,
-						      block0);
+		block = bmap(inode, block0);
 		_debug("%llx -> %llx",
 		       (unsigned long long) block0,
 		       (unsigned long long) block);
diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c
index cdf358b209d9..16626fce5754 100644
--- a/fs/ecryptfs/mmap.c
+++ b/fs/ecryptfs/mmap.c
@@ -544,9 +544,8 @@ static sector_t ecryptfs_bmap(struct address_space *mapping, sector_t block)
 
 	inode = (struct inode *)mapping->host;
 	lower_inode = ecryptfs_inode_to_lower(inode);
-	if (lower_inode->i_mapping->a_ops->bmap)
-		rc = lower_inode->i_mapping->a_ops->bmap(lower_inode->i_mapping,
-							 block);
+
+	rc = bmap(lower_inode, block);
 	return rc;
 }
 
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 3212c29235ce..413585d58415 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -53,19 +53,21 @@ EXPORT_SYMBOL(vfs_ioctl);
 
 static int ioctl_fibmap(struct file *filp, int __user *p)
 {
-	struct address_space *mapping = filp->f_mapping;
+	struct inode *inode = filp->f_inode;
 	int res, block;
 
-	/* do we support this mess? */
-	if (!mapping->a_ops->bmap)
-		return -EINVAL;
 	if (!capable(CAP_SYS_RAWIO))
 		return -EPERM;
 	res = get_user(block, p);
 	if (res)
 		return res;
-	res = mapping->a_ops->bmap(mapping, block);
-	return put_user(res, p);
+
+	res = bmap(inode, block);
+
+	if (res)
+		return put_user(res, p);
+	else
+		return -EINVAL;
 }
 
 /**
-- 
2.14.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-05 13:57 [PATCH RFC 0/2] ->bmap interface retirement Carlos Maiolino
  2018-09-05 13:57 ` [PATCH 1/2] fs: Replace direct ->bmap calls by bmap() Carlos Maiolino
@ 2018-09-05 13:57 ` Carlos Maiolino
  2018-09-05 14:31   ` Matthew Wilcox
  2018-09-05 14:40   ` Eric Sandeen
  1 sibling, 2 replies; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-05 13:57 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, sandeen, david

This patch modifies ioctl_fibmap to use FIEMAP interface internally. The
FIBMAP API is not changed, but now ->bmap can go.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/inode.c                  | 38 +++++++++++++++++++++++++++++++++++++-
 fs/ioctl.c                  | 11 +++++++++--
 include/uapi/linux/fiemap.h |  1 +
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 41812a3e829e..07befc5f253b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1588,9 +1588,45 @@ EXPORT_SYMBOL(iput);
  */
 sector_t bmap(struct inode *inode, sector_t block)
 {
+
+	struct fiemap_extent_info fieinfo = { 0, };
+	struct fiemap_extent fextent;
+	u64 start = block << inode->i_blkbits;
+	int error;
 	sector_t res = 0;
-	if (inode->i_mapping->a_ops->bmap)
+
+	if (inode->i_op->fiemap) {
+		fextent.fe_logical = 0;
+		fextent.fe_physical = 0;
+		fieinfo.fi_flags = FIEMAP_KERNEL_FIBMAP;
+		fieinfo.fi_extents_max = 1;
+		fieinfo.fi_extents_start = (__force struct fiemap_extent __user *) &fextent;
+
+		error = inode->i_op->fiemap(inode, &fieinfo, start, 1);
+
+		/* FIBMAP expect a zero return for error */
+		if (error)
+			goto out;
+
+		/*
+		 * Fiemap implementation of some filesystems will return the
+		 * extent map beginning at the requested offset
+		 * (fextent.fi_logical == start), while other FSes will return
+		 * the beginning of the extent at fextent.fe_logical, even if
+		 * the requested offset is somehwere in the middle of the
+		 * extent. We must be sure to return the correct block block
+		 * here in both cases.
+		 */
+		if (fextent.fe_logical != start)
+			res = (fextent.fe_physical + start) >> inode->i_blkbits;
+		else
+			res = fextent.fe_physical >> inode->i_blkbits;
+
+	} else if (inode->i_mapping->a_ops->bmap) {
 		res = inode->i_mapping->a_ops->bmap(inode->i_mapping, block);
+	}
+
+out:
 	return res;
 }
 EXPORT_SYMBOL(bmap);
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 413585d58415..58de1dd507b1 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -117,8 +117,14 @@ int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
 	extent.fe_flags = flags;
 
 	dest += fieinfo->fi_extents_mapped;
-	if (copy_to_user(dest, &extent, sizeof(extent)))
-		return -EFAULT;
+
+	if (fieinfo->fi_flags & FIEMAP_KERNEL_FIBMAP) {
+		memcpy((__force struct fiemap_extent *)dest, &extent,
+		       sizeof(extent));
+	} else {
+		if (copy_to_user(dest, &extent, sizeof(extent)))
+			return -EFAULT;
+	}
 
 	fieinfo->fi_extents_mapped++;
 	if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max)
@@ -146,6 +152,7 @@ int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags)
 	u32 incompat_flags;
 
 	incompat_flags = fieinfo->fi_flags & ~(FIEMAP_FLAGS_COMPAT & fs_flags);
+	incompat_flags &= ~(FIEMAP_KERNEL_FIBMAP);
 	if (incompat_flags) {
 		fieinfo->fi_flags = incompat_flags;
 		return -EBADR;
diff --git a/include/uapi/linux/fiemap.h b/include/uapi/linux/fiemap.h
index 8c0bc24d5d95..7064a3fa5421 100644
--- a/include/uapi/linux/fiemap.h
+++ b/include/uapi/linux/fiemap.h
@@ -42,6 +42,7 @@ struct fiemap {
 #define FIEMAP_FLAG_SYNC	0x00000001 /* sync file data before map */
 #define FIEMAP_FLAG_XATTR	0x00000002 /* map extended attribute tree */
 #define FIEMAP_FLAG_CACHE	0x00000004 /* request caching of the extents */
+#define FIEMAP_KERNEL_FIBMAP	0x00000008 /* Use FIEMAP for FIBMAP */
 
 #define FIEMAP_FLAGS_COMPAT	(FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR)
 
-- 
2.14.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] fs: Replace direct ->bmap calls by bmap()
  2018-09-05 13:57 ` [PATCH 1/2] fs: Replace direct ->bmap calls by bmap() Carlos Maiolino
@ 2018-09-05 14:23   ` Eric Sandeen
  2018-09-05 18:55     ` Christoph Hellwig
  2018-09-06  8:03     ` Carlos Maiolino
  0 siblings, 2 replies; 17+ messages in thread
From: Eric Sandeen @ 2018-09-05 14:23 UTC (permalink / raw)
  To: Carlos Maiolino, linux-fsdevel; +Cc: hch, david

On 9/5/18 8:57 AM, Carlos Maiolino wrote:
> Prepare the field to use ->fiemap for FIBMAP ioctl
> 
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
>  fs/cachefiles/rdwr.c |  5 ++---
>  fs/ecryptfs/mmap.c   |  5 ++---
>  fs/ioctl.c           | 14 ++++++++------
>  3 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> index 40f7595aad10..186e203d64a7 100644
> --- a/fs/cachefiles/rdwr.c
> +++ b/fs/cachefiles/rdwr.c
> @@ -435,7 +435,7 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
>  	block0 = page->index;
>  	block0 <<= shift;
>  
> -	block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
> +	block = bmap(inode, block0);

Prior to this there's an ASSERT that inode->i_mapping->a_ops->bmap
exists.  Should that stay, if the goal is to move all ->bmap use out
of calling code?  OTOH, what will this code do if bmap() finds that there
is no ->bmap present and returns 0?

>  	_debug("%llx -> %llx",
>  	       (unsigned long long) block0,
>  	       (unsigned long long) block);
> @@ -737,8 +737,7 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
>  		block0 = page->index;
>  		block0 <<= shift;
>  
> -		block = inode->i_mapping->a_ops->bmap(inode->i_mapping,
> -						      block0);
> +		block = bmap(inode, block0);
>  		_debug("%llx -> %llx",
>  		       (unsigned long long) block0,
>  		       (unsigned long long) block);
> diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c
> index cdf358b209d9..16626fce5754 100644
> --- a/fs/ecryptfs/mmap.c
> +++ b/fs/ecryptfs/mmap.c
> @@ -544,9 +544,8 @@ static sector_t ecryptfs_bmap(struct address_space *mapping, sector_t block)
>  
>  	inode = (struct inode *)mapping->host;
>  	lower_inode = ecryptfs_inode_to_lower(inode);
> -	if (lower_inode->i_mapping->a_ops->bmap)
> -		rc = lower_inode->i_mapping->a_ops->bmap(lower_inode->i_mapping,
> -							 block);
> +
> +	rc = bmap(lower_inode, block);
>  	return rc;
>  }
>  
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 3212c29235ce..413585d58415 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -53,19 +53,21 @@ EXPORT_SYMBOL(vfs_ioctl);
>  
>  static int ioctl_fibmap(struct file *filp, int __user *p)
>  {
> -	struct address_space *mapping = filp->f_mapping;
> +	struct inode *inode = filp->f_inode;
>  	int res, block;
>  
> -	/* do we support this mess? */
> -	if (!mapping->a_ops->bmap)
> -		return -EINVAL;
>  	if (!capable(CAP_SYS_RAWIO))
>  		return -EPERM;
>  	res = get_user(block, p);
>  	if (res)
>  		return res;
> -	res = mapping->a_ops->bmap(mapping, block);
> -	return put_user(res, p);
> +
> +	res = bmap(inode, block);
> +
> +	if (res)
> +		return put_user(res, p);
> +	else
> +		return -EINVAL;

So now mapping a hole will return -EINVAL?  I don't think that change
in behavior is ok.

-Eric

>  }
>  
>  /**
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-05 13:57 ` [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP Carlos Maiolino
@ 2018-09-05 14:31   ` Matthew Wilcox
  2018-09-05 18:56     ` Christoph Hellwig
  2018-09-06  8:18     ` Carlos Maiolino
  2018-09-05 14:40   ` Eric Sandeen
  1 sibling, 2 replies; 17+ messages in thread
From: Matthew Wilcox @ 2018-09-05 14:31 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-fsdevel, hch, sandeen, david

On Wed, Sep 05, 2018 at 03:57:48PM +0200, Carlos Maiolino wrote:
> +	if (inode->i_op->fiemap) {
> +		fextent.fe_logical = 0;
> +		fextent.fe_physical = 0;
> +		fieinfo.fi_flags = FIEMAP_KERNEL_FIBMAP;
> +		fieinfo.fi_extents_max = 1;
> +		fieinfo.fi_extents_start = (__force struct fiemap_extent __user *) &fextent;
> +
> +		error = inode->i_op->fiemap(inode, &fieinfo, start, 1);

You'd have to play games with set_fs() and friends if you want to do this.
The fiemap implementation is going to access fi_extents_start with a call
to copy_to_user() and for machines with a 4G/4G split, you need that
address to be interpreted as kernel space, not user space.

See fiemap_fill_next_extent():

        struct fiemap_extent __user *dest = fieinfo->fi_extents_start;
...
        if (copy_to_user(dest, &extent, sizeof(extent)))
                return -EFAULT;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-05 13:57 ` [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP Carlos Maiolino
  2018-09-05 14:31   ` Matthew Wilcox
@ 2018-09-05 14:40   ` Eric Sandeen
  2018-09-06  8:12     ` Carlos Maiolino
  1 sibling, 1 reply; 17+ messages in thread
From: Eric Sandeen @ 2018-09-05 14:40 UTC (permalink / raw)
  To: Carlos Maiolino, linux-fsdevel; +Cc: hch, david

On 9/5/18 8:57 AM, Carlos Maiolino wrote:
> This patch modifies ioctl_fibmap to use FIEMAP interface internally. The
> FIBMAP API is not changed, but now ->bmap can go.

Not really, because you still call it if ->fiemap isn't there ;)

> 
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
>  fs/inode.c                  | 38 +++++++++++++++++++++++++++++++++++++-
>  fs/ioctl.c                  | 11 +++++++++--
>  include/uapi/linux/fiemap.h |  1 +
>  3 files changed, 47 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index 41812a3e829e..07befc5f253b 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1588,9 +1588,45 @@ EXPORT_SYMBOL(iput);
>   */
>  sector_t bmap(struct inode *inode, sector_t block)
>  {
> +
> +	struct fiemap_extent_info fieinfo = { 0, };
> +	struct fiemap_extent fextent;
> +	u64 start = block << inode->i_blkbits;
> +	int error;
>  	sector_t res = 0;
> -	if (inode->i_mapping->a_ops->bmap)
> +
> +	if (inode->i_op->fiemap) {
> +		fextent.fe_logical = 0;
> +		fextent.fe_physical = 0;
> +		fieinfo.fi_flags = FIEMAP_KERNEL_FIBMAP;

| FIEMAP_FLAG_SYNC?

> +		fieinfo.fi_extents_max = 1;
> +		fieinfo.fi_extents_start = (__force struct fiemap_extent __user *) &fextent;
> +
> +		error = inode->i_op->fiemap(inode, &fieinfo, start, 1);
> +
> +		/* FIBMAP expect a zero return for error */
> +		if (error)
> +			goto out;

do you need to test fieinfo.fi_extents_mapped > 0?  I don't know if it's
possible to return error == 0 && fi_extents_mapped == 0, TBH.

> +
> +		/*
> +		 * Fiemap implementation of some filesystems will return the
> +		 * extent map beginning at the requested offset
> +		 * (fextent.fi_logical == start), while other FSes will return
> +		 * the beginning of the extent at fextent.fe_logical, even if
> +		 * the requested offset is somehwere in the middle of the
> +		 * extent. We must be sure to return the correct block block
> +		 * here in both cases.
> +		 */
> +		if (fextent.fe_logical != start)
> +			res = (fextent.fe_physical + start) >> inode->i_blkbits;

I don't think this is correct, is it?  You can't add the entire requested
logical file offset (start) to the resulting physical extent start (fe_physical).

You'd need to add the delta, (start - fextent.fe_logical) to the physical extent
start to get the offset into the mapping, right?

		res = (fextent.fe_physical +
		       (start - fextent.fe_logical)) >> inode->i_blkbits;

You'll also need to test the resulting fm_flags to be sure that
this is something that's valid to return via FIBMAP.

For example, you might get a shared block, which we have recently learned
Must Not Be Provided Under Any Circumstances to userspace via FIBMAP.

(I'm still thinking about how this change is structured in general, but I think
the above are a few logical issues w/ the code.)

-Eric

> +		else
> +			res = fextent.fe_physical >> inode->i_blkbits;
> +
> +	} else if (inode->i_mapping->a_ops->bmap) {
>  		res = inode->i_mapping->a_ops->bmap(inode->i_mapping, block);
> +	}
> +
> +out:
>  	return res;
>  }
>  EXPORT_SYMBOL(bmap);
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 413585d58415..58de1dd507b1 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -117,8 +117,14 @@ int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
>  	extent.fe_flags = flags;
>  
>  	dest += fieinfo->fi_extents_mapped;
> -	if (copy_to_user(dest, &extent, sizeof(extent)))
> -		return -EFAULT;
> +
> +	if (fieinfo->fi_flags & FIEMAP_KERNEL_FIBMAP) {
> +		memcpy((__force struct fiemap_extent *)dest, &extent,
> +		       sizeof(extent));
> +	} else {
> +		if (copy_to_user(dest, &extent, sizeof(extent)))
> +			return -EFAULT;
> +	}
>  
>  	fieinfo->fi_extents_mapped++;
>  	if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max)
> @@ -146,6 +152,7 @@ int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags)
>  	u32 incompat_flags;
>  
>  	incompat_flags = fieinfo->fi_flags & ~(FIEMAP_FLAGS_COMPAT & fs_flags);
> +	incompat_flags &= ~(FIEMAP_KERNEL_FIBMAP);
>  	if (incompat_flags) {
>  		fieinfo->fi_flags = incompat_flags;
>  		return -EBADR;
> diff --git a/include/uapi/linux/fiemap.h b/include/uapi/linux/fiemap.h
> index 8c0bc24d5d95..7064a3fa5421 100644
> --- a/include/uapi/linux/fiemap.h
> +++ b/include/uapi/linux/fiemap.h
> @@ -42,6 +42,7 @@ struct fiemap {
>  #define FIEMAP_FLAG_SYNC	0x00000001 /* sync file data before map */
>  #define FIEMAP_FLAG_XATTR	0x00000002 /* map extended attribute tree */
>  #define FIEMAP_FLAG_CACHE	0x00000004 /* request caching of the extents */
> +#define FIEMAP_KERNEL_FIBMAP	0x00000008 /* Use FIEMAP for FIBMAP */
>  
>  #define FIEMAP_FLAGS_COMPAT	(FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR)
>  
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] fs: Replace direct ->bmap calls by bmap()
  2018-09-05 14:23   ` Eric Sandeen
@ 2018-09-05 18:55     ` Christoph Hellwig
  2018-09-06  8:04       ` Carlos Maiolino
  2018-09-06  8:03     ` Carlos Maiolino
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-05 18:55 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Carlos Maiolino, linux-fsdevel, hch, david

On Wed, Sep 05, 2018 at 09:23:26AM -0500, Eric Sandeen wrote:
> >  	block0 = page->index;
> >  	block0 <<= shift;
> >  
> > -	block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
> > +	block = bmap(inode, block0);
> 
> Prior to this there's an ASSERT that inode->i_mapping->a_ops->bmap
> exists.  Should that stay, if the goal is to move all ->bmap use out
> of calling code?

I think it needs to go away.

> OTOH, what will this code do if bmap() finds that there
> is no ->bmap present and returns 0?

I think we just need to fix the bmap() prototype so that it can return
error, which would be very useful for various reasons.  Something like
this:

int bmap(struct address_space *mapping, sector_t *block)
{
	if (!mapping->a_ops->bmap)
		return -EINVAL;
	*block = mapping->a_ops->bmap(mapping, *block);
	return 0;
}

then add fiemap support with real error handling later.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-05 14:31   ` Matthew Wilcox
@ 2018-09-05 18:56     ` Christoph Hellwig
  2018-09-06  8:31       ` Carlos Maiolino
  2018-09-06  8:18     ` Carlos Maiolino
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-05 18:56 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Carlos Maiolino, linux-fsdevel, hch, sandeen, david

On Wed, Sep 05, 2018 at 07:31:47AM -0700, Matthew Wilcox wrote:
> On Wed, Sep 05, 2018 at 03:57:48PM +0200, Carlos Maiolino wrote:
> > +	if (inode->i_op->fiemap) {
> > +		fextent.fe_logical = 0;
> > +		fextent.fe_physical = 0;
> > +		fieinfo.fi_flags = FIEMAP_KERNEL_FIBMAP;
> > +		fieinfo.fi_extents_max = 1;
> > +		fieinfo.fi_extents_start = (__force struct fiemap_extent __user *) &fextent;
> > +
> > +		error = inode->i_op->fiemap(inode, &fieinfo, start, 1);
> 
> You'd have to play games with set_fs() and friends if you want to do this.
> The fiemap implementation is going to access fi_extents_start with a call
> to copy_to_user() and for machines with a 4G/4G split, you need that
> address to be interpreted as kernel space, not user space.
> 
> See fiemap_fill_next_extent():

Yeah.  I think we need to pass fiemap_fill_next_extent() as a function
pointer to fiemap in a prep patch, and then pass a different pointer
for the in-kernel usage.  Which is good API design anyway, so we should
have done this from the beginning.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] fs: Replace direct ->bmap calls by bmap()
  2018-09-05 14:23   ` Eric Sandeen
  2018-09-05 18:55     ` Christoph Hellwig
@ 2018-09-06  8:03     ` Carlos Maiolino
  1 sibling, 0 replies; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-06  8:03 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-fsdevel, hch, david

On Wed, Sep 05, 2018 at 09:23:26AM -0500, Eric Sandeen wrote:
> On 9/5/18 8:57 AM, Carlos Maiolino wrote:
> > Prepare the field to use ->fiemap for FIBMAP ioctl
> > 
> > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> > ---
> >  fs/cachefiles/rdwr.c |  5 ++---
> >  fs/ecryptfs/mmap.c   |  5 ++---
> >  fs/ioctl.c           | 14 ++++++++------
> >  3 files changed, 12 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> > index 40f7595aad10..186e203d64a7 100644
> > --- a/fs/cachefiles/rdwr.c
> > +++ b/fs/cachefiles/rdwr.c
> > @@ -435,7 +435,7 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
> >  	block0 = page->index;
> >  	block0 <<= shift;
> >  
> > -	block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
> > +	block = bmap(inode, block0);
> 
> Prior to this there's an ASSERT that inode->i_mapping->a_ops->bmap
> exists.  Should that stay, if the goal is to move all ->bmap use out
> of calling code?  OTOH, what will this code do if bmap() finds that there
> is no ->bmap present and returns 0?
> 
> > +	res = bmap(inode, block);
> > +
> > +	if (res)
> > +		return put_user(res, p);
> > +	else
> > +		return -EINVAL;
> 
> So now mapping a hole will return -EINVAL?  I don't think that change
> in behavior is ok.
> 

Thanks for catching it, I messed the return value here.

I'm going to think about your 2 points for the next review and work on a real
patch for retiring ->bmap once I've got more information now.


> -Eric
> 
> >  }
> >  
> >  /**
> > 
> 

-- 
Carlos

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] fs: Replace direct ->bmap calls by bmap()
  2018-09-05 18:55     ` Christoph Hellwig
@ 2018-09-06  8:04       ` Carlos Maiolino
  0 siblings, 0 replies; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-06  8:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Eric Sandeen, linux-fsdevel, david

On Wed, Sep 05, 2018 at 08:55:39PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 05, 2018 at 09:23:26AM -0500, Eric Sandeen wrote:
> > >  	block0 = page->index;
> > >  	block0 <<= shift;
> > >  
> > > -	block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
> > > +	block = bmap(inode, block0);
> > 
> > Prior to this there's an ASSERT that inode->i_mapping->a_ops->bmap
> > exists.  Should that stay, if the goal is to move all ->bmap use out
> > of calling code?
> 
> I think it needs to go away.

/me will keep it in mind
> 
> > OTOH, what will this code do if bmap() finds that there
> > is no ->bmap present and returns 0?
> 
> I think we just need to fix the bmap() prototype so that it can return
> error, which would be very useful for various reasons.  Something like
> this:
> 
> int bmap(struct address_space *mapping, sector_t *block)
> {
> 	if (!mapping->a_ops->bmap)
> 		return -EINVAL;
> 	*block = mapping->a_ops->bmap(mapping, *block);
> 	return 0;
> }
> 
> then add fiemap support with real error handling later.

Will do, thanks for the suggestion.


-- 
Carlos

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-05 14:40   ` Eric Sandeen
@ 2018-09-06  8:12     ` Carlos Maiolino
  2018-09-06 15:53       ` Eric Sandeen
  0 siblings, 1 reply; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-06  8:12 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-fsdevel, hch, david

On Wed, Sep 05, 2018 at 09:40:06AM -0500, Eric Sandeen wrote:
> On 9/5/18 8:57 AM, Carlos Maiolino wrote:
> > This patch modifies ioctl_fibmap to use FIEMAP interface internally. The
> > FIBMAP API is not changed, but now ->bmap can go.
> 
> Not really, because you still call it if ->fiemap isn't there ;)

Not sure if I understand what you meant here :(

What I meant to say is that, FIBMAP behavior from user's perspective isn't
supposed to change, neither the user API which needs to be used (we should
support FIBMAP forever, and breaking user's API is a big NO here). Not sure if I
didn't understand what you meant here, or maybe we are talking about different
things? =/

> 
> > 
> > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> > ---
> >  fs/inode.c                  | 38 +++++++++++++++++++++++++++++++++++++-
> >  fs/ioctl.c                  | 11 +++++++++--
> >  include/uapi/linux/fiemap.h |  1 +
> >  3 files changed, 47 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/inode.c b/fs/inode.c
> > index 41812a3e829e..07befc5f253b 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -1588,9 +1588,45 @@ EXPORT_SYMBOL(iput);
> >   */
> >  sector_t bmap(struct inode *inode, sector_t block)
> >  {
> > +
> > +	struct fiemap_extent_info fieinfo = { 0, };
> > +	struct fiemap_extent fextent;
> > +	u64 start = block << inode->i_blkbits;
> > +	int error;
> >  	sector_t res = 0;
> > -	if (inode->i_mapping->a_ops->bmap)
> > +
> > +	if (inode->i_op->fiemap) {
> > +		fextent.fe_logical = 0;
> > +		fextent.fe_physical = 0;
> > +		fieinfo.fi_flags = FIEMAP_KERNEL_FIBMAP;
> 
> | FIEMAP_FLAG_SYNC?
> 
> > +		fieinfo.fi_extents_max = 1;
> > +		fieinfo.fi_extents_start = (__force struct fiemap_extent __user *) &fextent;
> > +
> > +		error = inode->i_op->fiemap(inode, &fieinfo, start, 1);
> > +
> > +		/* FIBMAP expect a zero return for error */
> > +		if (error)
> > +			goto out;
> 
> do you need to test fieinfo.fi_extents_mapped > 0?  I don't know if it's
> possible to return error == 0 && fi_extents_mapped == 0, TBH.
> 
> > +
> > +		/*
> > +		 * Fiemap implementation of some filesystems will return the
> > +		 * extent map beginning at the requested offset
> > +		 * (fextent.fi_logical == start), while other FSes will return
> > +		 * the beginning of the extent at fextent.fe_logical, even if
> > +		 * the requested offset is somehwere in the middle of the
> > +		 * extent. We must be sure to return the correct block block
> > +		 * here in both cases.
> > +		 */
> > +		if (fextent.fe_logical != start)
> > +			res = (fextent.fe_physical + start) >> inode->i_blkbits;
> 
> I don't think this is correct, is it?  You can't add the entire requested
> logical file offset (start) to the resulting physical extent start (fe_physical).
> 
> You'd need to add the delta, (start - fextent.fe_logical) to the physical extent
> start to get the offset into the mapping, right?
> 
> 		res = (fextent.fe_physical +
> 		       (start - fextent.fe_logical)) >> inode->i_blkbits;

I believe you're right, yes, when I wrote the calculation above I was thinking
about "fe_logical is either 0 or equal start", and I tested it against
multi-extents files. I really have no idea how it survived my tests and xfstests
=/

> 
> You'll also need to test the resulting fm_flags to be sure that
> this is something that's valid to return via FIBMAP.
> 
> For example, you might get a shared block, which we have recently learned
> Must Not Be Provided Under Any Circumstances to userspace via FIBMAP.

Right, good point.

> 
> (I'm still thinking about how this change is structured in general, but I think
> the above are a few logical issues w/ the code.)
> 

Yes, makes sense, and I'm still also thinking on a way to better structure it.
And that's the point I decided to send an RFC, better to start a conversation
with a somehow working patch. Thanks for the input, much appreciated.

> -Eric
> 
> > +		else
> > +			res = fextent.fe_physical >> inode->i_blkbits;
> > +
> > +	} else if (inode->i_mapping->a_ops->bmap) {
> >  		res = inode->i_mapping->a_ops->bmap(inode->i_mapping, block);
> > +	}
> > +
> > +out:
> >  	return res;
> >  }
> >  EXPORT_SYMBOL(bmap);
> > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > index 413585d58415..58de1dd507b1 100644
> > --- a/fs/ioctl.c
> > +++ b/fs/ioctl.c
> > @@ -117,8 +117,14 @@ int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
> >  	extent.fe_flags = flags;
> >  
> >  	dest += fieinfo->fi_extents_mapped;
> > -	if (copy_to_user(dest, &extent, sizeof(extent)))
> > -		return -EFAULT;
> > +
> > +	if (fieinfo->fi_flags & FIEMAP_KERNEL_FIBMAP) {
> > +		memcpy((__force struct fiemap_extent *)dest, &extent,
> > +		       sizeof(extent));
> > +	} else {
> > +		if (copy_to_user(dest, &extent, sizeof(extent)))
> > +			return -EFAULT;
> > +	}
> >  
> >  	fieinfo->fi_extents_mapped++;
> >  	if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max)
> > @@ -146,6 +152,7 @@ int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags)
> >  	u32 incompat_flags;
> >  
> >  	incompat_flags = fieinfo->fi_flags & ~(FIEMAP_FLAGS_COMPAT & fs_flags);
> > +	incompat_flags &= ~(FIEMAP_KERNEL_FIBMAP);
> >  	if (incompat_flags) {
> >  		fieinfo->fi_flags = incompat_flags;
> >  		return -EBADR;
> > diff --git a/include/uapi/linux/fiemap.h b/include/uapi/linux/fiemap.h
> > index 8c0bc24d5d95..7064a3fa5421 100644
> > --- a/include/uapi/linux/fiemap.h
> > +++ b/include/uapi/linux/fiemap.h
> > @@ -42,6 +42,7 @@ struct fiemap {
> >  #define FIEMAP_FLAG_SYNC	0x00000001 /* sync file data before map */
> >  #define FIEMAP_FLAG_XATTR	0x00000002 /* map extended attribute tree */
> >  #define FIEMAP_FLAG_CACHE	0x00000004 /* request caching of the extents */
> > +#define FIEMAP_KERNEL_FIBMAP	0x00000008 /* Use FIEMAP for FIBMAP */
> >  
> >  #define FIEMAP_FLAGS_COMPAT	(FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR)
> >  
> > 
> 

-- 
Carlos

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-05 14:31   ` Matthew Wilcox
  2018-09-05 18:56     ` Christoph Hellwig
@ 2018-09-06  8:18     ` Carlos Maiolino
  1 sibling, 0 replies; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-06  8:18 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, hch, sandeen, david

Hi Matthew.

On Wed, Sep 05, 2018 at 07:31:47AM -0700, Matthew Wilcox wrote:
> On Wed, Sep 05, 2018 at 03:57:48PM +0200, Carlos Maiolino wrote:
> > +	if (inode->i_op->fiemap) {
> > +		fextent.fe_logical = 0;
> > +		fextent.fe_physical = 0;
> > +		fieinfo.fi_flags = FIEMAP_KERNEL_FIBMAP;
> > +		fieinfo.fi_extents_max = 1;
> > +		fieinfo.fi_extents_start = (__force struct fiemap_extent __user *) &fextent;
> > +
> > +		error = inode->i_op->fiemap(inode, &fieinfo, start, 1);
> 
> You'd have to play games with set_fs() and friends if you want to do this.
> The fiemap implementation is going to access fi_extents_start with a call
> to copy_to_user() and for machines with a 4G/4G split, you need that
> address to be interpreted as kernel space, not user space.
> 

Thanks for the review, It's the very first time I try to 'mix' __user pointers
with internal-only kernel code, I did some research on how other parts of kernel
handles it, so I'm kind of stepping on eggs here.

I do have a question about your point though:

> See fiemap_fill_next_extent():
> 
>         struct fiemap_extent __user *dest = fieinfo->fi_extents_start;
> ...
>         if (copy_to_user(dest, &extent, sizeof(extent)))
>                 return -EFAULT;
> 

Yes, I noticed it, and, that's why I added the FIEMAP_KERNEL_FIBMAP flag. This
is the changes I did to fiemap_fill_next_extent():

@@ -117,8 +117,14 @@ int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
        extent.fe_flags = flags;
 
        dest += fieinfo->fi_extents_mapped;
-       if (copy_to_user(dest, &extent, sizeof(extent)))
-               return -EFAULT;
+
+       if (fieinfo->fi_flags & FIEMAP_KERNEL_FIBMAP) {
+               memcpy((__force struct fiemap_extent *)dest, &extent,
+                      sizeof(extent));
+       } else {
+               if (copy_to_user(dest, &extent, sizeof(extent)))
+                       return -EFAULT;
+       }


So, my patch is going to use memcpy() directly, instead of copy_to_user(), if
the request comes from the bmap() function, other than that, I'm not really sure
I understand your point here.


-- 
Carlos

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-05 18:56     ` Christoph Hellwig
@ 2018-09-06  8:31       ` Carlos Maiolino
  2018-09-10  7:50         ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-06  8:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matthew Wilcox, linux-fsdevel, sandeen, david

Hi Christoph,

On Wed, Sep 05, 2018 at 08:56:49PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 05, 2018 at 07:31:47AM -0700, Matthew Wilcox wrote:
> > On Wed, Sep 05, 2018 at 03:57:48PM +0200, Carlos Maiolino wrote:
> > > +	if (inode->i_op->fiemap) {
> > > +		fextent.fe_logical = 0;
> > > +		fextent.fe_physical = 0;
> > > +		fieinfo.fi_flags = FIEMAP_KERNEL_FIBMAP;
> > > +		fieinfo.fi_extents_max = 1;
> > > +		fieinfo.fi_extents_start = (__force struct fiemap_extent __user *) &fextent;
> > > +
> > > +		error = inode->i_op->fiemap(inode, &fieinfo, start, 1);
> > 
> > You'd have to play games with set_fs() and friends if you want to do this.
> > The fiemap implementation is going to access fi_extents_start with a call
> > to copy_to_user() and for machines with a 4G/4G split, you need that
> > address to be interpreted as kernel space, not user space.
> > 
> > See fiemap_fill_next_extent():
> 
> Yeah.  I think we need to pass fiemap_fill_next_extent() as a function
> pointer to fiemap in a prep patch, and then pass a different pointer
> for the in-kernel usage.  Which is good API design anyway, so we should
> have done this from the beginning.

Sorry, I'm not 100% sure I followed your point here, do you mind to detail it a
bit?
Passing fiemap_fill_next_extent() as a pointer to what? ->fiemap() interface?
Sounds interesting, but doing this looks like I'll need to do what I was trying
to avoid from the beginning, which is the creating of a second struct
fiemap_extent_info, to be used in-lernel only, so, the last field doesn't need
to be tagged as __user. I'm ok with that though, I was just trying a way to
avoid adding unneeded data structures if possible, but looks like it ended up
not being a good approach :P

Thanks a lot for the review

-- 
Carlos

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-06  8:12     ` Carlos Maiolino
@ 2018-09-06 15:53       ` Eric Sandeen
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Sandeen @ 2018-09-06 15:53 UTC (permalink / raw)
  To: Carlos Maiolino, Eric Sandeen; +Cc: linux-fsdevel, hch, david

On 9/6/18 3:12 AM, Carlos Maiolino wrote:
>>> +
>>> +		/*
>>> +		 * Fiemap implementation of some filesystems will return the
>>> +		 * extent map beginning at the requested offset
>>> +		 * (fextent.fi_logical == start), while other FSes will return
>>> +		 * the beginning of the extent at fextent.fe_logical, even if
>>> +		 * the requested offset is somehwere in the middle of the
>>> +		 * extent. We must be sure to return the correct block block
>>> +		 * here in both cases.
>>> +		 */
>>> +		if (fextent.fe_logical != start)
>>> +			res = (fextent.fe_physical + start) >> inode->i_blkbits;
>> I don't think this is correct, is it?  You can't add the entire requested
>> logical file offset (start) to the resulting physical extent start (fe_physical).
>>
>> You'd need to add the delta, (start - fextent.fe_logical) to the physical extent
>> start to get the offset into the mapping, right?
>>
>> 		res = (fextent.fe_physical +
>> 		       (start - fextent.fe_logical)) >> inode->i_blkbits;
> I believe you're right, yes, when I wrote the calculation above I was thinking
> about "fe_logical is either 0 or equal start", and I tested it against
> multi-extents files. I really have no idea how it survived my tests and xfstests
> =/

Yeay... given the recent undetected FIBMAP breakage thanks to iomap conversion
in xfs, it appears that we could use better test coverage of this crappy old
interface.  ;)
 
-Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-06  8:31       ` Carlos Maiolino
@ 2018-09-10  7:50         ` Christoph Hellwig
  2018-09-10 10:31           ` Carlos Maiolino
  2018-09-26 13:34           ` Carlos Maiolino
  0 siblings, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-10  7:50 UTC (permalink / raw)
  To: Carlos Maiolino
  Cc: Christoph Hellwig, Matthew Wilcox, linux-fsdevel, sandeen, david

On Thu, Sep 06, 2018 at 10:31:05AM +0200, Carlos Maiolino wrote:
> 
> Sorry, I'm not 100% sure I followed your point here, do you mind to detail it a
> bit?
> Passing fiemap_fill_next_extent() as a pointer to what? ->fiemap() interface?

Yes.

> Sounds interesting, but doing this looks like I'll need to do what I was trying
> to avoid from the beginning, which is the creating of a second struct
> fiemap_extent_info, to be used in-lernel only, so, the last field doesn't need
> to be tagged as __user. I'm ok with that though, I was just trying a way to
> avoid adding unneeded data structures if possible, but looks like it ended up
> not being a good approach :P

fieinfo is mostly used as an opaqueue cookie, so I think it should
be possible to just pass a void pointer to fiemap and only let the 'filler'
callback (that is fiemap_fill_next_extent or whatever a kernel caller
passed) interpret it.

The only thing breaking this right now seems to be fi_flags, so maybe
we just need to pass that explicitly as another argument to ->fiemap.

Something like:

typedef int (fiemap_fill_cb)(struct inode *inode, void *data, u64 logical,
		u64 phys, u64 len, u32 flags);

struct file_operations {
	...
	int (*fiemap)(struct inode *, unsigned int, fiemap_fill_cb,
			void *, u64 start, u64 len);
	...
};

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-10  7:50         ` Christoph Hellwig
@ 2018-09-10 10:31           ` Carlos Maiolino
  2018-09-26 13:34           ` Carlos Maiolino
  1 sibling, 0 replies; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-10 10:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matthew Wilcox, linux-fsdevel, sandeen, david

On Mon, Sep 10, 2018 at 09:50:39AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 06, 2018 at 10:31:05AM +0200, Carlos Maiolino wrote:
> > 
> > Sorry, I'm not 100% sure I followed your point here, do you mind to detail it a
> > bit?
> > Passing fiemap_fill_next_extent() as a pointer to what? ->fiemap() interface?
> 
> Yes.
> 
> > Sounds interesting, but doing this looks like I'll need to do what I was trying
> > to avoid from the beginning, which is the creating of a second struct
> > fiemap_extent_info, to be used in-lernel only, so, the last field doesn't need
> > to be tagged as __user. I'm ok with that though, I was just trying a way to
> > avoid adding unneeded data structures if possible, but looks like it ended up
> > not being a good approach :P
> 
> fieinfo is mostly used as an opaqueue cookie, so I think it should
> be possible to just pass a void pointer to fiemap and only let the 'filler'
> callback (that is fiemap_fill_next_extent or whatever a kernel caller
> passed) interpret it.
> 
> The only thing breaking this right now seems to be fi_flags, so maybe
> we just need to pass that explicitly as another argument to ->fiemap.
> 
> Something like:
> 
> typedef int (fiemap_fill_cb)(struct inode *inode, void *data, u64 logical,
> 		u64 phys, u64 len, u32 flags);
> 
> struct file_operations {
> 	...
> 	int (*fiemap)(struct inode *, unsigned int, fiemap_fill_cb,
> 			void *, u64 start, u64 len);
> 	...

Thanks, this gives me some extra information to keep working on it. And I agree
this is a good way to refactor this infra-structure. I'll work on something
based on this and send it out to the list.

> };

-- 
Carlos

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP
  2018-09-10  7:50         ` Christoph Hellwig
  2018-09-10 10:31           ` Carlos Maiolino
@ 2018-09-26 13:34           ` Carlos Maiolino
  1 sibling, 0 replies; 17+ messages in thread
From: Carlos Maiolino @ 2018-09-26 13:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matthew Wilcox, linux-fsdevel, sandeen, david

Hi Christoph.

On Mon, Sep 10, 2018 at 09:50:39AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 06, 2018 at 10:31:05AM +0200, Carlos Maiolino wrote:
> > 
> > Sorry, I'm not 100% sure I followed your point here, do you mind to detail it a
> > bit?
> > Passing fiemap_fill_next_extent() as a pointer to what? ->fiemap() interface?
> 
> Yes.
> 
> > Sounds interesting, but doing this looks like I'll need to do what I was trying
> > to avoid from the beginning, which is the creating of a second struct
> > fiemap_extent_info, to be used in-lernel only, so, the last field doesn't need
> > to be tagged as __user. I'm ok with that though, I was just trying a way to
> > avoid adding unneeded data structures if possible, but looks like it ended up
> > not being a good approach :P
> 
> fieinfo is mostly used as an opaqueue cookie, so I think it should
> be possible to just pass a void pointer to fiemap and only let the 'filler'
> callback (that is fiemap_fill_next_extent or whatever a kernel caller
> passed) interpret it.
> 
> The only thing breaking this right now seems to be fi_flags, so maybe
> we just need to pass that explicitly as another argument to ->fiemap.
> 
> Something like:
> 
> typedef int (fiemap_fill_cb)(struct inode *inode, void *data, u64 logical,
> 		u64 phys, u64 len, u32 flags);
> 
> struct file_operations {
> 	...
> 	int (*fiemap)(struct inode *, unsigned int, fiemap_fill_cb,
> 			void *, u64 start, u64 len);
> 	...
> };

I've been playing around with this idea, and one thing that I've been having
problems with (at least that I'm not liking the end result), is the fact some
filesystems keep juggling with the *fiemap_extent_info passed to them. Like
ext4, for example, where it takes the pointer and passes it to its internal
functions or to generic_block_fiemap if that's the case.

I was wondering if, maybe, creating a new structure to embed the cookie, the
flags, and the fill callback wouldn't be better, I've been thinking it might be
easier to refactor the code needed to handle this new structure with all needed
data into it, instead of handling the callback, flags and the cookie by their
own. Maybe even create some helpers to handle the data into it.
What you think?


-- 
Carlos

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-09-26 19:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-05 13:57 [PATCH RFC 0/2] ->bmap interface retirement Carlos Maiolino
2018-09-05 13:57 ` [PATCH 1/2] fs: Replace direct ->bmap calls by bmap() Carlos Maiolino
2018-09-05 14:23   ` Eric Sandeen
2018-09-05 18:55     ` Christoph Hellwig
2018-09-06  8:04       ` Carlos Maiolino
2018-09-06  8:03     ` Carlos Maiolino
2018-09-05 13:57 ` [PATCH 2/2] Use fiemap internal infra-structure to handle FIBMAP Carlos Maiolino
2018-09-05 14:31   ` Matthew Wilcox
2018-09-05 18:56     ` Christoph Hellwig
2018-09-06  8:31       ` Carlos Maiolino
2018-09-10  7:50         ` Christoph Hellwig
2018-09-10 10:31           ` Carlos Maiolino
2018-09-26 13:34           ` Carlos Maiolino
2018-09-06  8:18     ` Carlos Maiolino
2018-09-05 14:40   ` Eric Sandeen
2018-09-06  8:12     ` Carlos Maiolino
2018-09-06 15:53       ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).