All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] xfs: a few fixes and cleanups to GETFSMAP
@ 2020-10-02  4:49 Darrick J. Wong
  2020-10-02  4:49 ` [PATCH 1/2] xfs: limit entries returned when counting fsmap records Darrick J. Wong
  2020-10-02  4:49 ` [PATCH 2/2] xfs: streamline xfs_getfsmap performance Darrick J. Wong
  0 siblings, 2 replies; 11+ messages in thread
From: Darrick J. Wong @ 2020-10-02  4:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

This quick series cleans up a few warts in the XFS GETFSMAP ioctl
implementation.  The first patch prevents an integer overflow when
counting the mappings.  The second patch improves performance of the
ioctl by formatting reverse mappings to an in-kernel buffer and then
copying the entire buffer to userspace (instead of copying the records
one by one).  That eliminates an indirect call and a lot of overhead
from copying things to userspace, which is a bit expensive.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=getfsmap-cleanups-5.10
---
 fs/xfs/xfs_fsmap.c |   38 +++++++++++++++++------------
 fs/xfs/xfs_fsmap.h |    6 +----
 fs/xfs/xfs_ioctl.c |   69 ++++++++++++++++++++++------------------------------
 3 files changed, 53 insertions(+), 60 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] xfs: limit entries returned when counting fsmap records
  2020-10-02  4:49 [PATCH 0/2] xfs: a few fixes and cleanups to GETFSMAP Darrick J. Wong
@ 2020-10-02  4:49 ` Darrick J. Wong
  2020-10-02  7:06   ` Christoph Hellwig
  2020-10-02  8:51   ` Chandan Babu R
  2020-10-02  4:49 ` [PATCH 2/2] xfs: streamline xfs_getfsmap performance Darrick J. Wong
  1 sibling, 2 replies; 11+ messages in thread
From: Darrick J. Wong @ 2020-10-02  4:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

If userspace asked fsmap to count the number of entries, we cannot
return more than UINT_MAX entries because fmh_entries is u32.
Therefore, stop counting if we hit this limit or else we will waste time
to return truncated results.

Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_fsmap.c |    3 +++
 1 file changed, 3 insertions(+)


diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index 4eebcec4aae6..aa36e7daf82c 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -256,6 +256,9 @@ xfs_getfsmap_helper(
 
 	/* Are we just counting mappings? */
 	if (info->head->fmh_count == 0) {
+		if (info->head->fmh_entries == UINT_MAX)
+			return -ECANCELED;
+
 		if (rec_daddr > info->next_daddr)
 			info->head->fmh_entries++;
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] xfs: streamline xfs_getfsmap performance
  2020-10-02  4:49 [PATCH 0/2] xfs: a few fixes and cleanups to GETFSMAP Darrick J. Wong
  2020-10-02  4:49 ` [PATCH 1/2] xfs: limit entries returned when counting fsmap records Darrick J. Wong
@ 2020-10-02  4:49 ` Darrick J. Wong
  2020-10-02  7:15   ` Christoph Hellwig
                     ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Darrick J. Wong @ 2020-10-02  4:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Refactor xfs_getfsmap to improve its performance: instead of indirectly
calling a function that copies one record to userspace at a time, create
a shadow buffer in the kernel and copy the whole array once at the end.
This should speed it up significantly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_fsmap.c |   35 +++++++++++++++-----------
 fs/xfs/xfs_fsmap.h |    6 +----
 fs/xfs/xfs_ioctl.c |   69 ++++++++++++++++++++++------------------------------
 3 files changed, 50 insertions(+), 60 deletions(-)


diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index aa36e7daf82c..3f2666c62a60 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -155,8 +155,7 @@ xfs_fsmap_owner_from_rmap(
 /* getfsmap query state */
 struct xfs_getfsmap_info {
 	struct xfs_fsmap_head	*head;
-	xfs_fsmap_format_t	formatter;	/* formatting fn */
-	void			*format_arg;	/* format buffer */
+	struct fsmap		*fsmap_recs;	/* mapping records */
 	struct xfs_buf		*agf_bp;	/* AGF, for refcount queries */
 	xfs_daddr_t		next_daddr;	/* next daddr we expect */
 	u64			missing_owner;	/* owner of holes */
@@ -224,6 +223,20 @@ xfs_getfsmap_is_shared(
 	return 0;
 }
 
+static inline void
+xfs_getfsmap_format(
+	struct xfs_mount		*mp,
+	struct xfs_fsmap		*xfm,
+	struct xfs_getfsmap_info	*info)
+{
+	struct fsmap			*rec;
+
+	trace_xfs_getfsmap_mapping(mp, xfm);
+
+	rec = &info->fsmap_recs[info->head->fmh_entries++];
+	xfs_fsmap_from_internal(rec, xfm);
+}
+
 /*
  * Format a reverse mapping for getfsmap, having translated rm_startblock
  * into the appropriate daddr units.
@@ -288,10 +301,7 @@ xfs_getfsmap_helper(
 		fmr.fmr_offset = 0;
 		fmr.fmr_length = rec_daddr - info->next_daddr;
 		fmr.fmr_flags = FMR_OF_SPECIAL_OWNER;
-		error = info->formatter(&fmr, info->format_arg);
-		if (error)
-			return error;
-		info->head->fmh_entries++;
+		xfs_getfsmap_format(mp, &fmr, info);
 	}
 
 	if (info->last)
@@ -323,11 +333,8 @@ xfs_getfsmap_helper(
 		if (shared)
 			fmr.fmr_flags |= FMR_OF_SHARED;
 	}
-	error = info->formatter(&fmr, info->format_arg);
-	if (error)
-		return error;
-	info->head->fmh_entries++;
 
+	xfs_getfsmap_format(mp, &fmr, info);
 out:
 	rec_daddr += XFS_FSB_TO_BB(mp, rec->rm_blockcount);
 	if (info->next_daddr < rec_daddr)
@@ -796,7 +803,7 @@ xfs_getfsmap_check_keys(
 
 /*
  * Get filesystem's extents as described in head, and format for
- * output.  Calls formatter to fill the user's buffer until all
+ * output.  Calls xfs_getfsmap_format to fill the user's buffer until all
  * extents are mapped, until the passed-in head->fmh_count slots have
  * been filled, or until the formatter short-circuits the loop, if it
  * is tracking filled-in extents on its own.
@@ -819,8 +826,7 @@ int
 xfs_getfsmap(
 	struct xfs_mount		*mp,
 	struct xfs_fsmap_head		*head,
-	xfs_fsmap_format_t		formatter,
-	void				*arg)
+	struct fsmap			*fsmap_recs)
 {
 	struct xfs_trans		*tp = NULL;
 	struct xfs_fsmap		dkeys[2];	/* per-dev keys */
@@ -895,8 +901,7 @@ xfs_getfsmap(
 
 	info.next_daddr = head->fmh_keys[0].fmr_physical +
 			  head->fmh_keys[0].fmr_length;
-	info.formatter = formatter;
-	info.format_arg = arg;
+	info.fsmap_recs = fsmap_recs;
 	info.head = head;
 
 	/*
diff --git a/fs/xfs/xfs_fsmap.h b/fs/xfs/xfs_fsmap.h
index c6c57739b862..a0775788e7b1 100644
--- a/fs/xfs/xfs_fsmap.h
+++ b/fs/xfs/xfs_fsmap.h
@@ -27,13 +27,9 @@ struct xfs_fsmap_head {
 	struct xfs_fsmap fmh_keys[2];	/* low and high keys */
 };
 
-void xfs_fsmap_from_internal(struct fsmap *dest, struct xfs_fsmap *src);
 void xfs_fsmap_to_internal(struct xfs_fsmap *dest, struct fsmap *src);
 
-/* fsmap to userspace formatter - copy to user & advance pointer */
-typedef int (*xfs_fsmap_format_t)(struct xfs_fsmap *, void *);
-
 int xfs_getfsmap(struct xfs_mount *mp, struct xfs_fsmap_head *head,
-		xfs_fsmap_format_t formatter, void *arg);
+		struct fsmap *out_recs);
 
 #endif /* __XFS_FSMAP_H__ */
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index bca7659fb5c6..e0f2100b74b0 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1716,38 +1716,15 @@ xfs_ioc_getbmap(
 	return error;
 }
 
-struct getfsmap_info {
-	struct xfs_mount	*mp;
-	struct fsmap_head __user *data;
-	unsigned int		idx;
-	__u32			last_flags;
-};
-
-STATIC int
-xfs_getfsmap_format(struct xfs_fsmap *xfm, void *priv)
-{
-	struct getfsmap_info	*info = priv;
-	struct fsmap		fm;
-
-	trace_xfs_getfsmap_mapping(info->mp, xfm);
-
-	info->last_flags = xfm->fmr_flags;
-	xfs_fsmap_from_internal(&fm, xfm);
-	if (copy_to_user(&info->data->fmh_recs[info->idx++], &fm,
-			sizeof(struct fsmap)))
-		return -EFAULT;
-
-	return 0;
-}
-
 STATIC int
 xfs_ioc_getfsmap(
 	struct xfs_inode	*ip,
 	struct fsmap_head	__user *arg)
 {
-	struct getfsmap_info	info = { NULL };
 	struct xfs_fsmap_head	xhead = {0};
 	struct fsmap_head	head;
+	struct fsmap		*recs;
+	unsigned int		count;
 	bool			aborted = false;
 	int			error;
 
@@ -1760,38 +1737,50 @@ xfs_ioc_getfsmap(
 		       sizeof(head.fmh_keys[1].fmr_reserved)))
 		return -EINVAL;
 
+	count = min_t(unsigned int, head.fmh_count,
+			UINT_MAX / sizeof(struct fsmap));
+	recs = kvzalloc(count * sizeof(struct fsmap), GFP_KERNEL);
+	if (!recs)
+		return -ENOMEM;
+
 	xhead.fmh_iflags = head.fmh_iflags;
-	xhead.fmh_count = head.fmh_count;
+	xhead.fmh_count = count;
 	xfs_fsmap_to_internal(&xhead.fmh_keys[0], &head.fmh_keys[0]);
 	xfs_fsmap_to_internal(&xhead.fmh_keys[1], &head.fmh_keys[1]);
 
 	trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
 	trace_xfs_getfsmap_high_key(ip->i_mount, &xhead.fmh_keys[1]);
 
-	info.mp = ip->i_mount;
-	info.data = arg;
-	error = xfs_getfsmap(ip->i_mount, &xhead, xfs_getfsmap_format, &info);
+	error = xfs_getfsmap(ip->i_mount, &xhead, recs);
 	if (error == -ECANCELED) {
 		error = 0;
 		aborted = true;
 	} else if (error)
-		return error;
-
-	/* If we didn't abort, set the "last" flag in the last fmx */
-	if (!aborted && info.idx) {
-		info.last_flags |= FMR_OF_LAST;
-		if (copy_to_user(&info.data->fmh_recs[info.idx - 1].fmr_flags,
-				&info.last_flags, sizeof(info.last_flags)))
-			return -EFAULT;
-	}
+		goto out_free;
 
 	/* copy back header */
+	error = -EFAULT;
 	head.fmh_entries = xhead.fmh_entries;
 	head.fmh_oflags = xhead.fmh_oflags;
 	if (copy_to_user(arg, &head, sizeof(struct fsmap_head)))
-		return -EFAULT;
+		goto out_free;
 
-	return 0;
+	/* Copy records if userspace wasn't asking only for a record count. */
+	if (head.fmh_count > 0) {
+		/* If we didn't abort, set the "last" flag in the last record */
+		if (!aborted && xhead.fmh_entries > 0)
+			recs[xhead.fmh_entries - 1].fmr_flags |= FMR_OF_LAST;
+
+		/* Copy all records out to userspace. */
+		if (copy_to_user(arg->fmh_recs, recs,
+				 xhead.fmh_entries * sizeof(struct fsmap)))
+			goto out_free;
+	}
+
+	error = 0;
+out_free:
+	kmem_free(recs);
+	return error;
 }
 
 STATIC int


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] xfs: limit entries returned when counting fsmap records
  2020-10-02  4:49 ` [PATCH 1/2] xfs: limit entries returned when counting fsmap records Darrick J. Wong
@ 2020-10-02  7:06   ` Christoph Hellwig
  2020-10-02  8:51   ` Chandan Babu R
  1 sibling, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2020-10-02  7:06 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Oct 01, 2020 at 09:49:24PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If userspace asked fsmap to count the number of entries, we cannot
> return more than UINT_MAX entries because fmh_entries is u32.
> Therefore, stop counting if we hit this limit or else we will waste time
> to return truncated results.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

(after remembering the magic repurposing of -ECANCELED in the fsmap
code)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] xfs: streamline xfs_getfsmap performance
  2020-10-02  4:49 ` [PATCH 2/2] xfs: streamline xfs_getfsmap performance Darrick J. Wong
@ 2020-10-02  7:15   ` Christoph Hellwig
  2020-10-02 17:58     ` Darrick J. Wong
  2020-10-02  8:52   ` Chandan Babu R
  2020-10-04 19:14   ` [PATCH v1.2 2/2] xfs: fix deadlock and " Darrick J. Wong
  2 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2020-10-02  7:15 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Oct 01, 2020 at 09:49:30PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Refactor xfs_getfsmap to improve its performance: instead of indirectly
> calling a function that copies one record to userspace at a time, create
> a shadow buffer in the kernel and copy the whole array once at the end.
> This should speed it up significantly.

So we save an indirect call and uaccess_enable/disable per entry,
but pay for it with the cost of a memory allocation (and actually
using that memory).

Doesn't seem like an obvious win to me, do you have numbers?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] xfs: limit entries returned when counting fsmap records
  2020-10-02  4:49 ` [PATCH 1/2] xfs: limit entries returned when counting fsmap records Darrick J. Wong
  2020-10-02  7:06   ` Christoph Hellwig
@ 2020-10-02  8:51   ` Chandan Babu R
  1 sibling, 0 replies; 11+ messages in thread
From: Chandan Babu R @ 2020-10-02  8:51 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Friday 2 October 2020 10:19:24 AM IST Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If userspace asked fsmap to count the number of entries, we cannot
> return more than UINT_MAX entries because fmh_entries is u32.
> Therefore, stop counting if we hit this limit or else we will waste time
> to return truncated results.
> 
> Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

The upper bound check is correct.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

> ---
>  fs/xfs/xfs_fsmap.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> 
> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> index 4eebcec4aae6..aa36e7daf82c 100644
> --- a/fs/xfs/xfs_fsmap.c
> +++ b/fs/xfs/xfs_fsmap.c
> @@ -256,6 +256,9 @@ xfs_getfsmap_helper(
>  
>  	/* Are we just counting mappings? */
>  	if (info->head->fmh_count == 0) {
> +		if (info->head->fmh_entries == UINT_MAX)
> +			return -ECANCELED;
> +
>  		if (rec_daddr > info->next_daddr)
>  			info->head->fmh_entries++;
>  
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] xfs: streamline xfs_getfsmap performance
  2020-10-02  4:49 ` [PATCH 2/2] xfs: streamline xfs_getfsmap performance Darrick J. Wong
  2020-10-02  7:15   ` Christoph Hellwig
@ 2020-10-02  8:52   ` Chandan Babu R
  2020-10-04 19:14   ` [PATCH v1.2 2/2] xfs: fix deadlock and " Darrick J. Wong
  2 siblings, 0 replies; 11+ messages in thread
From: Chandan Babu R @ 2020-10-02  8:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Friday 2 October 2020 10:19:30 AM IST Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Refactor xfs_getfsmap to improve its performance: instead of indirectly
> calling a function that copies one record to userspace at a time, create
> a shadow buffer in the kernel and copy the whole array once at the end.
> This should speed it up significantly.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Overall code flow is logically correct.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
> ---
>  fs/xfs/xfs_fsmap.c |   35 +++++++++++++++-----------
>  fs/xfs/xfs_fsmap.h |    6 +----
>  fs/xfs/xfs_ioctl.c |   69 ++++++++++++++++++++++------------------------------
>  3 files changed, 50 insertions(+), 60 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> index aa36e7daf82c..3f2666c62a60 100644
> --- a/fs/xfs/xfs_fsmap.c
> +++ b/fs/xfs/xfs_fsmap.c
> @@ -155,8 +155,7 @@ xfs_fsmap_owner_from_rmap(
>  /* getfsmap query state */
>  struct xfs_getfsmap_info {
>  	struct xfs_fsmap_head	*head;
> -	xfs_fsmap_format_t	formatter;	/* formatting fn */
> -	void			*format_arg;	/* format buffer */
> +	struct fsmap		*fsmap_recs;	/* mapping records */
>  	struct xfs_buf		*agf_bp;	/* AGF, for refcount queries */
>  	xfs_daddr_t		next_daddr;	/* next daddr we expect */
>  	u64			missing_owner;	/* owner of holes */
> @@ -224,6 +223,20 @@ xfs_getfsmap_is_shared(
>  	return 0;
>  }
>  
> +static inline void
> +xfs_getfsmap_format(
> +	struct xfs_mount		*mp,
> +	struct xfs_fsmap		*xfm,
> +	struct xfs_getfsmap_info	*info)
> +{
> +	struct fsmap			*rec;
> +
> +	trace_xfs_getfsmap_mapping(mp, xfm);
> +
> +	rec = &info->fsmap_recs[info->head->fmh_entries++];
> +	xfs_fsmap_from_internal(rec, xfm);
> +}
> +
>  /*
>   * Format a reverse mapping for getfsmap, having translated rm_startblock
>   * into the appropriate daddr units.
> @@ -288,10 +301,7 @@ xfs_getfsmap_helper(
>  		fmr.fmr_offset = 0;
>  		fmr.fmr_length = rec_daddr - info->next_daddr;
>  		fmr.fmr_flags = FMR_OF_SPECIAL_OWNER;
> -		error = info->formatter(&fmr, info->format_arg);
> -		if (error)
> -			return error;
> -		info->head->fmh_entries++;
> +		xfs_getfsmap_format(mp, &fmr, info);
>  	}
>  
>  	if (info->last)
> @@ -323,11 +333,8 @@ xfs_getfsmap_helper(
>  		if (shared)
>  			fmr.fmr_flags |= FMR_OF_SHARED;
>  	}
> -	error = info->formatter(&fmr, info->format_arg);
> -	if (error)
> -		return error;
> -	info->head->fmh_entries++;
>  
> +	xfs_getfsmap_format(mp, &fmr, info);
>  out:
>  	rec_daddr += XFS_FSB_TO_BB(mp, rec->rm_blockcount);
>  	if (info->next_daddr < rec_daddr)
> @@ -796,7 +803,7 @@ xfs_getfsmap_check_keys(
>  
>  /*
>   * Get filesystem's extents as described in head, and format for
> - * output.  Calls formatter to fill the user's buffer until all
> + * output.  Calls xfs_getfsmap_format to fill the user's buffer until all
>   * extents are mapped, until the passed-in head->fmh_count slots have
>   * been filled, or until the formatter short-circuits the loop, if it
>   * is tracking filled-in extents on its own.
> @@ -819,8 +826,7 @@ int
>  xfs_getfsmap(
>  	struct xfs_mount		*mp,
>  	struct xfs_fsmap_head		*head,
> -	xfs_fsmap_format_t		formatter,
> -	void				*arg)
> +	struct fsmap			*fsmap_recs)
>  {
>  	struct xfs_trans		*tp = NULL;
>  	struct xfs_fsmap		dkeys[2];	/* per-dev keys */
> @@ -895,8 +901,7 @@ xfs_getfsmap(
>  
>  	info.next_daddr = head->fmh_keys[0].fmr_physical +
>  			  head->fmh_keys[0].fmr_length;
> -	info.formatter = formatter;
> -	info.format_arg = arg;
> +	info.fsmap_recs = fsmap_recs;
>  	info.head = head;
>  
>  	/*
> diff --git a/fs/xfs/xfs_fsmap.h b/fs/xfs/xfs_fsmap.h
> index c6c57739b862..a0775788e7b1 100644
> --- a/fs/xfs/xfs_fsmap.h
> +++ b/fs/xfs/xfs_fsmap.h
> @@ -27,13 +27,9 @@ struct xfs_fsmap_head {
>  	struct xfs_fsmap fmh_keys[2];	/* low and high keys */
>  };
>  
> -void xfs_fsmap_from_internal(struct fsmap *dest, struct xfs_fsmap *src);
>  void xfs_fsmap_to_internal(struct xfs_fsmap *dest, struct fsmap *src);
>  
> -/* fsmap to userspace formatter - copy to user & advance pointer */
> -typedef int (*xfs_fsmap_format_t)(struct xfs_fsmap *, void *);
> -
>  int xfs_getfsmap(struct xfs_mount *mp, struct xfs_fsmap_head *head,
> -		xfs_fsmap_format_t formatter, void *arg);
> +		struct fsmap *out_recs);
>  
>  #endif /* __XFS_FSMAP_H__ */
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index bca7659fb5c6..e0f2100b74b0 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1716,38 +1716,15 @@ xfs_ioc_getbmap(
>  	return error;
>  }
>  
> -struct getfsmap_info {
> -	struct xfs_mount	*mp;
> -	struct fsmap_head __user *data;
> -	unsigned int		idx;
> -	__u32			last_flags;
> -};
> -
> -STATIC int
> -xfs_getfsmap_format(struct xfs_fsmap *xfm, void *priv)
> -{
> -	struct getfsmap_info	*info = priv;
> -	struct fsmap		fm;
> -
> -	trace_xfs_getfsmap_mapping(info->mp, xfm);
> -
> -	info->last_flags = xfm->fmr_flags;
> -	xfs_fsmap_from_internal(&fm, xfm);
> -	if (copy_to_user(&info->data->fmh_recs[info->idx++], &fm,
> -			sizeof(struct fsmap)))
> -		return -EFAULT;
> -
> -	return 0;
> -}
> -
>  STATIC int
>  xfs_ioc_getfsmap(
>  	struct xfs_inode	*ip,
>  	struct fsmap_head	__user *arg)
>  {
> -	struct getfsmap_info	info = { NULL };
>  	struct xfs_fsmap_head	xhead = {0};
>  	struct fsmap_head	head;
> +	struct fsmap		*recs;
> +	unsigned int		count;
>  	bool			aborted = false;
>  	int			error;
>  
> @@ -1760,38 +1737,50 @@ xfs_ioc_getfsmap(
>  		       sizeof(head.fmh_keys[1].fmr_reserved)))
>  		return -EINVAL;
>  
> +	count = min_t(unsigned int, head.fmh_count,
> +			UINT_MAX / sizeof(struct fsmap));
> +	recs = kvzalloc(count * sizeof(struct fsmap), GFP_KERNEL);
> +	if (!recs)
> +		return -ENOMEM;
> +
>  	xhead.fmh_iflags = head.fmh_iflags;
> -	xhead.fmh_count = head.fmh_count;
> +	xhead.fmh_count = count;
>  	xfs_fsmap_to_internal(&xhead.fmh_keys[0], &head.fmh_keys[0]);
>  	xfs_fsmap_to_internal(&xhead.fmh_keys[1], &head.fmh_keys[1]);
>  
>  	trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
>  	trace_xfs_getfsmap_high_key(ip->i_mount, &xhead.fmh_keys[1]);
>  
> -	info.mp = ip->i_mount;
> -	info.data = arg;
> -	error = xfs_getfsmap(ip->i_mount, &xhead, xfs_getfsmap_format, &info);
> +	error = xfs_getfsmap(ip->i_mount, &xhead, recs);
>  	if (error == -ECANCELED) {
>  		error = 0;
>  		aborted = true;
>  	} else if (error)
> -		return error;
> -
> -	/* If we didn't abort, set the "last" flag in the last fmx */
> -	if (!aborted && info.idx) {
> -		info.last_flags |= FMR_OF_LAST;
> -		if (copy_to_user(&info.data->fmh_recs[info.idx - 1].fmr_flags,
> -				&info.last_flags, sizeof(info.last_flags)))
> -			return -EFAULT;
> -	}
> +		goto out_free;
>  
>  	/* copy back header */
> +	error = -EFAULT;
>  	head.fmh_entries = xhead.fmh_entries;
>  	head.fmh_oflags = xhead.fmh_oflags;
>  	if (copy_to_user(arg, &head, sizeof(struct fsmap_head)))
> -		return -EFAULT;
> +		goto out_free;
>  
> -	return 0;
> +	/* Copy records if userspace wasn't asking only for a record count. */
> +	if (head.fmh_count > 0) {
> +		/* If we didn't abort, set the "last" flag in the last record */
> +		if (!aborted && xhead.fmh_entries > 0)
> +			recs[xhead.fmh_entries - 1].fmr_flags |= FMR_OF_LAST;
> +
> +		/* Copy all records out to userspace. */
> +		if (copy_to_user(arg->fmh_recs, recs,
> +				 xhead.fmh_entries * sizeof(struct fsmap)))
> +			goto out_free;
> +	}
> +
> +	error = 0;
> +out_free:
> +	kmem_free(recs);
> +	return error;
>  }
>  
>  STATIC int
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] xfs: streamline xfs_getfsmap performance
  2020-10-02  7:15   ` Christoph Hellwig
@ 2020-10-02 17:58     ` Darrick J. Wong
  2020-10-05  6:30       ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Darrick J. Wong @ 2020-10-02 17:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, Oct 02, 2020 at 08:15:05AM +0100, Christoph Hellwig wrote:
> On Thu, Oct 01, 2020 at 09:49:30PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Refactor xfs_getfsmap to improve its performance: instead of indirectly
> > calling a function that copies one record to userspace at a time, create
> > a shadow buffer in the kernel and copy the whole array once at the end.
> > This should speed it up significantly.
> 
> So we save an indirect call and uaccess_enable/disable per entry,
> but pay for it with the cost of a memory allocation (and actually
> using that memory).
> 
> Doesn't seem like an obvious win to me, do you have numbers?

On my 2TB /home partition, the runtime to retrieve 6.4 million fsmap
records drops from 107s to 85s.

It also occurs to me that the current code also copies to userspace
while holding the AGF buffer lock (or the rt bitmap ilock) which we
shouldn't be doing because the userspace buffer could very well be a
mmap file on the same volume, in which case a page_mkwrite could cause
allocation activity.

Hmm, maybe I should tamp down the allocation behavior too, since we
could just do 64k or pagesize or whatever; userspace can call us back
since we set it up for easy iteration.

--D

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v1.2 2/2] xfs: fix deadlock and streamline xfs_getfsmap performance
  2020-10-02  4:49 ` [PATCH 2/2] xfs: streamline xfs_getfsmap performance Darrick J. Wong
  2020-10-02  7:15   ` Christoph Hellwig
  2020-10-02  8:52   ` Chandan Babu R
@ 2020-10-04 19:14   ` Darrick J. Wong
  2 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2020-10-04 19:14 UTC (permalink / raw)
  To: linux-xfs, Christoph Hellwig

From: Darrick J. Wong <darrick.wong@oracle.com>

Refactor xfs_getfsmap to improve its performance: instead of indirectly
calling a function that copies one record to userspace at a time, create
a shadow buffer in the kernel and copy the whole array once at the end.
On the author's computer, this reduces the runtime on his /home by ~20%.

This also eliminates a deadlock when running GETFSMAP against the
realtime device.  The current code locks the rtbitmap to create
fsmappings and copies them into userspace, having not released the
rtbitmap lock.  If the userspace buffer is an mmap of a sparse file that
itself resides on the realtime device, the write page fault will recurse
into the fs for allocation, which will deadlock on the rtbitmap lock.

While we're at it, constrain the kernel memory buffer size so that
userspace can't whack the kernel with huge memory allocations.

Fixes: 4c934c7dd60c ("xfs: report realtime space information via the rtbitmap")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: constrain the internal buffer size and loop xfs_getfsmap calls
until we fill the userspace buffer or run out of result rows
---
 fs/xfs/xfs_fsmap.c |   45 +++++++++-------
 fs/xfs/xfs_fsmap.h |    6 --
 fs/xfs/xfs_ioctl.c |  147 +++++++++++++++++++++++++++++++++++-----------------
 3 files changed, 126 insertions(+), 72 deletions(-)

diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index aa36e7daf82c..9ce5e7d5bf8f 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -26,7 +26,7 @@
 #include "xfs_rtalloc.h"
 
 /* Convert an xfs_fsmap to an fsmap. */
-void
+static void
 xfs_fsmap_from_internal(
 	struct fsmap		*dest,
 	struct xfs_fsmap	*src)
@@ -155,8 +155,7 @@ xfs_fsmap_owner_from_rmap(
 /* getfsmap query state */
 struct xfs_getfsmap_info {
 	struct xfs_fsmap_head	*head;
-	xfs_fsmap_format_t	formatter;	/* formatting fn */
-	void			*format_arg;	/* format buffer */
+	struct fsmap		*fsmap_recs;	/* mapping records */
 	struct xfs_buf		*agf_bp;	/* AGF, for refcount queries */
 	xfs_daddr_t		next_daddr;	/* next daddr we expect */
 	u64			missing_owner;	/* owner of holes */
@@ -224,6 +223,20 @@ xfs_getfsmap_is_shared(
 	return 0;
 }
 
+static inline void
+xfs_getfsmap_format(
+	struct xfs_mount		*mp,
+	struct xfs_fsmap		*xfm,
+	struct xfs_getfsmap_info	*info)
+{
+	struct fsmap			*rec;
+
+	trace_xfs_getfsmap_mapping(mp, xfm);
+
+	rec = &info->fsmap_recs[info->head->fmh_entries++];
+	xfs_fsmap_from_internal(rec, xfm);
+}
+
 /*
  * Format a reverse mapping for getfsmap, having translated rm_startblock
  * into the appropriate daddr units.
@@ -288,10 +301,7 @@ xfs_getfsmap_helper(
 		fmr.fmr_offset = 0;
 		fmr.fmr_length = rec_daddr - info->next_daddr;
 		fmr.fmr_flags = FMR_OF_SPECIAL_OWNER;
-		error = info->formatter(&fmr, info->format_arg);
-		if (error)
-			return error;
-		info->head->fmh_entries++;
+		xfs_getfsmap_format(mp, &fmr, info);
 	}
 
 	if (info->last)
@@ -323,11 +333,8 @@ xfs_getfsmap_helper(
 		if (shared)
 			fmr.fmr_flags |= FMR_OF_SHARED;
 	}
-	error = info->formatter(&fmr, info->format_arg);
-	if (error)
-		return error;
-	info->head->fmh_entries++;
 
+	xfs_getfsmap_format(mp, &fmr, info);
 out:
 	rec_daddr += XFS_FSB_TO_BB(mp, rec->rm_blockcount);
 	if (info->next_daddr < rec_daddr)
@@ -795,11 +802,11 @@ xfs_getfsmap_check_keys(
 #endif /* CONFIG_XFS_RT */
 
 /*
- * Get filesystem's extents as described in head, and format for
- * output.  Calls formatter to fill the user's buffer until all
- * extents are mapped, until the passed-in head->fmh_count slots have
- * been filled, or until the formatter short-circuits the loop, if it
- * is tracking filled-in extents on its own.
+ * Get filesystem's extents as described in head, and format for output. Fills
+ * in the supplied records array until there are no more reverse mappings to
+ * return or head.fmh_entries == head.fmh_count.  In the second case, this
+ * function returns -ECANCELED to indicate that more records would have been
+ * returned.
  *
  * Key to Confusion
  * ----------------
@@ -819,8 +826,7 @@ int
 xfs_getfsmap(
 	struct xfs_mount		*mp,
 	struct xfs_fsmap_head		*head,
-	xfs_fsmap_format_t		formatter,
-	void				*arg)
+	struct fsmap			*fsmap_recs)
 {
 	struct xfs_trans		*tp = NULL;
 	struct xfs_fsmap		dkeys[2];	/* per-dev keys */
@@ -895,8 +901,7 @@ xfs_getfsmap(
 
 	info.next_daddr = head->fmh_keys[0].fmr_physical +
 			  head->fmh_keys[0].fmr_length;
-	info.formatter = formatter;
-	info.format_arg = arg;
+	info.fsmap_recs = fsmap_recs;
 	info.head = head;
 
 	/*
diff --git a/fs/xfs/xfs_fsmap.h b/fs/xfs/xfs_fsmap.h
index c6c57739b862..a0775788e7b1 100644
--- a/fs/xfs/xfs_fsmap.h
+++ b/fs/xfs/xfs_fsmap.h
@@ -27,13 +27,9 @@ struct xfs_fsmap_head {
 	struct xfs_fsmap fmh_keys[2];	/* low and high keys */
 };
 
-void xfs_fsmap_from_internal(struct fsmap *dest, struct xfs_fsmap *src);
 void xfs_fsmap_to_internal(struct xfs_fsmap *dest, struct fsmap *src);
 
-/* fsmap to userspace formatter - copy to user & advance pointer */
-typedef int (*xfs_fsmap_format_t)(struct xfs_fsmap *, void *);
-
 int xfs_getfsmap(struct xfs_mount *mp, struct xfs_fsmap_head *head,
-		xfs_fsmap_format_t formatter, void *arg);
+		struct fsmap *out_recs);
 
 #endif /* __XFS_FSMAP_H__ */
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index bca7659fb5c6..a08fc7eb3e93 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1716,39 +1716,17 @@ xfs_ioc_getbmap(
 	return error;
 }
 
-struct getfsmap_info {
-	struct xfs_mount	*mp;
-	struct fsmap_head __user *data;
-	unsigned int		idx;
-	__u32			last_flags;
-};
-
-STATIC int
-xfs_getfsmap_format(struct xfs_fsmap *xfm, void *priv)
-{
-	struct getfsmap_info	*info = priv;
-	struct fsmap		fm;
-
-	trace_xfs_getfsmap_mapping(info->mp, xfm);
-
-	info->last_flags = xfm->fmr_flags;
-	xfs_fsmap_from_internal(&fm, xfm);
-	if (copy_to_user(&info->data->fmh_recs[info->idx++], &fm,
-			sizeof(struct fsmap)))
-		return -EFAULT;
-
-	return 0;
-}
-
 STATIC int
 xfs_ioc_getfsmap(
 	struct xfs_inode	*ip,
 	struct fsmap_head	__user *arg)
 {
-	struct getfsmap_info	info = { NULL };
 	struct xfs_fsmap_head	xhead = {0};
 	struct fsmap_head	head;
-	bool			aborted = false;
+	struct fsmap		*recs;
+	unsigned int		count;
+	__u32			last_flags = 0;
+	bool			done = false;
 	int			error;
 
 	if (copy_from_user(&head, arg, sizeof(struct fsmap_head)))
@@ -1760,38 +1738,113 @@ xfs_ioc_getfsmap(
 		       sizeof(head.fmh_keys[1].fmr_reserved)))
 		return -EINVAL;
 
+	/*
+	 * Use an internal memory buffer so that we don't have to copy fsmap
+	 * data to userspace while holding locks.  Start by trying to allocate
+	 * up to 16 pages for the buffer, but fall back to a single page if we
+	 * have to.
+	 */
+	count = min_t(unsigned int, head.fmh_count,
+			(16 * PAGE_SIZE) / sizeof(struct fsmap));
+	recs = kvzalloc(count * sizeof(struct fsmap), GFP_KERNEL);
+	if (!recs) {
+		count = min_t(unsigned int, head.fmh_count,
+				PAGE_SIZE / sizeof(struct fsmap));
+		recs = kvzalloc(count * sizeof(struct fsmap), GFP_KERNEL);
+		if (!recs)
+			return -ENOMEM;
+	}
+
 	xhead.fmh_iflags = head.fmh_iflags;
-	xhead.fmh_count = head.fmh_count;
 	xfs_fsmap_to_internal(&xhead.fmh_keys[0], &head.fmh_keys[0]);
 	xfs_fsmap_to_internal(&xhead.fmh_keys[1], &head.fmh_keys[1]);
 
 	trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
 	trace_xfs_getfsmap_high_key(ip->i_mount, &xhead.fmh_keys[1]);
 
-	info.mp = ip->i_mount;
-	info.data = arg;
-	error = xfs_getfsmap(ip->i_mount, &xhead, xfs_getfsmap_format, &info);
-	if (error == -ECANCELED) {
-		error = 0;
-		aborted = true;
-	} else if (error)
-		return error;
-
-	/* If we didn't abort, set the "last" flag in the last fmx */
-	if (!aborted && info.idx) {
-		info.last_flags |= FMR_OF_LAST;
-		if (copy_to_user(&info.data->fmh_recs[info.idx - 1].fmr_flags,
-				&info.last_flags, sizeof(info.last_flags)))
-			return -EFAULT;
+	head.fmh_entries = 0;
+	do {
+		struct fsmap __user	*user_recs;
+		struct fsmap		*last_rec;
+
+		user_recs = &arg->fmh_recs[head.fmh_entries];
+		xhead.fmh_entries = 0;
+		xhead.fmh_count = min_t(unsigned int, count,
+					head.fmh_count - head.fmh_entries);
+
+		/* Run query, record how many entries we got. */
+		error = xfs_getfsmap(ip->i_mount, &xhead, recs);
+		switch (error) {
+		case 0:
+			/*
+			 * There are no more records in the result set.  Copy
+			 * whatever we got to userspace and break out.
+			 */
+			done = true;
+			break;
+		case -ECANCELED:
+			/*
+			 * The internal memory buffer is full.  Copy whatever
+			 * records we got to userspace and go again if we have
+			 * not yet filled the userspace buffer.
+			 */
+			error = 0;
+			break;
+		default:
+			goto out_free;
+		}
+		head.fmh_entries += xhead.fmh_entries;
+		head.fmh_oflags = xhead.fmh_oflags;
+
+		/*
+		 * If the caller wanted a record count or there aren't any
+		 * new records to return, we're done.
+		 */
+		if (head.fmh_count == 0 || xhead.fmh_entries == 0)
+			break;
+
+		/* Copy all the records we got out to userspace. */
+		if (copy_to_user(user_recs, recs,
+				 xhead.fmh_entries * sizeof(struct fsmap))) {
+			error = -EFAULT;
+			goto out_free;
+		}
+
+		/* Remember the last record flags we copied to userspace. */
+		last_rec = &recs[xhead.fmh_entries - 1];
+		last_flags = last_rec->fmr_flags;
+
+		/* Set up the low key for the next iteration. */
+		xfs_fsmap_to_internal(&xhead.fmh_keys[0], last_rec);
+		trace_xfs_getfsmap_low_key(ip->i_mount, &xhead.fmh_keys[0]);
+	} while (!done && head.fmh_entries < head.fmh_count);
+
+	/*
+	 * If there are no more records in the query result set and we're not
+	 * in counting mode, mark the last record returned with the LAST flag.
+	 */
+	if (done && head.fmh_count > 0 && head.fmh_entries > 0) {
+		struct fsmap __user	*user_rec;
+
+		last_flags |= FMR_OF_LAST;
+		user_rec = &arg->fmh_recs[head.fmh_entries - 1];
+
+		if (copy_to_user(&user_rec->fmr_flags, &last_flags,
+					sizeof(last_flags))) {
+			error = -EFAULT;
+			goto out_free;
+		}
 	}
 
 	/* copy back header */
-	head.fmh_entries = xhead.fmh_entries;
-	head.fmh_oflags = xhead.fmh_oflags;
-	if (copy_to_user(arg, &head, sizeof(struct fsmap_head)))
-		return -EFAULT;
+	if (copy_to_user(arg, &head, sizeof(struct fsmap_head))) {
+		error = -EFAULT;
+		goto out_free;
+	}
 
-	return 0;
+out_free:
+	kmem_free(recs);
+	return error;
 }
 
 STATIC int

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] xfs: streamline xfs_getfsmap performance
  2020-10-02 17:58     ` Darrick J. Wong
@ 2020-10-05  6:30       ` Christoph Hellwig
  2020-10-05 17:02         ` Darrick J. Wong
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2020-10-05  6:30 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Fri, Oct 02, 2020 at 10:58:08AM -0700, Darrick J. Wong wrote:
> On my 2TB /home partition, the runtime to retrieve 6.4 million fsmap
> records drops from 107s to 85s.

Please add these sorts of numbers to the commit log..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] xfs: streamline xfs_getfsmap performance
  2020-10-05  6:30       ` Christoph Hellwig
@ 2020-10-05 17:02         ` Darrick J. Wong
  0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2020-10-05 17:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Oct 05, 2020 at 07:30:01AM +0100, Christoph Hellwig wrote:
> On Fri, Oct 02, 2020 at 10:58:08AM -0700, Darrick J. Wong wrote:
> > On my 2TB /home partition, the runtime to retrieve 6.4 million fsmap
> > records drops from 107s to 85s.
> 
> Please add these sorts of numbers to the commit log..

I did add them to the V2 patch, but maybe you're just waiting for the
resend... :)

(On its way shortly)

--D

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-10-05 17:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-02  4:49 [PATCH 0/2] xfs: a few fixes and cleanups to GETFSMAP Darrick J. Wong
2020-10-02  4:49 ` [PATCH 1/2] xfs: limit entries returned when counting fsmap records Darrick J. Wong
2020-10-02  7:06   ` Christoph Hellwig
2020-10-02  8:51   ` Chandan Babu R
2020-10-02  4:49 ` [PATCH 2/2] xfs: streamline xfs_getfsmap performance Darrick J. Wong
2020-10-02  7:15   ` Christoph Hellwig
2020-10-02 17:58     ` Darrick J. Wong
2020-10-05  6:30       ` Christoph Hellwig
2020-10-05 17:02         ` Darrick J. Wong
2020-10-02  8:52   ` Chandan Babu R
2020-10-04 19:14   ` [PATCH v1.2 2/2] xfs: fix deadlock and " Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.