All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] xfs: skip unnecessary discards
@ 2018-05-07 18:11 Brian Foster
  2018-05-07 18:11 ` [PATCH 1/4] xfs: add bmapi nodiscard flag Brian Foster
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Brian Foster @ 2018-05-07 18:11 UTC (permalink / raw)
  To: linux-xfs

Hi all,

This series allows XFS to skip discards in a couple cases where they are
spurious as the associated blocks have not been written to (e.g.,
eofblocks and unwritten extents). This v1 series incorporates Dave's
feedback on the previously posted RFC.

Patch 1 defines a new bmapi flag, patch 2 uses the flag for eofblocks
trim operations, patch 3 skips discards of unwritten extents and patch 4
is an rfc based on the discussion in the original rfc. While the rfc
mostly works, it's not suitable for merge from a code standpoint (and
while I've worked out some kinks, I still think there are lingering
issues) and so is included only for informational purposes.

Otherwise, patches 1-3 survive xfstests using a couple thin provisioned
volumes with '-o discard.' More '-o discard' test runs with alternate
configurations are in progress, but it hasn't spontaneously combusted to
this point so I'm posting for review.

Thoughts, reviews, flames appreciated.

Brian

v1:
- Define and use bmapi flag for eofblocks trims.
- Separate patch to skip discards of unwritten extents.
- Various factoring cleanups.
rfc: https://marc.info/?l=linux-xfs&m=152476598825563&w=2

Brian Foster (4):
  xfs: add bmapi nodiscard flag
  xfs: skip online discard during eofblocks trims
  xfs: don't discard on free of unwritten extents
  xfs: convert speculative preallocation to unwritten extents

 fs/xfs/libxfs/xfs_alloc.c  | 10 +++++---
 fs/xfs/libxfs/xfs_alloc.h  | 27 ++++++++++++++++++--
 fs/xfs/libxfs/xfs_bmap.c   | 17 ++++++++++---
 fs/xfs/libxfs/xfs_bmap.h   | 30 ++++++++++++++++++++--
 fs/xfs/xfs_aops.c          |  3 +++
 fs/xfs/xfs_attr_inactive.c |  3 ++-
 fs/xfs/xfs_bmap_util.c     |  2 +-
 fs/xfs/xfs_extfree_item.c  |  2 +-
 fs/xfs/xfs_inode.c         | 11 +++++---
 fs/xfs/xfs_inode.h         |  2 +-
 fs/xfs/xfs_iomap.c         | 62 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_iops.c          |  3 ++-
 fs/xfs/xfs_qm_syscalls.c   |  2 +-
 fs/xfs/xfs_trans.h         |  3 ++-
 fs/xfs/xfs_trans_extfree.c | 13 +++++++---
 15 files changed, 165 insertions(+), 25 deletions(-)

-- 
2.14.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] xfs: add bmapi nodiscard flag
  2018-05-07 18:11 [PATCH 0/4] xfs: skip unnecessary discards Brian Foster
@ 2018-05-07 18:11 ` Brian Foster
  2018-05-07 18:11 ` [PATCH 2/4] xfs: skip online discard during eofblocks trims Brian Foster
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2018-05-07 18:11 UTC (permalink / raw)
  To: linux-xfs

Freed extents are unconditionally discarded when online discard is
enabled. Define XFS_BMAPI_NODISCARD to allow callers to bypass
discards when unnecessary. For example, this will be useful for
eofblocks trimming.

This patch does not change behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c  | 10 +++++++---
 fs/xfs/libxfs/xfs_alloc.h  | 27 +++++++++++++++++++++++++--
 fs/xfs/libxfs/xfs_bmap.c   | 16 ++++++++++++----
 fs/xfs/libxfs/xfs_bmap.h   | 30 ++++++++++++++++++++++++++++--
 fs/xfs/xfs_extfree_item.c  |  2 +-
 fs/xfs/xfs_trans.h         |  3 ++-
 fs/xfs/xfs_trans_extfree.c | 13 +++++++++----
 7 files changed, 84 insertions(+), 17 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 4bcc095fe44a..06e4b0c47c01 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2949,18 +2949,20 @@ xfs_free_extent_fix_freelist(
  * after fixing up the freelist.
  */
 int				/* error */
-xfs_free_extent(
+__xfs_free_extent(
 	struct xfs_trans	*tp,	/* transaction pointer */
 	xfs_fsblock_t		bno,	/* starting block number of extent */
 	xfs_extlen_t		len,	/* length of extent */
 	struct xfs_owner_info	*oinfo,	/* extent owner */
-	enum xfs_ag_resv_type	type)	/* block reservation type */
+	enum xfs_ag_resv_type	type,	/* block reservation type */
+	bool			skip_discard)
 {
 	struct xfs_mount	*mp = tp->t_mountp;
 	struct xfs_buf		*agbp;
 	xfs_agnumber_t		agno = XFS_FSB_TO_AGNO(mp, bno);
 	xfs_agblock_t		agbno = XFS_FSB_TO_AGBNO(mp, bno);
 	int			error;
+	unsigned int		busy_flags = 0;
 
 	ASSERT(len != 0);
 	ASSERT(type != XFS_AG_RESV_AGFL);
@@ -2984,7 +2986,9 @@ xfs_free_extent(
 	if (error)
 		goto err;
 
-	xfs_extent_busy_insert(tp, agno, agbno, len, 0);
+	if (skip_discard)
+		busy_flags |= XFS_EXTENT_BUSY_SKIP_DISCARD;
+	xfs_extent_busy_insert(tp, agno, agbno, len, busy_flags);
 	return 0;
 
 err:
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index cbf789ea5a4e..4a8e8dcaf352 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -191,12 +191,35 @@ xfs_alloc_vextent(
  * Free an extent.
  */
 int				/* error */
-xfs_free_extent(
+__xfs_free_extent(
 	struct xfs_trans	*tp,	/* transaction pointer */
 	xfs_fsblock_t		bno,	/* starting block number of extent */
 	xfs_extlen_t		len,	/* length of extent */
 	struct xfs_owner_info	*oinfo,	/* extent owner */
-	enum xfs_ag_resv_type	type);	/* block reservation type */
+	enum xfs_ag_resv_type	type,	/* block reservation type */
+	bool			skip_discard);
+
+static inline int
+xfs_free_extent(
+	struct xfs_trans	*tp,
+	xfs_fsblock_t		bno,
+	xfs_extlen_t		len,
+	struct xfs_owner_info	*oinfo,
+	enum xfs_ag_resv_type	type)
+{
+	return __xfs_free_extent(tp, bno, len, oinfo, type, false);
+}
+
+static inline int
+xfs_free_extent_nodiscard(
+	struct xfs_trans	*tp,
+	xfs_fsblock_t		bno,
+	xfs_extlen_t		len,
+	struct xfs_owner_info	*oinfo,
+	enum xfs_ag_resv_type	type)
+{
+	return __xfs_free_extent(tp, bno, len, oinfo, type, true);
+}
 
 int				/* error */
 xfs_alloc_lookup_le(
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 040eeda8426f..b171f4185adb 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -542,12 +542,13 @@ xfs_bmap_validate_ret(
  * The list is maintained sorted (by block number).
  */
 void
-xfs_bmap_add_free(
+__xfs_bmap_add_free(
 	struct xfs_mount		*mp,
 	struct xfs_defer_ops		*dfops,
 	xfs_fsblock_t			bno,
 	xfs_filblks_t			len,
-	struct xfs_owner_info		*oinfo)
+	struct xfs_owner_info		*oinfo,
+	bool				skip_discard)
 {
 	struct xfs_extent_free_item	*new;		/* new element */
 #ifdef DEBUG
@@ -574,6 +575,7 @@ xfs_bmap_add_free(
 		new->xefi_oinfo = *oinfo;
 	else
 		xfs_rmap_skip_owner_update(&new->xefi_oinfo);
+	new->xefi_skip_discard = skip_discard;
 	trace_xfs_bmap_free_defer(mp, XFS_FSB_TO_AGNO(mp, bno), 0,
 			XFS_FSB_TO_AGBNO(mp, bno), len);
 	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_FREE, &new->xefi_list);
@@ -5104,9 +5106,15 @@ xfs_bmap_del_extent_real(
 			error = xfs_refcount_decrease_extent(mp, dfops, del);
 			if (error)
 				goto done;
-		} else
-			xfs_bmap_add_free(mp, dfops, del->br_startblock,
+		} else {
+			if (bflags & XFS_BMAPI_NODISCARD)
+				xfs_bmap_add_free_nodiscard(mp, dfops,
+					del->br_startblock, del->br_blockcount,
+					NULL);
+			else
+				xfs_bmap_add_free(mp, dfops, del->br_startblock,
 					del->br_blockcount, NULL);
+		}
 	}
 
 	/*
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index 2b766b37096d..d8832e049636 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -68,6 +68,7 @@ struct xfs_extent_free_item
 	xfs_extlen_t		xefi_blockcount;/* number of blocks in extent */
 	struct list_head	xefi_list;
 	struct xfs_owner_info	xefi_oinfo;	/* extent owner */
+	bool			xefi_skip_discard;
 };
 
 #define	XFS_BMAP_MAX_NMAP	4
@@ -116,6 +117,9 @@ struct xfs_extent_free_item
 /* Only convert unwritten extents, don't allocate new blocks */
 #define XFS_BMAPI_CONVERT_ONLY	0x800
 
+/* Skip online discard of freed extents */
+#define XFS_BMAPI_NODISCARD	0x1000
+
 #define XFS_BMAPI_FLAGS \
 	{ XFS_BMAPI_ENTIRE,	"ENTIRE" }, \
 	{ XFS_BMAPI_METADATA,	"METADATA" }, \
@@ -192,9 +196,9 @@ void	xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno,
 void	xfs_trim_extent_eof(struct xfs_bmbt_irec *, struct xfs_inode *);
 int	xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
 void	xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
-void	xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
+void	__xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
 			  xfs_fsblock_t bno, xfs_filblks_t len,
-			  struct xfs_owner_info *oinfo);
+			  struct xfs_owner_info *oinfo, bool skip_discard);
 void	xfs_bmap_compute_maxlevels(struct xfs_mount *mp, int whichfork);
 int	xfs_bmap_first_unused(struct xfs_trans *tp, struct xfs_inode *ip,
 		xfs_extlen_t len, xfs_fileoff_t *unused, int whichfork);
@@ -240,6 +244,28 @@ int	xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int whichfork,
 		struct xfs_bmbt_irec *got, struct xfs_iext_cursor *cur,
 		int eof);
 
+static inline void
+xfs_bmap_add_free(
+	struct xfs_mount		*mp,
+	struct xfs_defer_ops		*dfops,
+	xfs_fsblock_t			bno,
+	xfs_filblks_t			len,
+	struct xfs_owner_info		*oinfo)
+{
+	__xfs_bmap_add_free(mp, dfops, bno, len, oinfo, false);
+}
+
+static inline void
+xfs_bmap_add_free_nodiscard(
+	struct xfs_mount		*mp,
+	struct xfs_defer_ops		*dfops,
+	xfs_fsblock_t			bno,
+	xfs_filblks_t			len,
+	struct xfs_owner_info		*oinfo)
+{
+	__xfs_bmap_add_free(mp, dfops, bno, len, oinfo, true);
+}
+
 enum xfs_bmap_intent_type {
 	XFS_BMAP_MAP = 1,
 	XFS_BMAP_UNMAP,
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index b5b1e567b9f4..4735a31793b0 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -542,7 +542,7 @@ xfs_efi_recover(
 	for (i = 0; i < efip->efi_format.efi_nextents; i++) {
 		extp = &efip->efi_format.efi_extents[i];
 		error = xfs_trans_free_extent(tp, efdp, extp->ext_start,
-					      extp->ext_len, &oinfo);
+					      extp->ext_len, &oinfo, false);
 		if (error)
 			goto abort_error;
 
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 9d542dfe0052..d5be3f6a3e8f 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -228,7 +228,8 @@ struct xfs_efd_log_item	*xfs_trans_get_efd(struct xfs_trans *,
 				  uint);
 int		xfs_trans_free_extent(struct xfs_trans *,
 				      struct xfs_efd_log_item *, xfs_fsblock_t,
-				      xfs_extlen_t, struct xfs_owner_info *);
+				      xfs_extlen_t, struct xfs_owner_info *,
+				      bool);
 int		xfs_trans_commit(struct xfs_trans *);
 int		xfs_trans_roll(struct xfs_trans **);
 int		xfs_trans_roll_inode(struct xfs_trans **, struct xfs_inode *);
diff --git a/fs/xfs/xfs_trans_extfree.c b/fs/xfs/xfs_trans_extfree.c
index ab438647592a..dab8b3b7a9b8 100644
--- a/fs/xfs/xfs_trans_extfree.c
+++ b/fs/xfs/xfs_trans_extfree.c
@@ -68,7 +68,8 @@ xfs_trans_free_extent(
 	struct xfs_efd_log_item	*efdp,
 	xfs_fsblock_t		start_block,
 	xfs_extlen_t		ext_len,
-	struct xfs_owner_info	*oinfo)
+	struct xfs_owner_info	*oinfo,
+	bool			skip_discard)
 {
 	struct xfs_mount	*mp = tp->t_mountp;
 	uint			next_extent;
@@ -79,8 +80,12 @@ xfs_trans_free_extent(
 
 	trace_xfs_bmap_free_deferred(tp->t_mountp, agno, 0, agbno, ext_len);
 
-	error = xfs_free_extent(tp, start_block, ext_len, oinfo,
-			XFS_AG_RESV_NONE);
+	if (skip_discard)
+		error = xfs_free_extent_nodiscard(tp, start_block, ext_len,
+						  oinfo, XFS_AG_RESV_NONE);
+	else
+		error = xfs_free_extent(tp, start_block, ext_len, oinfo,
+					XFS_AG_RESV_NONE);
 
 	/*
 	 * Mark the transaction dirty, even on error. This ensures the
@@ -195,7 +200,7 @@ xfs_extent_free_finish_item(
 	error = xfs_trans_free_extent(tp, done_item,
 			free->xefi_startblock,
 			free->xefi_blockcount,
-			&free->xefi_oinfo);
+			&free->xefi_oinfo, free->xefi_skip_discard);
 	kmem_free(free);
 	return error;
 }
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] xfs: skip online discard during eofblocks trims
  2018-05-07 18:11 [PATCH 0/4] xfs: skip unnecessary discards Brian Foster
  2018-05-07 18:11 ` [PATCH 1/4] xfs: add bmapi nodiscard flag Brian Foster
@ 2018-05-07 18:11 ` Brian Foster
  2018-05-07 23:38   ` Dave Chinner
  2018-05-07 18:11 ` [PATCH 3/4] xfs: don't discard on free of unwritten extents Brian Foster
  2018-05-07 18:11 ` [PATCH RFC 4/4] xfs: convert speculative preallocation to " Brian Foster
  3 siblings, 1 reply; 9+ messages in thread
From: Brian Foster @ 2018-05-07 18:11 UTC (permalink / raw)
  To: linux-xfs

We've had reports of online discard operations being sent from XFS
on write-only workloads. These discards occur as a result of
eofblocks trims that can occur after a large file copy completes.

These discards are slightly confusing for users who might be paying
close attention to online discards (i.e., vdo) due to performance
sensitivity. They also happen to be spurious because freed post-eof
blocks by definition have not been written to during the current
allocation cycle.

Update xfs_free_eofblocks() to skip discards that are purely
attributed to eofblocks trims. This cuts down the number of spurious
discards that may occur on write-only workloads due to normal
preallocation activity.

Note that discards of post-eof extents can still occur from other
codepaths that do not isolate handling of post-eof blocks from those
within eof. For example, file unlinks and truncates may still cause
discards for any file blocks affected by the operation.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_attr_inactive.c |  3 ++-
 fs/xfs/xfs_bmap_util.c     |  2 +-
 fs/xfs/xfs_inode.c         | 11 ++++++++---
 fs/xfs/xfs_inode.h         |  2 +-
 fs/xfs/xfs_iops.c          |  3 ++-
 fs/xfs/xfs_qm_syscalls.c   |  2 +-
 6 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
index 52818ea2eb50..639311e60c5d 100644
--- a/fs/xfs/xfs_attr_inactive.c
+++ b/fs/xfs/xfs_attr_inactive.c
@@ -434,7 +434,8 @@ xfs_attr_inactive(
 		if (error)
 			goto out_cancel;
 
-		error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0);
+		error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0,
+					      false);
 		if (error)
 			goto out_cancel;
 	}
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 8cd8c412f52d..7b95d8976488 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -872,7 +872,7 @@ xfs_free_eofblocks(
 		 * may be full of holes (ie NULL files bug).
 		 */
 		error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK,
-					      XFS_ISIZE(ip));
+					      XFS_ISIZE(ip), true);
 		if (error) {
 			/*
 			 * If we get an error at this point we simply don't
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 2b70c8b4cee2..03b2b7099f90 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1538,7 +1538,8 @@ xfs_itruncate_extents(
 	struct xfs_trans	**tpp,
 	struct xfs_inode	*ip,
 	int			whichfork,
-	xfs_fsize_t		new_size)
+	xfs_fsize_t		new_size,
+	bool			skip_discard)
 {
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_trans	*tp = *tpp;
@@ -1549,6 +1550,7 @@ xfs_itruncate_extents(
 	xfs_filblks_t		unmap_len;
 	int			error = 0;
 	int			done = 0;
+	int			flags = 0;
 
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
 	ASSERT(!atomic_read(&VFS_I(ip)->i_count) ||
@@ -1561,6 +1563,9 @@ xfs_itruncate_extents(
 
 	trace_xfs_itruncate_extents_start(ip, new_size);
 
+	if (skip_discard)
+		flags |= XFS_BMAPI_NODISCARD;
+
 	/*
 	 * Since it is possible for space to become allocated beyond
 	 * the end of the file (in a crash where the space is allocated
@@ -1581,7 +1586,7 @@ xfs_itruncate_extents(
 		xfs_defer_init(&dfops, &first_block);
 		error = xfs_bunmapi(tp, ip,
 				    first_unmap_block, unmap_len,
-				    xfs_bmapi_aflag(whichfork),
+				    xfs_bmapi_aflag(whichfork) | flags,
 				    XFS_ITRUNC_MAX_EXTENTS,
 				    &first_block, &dfops,
 				    &done);
@@ -1743,7 +1748,7 @@ xfs_inactive_truncate(
 	ip->i_d.di_size = 0;
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
-	error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK, 0);
+	error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK, 0, false);
 	if (error)
 		goto error_trans_cancel;
 
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 1eebc53df7d7..b7877692ce4b 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -416,7 +416,7 @@ uint		xfs_ip2xflags(struct xfs_inode *);
 int		xfs_ifree(struct xfs_trans *, xfs_inode_t *,
 			   struct xfs_defer_ops *);
 int		xfs_itruncate_extents(struct xfs_trans **, struct xfs_inode *,
-				      int, xfs_fsize_t);
+				      int, xfs_fsize_t, bool);
 void		xfs_iext_realloc(xfs_inode_t *, int, int);
 
 void		xfs_iunpin_wait(xfs_inode_t *);
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index a3ed3c811dfa..fee5fb4b45de 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -966,7 +966,8 @@ xfs_setattr_size(
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
 	if (newsize <= oldsize) {
-		error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK, newsize);
+		error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK, newsize,
+					      false);
 		if (error)
 			goto out_trans_cancel;
 
diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
index 9cb5c381b01c..de498691d84a 100644
--- a/fs/xfs/xfs_qm_syscalls.c
+++ b/fs/xfs/xfs_qm_syscalls.c
@@ -248,7 +248,7 @@ xfs_qm_scall_trunc_qfile(
 	ip->i_d.di_size = 0;
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
-	error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK, 0);
+	error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK, 0, false);
 	if (error) {
 		xfs_trans_cancel(tp);
 		goto out_unlock;
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] xfs: don't discard on free of unwritten extents
  2018-05-07 18:11 [PATCH 0/4] xfs: skip unnecessary discards Brian Foster
  2018-05-07 18:11 ` [PATCH 1/4] xfs: add bmapi nodiscard flag Brian Foster
  2018-05-07 18:11 ` [PATCH 2/4] xfs: skip online discard during eofblocks trims Brian Foster
@ 2018-05-07 18:11 ` Brian Foster
  2018-05-07 23:35   ` Dave Chinner
  2018-05-07 18:11 ` [PATCH RFC 4/4] xfs: convert speculative preallocation to " Brian Foster
  3 siblings, 1 reply; 9+ messages in thread
From: Brian Foster @ 2018-05-07 18:11 UTC (permalink / raw)
  To: linux-xfs

Unwritten extents by definition have not been written to until they
are converted to normal written extents. If unwritten extents are
freed from a file, it is therefore guaranteed that the blocks have
not been written to since allocation (note that zero range punches
and reallocates blocks).

To cut down on online discards generated from workloads that make
use of preallocation, skip discards of extents if they are in the
unwritten state when the extent is freed.

Note that this optimization does not apply to log recovery, during
which all freed extents are discarded if online discard is enabled.
Also note that it may be possible for a filesystem crash to occur
after write completion of an unwritten extent but before unwritten
conversion such that the extent remains unwritten after log
recovery. Since this pseudo-inconsistency may already be possible
after a crash (consider writing to recently allocated blocks where
the allocation transaction is lost after a crash), this change
shouldn't introduce any fundamental limitations that don't already
exist. In short, on storage stacks where discards are important,
it's good practice to run an occasional fstrim even with online
discard enabled in the filesystem, particularly after a crash.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index b171f4185adb..a50c197d426f 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5107,7 +5107,8 @@ xfs_bmap_del_extent_real(
 			if (error)
 				goto done;
 		} else {
-			if (bflags & XFS_BMAPI_NODISCARD)
+			if (bflags & XFS_BMAPI_NODISCARD ||
+			    del->br_state == XFS_EXT_UNWRITTEN)
 				xfs_bmap_add_free_nodiscard(mp, dfops,
 					del->br_startblock, del->br_blockcount,
 					NULL);
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC 4/4] xfs: convert speculative preallocation to unwritten extents
  2018-05-07 18:11 [PATCH 0/4] xfs: skip unnecessary discards Brian Foster
                   ` (2 preceding siblings ...)
  2018-05-07 18:11 ` [PATCH 3/4] xfs: don't discard on free of unwritten extents Brian Foster
@ 2018-05-07 18:11 ` Brian Foster
  3 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2018-05-07 18:11 UTC (permalink / raw)
  To: linux-xfs

XXX: This is hacky/racy/broken and for educational purposes only.
     DO NOT APPLY.

Post-eof blocks are typically converted from delalloc to real extent
allocations when some file-mapped portion of the extent is converted
(at writeback time). When this occurs, sample the current size of
the file and convert any post-eof portion of the extent to an
unwritten extent.

The advantages of this behavior are:

- Eliminates the only user of XFS_BMAPI_NODISCARD (eofblocks).

- Reduces the window of stale data exposure associated with a
  filesystem crash during writeback. Incomplete writeback over
  previously converted speculative preallocation leaves behind
  unwritten extents rather than previous content of the blocks.

- Dramatically reduces the overhead of block zeroing during writes
  that jump over eof (and eofblocks). These writes currently issue
  buffered writes over all preallocated blocks, which consumes time,
  bandwidth and memory. For example, this patch improves the
  performance of the following (on a low-end vm) from ~29s to <1s:

	xfs_io -fc "falloc 0 8g"
		-c "pwrite 8g 4k"
		-c fsync
		-c "pwrite 16g 4k" <file>

  The above may also facilitate further optimizations such as
  invoking a selective/partial writeback @ EOF in
  xfs_file_aio_write_checks() to proactively convert preallocation.

The problems with this patch are that it is quite hacky and racy
with respect to buffered writes that may occur over preallocated
extents in parallel with writeback. This requires careful attention
to i_size as well as some hacks to ensure that delalloc page buffers
are handled properly when the underlying extent may have been
converted after the page was allocated (observed during generic/032
and generic/464). Finally, the separate conversion means that the
operation requires optimistic transaction block reservation rather
than relying on indlen blocks originally attached to the delalloc
extents.

The broader goal in this area of the writeback code is the
elimination of buffer_heads entirely, replaced with reliance on
underlying extent state, and unconditional conversion of all
delalloc extents to unwritten. This closes the stale data exposure
problem completely and eliminates the need for the selective
post-eof hack implemented here.

As a result, this patch was an experiment to document the complexity
involved with and benefits of selective unwritten conversion of
post-eof blocks. The fact that it still requires mucking around with
page buffer handling suggests that in its current form, it's
probably best to defer this to a broader writeback rework that
treats all delalloc blocks consistently.

Not-Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_aops.c  |  3 +++
 fs/xfs/xfs_iomap.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 0ab824f574ed..3f507125d623 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -979,6 +979,9 @@ xfs_writepage_map(
 					     wpc->io_type);
 			if (error)
 				goto out;
+			if (wpc->io_type == XFS_IO_DELALLOC &&
+			    wpc->imap.br_state == XFS_EXT_UNWRITTEN)
+				wpc->io_type = XFS_IO_UNWRITTEN;
 			wpc->imap_valid = xfs_imap_valid(inode, &wpc->imap,
 							 offset);
 		}
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 046469fcc1b8..670404025ece 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -657,6 +657,54 @@ xfs_file_iomap_begin_delay(
 	return error;
 }
 
+/*
+ * Convert any post-eof blocks that are part of the current allocation to
+ * unwritten extents. This optimizations prevents unnecessary online discards of
+ * speculative prealloc on file removal or truncate. It also prevents zeroing on
+ * file extending writes that might skip over preallocated blocks.
+ *
+ * NOTE: This can be removed once we finally kill off buffer_heads and convert
+ * all delalloc blocks to unwritten by default.
+ */
+static int
+xfs_iomap_convert_eofblocks(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	struct xfs_bmbt_irec	*imap,
+	xfs_fileoff_t		end_fsb,
+	xfs_fsblock_t		*first_block,
+	struct xfs_defer_ops	*dfops,
+	xfs_fileoff_t		*conv_fsb)
+{
+	struct xfs_mount	*mp	= ip->i_mount;
+	struct xfs_bmbt_irec	crec	= *imap;
+	int			nimaps	= 1;
+	int			error;
+	int64_t			nres;
+
+	/* check for post-eof blocks, nothing to do there are none */
+	xfs_trim_extent(&crec, end_fsb, crec.br_blockcount);
+	if (!crec.br_blockcount || crec.br_blockcount == imap->br_blockcount)
+		return 0;
+	ASSERT(crec.br_state == XFS_EXT_NORM);
+
+	/* XXX: transaction hack */
+	nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
+	error = xfs_mod_fdblocks(tp->t_mountp, -nres, false);
+	if (error)
+		return error == -ENOSPC ? 0 : error;
+	tp->t_blk_res += nres;
+
+	*conv_fsb = crec.br_startoff;
+
+	error = xfs_bmapi_write(tp, ip, crec.br_startoff, crec.br_blockcount,
+				XFS_BMAPI_PREALLOC|XFS_BMAPI_CONVERT,
+				first_block, nres, &crec, &nimaps, dfops);
+	if (error)
+		return error;
+	return 0;
+}
+
 /*
  * Pass in a delayed allocate extent, convert it to real extents;
  * return to the caller the extent we create which maps on top of
@@ -677,6 +725,7 @@ xfs_iomap_write_allocate(
 	xfs_mount_t	*mp = ip->i_mount;
 	xfs_fileoff_t	offset_fsb, last_block;
 	xfs_fileoff_t	end_fsb, map_start_fsb;
+	xfs_fileoff_t	conv_fsb;
 	xfs_fsblock_t	first_block;
 	struct xfs_defer_ops	dfops;
 	xfs_filblks_t	count_fsb;
@@ -712,6 +761,7 @@ xfs_iomap_write_allocate(
 		 * but before our buffer starts.
 		 */
 		nimaps = 0;
+		conv_fsb = NULLFILEOFF;
 		while (nimaps == 0) {
 			nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
 			/*
@@ -785,6 +835,11 @@ xfs_iomap_write_allocate(
 						count_fsb, flags, &first_block,
 						nres, imap, &nimaps,
 						&dfops);
+			/* convert any speculative prealloc in this mapping */
+			if (!error && nimaps)
+				error = xfs_iomap_convert_eofblocks(tp, ip,
+						imap, end_fsb, &first_block, &dfops,
+						&conv_fsb);
 			if (error)
 				goto trans_cancel;
 
@@ -809,6 +864,13 @@ xfs_iomap_write_allocate(
 		if ((offset_fsb >= imap->br_startoff) &&
 		    (offset_fsb < (imap->br_startoff +
 				   imap->br_blockcount))) {
+			if (conv_fsb != NULLFILEOFF) {
+				xfs_trim_extent(imap, imap->br_startoff,
+						conv_fsb - imap->br_startoff);
+				ASSERT(offset_fsb >= imap->br_startoff &&
+				       offset_fsb < (imap->br_startoff +
+						     imap->br_blockcount));
+			}
 			XFS_STATS_INC(mp, xs_xstrat_quick);
 			return 0;
 		}
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] xfs: don't discard on free of unwritten extents
  2018-05-07 18:11 ` [PATCH 3/4] xfs: don't discard on free of unwritten extents Brian Foster
@ 2018-05-07 23:35   ` Dave Chinner
  2018-05-08 12:36     ` Brian Foster
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2018-05-07 23:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Mon, May 07, 2018 at 02:11:37PM -0400, Brian Foster wrote:
> Unwritten extents by definition have not been written to until they
> are converted to normal written extents. If unwritten extents are
> freed from a file, it is therefore guaranteed that the blocks have
> not been written to since allocation (note that zero range punches
> and reallocates blocks).
> 
> To cut down on online discards generated from workloads that make
> use of preallocation, skip discards of extents if they are in the
> unwritten state when the extent is freed.
> 
> Note that this optimization does not apply to log recovery, during
> which all freed extents are discarded if online discard is enabled.
> Also note that it may be possible for a filesystem crash to occur
> after write completion of an unwritten extent but before unwritten
> conversion such that the extent remains unwritten after log
> recovery. Since this pseudo-inconsistency may already be possible
> after a crash (consider writing to recently allocated blocks where
> the allocation transaction is lost after a crash), this change
> shouldn't introduce any fundamental limitations that don't already
> exist. In short, on storage stacks where discards are important,
> it's good practice to run an occasional fstrim even with online
> discard enabled in the filesystem, particularly after a crash.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index b171f4185adb..a50c197d426f 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -5107,7 +5107,8 @@ xfs_bmap_del_extent_real(
>  			if (error)
>  				goto done;
>  		} else {
> -			if (bflags & XFS_BMAPI_NODISCARD)
> +			if (bflags & XFS_BMAPI_NODISCARD ||
> +			    del->br_state == XFS_EXT_UNWRITTEN)

Doesn't gcc warn about parenthesis challenged logic like this by
default these days?

I also find code with multi-line function calls easier to follow if
there are {} aroudn them like so:

			if ((bflags & XFS_BMAPI_NODISCARD) ||
			    del->br_state == XFS_EXT_UNWRITTEN) {
>  				xfs_bmap_add_free_nodiscard(mp, dfops,
>  					del->br_startblock, del->br_blockcount,
>  					NULL);
			}


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] xfs: skip online discard during eofblocks trims
  2018-05-07 18:11 ` [PATCH 2/4] xfs: skip online discard during eofblocks trims Brian Foster
@ 2018-05-07 23:38   ` Dave Chinner
  2018-05-08 12:35     ` Brian Foster
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2018-05-07 23:38 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Mon, May 07, 2018 at 02:11:36PM -0400, Brian Foster wrote:
> We've had reports of online discard operations being sent from XFS
> on write-only workloads. These discards occur as a result of
> eofblocks trims that can occur after a large file copy completes.
> 
> These discards are slightly confusing for users who might be paying
> close attention to online discards (i.e., vdo) due to performance
> sensitivity. They also happen to be spurious because freed post-eof
> blocks by definition have not been written to during the current
> allocation cycle.
> 
> Update xfs_free_eofblocks() to skip discards that are purely
> attributed to eofblocks trims. This cuts down the number of spurious
> discards that may occur on write-only workloads due to normal
> preallocation activity.
> 
> Note that discards of post-eof extents can still occur from other
> codepaths that do not isolate handling of post-eof blocks from those
> within eof. For example, file unlinks and truncates may still cause
> discards for any file blocks affected by the operation.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_attr_inactive.c |  3 ++-
>  fs/xfs/xfs_bmap_util.c     |  2 +-
>  fs/xfs/xfs_inode.c         | 11 ++++++++---
>  fs/xfs/xfs_inode.h         |  2 +-
>  fs/xfs/xfs_iops.c          |  3 ++-
>  fs/xfs/xfs_qm_syscalls.c   |  2 +-
>  6 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> index 52818ea2eb50..639311e60c5d 100644
> --- a/fs/xfs/xfs_attr_inactive.c
> +++ b/fs/xfs/xfs_attr_inactive.c
> @@ -434,7 +434,8 @@ xfs_attr_inactive(
>  		if (error)
>  			goto out_cancel;
>  
> -		error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0);
> +		error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0,
> +					      false);
>  		if (error)
>  			goto out_cancel;
>  	}
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 8cd8c412f52d..7b95d8976488 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -872,7 +872,7 @@ xfs_free_eofblocks(
>  		 * may be full of holes (ie NULL files bug).
>  		 */
>  		error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK,
> -					      XFS_ISIZE(ip));
> +					      XFS_ISIZE(ip), true);

xfs_itruncate_extents_nodiscard(), and then all the other
xfs_itruncate_extents() can remain unchanged.

>  		if (error) {
>  			/*
>  			 * If we get an error at this point we simply don't
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 2b70c8b4cee2..03b2b7099f90 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1538,7 +1538,8 @@ xfs_itruncate_extents(
>  	struct xfs_trans	**tpp,
>  	struct xfs_inode	*ip,
>  	int			whichfork,
> -	xfs_fsize_t		new_size)
> +	xfs_fsize_t		new_size,
> +	bool			skip_discard)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_trans	*tp = *tpp;
> @@ -1549,6 +1550,7 @@ xfs_itruncate_extents(
>  	xfs_filblks_t		unmap_len;
>  	int			error = 0;
>  	int			done = 0;
> +	int			flags = 0;
>  
>  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
>  	ASSERT(!atomic_read(&VFS_I(ip)->i_count) ||
> @@ -1561,6 +1563,9 @@ xfs_itruncate_extents(
>  
>  	trace_xfs_itruncate_extents_start(ip, new_size);
>  
> +	if (skip_discard)
> +		flags |= XFS_BMAPI_NODISCARD;
> +

	flags = xfs_bmapi_aflag(whichfork);
	if (skip_discard)
		flags |= XFS_BMAPI_NODISCARD;

>  	/*
>  	 * Since it is possible for space to become allocated beyond
>  	 * the end of the file (in a crash where the space is allocated
> @@ -1581,7 +1586,7 @@ xfs_itruncate_extents(
>  		xfs_defer_init(&dfops, &first_block);
>  		error = xfs_bunmapi(tp, ip,
>  				    first_unmap_block, unmap_len,
> -				    xfs_bmapi_aflag(whichfork),
> +				    xfs_bmapi_aflag(whichfork) | flags,

so this gets simpler and easier to read.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] xfs: skip online discard during eofblocks trims
  2018-05-07 23:38   ` Dave Chinner
@ 2018-05-08 12:35     ` Brian Foster
  0 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2018-05-08 12:35 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, May 08, 2018 at 09:38:19AM +1000, Dave Chinner wrote:
> On Mon, May 07, 2018 at 02:11:36PM -0400, Brian Foster wrote:
> > We've had reports of online discard operations being sent from XFS
> > on write-only workloads. These discards occur as a result of
> > eofblocks trims that can occur after a large file copy completes.
> > 
> > These discards are slightly confusing for users who might be paying
> > close attention to online discards (i.e., vdo) due to performance
> > sensitivity. They also happen to be spurious because freed post-eof
> > blocks by definition have not been written to during the current
> > allocation cycle.
> > 
> > Update xfs_free_eofblocks() to skip discards that are purely
> > attributed to eofblocks trims. This cuts down the number of spurious
> > discards that may occur on write-only workloads due to normal
> > preallocation activity.
> > 
> > Note that discards of post-eof extents can still occur from other
> > codepaths that do not isolate handling of post-eof blocks from those
> > within eof. For example, file unlinks and truncates may still cause
> > discards for any file blocks affected by the operation.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_attr_inactive.c |  3 ++-
> >  fs/xfs/xfs_bmap_util.c     |  2 +-
> >  fs/xfs/xfs_inode.c         | 11 ++++++++---
> >  fs/xfs/xfs_inode.h         |  2 +-
> >  fs/xfs/xfs_iops.c          |  3 ++-
> >  fs/xfs/xfs_qm_syscalls.c   |  2 +-
> >  6 files changed, 15 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > index 52818ea2eb50..639311e60c5d 100644
> > --- a/fs/xfs/xfs_attr_inactive.c
> > +++ b/fs/xfs/xfs_attr_inactive.c
> > @@ -434,7 +434,8 @@ xfs_attr_inactive(
> >  		if (error)
> >  			goto out_cancel;
> >  
> > -		error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0);
> > +		error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0,
> > +					      false);
> >  		if (error)
> >  			goto out_cancel;
> >  	}
> > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> > index 8cd8c412f52d..7b95d8976488 100644
> > --- a/fs/xfs/xfs_bmap_util.c
> > +++ b/fs/xfs/xfs_bmap_util.c
> > @@ -872,7 +872,7 @@ xfs_free_eofblocks(
> >  		 * may be full of holes (ie NULL files bug).
> >  		 */
> >  		error = xfs_itruncate_extents(&tp, ip, XFS_DATA_FORK,
> > -					      XFS_ISIZE(ip));
> > +					      XFS_ISIZE(ip), true);
> 
> xfs_itruncate_extents_nodiscard(), and then all the other
> xfs_itruncate_extents() can remain unchanged.
> 

Oops, yeah.. I'll apply the same wrapper factoring approach here.

> >  		if (error) {
> >  			/*
> >  			 * If we get an error at this point we simply don't
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index 2b70c8b4cee2..03b2b7099f90 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -1538,7 +1538,8 @@ xfs_itruncate_extents(
> >  	struct xfs_trans	**tpp,
> >  	struct xfs_inode	*ip,
> >  	int			whichfork,
> > -	xfs_fsize_t		new_size)
> > +	xfs_fsize_t		new_size,
> > +	bool			skip_discard)
> >  {
> >  	struct xfs_mount	*mp = ip->i_mount;
> >  	struct xfs_trans	*tp = *tpp;
> > @@ -1549,6 +1550,7 @@ xfs_itruncate_extents(
> >  	xfs_filblks_t		unmap_len;
> >  	int			error = 0;
> >  	int			done = 0;
> > +	int			flags = 0;
> >  
> >  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> >  	ASSERT(!atomic_read(&VFS_I(ip)->i_count) ||
> > @@ -1561,6 +1563,9 @@ xfs_itruncate_extents(
> >  
> >  	trace_xfs_itruncate_extents_start(ip, new_size);
> >  
> > +	if (skip_discard)
> > +		flags |= XFS_BMAPI_NODISCARD;
> > +
> 
> 	flags = xfs_bmapi_aflag(whichfork);
> 	if (skip_discard)
> 		flags |= XFS_BMAPI_NODISCARD;
> 
> >  	/*
> >  	 * Since it is possible for space to become allocated beyond
> >  	 * the end of the file (in a crash where the space is allocated
> > @@ -1581,7 +1586,7 @@ xfs_itruncate_extents(
> >  		xfs_defer_init(&dfops, &first_block);
> >  		error = xfs_bunmapi(tp, ip,
> >  				    first_unmap_block, unmap_len,
> > -				    xfs_bmapi_aflag(whichfork),
> > +				    xfs_bmapi_aflag(whichfork) | flags,
> 
> so this gets simpler and easier to read.
> 

Indeed, thanks.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] xfs: don't discard on free of unwritten extents
  2018-05-07 23:35   ` Dave Chinner
@ 2018-05-08 12:36     ` Brian Foster
  0 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2018-05-08 12:36 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, May 08, 2018 at 09:35:12AM +1000, Dave Chinner wrote:
> On Mon, May 07, 2018 at 02:11:37PM -0400, Brian Foster wrote:
> > Unwritten extents by definition have not been written to until they
> > are converted to normal written extents. If unwritten extents are
> > freed from a file, it is therefore guaranteed that the blocks have
> > not been written to since allocation (note that zero range punches
> > and reallocates blocks).
> > 
> > To cut down on online discards generated from workloads that make
> > use of preallocation, skip discards of extents if they are in the
> > unwritten state when the extent is freed.
> > 
> > Note that this optimization does not apply to log recovery, during
> > which all freed extents are discarded if online discard is enabled.
> > Also note that it may be possible for a filesystem crash to occur
> > after write completion of an unwritten extent but before unwritten
> > conversion such that the extent remains unwritten after log
> > recovery. Since this pseudo-inconsistency may already be possible
> > after a crash (consider writing to recently allocated blocks where
> > the allocation transaction is lost after a crash), this change
> > shouldn't introduce any fundamental limitations that don't already
> > exist. In short, on storage stacks where discards are important,
> > it's good practice to run an occasional fstrim even with online
> > discard enabled in the filesystem, particularly after a crash.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/libxfs/xfs_bmap.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index b171f4185adb..a50c197d426f 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -5107,7 +5107,8 @@ xfs_bmap_del_extent_real(
> >  			if (error)
> >  				goto done;
> >  		} else {
> > -			if (bflags & XFS_BMAPI_NODISCARD)
> > +			if (bflags & XFS_BMAPI_NODISCARD ||
> > +			    del->br_state == XFS_EXT_UNWRITTEN)
> 
> Doesn't gcc warn about parenthesis challenged logic like this by
> default these days?
> 
> I also find code with multi-line function calls easier to follow if
> there are {} aroudn them like so:
> 

Ok, thanks..

Brian

> 			if ((bflags & XFS_BMAPI_NODISCARD) ||
> 			    del->br_state == XFS_EXT_UNWRITTEN) {
> >  				xfs_bmap_add_free_nodiscard(mp, dfops,
> >  					del->br_startblock, del->br_blockcount,
> >  					NULL);
> 			}
> 
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-05-08 12:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-07 18:11 [PATCH 0/4] xfs: skip unnecessary discards Brian Foster
2018-05-07 18:11 ` [PATCH 1/4] xfs: add bmapi nodiscard flag Brian Foster
2018-05-07 18:11 ` [PATCH 2/4] xfs: skip online discard during eofblocks trims Brian Foster
2018-05-07 23:38   ` Dave Chinner
2018-05-08 12:35     ` Brian Foster
2018-05-07 18:11 ` [PATCH 3/4] xfs: don't discard on free of unwritten extents Brian Foster
2018-05-07 23:35   ` Dave Chinner
2018-05-08 12:36     ` Brian Foster
2018-05-07 18:11 ` [PATCH RFC 4/4] xfs: convert speculative preallocation to " Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.