All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFCv2 0/9] xfs: defer agfl block frees
@ 2018-01-24 18:44 Brian Foster
  2018-01-24 18:44 ` [RFCv2 1/9] xfs: create agfl block free helper function Brian Foster
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

Hi all,

Here's a second RFC of the mechanism to defer AGFL block frees. The
purpose of this change is to minimize log reservation consumption (and
thus reduce the likelihood of overruns) due to too many allocation
operations being performed in a single transaction context.

The major difference in this version is that rather than explicitly
plumb an xfs_defer_ops structure to the allocator code from specific
contexts where we want to defer AGFL block frees, we update struct
xfs_trans to carry an optional dfops reference and use that to decide
whether to defer such frees during AGFL fixups.

This approach implies a few additional thoughts:

- We eventually want to defer AGFL block frees by default (or at least
  from any code that already has a use for xfs_defer_ops).
- With most transactions converted, we can facilitate further cleanups
  where we've already had to manually plumb xfs_defer_ops to places that
  already have access to the associated transaction.
- Going further, we ultimately may be able to consider some further
  dfops interface cleanups, such as associating dfops to the tx via
  xfs_defer_init(), processing deferred ops at transaction commit time,
  rethinking how dfops memory is allocated, etc.

If the broader ideas to defer AGFL block frees and to do so via
refactoring the transaction/dfops relationship are acceptable, I'd like
to take an incremental approach (starting with enabling deferred AGFL
frees from the inode-related contexts modified in this series) to
incorporate this change so we can enable the behavior in the contexts
where it's known to be most effective without having to refactor the
world all at once.

With regard to testing, this series survives some basic xfstests testing
so far. I've also run some fairly basic "performance" tests to try and
assess how much of an effect this has on transaction regrants and
whether or not it's worth bumping transaction log counts to cover
regrants due to deferred AGFL frees. The final two patches are included
for reference purposes and discuss this further. In summary, the testing
I've done so far[1] suggests that bumping logcounts is probably overkill
given the low frequency of AGFL block frees relative to all transaction
commits.

Thoughts, reviews, flames appreciated.

Brian

rfcv2:
- Use transaction to carry deferred ops struct to allocation context.
- Remove dfops param from dir interfaces where possible.
- Defer AGFL block frees from more contexts.
- Define runtime stat for transaction regrant and logcount patch (to be
  potentially removed).
rfcv1: https://marc.info/?l=linux-xfs&m=151267309423883&w=2

[1] Some notes around regrant testing using xs_regrant_logspace.

* Sustained file removal test (known to reproduce log res. overruns):

 - for-next - 6641554
 - deferred AGFL frees - 6742268
 - deferred AGFL w/ tr_inactive - 6715692
 - deferred AGFL w/ inactive+truncate - 74138

Deferring AGFL block frees increases the regrant count by about ~1.5% in
this test. Upping the tr_inactive logcount reduced this by 0.4%. The
significant effect from the bump of the tr_truncate logcount shows that
this workload is dominated by bmap extent frees, so I don't think this
particular number is relevant to the patchset.

* fsstress test against a 15GB fs (runs for ~5m and
  consumes ~9GB of available space, repeated a couple runs, all using
  the same seed value):

 - for-next - 35469, 35616
 - deferred AGFL frees - 35446, 35445
 - deferred AGFL w/ patch 9 (AGFL count == 1) - 35510, 35412

This short, but more mixed workload doesn't show much of a difference
between the baseline and deferred AGFL free code (in fact the counts
drop, which suggests this is all just noise).

Brian Foster (9):
  xfs: create agfl block free helper function
  xfs: allow attach of dfops to transaction
  xfs: defer agfl block frees when dfops is available
  xfs: defer agfl block frees from deferred ops processing context
  xfs: defer agfl frees from inode inactivation
  xfs: defer frees from common inode allocation paths
  xfs: defer agfl frees from directory op transactions
  xfs: add a runtime stat for extra transaction log regrants
  xfs: add extra log count for transactions that defer agfl frees

 fs/xfs/libxfs/xfs_alloc.c      | 82 ++++++++++++++++++++++++++++++++++++------
 fs/xfs/libxfs/xfs_alloc.h      |  2 ++
 fs/xfs/libxfs/xfs_defer.c      |  9 +++++
 fs/xfs/libxfs/xfs_defer.h      |  1 +
 fs/xfs/libxfs/xfs_dir2.c       | 16 ++++-----
 fs/xfs/libxfs/xfs_dir2.h       |  9 ++---
 fs/xfs/libxfs/xfs_ialloc.c     |  6 ++--
 fs/xfs/libxfs/xfs_ialloc.h     |  1 -
 fs/xfs/libxfs/xfs_trans_resv.h | 19 +++++-----
 fs/xfs/xfs_inode.c             | 69 ++++++++++++++++++-----------------
 fs/xfs/xfs_inode.h             |  3 +-
 fs/xfs/xfs_log.c               |  1 +
 fs/xfs/xfs_stats.h             |  3 +-
 fs/xfs/xfs_symlink.c           |  3 +-
 fs/xfs/xfs_trace.h             |  2 ++
 fs/xfs/xfs_trans.c             |  7 ++--
 fs/xfs/xfs_trans.h             |  1 +
 fs/xfs/xfs_trans_extfree.c     | 70 ++++++++++++++++++++++++++++++++++++
 18 files changed, 227 insertions(+), 77 deletions(-)

-- 
2.13.6


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFCv2 1/9] xfs: create agfl block free helper function
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 2/9] xfs: allow attach of dfops to transaction Brian Foster
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

Refactor the AGFL block free code into a new helper such that it can
be invoked from deferred context. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c | 37 +++++++++++++++++++++++++++----------
 fs/xfs/libxfs/xfs_alloc.h |  2 ++
 2 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 6883a7668de6..fd1699d00986 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2053,6 +2053,30 @@ xfs_alloc_space_available(
 	return true;
 }
 
+int
+xfs_free_agfl_block(
+	struct xfs_trans	*tp,
+	xfs_agnumber_t		agno,
+	xfs_agblock_t		agbno,
+	struct xfs_buf		*agbp,
+	struct xfs_owner_info	*oinfo)
+{
+	int			error;
+	struct xfs_buf		*bp;
+
+	error = xfs_free_ag_extent(tp, agbp, agno, agbno, 1, oinfo,
+				   XFS_AG_RESV_AGFL);
+	if (error)
+		return error;
+
+	bp = xfs_btree_get_bufs(tp->t_mountp, tp, agno, agbno, 0);
+	if (!bp)
+		return -EFSCORRUPTED;
+	xfs_trans_binval(tp, bp);
+
+	return 0;
+}
+
 /*
  * Decide whether to use this allocation group for this allocation.
  * If so, fix up the btree freelist's size.
@@ -2149,21 +2173,14 @@ xfs_alloc_fix_freelist(
 	else
 		xfs_rmap_ag_owner(&targs.oinfo, XFS_RMAP_OWN_AG);
 	while (!(flags & XFS_ALLOC_FLAG_NOSHRINK) && pag->pagf_flcount > need) {
-		struct xfs_buf	*bp;
-
 		error = xfs_alloc_get_freelist(tp, agbp, &bno, 0);
 		if (error)
 			goto out_agbp_relse;
-		error = xfs_free_ag_extent(tp, agbp, args->agno, bno, 1,
-					   &targs.oinfo, XFS_AG_RESV_AGFL);
+
+		error = xfs_free_agfl_block(tp, args->agno, bno, agbp,
+					    &targs.oinfo);
 		if (error)
 			goto out_agbp_relse;
-		bp = xfs_btree_get_bufs(mp, tp, args->agno, bno, 0);
-		if (!bp) {
-			error = -EFSCORRUPTED;
-			goto out_agbp_relse;
-		}
-		xfs_trans_binval(tp, bp);
 	}
 
 	targs.tp = tp;
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index 65a0cafe06e4..2afd0980dba6 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -222,6 +222,8 @@ int xfs_read_agf(struct xfs_mount *mp, struct xfs_trans *tp,
 			xfs_agnumber_t agno, int flags, struct xfs_buf **bpp);
 int xfs_alloc_read_agfl(struct xfs_mount *mp, struct xfs_trans *tp,
 			xfs_agnumber_t agno, struct xfs_buf **bpp);
+int xfs_free_agfl_block(struct xfs_trans *, xfs_agnumber_t, xfs_agblock_t,
+			struct xfs_buf *, struct xfs_owner_info *);
 int xfs_alloc_fix_freelist(struct xfs_alloc_arg *args, int flags);
 int xfs_free_extent_fix_freelist(struct xfs_trans *tp, xfs_agnumber_t agno,
 		struct xfs_buf **agbp);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 2/9] xfs: allow attach of dfops to transaction
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
  2018-01-24 18:44 ` [RFCv2 1/9] xfs: create agfl block free helper function Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 3/9] xfs: defer agfl block frees when dfops is available Brian Foster
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

xfs_defer_ops is a separate data structure from the xfs_trans data
structure. The former is typically allocated on the stack and
completed/processed before the associated thread returns outside of
the scope that defines the structure.

While this currently works fine, there are many places where
xfs_dfops has to be plumbed through deep callchains and/or subsystem
data structures (e.g., struct xfs_da_args) to other contexts. Since
deferred operations cannot be processed without a transaction, the
scope of xfs_defer_ops is essentially a subset of that of a
transaction. Further, an upcoming enhancement to defer AGFL block
frees requires to plumb xfs_defer_ops through yet another context
(struct xfs_alloc_arg) that already carries a transaction.

Rather than continue to pass dfops around independently, support the
ability to optionally carry an xfs_defer_ops structure in the
transaction itself. This facilitates the addition of deferred AGFL
block free behavior from selective contexts and incremental
clean up of other callchains to set/use the transaction reference
rather than plumb the dfops through explicitly.

Note that this patch does not change behavior nor dictate any
changes to how dfops structs are allocated (on the stack) and so all
existing rules apply. Changes to how dfops are allocated can be
considered once all paths are converted and thus would use a
consistent allocation pattern.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_trans.c | 7 ++++---
 fs/xfs/xfs_trans.h | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 86f92df32c42..50ee39faf64e 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -94,11 +94,11 @@ xfs_trans_free(
  * blocks.  Locks and log items, however, are no inherited.  They must
  * be added to the new transaction explicitly.
  */
-STATIC xfs_trans_t *
+STATIC struct xfs_trans *
 xfs_trans_dup(
-	xfs_trans_t	*tp)
+	struct xfs_trans	*tp)
 {
-	xfs_trans_t	*ntp;
+	struct xfs_trans	*ntp;
 
 	ntp = kmem_zone_zalloc(xfs_trans_zone, KM_SLEEP);
 
@@ -124,6 +124,7 @@ xfs_trans_dup(
 	ntp->t_rtx_res = tp->t_rtx_res - tp->t_rtx_res_used;
 	tp->t_rtx_res = tp->t_rtx_res_used;
 	ntp->t_pflags = tp->t_pflags;
+	ntp->t_dfops = tp->t_dfops;
 
 	xfs_trans_dup_dqinfo(tp, ntp);
 
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 815b53d20e26..d3e0599a4ce9 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -111,6 +111,7 @@ typedef struct xfs_trans {
 	struct xlog_ticket	*t_ticket;	/* log mgr ticket */
 	struct xfs_mount	*t_mountp;	/* ptr to fs mount struct */
 	struct xfs_dquot_acct   *t_dqinfo;	/* acctg info for dquots */
+	struct xfs_defer_ops	*t_dfops;	/* deferred ops reference */
 	unsigned int		t_flags;	/* misc flags */
 	int64_t			t_icount_delta;	/* superblock icount change */
 	int64_t			t_ifree_delta;	/* superblock ifree change */
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 3/9] xfs: defer agfl block frees when dfops is available
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
  2018-01-24 18:44 ` [RFCv2 1/9] xfs: create agfl block free helper function Brian Foster
  2018-01-24 18:44 ` [RFCv2 2/9] xfs: allow attach of dfops to transaction Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 4/9] xfs: defer agfl block frees from deferred ops processing context Brian Foster
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

The AGFL fixup code executes before every block allocation/free and
rectifies the AGFL based on the current, dynamic allocation
requirements of the fs. The AGFL must hold a minimum number of
blocks to satisfy a worst case split of the free space btrees caused
by the impending allocation operation. The AGFL is also updated to
maintain the implicit requirement for a minimum number of free slots
to satisfy a worst case join of the free space btrees.

Since the AGFL caches individual blocks, AGFL reduction typically
involves multiple, single block frees. We've had reports of
transaction overrun problems during certain workloads that boil down
to AGFL reduction freeing multiple blocks and consuming more space
in the log than was reserved for the transaction.

Since the objective of freeing AGFL blocks is to ensure free AGFL
free slots are available for the upcoming allocation, one way to
address this problem is to release surplus blocks from the AGFL
immediately but defer the free of those blocks (similar to how
file-mapped blocks are unmapped from the file in one transaction and
freed via a deferred operation) until the transaction is rolled.
This turns AGFL reduction into an operation with predictable log
reservation consumption.

Add the capability to defer AGFL block frees when a deferred ops
list is available to the AGFL fixup code. While deferring AGFL frees
is currently a conditional behavior (based on whether the caller has
populated the new dfops field of the transaction structure), the
long term objective is to convert all codepaths to use tp->t_dfops
and thus defer AGFL block frees consistently.

A bit of customization is required to handle deferred completion
processing because AGFL blocks are accounted against a separate
reservation pool and AGFL blocks are not inserted into the extent
busy list when freed (they are inserted when used and released back
to the AGFL). Reuse the majority of the existing deferred extent
free infrastructure and customize it appropriately to handle AGFL
blocks.

Note that this patch only adds infrastructure. It does not change
behavior because no callers have been updated to pass ->t_dfops into
the allocation code.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c  | 51 ++++++++++++++++++++++++++++++---
 fs/xfs/libxfs/xfs_defer.h  |  1 +
 fs/xfs/xfs_trace.h         |  2 ++
 fs/xfs/xfs_trans_extfree.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 120 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index fd1699d00986..e9db68386bdb 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -39,6 +39,9 @@
 #include "xfs_buf_item.h"
 #include "xfs_log.h"
 #include "xfs_ag_resv.h"
+#include "xfs_bmap.h"
+
+extern kmem_zone_t	*xfs_bmap_free_item_zone;
 
 struct workqueue_struct *xfs_alloc_wq;
 
@@ -2078,6 +2081,40 @@ xfs_free_agfl_block(
 }
 
 /*
+ * Defer an AGFL block free. This is effectively equivalent to
+ * xfs_bmap_add_free() with some special handling particular to AGFL blocks.
+ *
+ * Deferring AGFL frees helps prevent log reservation overruns due to too many
+ * allocation operations in a transaction. AGFL frees are prone to this problem
+ * because for one they are always freed one at a time. Further, an immediate
+ * AGFL block free can cause a btree join and require another block free before
+ * the real allocation can proceed. Deferring the free disconnects freeing up
+ * the AGFL slot from freeing the block.
+ */
+STATIC void
+xfs_defer_agfl_block(
+	struct xfs_mount		*mp,
+	struct xfs_defer_ops		*dfops,
+	xfs_agnumber_t			agno,
+	xfs_fsblock_t			agbno,
+	struct xfs_owner_info		*oinfo)
+{
+	struct xfs_extent_free_item	*new;		/* new element */
+
+	ASSERT(xfs_bmap_free_item_zone != NULL);
+	ASSERT(oinfo != NULL);
+
+	new = kmem_zone_alloc(xfs_bmap_free_item_zone, KM_SLEEP);
+	new->xefi_startblock = XFS_AGB_TO_FSB(mp, agno, agbno);
+	new->xefi_blockcount = 1;
+	new->xefi_oinfo = *oinfo;
+
+	trace_xfs_agfl_free_defer(mp, agno, 0, agbno, 1);
+
+	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
+}
+
+/*
  * Decide whether to use this allocation group for this allocation.
  * If so, fix up the btree freelist's size.
  */
@@ -2177,10 +2214,16 @@ xfs_alloc_fix_freelist(
 		if (error)
 			goto out_agbp_relse;
 
-		error = xfs_free_agfl_block(tp, args->agno, bno, agbp,
-					    &targs.oinfo);
-		if (error)
-			goto out_agbp_relse;
+		/* defer agfl frees if dfops is provided */
+		if (tp->t_dfops) {
+			xfs_defer_agfl_block(mp, tp->t_dfops, args->agno, bno,
+					     &targs.oinfo);
+		} else {
+			error = xfs_free_agfl_block(tp, args->agno, bno, agbp,
+						    &targs.oinfo);
+			if (error)
+				goto out_agbp_relse;
+		}
 	}
 
 	targs.tp = tp;
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index 045beacdd37d..e70725ba1f5f 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -55,6 +55,7 @@ enum xfs_defer_ops_type {
 	XFS_DEFER_OPS_TYPE_REFCOUNT,
 	XFS_DEFER_OPS_TYPE_RMAP,
 	XFS_DEFER_OPS_TYPE_FREE,
+	XFS_DEFER_OPS_TYPE_AGFL_FREE,
 	XFS_DEFER_OPS_TYPE_MAX,
 };
 
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 945de08af7ba..466ce0586d3c 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -2426,6 +2426,8 @@ DEFINE_DEFER_PENDING_EVENT(xfs_defer_pending_abort);
 #define DEFINE_BMAP_FREE_DEFERRED_EVENT DEFINE_PHYS_EXTENT_DEFERRED_EVENT
 DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_bmap_free_defer);
 DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_bmap_free_deferred);
+DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_agfl_free_defer);
+DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_agfl_free_deferred);
 
 /* rmap tracepoints */
 DECLARE_EVENT_CLASS(xfs_rmap_class,
diff --git a/fs/xfs/xfs_trans_extfree.c b/fs/xfs/xfs_trans_extfree.c
index ab438647592a..f5620796ae25 100644
--- a/fs/xfs/xfs_trans_extfree.c
+++ b/fs/xfs/xfs_trans_extfree.c
@@ -231,9 +231,79 @@ static const struct xfs_defer_op_type xfs_extent_free_defer_type = {
 	.cancel_item	= xfs_extent_free_cancel_item,
 };
 
+/*
+ * AGFL blocks are accounted differently in the reserve pools and are not
+ * inserted into the busy extent list.
+ */
+STATIC int
+xfs_agfl_free_finish_item(
+	struct xfs_trans		*tp,
+	struct xfs_defer_ops		*dop,
+	struct list_head		*item,
+	void				*done_item,
+	void				**state)
+{
+	struct xfs_mount		*mp = tp->t_mountp;
+	struct xfs_efd_log_item		*efdp = done_item;
+	struct xfs_extent_free_item	*free;
+	struct xfs_extent		*extp;
+	struct xfs_buf			*agbp;
+	int				error;
+	xfs_agnumber_t			agno;
+	xfs_agblock_t			agbno;
+	uint				next_extent;
+
+	free = container_of(item, struct xfs_extent_free_item, xefi_list);
+	ASSERT(free->xefi_blockcount == 1);
+	agno = XFS_FSB_TO_AGNO(mp, free->xefi_startblock);
+	agbno = XFS_FSB_TO_AGBNO(mp, free->xefi_startblock);
+
+	trace_xfs_agfl_free_deferred(mp, agno, 0, agbno, free->xefi_blockcount);
+
+	error = xfs_alloc_read_agf(mp, tp, agno, 0, &agbp);
+	if (!error)
+		error = xfs_free_agfl_block(tp, agno, agbno, agbp,
+					    &free->xefi_oinfo);
+
+	/*
+	 * Mark the transaction dirty, even on error. This ensures the
+	 * transaction is aborted, which:
+	 *
+	 * 1.) releases the EFI and frees the EFD
+	 * 2.) shuts down the filesystem
+	 */
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	efdp->efd_item.li_desc->lid_flags |= XFS_LID_DIRTY;
+
+	next_extent = efdp->efd_next_extent;
+	ASSERT(next_extent < efdp->efd_format.efd_nextents);
+	extp = &(efdp->efd_format.efd_extents[next_extent]);
+	extp->ext_start = free->xefi_startblock;
+	extp->ext_len = free->xefi_blockcount;
+	efdp->efd_next_extent++;
+
+	kmem_free(free);
+	return error;
+}
+
+
+/* sub-type with special handling for AGFL deferred frees */
+static const struct xfs_defer_op_type xfs_agfl_free_defer_type = {
+	.type		= XFS_DEFER_OPS_TYPE_AGFL_FREE,
+	.max_items	= XFS_EFI_MAX_FAST_EXTENTS,
+	.diff_items	= xfs_extent_free_diff_items,
+	.create_intent	= xfs_extent_free_create_intent,
+	.abort_intent	= xfs_extent_free_abort_intent,
+	.log_item	= xfs_extent_free_log_item,
+	.create_done	= xfs_extent_free_create_done,
+	.finish_item	= xfs_agfl_free_finish_item,
+	.cancel_item	= xfs_extent_free_cancel_item,
+};
+
 /* Register the deferred op type. */
 void
 xfs_extent_free_init_defer_op(void)
 {
 	xfs_defer_init_op_type(&xfs_extent_free_defer_type);
+	xfs_defer_init_op_type(&xfs_agfl_free_defer_type);
 }
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 4/9] xfs: defer agfl block frees from deferred ops processing context
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
                   ` (2 preceding siblings ...)
  2018-01-24 18:44 ` [RFCv2 3/9] xfs: defer agfl block frees when dfops is available Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 5/9] xfs: defer agfl frees from inode inactivation Brian Foster
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

Now that AGFL block frees can be deferred when dfops is carried
through to the allocator via the transaction, we want to start
deferring AGFL block frees from the contexts that are known to
stress existing log reservations.

The first such context is deferred operation processing itself. This
primarily targets deferred extent frees (such as file extents and
inode chunks), but in doing so covers all allocation operations that
occur in deferred operation processing context.

Update xfs_defer_finish() to set and reset ->t_dfops across the
finish sequence. This means that any AGFL block frees due to
allocation events result in the addition of new EFIs to the dfops
rather than being processed immediately. xfs_defer_finish() rolls
the transaction at least once more to process the frees of the AGFL
blocks back to the allocation btrees and returns once the AGFL is
rectified.

XXX: cache/reset old_dfops rather than NULL?

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_defer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index 087fea02c389..de71d292b9f3 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -357,6 +357,14 @@ xfs_defer_finish(
 
 	trace_xfs_defer_finish((*tp)->t_mountp, dop);
 
+	/*
+	 * Attach dfops to the transaction during deferred ops processing. This
+	 * causes calls into the allocator to defer AGFL block frees. The
+	 * pointer is set to NULL before we return.
+	 */
+	ASSERT(!(*tp)->t_dfops || ((*tp)->t_dfops == dop));
+	(*tp)->t_dfops = dop;
+
 	/* Until we run out of pending work to finish... */
 	while (xfs_defer_has_unfinished_work(dop)) {
 		/* Log intents for work items sitting in the intake. */
@@ -428,6 +436,7 @@ xfs_defer_finish(
 	}
 
 out:
+	(*tp)->t_dfops = NULL;
 	if (error)
 		trace_xfs_defer_finish_error((*tp)->t_mountp, dop, error);
 	else
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 5/9] xfs: defer agfl frees from inode inactivation
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
                   ` (3 preceding siblings ...)
  2018-01-24 18:44 ` [RFCv2 4/9] xfs: defer agfl block frees from deferred ops processing context Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 6/9] xfs: defer frees from common inode allocation paths Brian Foster
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

XFS inode chunks are already freed via deferred operations (which
now also defer AGFL block frees), but inode btree blocks are freed
directly in the associated context. This has been known to lead to
log reservation overruns in particular workloads where an inobt
block free may require several AGFL block frees (and thus several
allocation btree modifications) before the inobt block itself is
actually freed.

To avoid this problem, we want to defer the frees of any AGFL blocks
that must be dropped before the inobt block free can take place.
This requires passing the dfops from xfs_inactive_ifree() down
through the inobt ->[alloc|free]_block() callouts, which essentially
only requires to attach the dfops to the transaction since it is
already carried all the way through to the inobt update and
allocation.

Since dfops is attached to the transaction, we no longer have to
pass it down explicitly through xfs_difree() and friends to defer
the physical inode chunk free. Kill two stones with one bird and
clean up the entire codepath to reference ->t_dfops.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_ialloc.c |  6 ++----
 fs/xfs/libxfs/xfs_ialloc.h |  1 -
 fs/xfs/xfs_inode.c         | 15 ++++++++++-----
 fs/xfs/xfs_inode.h         |  3 +--
 4 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index 3625d1da7462..52f36adceef5 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -1897,7 +1897,6 @@ xfs_difree_inobt(
 	struct xfs_trans		*tp,
 	struct xfs_buf			*agbp,
 	xfs_agino_t			agino,
-	struct xfs_defer_ops		*dfops,
 	struct xfs_icluster		*xic,
 	struct xfs_inobt_rec_incore	*orec)
 {
@@ -1984,7 +1983,7 @@ xfs_difree_inobt(
 			goto error0;
 		}
 
-		xfs_difree_inode_chunk(mp, agno, &rec, dfops);
+		xfs_difree_inode_chunk(mp, agno, &rec, tp->t_dfops);
 	} else {
 		xic->deleted = false;
 
@@ -2129,7 +2128,6 @@ int
 xfs_difree(
 	struct xfs_trans	*tp,		/* transaction pointer */
 	xfs_ino_t		inode,		/* inode to be freed */
-	struct xfs_defer_ops	*dfops,		/* extents to free */
 	struct xfs_icluster	*xic)	/* cluster info if deleted */
 {
 	/* REFERENCED */
@@ -2181,7 +2179,7 @@ xfs_difree(
 	/*
 	 * Fix up the inode allocation btree.
 	 */
-	error = xfs_difree_inobt(mp, tp, agbp, agino, dfops, xic, &rec);
+	error = xfs_difree_inobt(mp, tp, agbp, agino, xic, &rec);
 	if (error)
 		goto error0;
 
diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
index c5402bb4ce0c..884562efd708 100644
--- a/fs/xfs/libxfs/xfs_ialloc.h
+++ b/fs/xfs/libxfs/xfs_ialloc.h
@@ -94,7 +94,6 @@ int					/* error */
 xfs_difree(
 	struct xfs_trans *tp,		/* transaction pointer */
 	xfs_ino_t	inode,		/* inode to be freed */
-	struct xfs_defer_ops *dfops,	/* extents to free */
 	struct xfs_icluster *ifree);	/* cluster info if deleted */
 
 /*
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index c9e40d4fc939..4c4a2fa3ad11 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1801,8 +1801,14 @@ xfs_inactive_ifree(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
+	/*
+	 * Attach dfops to tp to defer any physical inode chunk frees and/or
+	 * AGFL block frees that occur as a result of inobt block allocation
+	 * operations.
+	 */
 	xfs_defer_init(&dfops, &first_block);
-	error = xfs_ifree(tp, ip, &dfops);
+	tp->t_dfops = &dfops;
+	error = xfs_ifree(tp, ip);
 	if (error) {
 		/*
 		 * If we fail to free the inode, shut down.  The cancel
@@ -2421,9 +2427,8 @@ xfs_ifree_local_data(
  */
 int
 xfs_ifree(
-	xfs_trans_t	*tp,
-	xfs_inode_t	*ip,
-	struct xfs_defer_ops	*dfops)
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip)
 {
 	int			error;
 	struct xfs_icluster	xic = { 0 };
@@ -2442,7 +2447,7 @@ xfs_ifree(
 	if (error)
 		return error;
 
-	error = xfs_difree(tp, ip->i_ino, dfops, &xic);
+	error = xfs_difree(tp, ip->i_ino, &xic);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 386b0bb3c92a..efd676374ac3 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -413,8 +413,7 @@ uint		xfs_ilock_data_map_shared(struct xfs_inode *);
 uint		xfs_ilock_attr_map_shared(struct xfs_inode *);
 
 uint		xfs_ip2xflags(struct xfs_inode *);
-int		xfs_ifree(struct xfs_trans *, xfs_inode_t *,
-			   struct xfs_defer_ops *);
+int		xfs_ifree(struct xfs_trans *, struct xfs_inode *);
 int		xfs_itruncate_extents(struct xfs_trans **, struct xfs_inode *,
 				      int, xfs_fsize_t);
 void		xfs_iext_realloc(xfs_inode_t *, int, int);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 6/9] xfs: defer frees from common inode allocation paths
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
                   ` (4 preceding siblings ...)
  2018-01-24 18:44 ` [RFCv2 5/9] xfs: defer agfl frees from inode inactivation Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 7/9] xfs: defer agfl frees from directory op transactions Brian Foster
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

Inode allocation can require block allocation for physical inode
chunk allocation, inode btree record insertion, and/or directory
block allocation for entry insertion. Any of these block allocation
requests can require AGFL fixups prior to the actual allocation.
Update the common file creation transacions to defer AGFL frees from
these contexts to avoid too much log reservation consumption
per-transaction.

Since these transactions are already passed down through the btree
cursors and da_args structure, this simply requires to attach dfops
to the transaction. Note that this covers tr_create, tr_mkdir and
tr_symlink. Other transactions such as tr_create_tmpfile do not
already make use of deferred operations and so are left alone for
the time being.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_inode.c   | 1 +
 fs/xfs/xfs_symlink.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 4c4a2fa3ad11..924dbd5a1b40 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1187,6 +1187,7 @@ xfs_create(
 	unlock_dp_on_error = true;
 
 	xfs_defer_init(&dfops, &first_block);
+	tp->t_dfops = &dfops;
 
 	/*
 	 * Reserve disk quota and the inode.
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 2e9e793a8f9d..8be021228266 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -259,6 +259,7 @@ xfs_symlink(
 	 * bmapi or the directory create code.
 	 */
 	xfs_defer_init(&dfops, &first_block);
+	tp->t_dfops = &dfops;
 
 	/*
 	 * Allocate an inode for the symlink.
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 7/9] xfs: defer agfl frees from directory op transactions
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
                   ` (5 preceding siblings ...)
  2018-01-24 18:44 ` [RFCv2 6/9] xfs: defer frees from common inode allocation paths Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 8/9] xfs: add a runtime stat for extra transaction log regrants Brian Foster
  2018-01-24 18:44 ` [RFCv2 9/9] xfs: add extra log count for transactions that defer agfl frees Brian Foster
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

Directory operations can perform block allocations as entries are
added/removed from directories. Defer AGFL block frees from the
remaining directory operation transactions. This covers the hard
link, remove and rename operations.

Since this fixes up all callers of the dir2 create, remove and
replace functions to pass dfops via the transaction, clean up the
associated interfaces to no longer require a dfops parameter.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_dir2.c | 16 +++++++--------
 fs/xfs/libxfs/xfs_dir2.h |  9 +++-----
 fs/xfs/xfs_inode.c       | 53 ++++++++++++++++++++++++------------------------
 fs/xfs/xfs_symlink.c     |  2 +-
 4 files changed, 38 insertions(+), 42 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 92f94e190f04..3a6e126212ec 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -256,7 +256,6 @@ xfs_dir_createname(
 	struct xfs_name		*name,
 	xfs_ino_t		inum,		/* new entry inode number */
 	xfs_fsblock_t		*first,		/* bmap's firstblock */
-	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
 	xfs_extlen_t		total)		/* bmap's total block count */
 {
 	struct xfs_da_args	*args;
@@ -282,11 +281,12 @@ xfs_dir_createname(
 	args->hashval = dp->i_mount->m_dirnameops->hashname(name);
 	args->inumber = inum;
 	args->dp = dp;
-	args->firstblock = first;
-	args->dfops = dfops;
 	args->total = total;
 	args->whichfork = XFS_DATA_FORK;
 	args->trans = tp;
+	ASSERT(tp->t_dfops || !first);
+	args->dfops = tp->t_dfops;
+	args->firstblock = first;
 	args->op_flags = XFS_DA_OP_ADDNAME | XFS_DA_OP_OKNOENT;
 	if (!inum)
 		args->op_flags |= XFS_DA_OP_JUSTCHECK;
@@ -433,7 +433,6 @@ xfs_dir_removename(
 	struct xfs_name	*name,
 	xfs_ino_t	ino,
 	xfs_fsblock_t	*first,		/* bmap's firstblock */
-	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
 	xfs_extlen_t	total)		/* bmap's total block count */
 {
 	struct xfs_da_args *args;
@@ -455,10 +454,11 @@ xfs_dir_removename(
 	args->inumber = ino;
 	args->dp = dp;
 	args->firstblock = first;
-	args->dfops = dfops;
 	args->total = total;
 	args->whichfork = XFS_DATA_FORK;
 	args->trans = tp;
+	ASSERT(tp->t_dfops);
+	args->dfops = tp->t_dfops;
 
 	if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
 		rval = xfs_dir2_sf_removename(args);
@@ -495,7 +495,6 @@ xfs_dir_replace(
 	struct xfs_name	*name,		/* name of entry to replace */
 	xfs_ino_t	inum,		/* new inode number */
 	xfs_fsblock_t	*first,		/* bmap's firstblock */
-	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
 	xfs_extlen_t	total)		/* bmap's total block count */
 {
 	struct xfs_da_args *args;
@@ -520,10 +519,11 @@ xfs_dir_replace(
 	args->inumber = inum;
 	args->dp = dp;
 	args->firstblock = first;
-	args->dfops = dfops;
 	args->total = total;
 	args->whichfork = XFS_DATA_FORK;
 	args->trans = tp;
+	ASSERT(tp->t_dfops);
+	args->dfops = tp->t_dfops;
 
 	if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
 		rval = xfs_dir2_sf_replace(args);
@@ -559,7 +559,7 @@ xfs_dir_canenter(
 	xfs_inode_t	*dp,
 	struct xfs_name	*name)		/* name of entry to add */
 {
-	return xfs_dir_createname(tp, dp, name, 0, NULL, NULL, 0);
+	return xfs_dir_createname(tp, dp, name, 0, NULL, 0);
 }
 
 /*
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index 388d67c5c903..0e4ee72c9135 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -130,19 +130,16 @@ extern int xfs_dir_init(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_inode *pdp);
 extern int xfs_dir_createname(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t inum,
-				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				xfs_fsblock_t *first, xfs_extlen_t tot);
 extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t *inum,
 				struct xfs_name *ci_name);
 extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t ino,
-				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				xfs_fsblock_t *first, xfs_extlen_t tot);
 extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t inum,
-				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				xfs_fsblock_t *first, xfs_extlen_t tot);
 extern int xfs_dir_canenter(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name);
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 924dbd5a1b40..3fc645313e35 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1217,8 +1217,8 @@ xfs_create(
 	xfs_trans_ijoin(tp, dp, XFS_ILOCK_EXCL);
 	unlock_dp_on_error = false;
 
-	error = xfs_dir_createname(tp, dp, name, ip->i_ino,
-					&first_block, &dfops, resblks ?
+	error = xfs_dir_createname(tp, dp, name, ip->i_ino, &first_block,
+				   resblks ?
 					resblks - XFS_IALLOC_SPACE_RES(mp) : 0);
 	if (error) {
 		ASSERT(error != -ENOSPC);
@@ -1445,6 +1445,7 @@ xfs_link(
 	}
 
 	xfs_defer_init(&dfops, &first_block);
+	tp->t_dfops = &dfops;
 
 	/*
 	 * Handle initial link state of O_TMPFILE inode
@@ -1456,7 +1457,7 @@ xfs_link(
 	}
 
 	error = xfs_dir_createname(tp, tdp, target_name, sip->i_ino,
-					&first_block, &dfops, resblks);
+				   &first_block, resblks);
 	if (error)
 		goto error_return;
 	xfs_trans_ichgtime(tp, tdp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
@@ -2636,8 +2637,9 @@ xfs_remove(
 		goto out_trans_cancel;
 
 	xfs_defer_init(&dfops, &first_block);
-	error = xfs_dir_removename(tp, dp, name, ip->i_ino,
-					&first_block, &dfops, resblks);
+	tp->t_dfops = &dfops;
+	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
+				   resblks);
 	if (error) {
 		ASSERT(error != -ENOENT);
 		goto out_bmap_cancel;
@@ -2725,9 +2727,9 @@ xfs_sort_for_rename(
 
 static int
 xfs_finish_rename(
-	struct xfs_trans	*tp,
-	struct xfs_defer_ops	*dfops)
+	struct xfs_trans	*tp)
 {
+	struct xfs_defer_ops	*dfops = tp->t_dfops;
 	int			error;
 
 	/*
@@ -2761,7 +2763,6 @@ xfs_cross_rename(
 	struct xfs_inode	*dp2,
 	struct xfs_name		*name2,
 	struct xfs_inode	*ip2,
-	struct xfs_defer_ops	*dfops,
 	xfs_fsblock_t		*first_block,
 	int			spaceres)
 {
@@ -2771,16 +2772,14 @@ xfs_cross_rename(
 	int		dp2_flags = 0;
 
 	/* Swap inode number for dirent in first parent */
-	error = xfs_dir_replace(tp, dp1, name1,
-				ip2->i_ino,
-				first_block, dfops, spaceres);
+	error = xfs_dir_replace(tp, dp1, name1, ip2->i_ino, first_block,
+				spaceres);
 	if (error)
 		goto out_trans_abort;
 
 	/* Swap inode number for dirent in second parent */
-	error = xfs_dir_replace(tp, dp2, name2,
-				ip1->i_ino,
-				first_block, dfops, spaceres);
+	error = xfs_dir_replace(tp, dp2, name2, ip1->i_ino, first_block,
+				spaceres);
 	if (error)
 		goto out_trans_abort;
 
@@ -2795,7 +2794,7 @@ xfs_cross_rename(
 		if (S_ISDIR(VFS_I(ip2)->i_mode)) {
 			error = xfs_dir_replace(tp, ip2, &xfs_name_dotdot,
 						dp1->i_ino, first_block,
-						dfops, spaceres);
+						spaceres);
 			if (error)
 				goto out_trans_abort;
 
@@ -2822,7 +2821,7 @@ xfs_cross_rename(
 		if (S_ISDIR(VFS_I(ip1)->i_mode)) {
 			error = xfs_dir_replace(tp, ip1, &xfs_name_dotdot,
 						dp2->i_ino, first_block,
-						dfops, spaceres);
+						spaceres);
 			if (error)
 				goto out_trans_abort;
 
@@ -2861,10 +2860,10 @@ xfs_cross_rename(
 	}
 	xfs_trans_ichgtime(tp, dp1, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
 	xfs_trans_log_inode(tp, dp1, XFS_ILOG_CORE);
-	return xfs_finish_rename(tp, dfops);
+	return xfs_finish_rename(tp);
 
 out_trans_abort:
-	xfs_defer_cancel(dfops);
+	xfs_defer_cancel(tp->t_dfops);
 	xfs_trans_cancel(tp);
 	return error;
 }
@@ -3003,12 +3002,13 @@ xfs_rename(
 	}
 
 	xfs_defer_init(&dfops, &first_block);
+	tp->t_dfops = &dfops;
 
 	/* RENAME_EXCHANGE is unique from here on. */
 	if (flags & RENAME_EXCHANGE)
 		return xfs_cross_rename(tp, src_dp, src_name, src_ip,
 					target_dp, target_name, target_ip,
-					&dfops, &first_block, spaceres);
+					&first_block, spaceres);
 
 	/*
 	 * Set up the target.
@@ -3030,7 +3030,7 @@ xfs_rename(
 		 */
 		error = xfs_dir_createname(tp, target_dp, target_name,
 						src_ip->i_ino, &first_block,
-						&dfops, spaceres);
+						spaceres);
 		if (error)
 			goto out_bmap_cancel;
 
@@ -3069,8 +3069,7 @@ xfs_rename(
 		 * name at the destination directory, remove it first.
 		 */
 		error = xfs_dir_replace(tp, target_dp, target_name,
-					src_ip->i_ino,
-					&first_block, &dfops, spaceres);
+					src_ip->i_ino, &first_block, spaceres);
 		if (error)
 			goto out_bmap_cancel;
 
@@ -3104,8 +3103,8 @@ xfs_rename(
 		 * directory.
 		 */
 		error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot,
-					target_dp->i_ino,
-					&first_block, &dfops, spaceres);
+					target_dp->i_ino, &first_block,
+					spaceres);
 		ASSERT(error != -EEXIST);
 		if (error)
 			goto out_bmap_cancel;
@@ -3144,10 +3143,10 @@ xfs_rename(
 	 */
 	if (wip) {
 		error = xfs_dir_replace(tp, src_dp, src_name, wip->i_ino,
-					&first_block, &dfops, spaceres);
+					&first_block, spaceres);
 	} else
 		error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
-					   &first_block, &dfops, spaceres);
+					   &first_block, spaceres);
 	if (error)
 		goto out_bmap_cancel;
 
@@ -3182,7 +3181,7 @@ xfs_rename(
 	if (new_parent)
 		xfs_trans_log_inode(tp, target_dp, XFS_ILOG_CORE);
 
-	error = xfs_finish_rename(tp, &dfops);
+	error = xfs_finish_rename(tp);
 	if (wip)
 		IRELE(wip);
 	return error;
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 8be021228266..6cbc2b9d465f 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -351,7 +351,7 @@ xfs_symlink(
 	 * Create the directory entry for the symlink.
 	 */
 	error = xfs_dir_createname(tp, dp, link_name, ip->i_ino,
-					&first_block, &dfops, resblks);
+				   &first_block, resblks);
 	if (error)
 		goto out_bmap_cancel;
 	xfs_trans_ichgtime(tp, dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 8/9] xfs: add a runtime stat for extra transaction log regrants
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
                   ` (6 preceding siblings ...)
  2018-01-24 18:44 ` [RFCv2 7/9] xfs: defer agfl frees from directory op transactions Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  2018-01-24 18:44 ` [RFCv2 9/9] xfs: add extra log count for transactions that defer agfl frees Brian Foster
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

Add a runtime stat to track the number of regrants that occur that
were not satisfied by the initial transaction reservation (i.e.,
->tr_logcount), regardless of whether blocking for reservation was
required. The purpose of this stat is to help shed some light on how
certain transaction changes may affect runtime behavior. For example,
deferring AGFL block frees can cause an additional transaction rolls
in cases where it is enabled. The regrant stat helps track how this
behavior affects workloads and to consider whether to update
transaction reservation counts.

Update the tail pushing section of the stats structure with the new
field since it already includes a couple other logspace-related
stats. Add the new stat at the end to minimize compatibility issues
with any stats-parsing scripts, etc. that may exist out in the wild.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c   | 1 +
 fs/xfs/xfs_stats.h | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index c1f266c34af7..0efe2af7e4fe 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -399,6 +399,7 @@ xfs_log_regrant(
 		return 0;
 
 	trace_xfs_log_regrant(log, tic);
+	XFS_STATS_INC(mp, xs_regrant_logspace);
 
 	error = xlog_grant_head_check(log, &log->l_write_head, tic,
 				      &need_bytes);
diff --git a/fs/xfs/xfs_stats.h b/fs/xfs/xfs_stats.h
index f64d0ae345c4..76c6c099b65e 100644
--- a/fs/xfs/xfs_stats.h
+++ b/fs/xfs/xfs_stats.h
@@ -99,7 +99,7 @@ struct __xfsstats {
 	uint32_t		xs_log_noiclogs;
 	uint32_t		xs_log_force;
 	uint32_t		xs_log_force_sleep;
-# define XFSSTAT_END_TAIL_PUSHING	(XFSSTAT_END_LOG_OPS+10)
+# define XFSSTAT_END_TAIL_PUSHING	(XFSSTAT_END_LOG_OPS+11)
 	uint32_t		xs_try_logspace;
 	uint32_t		xs_sleep_logspace;
 	uint32_t		xs_push_ail;
@@ -110,6 +110,7 @@ struct __xfsstats {
 	uint32_t		xs_push_ail_flushing;
 	uint32_t		xs_push_ail_restarts;
 	uint32_t		xs_push_ail_flush;
+	uint32_t		xs_regrant_logspace;
 # define XFSSTAT_END_WRITE_CONVERT	(XFSSTAT_END_TAIL_PUSHING+2)
 	uint32_t		xs_xstrat_quick;
 	uint32_t		xs_xstrat_split;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFCv2 9/9] xfs: add extra log count for transactions that defer agfl frees
  2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
                   ` (7 preceding siblings ...)
  2018-01-24 18:44 ` [RFCv2 8/9] xfs: add a runtime stat for extra transaction log regrants Brian Foster
@ 2018-01-24 18:44 ` Brian Foster
  8 siblings, 0 replies; 10+ messages in thread
From: Brian Foster @ 2018-01-24 18:44 UTC (permalink / raw)
  To: linux-xfs

Now that AGFL frees are deferred from particular contexts, the
associated transactions may require an increased number of
transaction rolls during deferred operation processing. Update the
log counts to prevent these additional rolls from blocking on
reservation regrant.

XXX: This patch is currently a no-op as XFS_AGFL_LOG_COUNT is set to
zero. Initially it was 1, but testing has shown a couple issues:

- The increase in number of regrants due to deferred AGFL block frees
appears to be minimal on the workloads examined so far.
- Increasing the logcount of certain larger transactions (e.g.,
tr_rename and some of the other directory ops) can increase the max
transaction reservation and have side effects like increasing the
minimum required log size at mkfs time.

Based on the above and the reasoning that deferred AGFL block frees
are a subset of all AGFL fixup operations (only frees), which in
turn are a subset of allocations (typically those that involve btree
shape changes), I think that it's probably overkill to increase
the logcount of any existing transactions at the moment. That
doesn't preclude from doing so in the future, of course.

I'm including this patch for reference and discussion purposes for
the RFC. I'm open to running more/different tests and/or considering
subsets of transactions, though I would prefer any change be backed
by a test that demonstrates the cost of deferring AGFL frees and
benefit of increasing the logcount of the associated transaction(s)
and workload. I've simply not been able to do that with enough
clarity to justify a change so far.

Not-Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_trans_resv.h | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_trans_resv.h b/fs/xfs/libxfs/xfs_trans_resv.h
index b7e5357d060a..05fdcef07216 100644
--- a/fs/xfs/libxfs/xfs_trans_resv.h
+++ b/fs/xfs/libxfs/xfs_trans_resv.h
@@ -82,20 +82,23 @@ struct xfs_trans_resv {
 	 XFS_DAENTER_BMAPS(mp, XFS_DATA_FORK) + 1)
 
 /*
- * Various log count values.
+ * Various log count values. The AGFL count is a sub-component that covers
+ * additional tx rolls due to AGFL fixups (for transactions that defer AGFL
+ * block frees).
  */
+#define	XFS_AGFL_LOG_COUNT		0 /* for agfl fixups */
 #define	XFS_DEFAULT_LOG_COUNT		1
 #define	XFS_DEFAULT_PERM_LOG_COUNT	2
 #define	XFS_ITRUNCATE_LOG_COUNT		2
 #define	XFS_ITRUNCATE_LOG_COUNT_REFLINK	8
-#define XFS_INACTIVE_LOG_COUNT		2
-#define	XFS_CREATE_LOG_COUNT		2
+#define	XFS_INACTIVE_LOG_COUNT		(2 + XFS_AGFL_LOG_COUNT)
+#define	XFS_CREATE_LOG_COUNT		(2 + XFS_AGFL_LOG_COUNT)
 #define	XFS_CREATE_TMPFILE_LOG_COUNT	2
-#define	XFS_MKDIR_LOG_COUNT		3
-#define	XFS_SYMLINK_LOG_COUNT		3
-#define	XFS_REMOVE_LOG_COUNT		2
-#define	XFS_LINK_LOG_COUNT		2
-#define	XFS_RENAME_LOG_COUNT		2
+#define	XFS_MKDIR_LOG_COUNT		(3 + XFS_AGFL_LOG_COUNT)
+#define	XFS_SYMLINK_LOG_COUNT		(3 + XFS_AGFL_LOG_COUNT)
+#define	XFS_REMOVE_LOG_COUNT		(2 + XFS_AGFL_LOG_COUNT)
+#define	XFS_LINK_LOG_COUNT		(2 + XFS_AGFL_LOG_COUNT)
+#define	XFS_RENAME_LOG_COUNT		(2 + XFS_AGFL_LOG_COUNT)
 #define	XFS_WRITE_LOG_COUNT		2
 #define	XFS_WRITE_LOG_COUNT_REFLINK	8
 #define	XFS_ADDAFORK_LOG_COUNT		2
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-01-24 18:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-24 18:44 [RFCv2 0/9] xfs: defer agfl block frees Brian Foster
2018-01-24 18:44 ` [RFCv2 1/9] xfs: create agfl block free helper function Brian Foster
2018-01-24 18:44 ` [RFCv2 2/9] xfs: allow attach of dfops to transaction Brian Foster
2018-01-24 18:44 ` [RFCv2 3/9] xfs: defer agfl block frees when dfops is available Brian Foster
2018-01-24 18:44 ` [RFCv2 4/9] xfs: defer agfl block frees from deferred ops processing context Brian Foster
2018-01-24 18:44 ` [RFCv2 5/9] xfs: defer agfl frees from inode inactivation Brian Foster
2018-01-24 18:44 ` [RFCv2 6/9] xfs: defer frees from common inode allocation paths Brian Foster
2018-01-24 18:44 ` [RFCv2 7/9] xfs: defer agfl frees from directory op transactions Brian Foster
2018-01-24 18:44 ` [RFCv2 8/9] xfs: add a runtime stat for extra transaction log regrants Brian Foster
2018-01-24 18:44 ` [RFCv2 9/9] xfs: add extra log count for transactions that defer agfl frees Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.