All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v14 00/15] xfs: Delayed Attributes
@ 2020-12-18  7:29 Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step Allison Henderson
                   ` (14 more replies)
  0 siblings, 15 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

Hi all,

This set is a subset of a larger series for parent pointers. Delayed attributes
allow attribute operations (set and remove) to be logged and committed in the same
way that other delayed operations do. This allows more complex operations (like
parent pointers) to be broken up into multiple smaller transactions. To do
this, the existing attr operations must be modified to operate as a delayed
operation.  This means that they cannot roll, commit, or finish transactions.  
Instead, they return -EAGAIN to allow the calling function to handle the transaction.
In this series, we focus on only the delayed attribute portion. We will introduce
parent pointers in a later set.

At the moment, I would like people to focus their review efforts on just this
"delayed attribute" sub series, as I think that is a more conservative use of peoples
review time.  I also think the set is a bit much to manage all at once, and we
need to get the infrastructure ironed out before we focus too much anything
that depends on it. But I do have the extended series for folks that want to
see the bigger picture of where this is going.

To help organize the set, I've arranged the patches to make sort of mini sets.
I thought it would help reviewers break down the reviewing some. For reviewing
purposes, the set could be broken up into 2 phases:

Delay Ready Attributes: (patches 1-8)
These are the remaining patches belonging to the "Delay Ready" series that
we've been working with.  In these patches, transaction handling is removed
from the attr routines, and replaced with a state machine that allows a high
level function to roll the transaction and repeatedly recall the attr routines
until they are finished.  The behavior of the attr set/remove routines
are now also compatible as a .finish_item callback
  xfs: Add helper xfs_attr_node_remove_step
  xfs: Hoist transaction handling in xfs_attr_node_remove_step
  xfs: Add xfs_attr_node_remove_cleanup
  xfs: Add delay ready attr remove routines
  xfs: Add delay ready attr set routines
  xfs: Add statemachine tracepoints
  xfs: Rename __xfs_attr_rmtval_remove
  xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans

Delayed Attributes: (patches 9 - 15)
These patches go on to fully implement delayed attributes.  New attr intent and
done items are introduced for use in the existing logging infrastructure.  A
mount option is added to toggle the feature on and off, and an error tag is added
to test the log replay
  xfs: Set up infastructure for deferred attribute operations
  xfs: Skip flip flags for delayed attrs
  xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  xfs: Remove unused xfs_attr_*_args
  xfs: Add delayed attributes error tag
  xfs: Add delattr mount option
  xfs: Merge xfs_delattr_context into xfs_attr_item

Updates since v13: Mostly integrating review feed back, which involved quite
a few changes.  Most significant being the removal of the incompat flag which
has been replaced with a mount option.  Though we do need to set an incompat
flag, we cannot use it to enable/disable the feature.  It needs to be set when
ever a new log item is written to disk.  ATM, writing the flag to the super
block on the fly currently not supported, but overlaps with another work effort
that needs it for other reasons.  For now, writing new log entries to the disk
is simply disabled unless specified at mount time.

xfs: Add helper xfs_attr_node_remove_step
   Initialized state to null in xfs_attr_node_removename
   Fixed typo "shirnk"

xfs: Add xfs_attr_node_remove_cleanup
   NEW

xfs: Hoist transaction handling in xfs_attr_node_remove_step
   NEW

xfs: Add delay ready attr remove routines
   Removed uneeded error = 0; initialization in xfs_attr_remove_args
   Simplified if/else logic in xfs_attr_remove_iter.  Added some commentary
   Collapsed extra state arg in xfs_attr_node_removename_setup
   Moved up state init in xfs_attr_node_removename_setup
   Fixed variable alignment in xfs_attr_node_remove_step
   Removed XFS_DAC_NODE_RMVNAME_INIT
   Modified default switch case in xfs_attr_node_removename_iter to goto out instead of return
   Updated /* fallthrough */ comment in xfs_attr_node_removename_iter
   Updated commentary for xfs_attr_node_remove_step in xfs_attr_node_removename_iter
   Updated xfs_attr_trans_roll to skip roll if defer_finish is called
   Expanded state machine diagram in b/fs/xfs/libxfs/xfs_attr.h
   Fixed typo in xfs_attr_rmtval_remove comment 
   Added explanation to stateless -EAGAIN return
   
xfs: Add delay ready attr set routines
   Rebase adjustments
   Removed uneeded bhold/join from xfs_attr_set_args
   Removed xfs_attr_rmtval_remove prototype from xfs_attr_remote.h.
   Added comments to EAGAIN returns that dont needs states
   Added XFS_DAS_RM_LBLK and XFS_DAS_UNINIT to switch in xfs_attr_set_iter
   Assert on states that belong to remove path 
   Expanded comments in xfs_attr_set_iter
   Fixed ENSPC handler for xfs_attr_leaf_try_add
   Expanded state machine diagram in b/fs/xfs/libxfs/xfs_attr.h
   Fix a bug found with generic/449
      Do no return EAGAIN on sucessfull leaf add if theres nothing left to do
   Moved state set to before xfs_attr_rmtval_remove call

xfs: Add statemachine tracepoints
  NEW

xfs: Rename __xfs_attr_rmtval_remove
   Rebase adjustments

xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans
   NEW

xfs: Set up infastructure for deferred attribute operations
  Added extra buffer_size parameter to xfs_attri_log_item
    This allows a trailing buffer to be allocated allong with the intent
    Simplifies having to realloc the intent during the log commit
  Misc white space cleanups
  Removed unused xfs_attrlen_t
  Removed wrapper function xfs_attri_item_sizeof and xfs_attrd_item_sizeof
  Added inline helper ATTR_NVEC_SIZE.  Simplified with roundup function
  Rephrased comment for xfs_trans_attr
  Moved attrip cleanup in xfs_attr_finish_item to xfs_attri_item_committed
  Removed unused xfs_attri_item_committing  and xfs_attrd_item_committing
  Merged xfs_attrd_init into xfs_trans_get_attrd
  Added XFS_ITEM_RELEASE_WHEN_COMMITTED to xfs_attrd_item_ops
  Reworked xfs_attri_item_recover to preseerve log replay order
  Expanded xfs_attri_item_recover validate check
  Added xfs_attri_item_relog
  Added validate check to xfs_attr_leaf_try_add
  Trimmed down xfs_attri_log_item commentary
  Reordered xfs_attri_log_item members to remove holes
  Renamed xfs_sb_version_hasdelattr to xfs_hasdelattr, and moved to xfs_attr.h
    This will be used for a mount option later instead of an incompat flag

xfs: Skip flip flags for delayed attrs
  NEW

xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  Merged with patch xfs: Enable delayed attributes
  Manged kmem_alloc_large to kmem_zalloc 
  Merged with previous "Enable delayed attributes" patch

xfs: Enable delayed attributes
  REMOVED

xfs: Remove unused xfs_attr_*_args
  Collapsed in leaf_bp parameter in xfs_attr_set_iter

xfs: Add delayed attributes error tag
   No change

xfs: Add feature bit XFS_SB_FEAT_INCOMPAT_LOG_DELATTR
   REMOVED

xfs: Add delattr mount option
   NEW

b54c08d xfsprogs: Merge xfs_delattr_context into xfs_attr_item
   NEW

This series can be viewed on github here:
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_v14

As well as the extended delayed attribute and parent pointer series:
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_v14_extended

And the test cases:
https://github.com/allisonhenderson/xfs_work/tree/pptr_xfstestsv2

In order to run the test cases, you will need have the corresponding xfsprogs
changes as well.  Which can be found here:
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_xfsprogs_v14
https://github.com/allisonhenderson/xfs_work/tree/delay_ready_attrs_xfsprogs_v14_extended

To run the xfs attributes tests run:
check -g attr

To run as delayed attributes run:
export MOUNT_OPTIONS="-o delattr"
check -g attr

To run parent pointer tests:
check -g parent

I've also made the corresponding updates to the user space side as well, and ported anything
they need to seat correctly.

Questions, comment and feedback appreciated! 

Thanks all!
Allison 


Allison Collins (3):
  xfs: Add helper xfs_attr_node_remove_step
  xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  xfs: Add delayed attributes error tag

Allison Henderson (12):
  xfs: Add xfs_attr_node_remove_cleanup
  xfs: Hoist transaction handling in xfs_attr_node_remove_step
  xfs: Add delay ready attr remove routines
  xfs: Add delay ready attr set routines
  xfs: Add state machine tracepoints
  xfs: Rename __xfs_attr_rmtval_remove
  xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans
  xfs: Set up infastructure for deferred attribute operations
  xfs: Skip flip flags for delayed attrs
  xfs: Remove unused xfs_attr_*_args
  xfs: Add delattr mount option
  xfs: Merge xfs_delattr_context into xfs_attr_item

 fs/xfs/Makefile                 |   1 +
 fs/xfs/libxfs/xfs_attr.c        | 633 ++++++++++++++++++++----------
 fs/xfs/libxfs/xfs_attr.h        | 360 ++++++++++++++++-
 fs/xfs/libxfs/xfs_attr_leaf.c   |   5 +-
 fs/xfs/libxfs/xfs_attr_remote.c | 126 +++---
 fs/xfs/libxfs/xfs_attr_remote.h |   7 +-
 fs/xfs/libxfs/xfs_defer.c       |   1 +
 fs/xfs/libxfs/xfs_defer.h       |   3 +
 fs/xfs/libxfs/xfs_errortag.h    |   4 +-
 fs/xfs/libxfs/xfs_log_format.h  |  44 ++-
 fs/xfs/libxfs/xfs_log_recover.h |   2 +
 fs/xfs/scrub/common.c           |   2 +
 fs/xfs/xfs_acl.c                |   2 +
 fs/xfs/xfs_attr_inactive.c      |   2 +-
 fs/xfs/xfs_attr_item.c          | 830 ++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr_item.h          |  52 +++
 fs/xfs/xfs_attr_list.c          |   1 +
 fs/xfs/xfs_error.c              |   3 +
 fs/xfs/xfs_ioctl.c              |   2 +
 fs/xfs/xfs_ioctl32.c            |   2 +
 fs/xfs/xfs_iops.c               |   2 +
 fs/xfs/xfs_log.c                |   4 +
 fs/xfs/xfs_log_recover.c        |   7 +-
 fs/xfs/xfs_mount.h              |   1 +
 fs/xfs/xfs_ondisk.h             |   2 +
 fs/xfs/xfs_super.c              |   6 +-
 fs/xfs/xfs_trace.h              |  21 +-
 fs/xfs/xfs_xattr.c              |   3 +
 28 files changed, 1874 insertions(+), 254 deletions(-)
 create mode 100644 fs/xfs/xfs_attr_item.c
 create mode 100644 fs/xfs/xfs_attr_item.h

-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-21  6:45   ` Chandan Babu R
  2020-12-22 16:50   ` Brian Foster
  2020-12-18  7:29 ` [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup Allison Henderson
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

From: Allison Collins <allison.henderson@oracle.com>

This patch as a new helper function xfs_attr_node_remove_step.  This
will help simplify and modularize the calling function
xfs_attr_node_remove.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 46 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 34 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index fd8e641..8b55a8d 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -1228,19 +1228,14 @@ xfs_attr_node_remove_rmt(
  * the root node (a special case of an intermediate node).
  */
 STATIC int
-xfs_attr_node_removename(
-	struct xfs_da_args	*args)
+xfs_attr_node_remove_step(
+	struct xfs_da_args	*args,
+	struct xfs_da_state	*state)
 {
-	struct xfs_da_state	*state;
 	struct xfs_da_state_blk	*blk;
 	int			retval, error;
 	struct xfs_inode	*dp = args->dp;
 
-	trace_xfs_attr_node_removename(args);
-
-	error = xfs_attr_node_removename_setup(args, &state);
-	if (error)
-		goto out;
 
 	/*
 	 * If there is an out-of-line value, de-allocate the blocks.
@@ -1250,7 +1245,7 @@ xfs_attr_node_removename(
 	if (args->rmtblkno > 0) {
 		error = xfs_attr_node_remove_rmt(args, state);
 		if (error)
-			goto out;
+			return error;
 	}
 
 	/*
@@ -1267,18 +1262,45 @@ xfs_attr_node_removename(
 	if (retval && (state->path.active > 1)) {
 		error = xfs_da3_join(state);
 		if (error)
-			goto out;
+			return error;
 		error = xfs_defer_finish(&args->trans);
 		if (error)
-			goto out;
+			return error;
 		/*
 		 * Commit the Btree join operation and start a new trans.
 		 */
 		error = xfs_trans_roll_inode(&args->trans, dp);
 		if (error)
-			goto out;
+			return error;
 	}
 
+	return error;
+}
+
+/*
+ * Remove a name from a B-tree attribute list.
+ *
+ * This routine will find the blocks of the name to remove, remove them and
+ * shrink the tree if needed.
+ */
+STATIC int
+xfs_attr_node_removename(
+	struct xfs_da_args	*args)
+{
+	struct xfs_da_state	*state = NULL;
+	int			error;
+	struct xfs_inode	*dp = args->dp;
+
+	trace_xfs_attr_node_removename(args);
+
+	error = xfs_attr_node_removename_setup(args, &state);
+	if (error)
+		goto out;
+
+	error = xfs_attr_node_remove_step(args, state);
+	if (error)
+		goto out;
+
 	/*
 	 * If the result is small enough, push it all into the inode.
 	 */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-21  6:45   ` Chandan Babu R
  2020-12-22 16:50   ` Brian Foster
  2020-12-18  7:29 ` [PATCH v14 03/15] xfs: Hoist transaction handling in xfs_attr_node_remove_step Allison Henderson
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This patch pulls a new helper function xfs_attr_node_remove_cleanup out
of xfs_attr_node_remove_step.  This helps to modularize
xfs_attr_node_remove_step which will help make the delayed attribute
code easier to follow

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 8b55a8d..e93d76a 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -1220,6 +1220,25 @@ xfs_attr_node_remove_rmt(
 	return xfs_attr_refillstate(state);
 }
 
+STATIC int
+xfs_attr_node_remove_cleanup(
+	struct xfs_da_args	*args,
+	struct xfs_da_state	*state)
+{
+	struct xfs_da_state_blk	*blk;
+	int			retval;
+
+	/*
+	 * Remove the name and update the hashvals in the tree.
+	 */
+	blk = &state->path.blk[state->path.active-1];
+	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
+	retval = xfs_attr3_leaf_remove(blk->bp, args);
+	xfs_da3_fixhashpath(state, &state->path);
+
+	return retval;
+}
+
 /*
  * Remove a name from a B-tree attribute list.
  *
@@ -1232,7 +1251,6 @@ xfs_attr_node_remove_step(
 	struct xfs_da_args	*args,
 	struct xfs_da_state	*state)
 {
-	struct xfs_da_state_blk	*blk;
 	int			retval, error;
 	struct xfs_inode	*dp = args->dp;
 
@@ -1247,14 +1265,7 @@ xfs_attr_node_remove_step(
 		if (error)
 			return error;
 	}
-
-	/*
-	 * Remove the name and update the hashvals in the tree.
-	 */
-	blk = &state->path.blk[ state->path.active-1 ];
-	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
-	retval = xfs_attr3_leaf_remove(blk->bp, args);
-	xfs_da3_fixhashpath(state, &state->path);
+	retval = xfs_attr_node_remove_cleanup(args, state);
 
 	/*
 	 * Check to see if the tree needs to be collapsed.
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 03/15] xfs: Hoist transaction handling in xfs_attr_node_remove_step
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-21  6:45   ` Chandan Babu R
  2020-12-18  7:29 ` [PATCH v14 04/15] xfs: Add delay ready attr remove routines Allison Henderson
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This patch hoists transaction handling in xfs_attr_node_remove to
xfs_attr_node_remove_step.  This will help keep transaction handling in
higher level functions instead of buried in subfunctions when we
introduce delay attributes

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index e93d76a..1969b88 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -1251,7 +1251,7 @@ xfs_attr_node_remove_step(
 	struct xfs_da_args	*args,
 	struct xfs_da_state	*state)
 {
-	int			retval, error;
+	int			error;
 	struct xfs_inode	*dp = args->dp;
 
 
@@ -1265,25 +1265,6 @@ xfs_attr_node_remove_step(
 		if (error)
 			return error;
 	}
-	retval = xfs_attr_node_remove_cleanup(args, state);
-
-	/*
-	 * Check to see if the tree needs to be collapsed.
-	 */
-	if (retval && (state->path.active > 1)) {
-		error = xfs_da3_join(state);
-		if (error)
-			return error;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
-		/*
-		 * Commit the Btree join operation and start a new trans.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
-	}
 
 	return error;
 }
@@ -1299,7 +1280,7 @@ xfs_attr_node_removename(
 	struct xfs_da_args	*args)
 {
 	struct xfs_da_state	*state = NULL;
-	int			error;
+	int			retval, error;
 	struct xfs_inode	*dp = args->dp;
 
 	trace_xfs_attr_node_removename(args);
@@ -1312,6 +1293,26 @@ xfs_attr_node_removename(
 	if (error)
 		goto out;
 
+	retval = xfs_attr_node_remove_cleanup(args, state);
+
+	/*
+	 * Check to see if the tree needs to be collapsed.
+	 */
+	if (retval && (state->path.active > 1)) {
+		error = xfs_da3_join(state);
+		if (error)
+			return error;
+		error = xfs_defer_finish(&args->trans);
+		if (error)
+			return error;
+		/*
+		 * Commit the Btree join operation and start a new trans.
+		 */
+		error = xfs_trans_roll_inode(&args->trans, dp);
+		if (error)
+			return error;
+	}
+
 	/*
 	 * If the result is small enough, push it all into the inode.
 	 */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (2 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 03/15] xfs: Hoist transaction handling in xfs_attr_node_remove_step Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-22  7:22   ` Chandan Babu R
  2020-12-22 17:11   ` Brian Foster
  2020-12-18  7:29 ` [PATCH v14 05/15] xfs: Add delay ready attr set routines Allison Henderson
                   ` (10 subsequent siblings)
  14 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This patch modifies the attr remove routines to be delay ready. This
means they no longer roll or commit transactions, but instead return
-EAGAIN to have the calling routine roll and refresh the transaction. In
this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
uses a sort of state machine like switch to keep track of where it was
when EAGAIN was returned. xfs_attr_node_removename has also been
modified to use the switch, and a new version of xfs_attr_remove_args
consists of a simple loop to refresh the transaction until the operation
is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
transaction where ever the existing code used to.

Calls to xfs_attr_rmtval_remove are replaced with the delay ready
version __xfs_attr_rmtval_remove. We will rename
__xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
done.

xfs_attr_rmtval_remove itself is still in use by the set routines (used
during a rename).  For reasons of preserving existing function, we
modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
set.  Similar to how xfs_attr_remove_args does here.  Once we transition
the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
used and will be removed.

This patch also adds a new struct xfs_delattr_context, which we will use
to keep track of the current state of an attribute operation. The new
xfs_delattr_state enum is used to track various operations that are in
progress so that we know not to repeat them, and resume where we left
off before EAGAIN was returned to cycle out the transaction. Other
members take the place of local variables that need to retain their
values across multiple function recalls.  See xfs_attr.h for a more
detailed diagram of the states.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 218 +++++++++++++++++++++++++++++-----------
 fs/xfs/libxfs/xfs_attr.h        | 100 ++++++++++++++++++
 fs/xfs/libxfs/xfs_attr_leaf.c   |   2 +-
 fs/xfs/libxfs/xfs_attr_remote.c |  48 +++++----
 fs/xfs/libxfs/xfs_attr_remote.h |   2 +-
 fs/xfs/xfs_attr_inactive.c      |   2 +-
 6 files changed, 288 insertions(+), 84 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 1969b88..b6330f9 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -53,7 +53,7 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
  */
 STATIC int xfs_attr_node_get(xfs_da_args_t *args);
 STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
-STATIC int xfs_attr_node_removename(xfs_da_args_t *args);
+STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
 STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
 				 struct xfs_da_state **state);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
@@ -264,6 +264,34 @@ xfs_attr_set_shortform(
 }
 
 /*
+ * Checks to see if a delayed attribute transaction should be rolled.  If so,
+ * also checks for a defer finish.  Transaction is finished and rolled as
+ * needed, and returns true of false if the delayed operation should continue.
+ */
+int
+xfs_attr_trans_roll(
+	struct xfs_delattr_context	*dac)
+{
+	struct xfs_da_args		*args = dac->da_args;
+	int				error;
+
+	if (dac->flags & XFS_DAC_DEFER_FINISH) {
+		/*
+		 * The caller wants us to finish all the deferred ops so that we
+		 * avoid pinning the log tail with a large number of deferred
+		 * ops.
+		 */
+		dac->flags &= ~XFS_DAC_DEFER_FINISH;
+		error = xfs_defer_finish(&args->trans);
+		if (error)
+			return error;
+	} else
+		error = xfs_trans_roll_inode(&args->trans, args->dp);
+
+	return error;
+}
+
+/*
  * Set the attribute specified in @args.
  */
 int
@@ -364,23 +392,58 @@ xfs_has_attr(
  */
 int
 xfs_attr_remove_args(
-	struct xfs_da_args      *args)
+	struct xfs_da_args	*args)
 {
-	struct xfs_inode	*dp = args->dp;
-	int			error;
+	int				error;
+	struct xfs_delattr_context	dac = {
+		.da_args	= args,
+	};
+
+	do {
+		error = xfs_attr_remove_iter(&dac);
+		if (error != -EAGAIN)
+			break;
+
+		error = xfs_attr_trans_roll(&dac);
+		if (error)
+			return error;
+
+	} while (true);
+
+	return error;
+}
 
-	if (!xfs_inode_hasattr(dp)) {
-		error = -ENOATTR;
-	} else if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
+/*
+ * Remove the attribute specified in @args.
+ *
+ * This function may return -EAGAIN to signal that the transaction needs to be
+ * rolled.  Callers should continue calling this function until they receive a
+ * return value other than -EAGAIN.
+ */
+int
+xfs_attr_remove_iter(
+	struct xfs_delattr_context	*dac)
+{
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_inode		*dp = args->dp;
+
+	/* If we are shrinking a node, resume shrink */
+	if (dac->dela_state == XFS_DAS_RM_SHRINK)
+		goto node;
+
+	if (!xfs_inode_hasattr(dp))
+		return -ENOATTR;
+
+	if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
 		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
-		error = xfs_attr_shortform_remove(args);
-	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
-		error = xfs_attr_leaf_removename(args);
-	} else {
-		error = xfs_attr_node_removename(args);
+		return xfs_attr_shortform_remove(args);
 	}
 
-	return error;
+	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+		return xfs_attr_leaf_removename(args);
+node:
+	/* If we are not short form or leaf, then proceed to remove node */
+	return  xfs_attr_node_removename_iter(dac);
 }
 
 /*
@@ -1178,10 +1241,11 @@ xfs_attr_leaf_mark_incomplete(
  */
 STATIC
 int xfs_attr_node_removename_setup(
-	struct xfs_da_args	*args,
-	struct xfs_da_state	**state)
+	struct xfs_delattr_context	*dac)
 {
-	int			error;
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_state		**state = &dac->da_state;
+	int				error;
 
 	error = xfs_attr_node_hasname(args, state);
 	if (error != -EEXIST)
@@ -1203,13 +1267,16 @@ int xfs_attr_node_removename_setup(
 }
 
 STATIC int
-xfs_attr_node_remove_rmt(
-	struct xfs_da_args	*args,
-	struct xfs_da_state	*state)
+xfs_attr_node_remove_rmt (
+	struct xfs_delattr_context	*dac,
+	struct xfs_da_state		*state)
 {
-	int			error = 0;
+	int				error = 0;
 
-	error = xfs_attr_rmtval_remove(args);
+	/*
+	 * May return -EAGAIN to request that the caller recall this function
+	 */
+	error = __xfs_attr_rmtval_remove(dac);
 	if (error)
 		return error;
 
@@ -1240,28 +1307,34 @@ xfs_attr_node_remove_cleanup(
 }
 
 /*
- * Remove a name from a B-tree attribute list.
+ * Step through removeing a name from a B-tree attribute list.
  *
  * This will involve walking down the Btree, and may involve joining
  * leaf nodes and even joining intermediate nodes up to and including
  * the root node (a special case of an intermediate node).
+ *
+ * This routine is meant to function as either an inline or delayed operation,
+ * and may return -EAGAIN when the transaction needs to be rolled.  Calling
+ * functions will need to handle this, and recall the function until a
+ * successful error code is returned.
  */
 STATIC int
 xfs_attr_node_remove_step(
-	struct xfs_da_args	*args,
-	struct xfs_da_state	*state)
+	struct xfs_delattr_context	*dac)
 {
-	int			error;
-	struct xfs_inode	*dp = args->dp;
-
-
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_state		*state = dac->da_state;
+	int				error = 0;
 	/*
 	 * If there is an out-of-line value, de-allocate the blocks.
 	 * This is done before we remove the attribute so that we don't
 	 * overflow the maximum size of a transaction and/or hit a deadlock.
 	 */
 	if (args->rmtblkno > 0) {
-		error = xfs_attr_node_remove_rmt(args, state);
+		/*
+		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
+		 */
+		error = xfs_attr_node_remove_rmt(dac, state);
 		if (error)
 			return error;
 	}
@@ -1274,51 +1347,74 @@ xfs_attr_node_remove_step(
  *
  * This routine will find the blocks of the name to remove, remove them and
  * shrink the tree if needed.
+ *
+ * This routine is meant to function as either an inline or delayed operation,
+ * and may return -EAGAIN when the transaction needs to be rolled.  Calling
+ * functions will need to handle this, and recall the function until a
+ * successful error code is returned.
  */
 STATIC int
-xfs_attr_node_removename(
-	struct xfs_da_args	*args)
+xfs_attr_node_removename_iter(
+	struct xfs_delattr_context	*dac)
 {
-	struct xfs_da_state	*state = NULL;
-	int			retval, error;
-	struct xfs_inode	*dp = args->dp;
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_state		*state = NULL;
+	int				retval, error;
+	struct xfs_inode		*dp = args->dp;
 
 	trace_xfs_attr_node_removename(args);
 
-	error = xfs_attr_node_removename_setup(args, &state);
-	if (error)
-		goto out;
+	if (!dac->da_state) {
+		error = xfs_attr_node_removename_setup(dac);
+		if (error)
+			goto out;
+	}
+	state = dac->da_state;
 
-	error = xfs_attr_node_remove_step(args, state);
-	if (error)
-		goto out;
+	switch (dac->dela_state) {
+	case XFS_DAS_UNINIT:
+		/*
+		 * repeatedly remove remote blocks, remove the entry and join.
+		 * returns -EAGAIN or 0 for completion of the step.
+		 */
+		error = xfs_attr_node_remove_step(dac);
+		if (error)
+			break;
 
-	retval = xfs_attr_node_remove_cleanup(args, state);
+		retval = xfs_attr_node_remove_cleanup(args, state);
 
-	/*
-	 * Check to see if the tree needs to be collapsed.
-	 */
-	if (retval && (state->path.active > 1)) {
-		error = xfs_da3_join(state);
-		if (error)
-			return error;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
 		/*
-		 * Commit the Btree join operation and start a new trans.
+		 * Check to see if the tree needs to be collapsed. Set the flag
+		 * to indicate that the calling function needs to move the
+		 * shrink operation
 		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
-	}
+		if (retval && (state->path.active > 1)) {
+			error = xfs_da3_join(state);
+			if (error)
+				return error;
 
-	/*
-	 * If the result is small enough, push it all into the inode.
-	 */
-	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
-		error = xfs_attr_node_shrink(args, state);
+			dac->flags |= XFS_DAC_DEFER_FINISH;
+			dac->dela_state = XFS_DAS_RM_SHRINK;
+			return -EAGAIN;
+		}
+
+		/* fallthrough */
+	case XFS_DAS_RM_SHRINK:
+		/*
+		 * If the result is small enough, push it all into the inode.
+		 */
+		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+			error = xfs_attr_node_shrink(args, state);
+
+		break;
+	default:
+		ASSERT(0);
+		error = -EINVAL;
+		goto out;
+	}
 
+	if (error == -EAGAIN)
+		return error;
 out:
 	if (state)
 		xfs_da_state_free(state);
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 3e97a93..3154ef4 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -74,6 +74,102 @@ struct xfs_attr_list_context {
 };
 
 
+/*
+ * ========================================================================
+ * Structure used to pass context around among the delayed routines.
+ * ========================================================================
+ */
+
+/*
+ * Below is a state machine diagram for attr remove operations. The  XFS_DAS_*
+ * states indicate places where the function would return -EAGAIN, and then
+ * immediately resume from after being recalled by the calling function. States
+ * marked as a "subroutine state" indicate that they belong to a subroutine, and
+ * so the calling function needs to pass them back to that subroutine to allow
+ * it to finish where it left off. But they otherwise do not have a role in the
+ * calling function other than just passing through.
+ *
+ * xfs_attr_remove_iter()
+ *              │
+ *              v
+ *        found attr blks? ───n──┐
+ *              │                v
+ *              │         find and invalidate
+ *              y         the blocks. mark
+ *              │         attr incomplete
+ *              ├────────────────┘
+ *              │
+ *              v
+ *      remove a block with
+ *    xfs_attr_node_remove_step <────┐
+ *              │                    │
+ *              v                    │
+ *      still have blks ──y──> return -EAGAIN.
+ *        to remove?          re-enter with one
+ *              │            less blk to remove
+ *              n
+ *              │
+ *              v
+ *       remove leaf and
+ *       update hash with
+ *   xfs_attr_node_remove_cleanup
+ *              │
+ *              v
+ *           need to
+ *        shrink tree? ─n─┐
+ *              │         │
+ *              y         │
+ *              │         │
+ *              v         │
+ *          join leaf     │
+ *              │         │
+ *              v         │
+ *      XFS_DAS_RM_SHRINK │
+ *              │         │
+ *              v         │
+ *       do the shrink    │
+ *              │         │
+ *              v         │
+ *          free state <──┘
+ *              │
+ *              v
+ *            done
+ *
+ */
+
+/*
+ * Enum values for xfs_delattr_context.da_state
+ *
+ * These values are used by delayed attribute operations to keep track  of where
+ * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
+ * calling function to roll the transaction, and then recall the subroutine to
+ * finish the operation.  The enum is then used by the subroutine to jump back
+ * to where it was and resume executing where it left off.
+ */
+enum xfs_delattr_state {
+	XFS_DAS_UNINIT		= 0,  /* No state has been set yet */
+	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
+};
+
+/*
+ * Defines for xfs_delattr_context.flags
+ */
+#define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
+
+/*
+ * Context used for keeping track of delayed attribute operations
+ */
+struct xfs_delattr_context {
+	struct xfs_da_args      *da_args;
+
+	/* Used in xfs_attr_node_removename to roll through removing blocks */
+	struct xfs_da_state     *da_state;
+
+	/* Used to keep track of current state of delayed operation */
+	unsigned int            flags;
+	enum xfs_delattr_state  dela_state;
+};
+
 /*========================================================================
  * Function prototypes for the kernel.
  *========================================================================*/
@@ -91,6 +187,10 @@ int xfs_attr_set(struct xfs_da_args *args);
 int xfs_attr_set_args(struct xfs_da_args *args);
 int xfs_has_attr(struct xfs_da_args *args);
 int xfs_attr_remove_args(struct xfs_da_args *args);
+int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
+int xfs_attr_trans_roll(struct xfs_delattr_context *dac);
 bool xfs_attr_namecheck(const void *name, size_t length);
+void xfs_delattr_context_init(struct xfs_delattr_context *dac,
+			      struct xfs_da_args *args);
 
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index d6ef69a..3780141 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -19,8 +19,8 @@
 #include "xfs_bmap_btree.h"
 #include "xfs_bmap.h"
 #include "xfs_attr_sf.h"
-#include "xfs_attr_remote.h"
 #include "xfs_attr.h"
+#include "xfs_attr_remote.h"
 #include "xfs_attr_leaf.h"
 #include "xfs_error.h"
 #include "xfs_trace.h"
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 48d8e9c..f09820c 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -674,10 +674,12 @@ xfs_attr_rmtval_invalidate(
  */
 int
 xfs_attr_rmtval_remove(
-	struct xfs_da_args      *args)
+	struct xfs_da_args		*args)
 {
-	int			error;
-	int			retval;
+	int				error;
+	struct xfs_delattr_context	dac  = {
+		.da_args	= args,
+	};
 
 	trace_xfs_attr_rmtval_remove(args);
 
@@ -685,31 +687,29 @@ xfs_attr_rmtval_remove(
 	 * Keep de-allocating extents until the remote-value region is gone.
 	 */
 	do {
-		retval = __xfs_attr_rmtval_remove(args);
-		if (retval && retval != -EAGAIN)
-			return retval;
+		error = __xfs_attr_rmtval_remove(&dac);
+		if (error != -EAGAIN)
+			break;
 
-		/*
-		 * Close out trans and start the next one in the chain.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, args->dp);
+		error = xfs_attr_trans_roll(&dac);
 		if (error)
 			return error;
-	} while (retval == -EAGAIN);
+	} while (true);
 
-	return 0;
+	return error;
 }
 
 /*
  * Remove the value associated with an attribute by deleting the out-of-line
- * buffer that it is stored on. Returns EAGAIN for the caller to refresh the
+ * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the
  * transaction and re-call the function
  */
 int
 __xfs_attr_rmtval_remove(
-	struct xfs_da_args	*args)
+	struct xfs_delattr_context	*dac)
 {
-	int			error, done;
+	struct xfs_da_args		*args = dac->da_args;
+	int				error, done;
 
 	/*
 	 * Unmap value blocks for this attr.
@@ -719,12 +719,20 @@ __xfs_attr_rmtval_remove(
 	if (error)
 		return error;
 
-	error = xfs_defer_finish(&args->trans);
-	if (error)
-		return error;
-
-	if (!done)
+	/*
+	 * We dont need an explicit state here to pick up where we left off.  We
+	 * can figure it out using the !done return code.  Calling function only
+	 * needs to keep recalling this routine until we indicate to stop by
+	 * returning anything other than -EAGAIN. The actual value of
+	 * attr->xattri_dela_state may be some value reminicent of the calling
+	 * function, but it's value is irrelevant with in the context of this
+	 * function.  Once we are done here, the next state is set as needed
+	 * by the parent
+	 */
+	if (!done) {
+		dac->flags |= XFS_DAC_DEFER_FINISH;
 		return -EAGAIN;
+	}
 
 	return error;
 }
diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
index 9eee615..002fd30 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.h
+++ b/fs/xfs/libxfs/xfs_attr_remote.h
@@ -14,5 +14,5 @@ int xfs_attr_rmtval_remove(struct xfs_da_args *args);
 int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
 		xfs_buf_flags_t incore_flags);
 int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
-int __xfs_attr_rmtval_remove(struct xfs_da_args *args);
+int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
 #endif /* __XFS_ATTR_REMOTE_H__ */
diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
index bfad669..aaa7e66 100644
--- a/fs/xfs/xfs_attr_inactive.c
+++ b/fs/xfs/xfs_attr_inactive.c
@@ -15,10 +15,10 @@
 #include "xfs_da_format.h"
 #include "xfs_da_btree.h"
 #include "xfs_inode.h"
+#include "xfs_attr.h"
 #include "xfs_attr_remote.h"
 #include "xfs_trans.h"
 #include "xfs_bmap.h"
-#include "xfs_attr.h"
 #include "xfs_attr_leaf.h"
 #include "xfs_quota.h"
 #include "xfs_dir2.h"
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 05/15] xfs: Add delay ready attr set routines
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (3 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 04/15] xfs: Add delay ready attr remove routines Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-23  8:00   ` Chandan Babu R
  2020-12-18  7:29 ` [PATCH v14 06/15] xfs: Add state machine tracepoints Allison Henderson
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This patch modifies the attr set routines to be delay ready. This means
they no longer roll or commit transactions, but instead return -EAGAIN
to have the calling routine roll and refresh the transaction.  In this
series, xfs_attr_set_args has become xfs_attr_set_iter, which uses a
state machine like switch to keep track of where it was when EAGAIN was
returned. See xfs_attr.h for a more detailed diagram of the states.

Two new helper functions have been added: xfs_attr_rmtval_set_init and
xfs_attr_rmtval_set_blk.  They provide a subset of logic similar to
xfs_attr_rmtval_set, but they store the current block in the delay attr
context to allow the caller to roll the transaction between allocations.
This helps to simplify and consolidate code used by
xfs_attr_leaf_addname and xfs_attr_node_addname. xfs_attr_set_args has
now become a simple loop to refresh the transaction until the operation
is completed.  Lastly, xfs_attr_rmtval_remove is no longer used, and is
removed.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 357 ++++++++++++++++++++++++++--------------
 fs/xfs/libxfs/xfs_attr.h        | 235 +++++++++++++++++++++++++-
 fs/xfs/libxfs/xfs_attr_remote.c |  98 +++++++----
 fs/xfs/libxfs/xfs_attr_remote.h |   5 +-
 fs/xfs/xfs_trace.h              |   1 -
 5 files changed, 541 insertions(+), 155 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index b6330f9..cd72512 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -44,7 +44,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
  * Internal routines when attribute list is one block.
  */
 STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
-STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args);
+STATIC int xfs_attr_leaf_addname(struct xfs_delattr_context *dac);
 STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
 STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
 
@@ -52,12 +52,15 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
  * Internal routines when attribute list is more than one block.
  */
 STATIC int xfs_attr_node_get(xfs_da_args_t *args);
-STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
+STATIC int xfs_attr_node_addname(struct xfs_delattr_context *dac);
 STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
 STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
 				 struct xfs_da_state **state);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
+STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *args, struct xfs_buf *bp);
+STATIC int xfs_attr_set_iter(struct xfs_delattr_context *dac,
+			     struct xfs_buf **leaf_bp);
 
 int
 xfs_inode_hasattr(
@@ -218,8 +221,11 @@ xfs_attr_is_shortform(
 
 /*
  * Attempts to set an attr in shortform, or converts short form to leaf form if
- * there is not enough room.  If the attr is set, the transaction is committed
- * and set to NULL.
+ * there is not enough room.  This function is meant to operate as a helper
+ * routine to the delayed attribute functions.  It returns -EAGAIN to indicate
+ * that the calling function should roll the transaction, and then proceed to
+ * add the attr in leaf form.  This subroutine does not expect to be recalled
+ * again like the other delayed attr routines do.
  */
 STATIC int
 xfs_attr_set_shortform(
@@ -227,16 +233,16 @@ xfs_attr_set_shortform(
 	struct xfs_buf		**leaf_bp)
 {
 	struct xfs_inode	*dp = args->dp;
-	int			error, error2 = 0;
+	int			error = 0;
 
 	/*
 	 * Try to add the attr to the attribute list in the inode.
 	 */
 	error = xfs_attr_try_sf_addname(dp, args);
+
+	/* Should only be 0, -EEXIST or -ENOSPC */
 	if (error != -ENOSPC) {
-		error2 = xfs_trans_commit(args->trans);
-		args->trans = NULL;
-		return error ? error : error2;
+		return error;
 	}
 	/*
 	 * It won't fit in the shortform, transform to a leaf block.  GROT:
@@ -249,18 +255,15 @@ xfs_attr_set_shortform(
 	/*
 	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
 	 * push cannot grab the half-baked leaf buffer and run into problems
-	 * with the write verifier. Once we're done rolling the transaction we
-	 * can release the hold and add the attr to the leaf.
+	 * with the write verifier.
 	 */
 	xfs_trans_bhold(args->trans, *leaf_bp);
-	error = xfs_defer_finish(&args->trans);
-	xfs_trans_bhold_release(args->trans, *leaf_bp);
-	if (error) {
-		xfs_trans_brelse(args->trans, *leaf_bp);
-		return error;
-	}
 
-	return 0;
+	/*
+	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
+	 * fork to leaf format and will restart with the leaf add.
+	 */
+	return -EAGAIN;
 }
 
 /*
@@ -268,7 +271,7 @@ xfs_attr_set_shortform(
  * also checks for a defer finish.  Transaction is finished and rolled as
  * needed, and returns true of false if the delayed operation should continue.
  */
-int
+STATIC int
 xfs_attr_trans_roll(
 	struct xfs_delattr_context	*dac)
 {
@@ -298,34 +301,95 @@ int
 xfs_attr_set_args(
 	struct xfs_da_args	*args)
 {
-	struct xfs_inode	*dp = args->dp;
-	struct xfs_buf          *leaf_bp = NULL;
-	int			error = 0;
+	struct xfs_buf			*leaf_bp = NULL;
+	int				error = 0;
+	struct xfs_delattr_context	dac = {
+		.da_args	= args,
+	};
+
+	do {
+		error = xfs_attr_set_iter(&dac, &leaf_bp);
+		if (error != -EAGAIN)
+			break;
+
+		error = xfs_attr_trans_roll(&dac);
+		if (error)
+			return error;
+	} while (true);
+
+	return error;
+}
+
+/*
+ * Set the attribute specified in @args.
+ * This routine is meant to function as a delayed operation, and may return
+ * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
+ * to handle this, and recall the function until a successful error code is
+ * returned.
+ */
+STATIC int
+xfs_attr_set_iter(
+	struct xfs_delattr_context	*dac,
+	struct xfs_buf			**leaf_bp)
+{
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_inode		*dp = args->dp;
+	int				error = 0;
+
+	/* State machine switch */
+	switch (dac->dela_state) {
+	case XFS_DAS_FLIP_LFLAG:
+	case XFS_DAS_FOUND_LBLK:
+	case XFS_DAS_RM_LBLK:
+		return xfs_attr_leaf_addname(dac);
+	case XFS_DAS_FOUND_NBLK:
+	case XFS_DAS_FLIP_NFLAG:
+	case XFS_DAS_ALLOC_NODE:
+		return xfs_attr_node_addname(dac);
+	case XFS_DAS_UNINIT:
+		break;
+	default:
+		ASSERT(dac->dela_state != XFS_DAS_RM_SHRINK);
+		break;
+	}
 
 	/*
 	 * If the attribute list is already in leaf format, jump straight to
 	 * leaf handling.  Otherwise, try to add the attribute to the shortform
 	 * list; if there's no room then convert the list to leaf format and try
-	 * again.
+	 * again. No need to set state as we will be in leaf form when we come
+	 * back
 	 */
 	if (xfs_attr_is_shortform(dp)) {
 
 		/*
-		 * If the attr was successfully set in shortform, the
-		 * transaction is committed and set to NULL.  Otherwise, is it
-		 * converted from shortform to leaf, and the transaction is
-		 * retained.
+		 * If the attr was successfully set in shortform, no need to
+		 * continue.  Otherwise, is it converted from shortform to leaf
+		 * and -EAGAIN is returned.
 		 */
-		error = xfs_attr_set_shortform(args, &leaf_bp);
-		if (error || !args->trans)
-			return error;
+		error = xfs_attr_set_shortform(args, leaf_bp);
+		if (error == -EAGAIN)
+			dac->flags |= XFS_DAC_DEFER_FINISH;
+
+		return error;
 	}
 
-	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
-		error = xfs_attr_leaf_addname(args);
-		if (error != -ENOSPC)
-			return error;
+	/*
+	 * After a shortform to leaf conversion, we need to hold the leaf and
+	 * cycle out the transaction.  When we get back, we need to release
+	 * the leaf to release the hold on the leaf buffer.
+	 */
+	if (*leaf_bp != NULL) {
+		xfs_trans_bhold_release(args->trans, *leaf_bp);
+		*leaf_bp = NULL;
+	}
+
+	if (!xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+		return xfs_attr_node_addname(dac);
 
+	error = xfs_attr_leaf_try_add(args, *leaf_bp);
+	switch (error) {
+	case -ENOSPC:
 		/*
 		 * Promote the attribute list to the Btree format.
 		 */
@@ -334,25 +398,22 @@ xfs_attr_set_args(
 			return error;
 
 		/*
-		 * Finish any deferred work items and roll the transaction once
-		 * more.  The goal here is to call node_addname with the inode
-		 * and transaction in the same state (inode locked and joined,
-		 * transaction clean) no matter how we got to this step.
-		 */
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
-
-		/*
-		 * Commit the current trans (including the inode) and
-		 * start a new one.
+		 * Finish any deferred work items and roll the
+		 * transaction once more.  The goal here is to call
+		 * node_addname with the inode and transaction in the
+		 * same state (inode locked and joined, transaction
+		 * clean) no matter how we got to this step.
+		 *
+		 * At this point, we are still in XFS_DAS_UNINIT, but
+		 * when we come back, we'll be a node, so we'll fall
+		 * down into the node handling code below
 		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
+		dac->flags |= XFS_DAC_DEFER_FINISH;
+		return -EAGAIN;
+	case 0:
+		dac->dela_state = XFS_DAS_FOUND_LBLK;
+		return -EAGAIN;
 	}
-
-	error = xfs_attr_node_addname(args);
 	return error;
 }
 
@@ -728,28 +789,30 @@ xfs_attr_leaf_try_add(
  *
  * This leaf block cannot have a "remote" value, we only call this routine
  * if bmap_one_block() says there is only one block (ie: no remote blks).
+ *
+ * This routine is meant to function as a delayed operation, and may return
+ * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
+ * to handle this, and recall the function until a successful error code is
+ * returned.
  */
 STATIC int
 xfs_attr_leaf_addname(
-	struct xfs_da_args	*args)
+	struct xfs_delattr_context	*dac)
 {
-	int			error, forkoff;
-	struct xfs_buf		*bp = NULL;
-	struct xfs_inode	*dp = args->dp;
-
-	trace_xfs_attr_leaf_addname(args);
-
-	error = xfs_attr_leaf_try_add(args, bp);
-	if (error)
-		return error;
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_buf			*bp = NULL;
+	int				error, forkoff;
+	struct xfs_inode		*dp = args->dp;
 
-	/*
-	 * Commit the transaction that added the attr name so that
-	 * later routines can manage their own transactions.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
-	if (error)
-		return error;
+	/* State machine switch */
+	switch (dac->dela_state) {
+	case XFS_DAS_FLIP_LFLAG:
+		goto das_flip_flag;
+	case XFS_DAS_RM_LBLK:
+		goto das_rm_lblk;
+	default:
+		break;
+	}
 
 	/*
 	 * If there was an out-of-line value, allocate the blocks we
@@ -757,12 +820,34 @@ xfs_attr_leaf_addname(
 	 * after we create the attribute so that we don't overflow the
 	 * maximum size of a transaction and/or hit a deadlock.
 	 */
-	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args);
+
+	/* Open coded xfs_attr_rmtval_set without trans handling */
+	if ((dac->flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
+		dac->flags |= XFS_DAC_LEAF_ADDNAME_INIT;
+		if (args->rmtblkno > 0) {
+			error = xfs_attr_rmtval_find_space(dac);
+			if (error)
+				return error;
+		}
+	}
+
+	/*
+	 * Roll through the "value", allocating blocks on disk as
+	 * required.
+	 */
+	if (dac->blkcnt > 0) {
+		error = xfs_attr_rmtval_set_blk(dac);
 		if (error)
 			return error;
+
+		dac->flags |= XFS_DAC_DEFER_FINISH;
+		return -EAGAIN;
 	}
 
+	error = xfs_attr_rmtval_set_value(args);
+	if (error)
+		return error;
+
 	if (!(args->op_flags & XFS_DA_OP_RENAME)) {
 		/*
 		 * Added a "remote" value, just clear the incomplete flag.
@@ -782,29 +867,30 @@ xfs_attr_leaf_addname(
 	 * In a separate transaction, set the incomplete flag on the "old" attr
 	 * and clear the incomplete flag on the "new" attr.
 	 */
-
 	error = xfs_attr3_leaf_flipflags(args);
 	if (error)
 		return error;
 	/*
 	 * Commit the flag value change and start the next trans in series.
 	 */
-	error = xfs_trans_roll_inode(&args->trans, args->dp);
-	if (error)
-		return error;
-
+	dac->dela_state = XFS_DAS_FLIP_LFLAG;
+	return -EAGAIN;
+das_flip_flag:
 	/*
 	 * Dismantle the "old" attribute/value pair by removing a "remote" value
 	 * (if it exists).
 	 */
 	xfs_attr_restore_rmt_blk(args);
 
-	if (args->rmtblkno) {
-		error = xfs_attr_rmtval_invalidate(args);
-		if (error)
-			return error;
+	error = xfs_attr_rmtval_invalidate(args);
+	if (error)
+		return error;
 
-		error = xfs_attr_rmtval_remove(args);
+	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
+	dac->dela_state = XFS_DAS_RM_LBLK;
+das_rm_lblk:
+	if (args->rmtblkno) {
+		error = __xfs_attr_rmtval_remove(dac);
 		if (error)
 			return error;
 	}
@@ -970,23 +1056,38 @@ xfs_attr_node_hasname(
  *
  * "Remote" attribute values confuse the issue and atomic rename operations
  * add a whole extra layer of confusion on top of that.
+ *
+ * This routine is meant to function as a delayed operation, and may return
+ * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
+ * to handle this, and recall the function until a successful error code is
+ *returned.
  */
 STATIC int
 xfs_attr_node_addname(
-	struct xfs_da_args	*args)
+	struct xfs_delattr_context	*dac)
 {
-	struct xfs_da_state	*state;
-	struct xfs_da_state_blk	*blk;
-	struct xfs_inode	*dp;
-	int			retval, error;
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_state		*state = NULL;
+	struct xfs_da_state_blk		*blk;
+	int				retval = 0;
+	int				error = 0;
 
 	trace_xfs_attr_node_addname(args);
 
-	/*
-	 * Fill in bucket of arguments/results/context to carry around.
-	 */
-	dp = args->dp;
-restart:
+	/* State machine switch */
+	switch (dac->dela_state) {
+	case XFS_DAS_FLIP_NFLAG:
+		goto das_flip_flag;
+	case XFS_DAS_FOUND_NBLK:
+		goto das_found_nblk;
+	case XFS_DAS_ALLOC_NODE:
+		goto das_alloc_node;
+	case XFS_DAS_RM_NBLK:
+		goto das_rm_nblk;
+	default:
+		break;
+	}
+
 	/*
 	 * Search to see if name already exists, and get back a pointer
 	 * to where it should go.
@@ -1032,19 +1133,16 @@ xfs_attr_node_addname(
 			error = xfs_attr3_leaf_to_node(args);
 			if (error)
 				goto out;
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				goto out;
 
 			/*
-			 * Commit the node conversion and start the next
-			 * trans in the chain.
+			 * Now that we have converted the leaf to a node, we can
+			 * roll the transaction, and try xfs_attr3_leaf_add
+			 * again on re-entry.  No need to set dela_state to do
+			 * this. dela_state is still unset by this function at
+			 * this point.
 			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
-			if (error)
-				goto out;
-
-			goto restart;
+			dac->flags |= XFS_DAC_DEFER_FINISH;
+			return -EAGAIN;
 		}
 
 		/*
@@ -1056,9 +1154,7 @@ xfs_attr_node_addname(
 		error = xfs_da3_split(state);
 		if (error)
 			goto out;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			goto out;
+		dac->flags |= XFS_DAC_DEFER_FINISH;
 	} else {
 		/*
 		 * Addition succeeded, update Btree hashvals.
@@ -1066,6 +1162,11 @@ xfs_attr_node_addname(
 		xfs_da3_fixhashpath(state, &state->path);
 	}
 
+	if (!args->rmtblkno && !(args->op_flags & XFS_DA_OP_RENAME)) {
+		retval = error;
+		goto out;
+	}
+
 	/*
 	 * Kill the state structure, we're done with it and need to
 	 * allow the buffers to come back later.
@@ -1073,13 +1174,9 @@ xfs_attr_node_addname(
 	xfs_da_state_free(state);
 	state = NULL;
 
-	/*
-	 * Commit the leaf addition or btree split and start the next
-	 * trans in the chain.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
-	if (error)
-		goto out;
+	dac->dela_state = XFS_DAS_FOUND_NBLK;
+	return -EAGAIN;
+das_found_nblk:
 
 	/*
 	 * If there was an out-of-line value, allocate the blocks we
@@ -1088,7 +1185,27 @@ xfs_attr_node_addname(
 	 * maximum size of a transaction and/or hit a deadlock.
 	 */
 	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args);
+		/* Open coded xfs_attr_rmtval_set without trans handling */
+		error = xfs_attr_rmtval_find_space(dac);
+		if (error)
+			return error;
+
+		/*
+		 * Roll through the "value", allocating blocks on disk as
+		 * required.  Set the state in case of -EAGAIN return code
+		 */
+		dac->dela_state = XFS_DAS_ALLOC_NODE;
+das_alloc_node:
+		if (dac->blkcnt > 0) {
+			error = xfs_attr_rmtval_set_blk(dac);
+			if (error)
+				return error;
+
+			dac->flags |= XFS_DAC_DEFER_FINISH;
+			return -EAGAIN;
+		}
+
+		error = xfs_attr_rmtval_set_value(args);
 		if (error)
 			return error;
 	}
@@ -1118,22 +1235,24 @@ xfs_attr_node_addname(
 	/*
 	 * Commit the flag value change and start the next trans in series
 	 */
-	error = xfs_trans_roll_inode(&args->trans, args->dp);
-	if (error)
-		goto out;
-
+	dac->dela_state = XFS_DAS_FLIP_NFLAG;
+	return -EAGAIN;
+das_flip_flag:
 	/*
 	 * Dismantle the "old" attribute/value pair by removing a "remote" value
 	 * (if it exists).
 	 */
 	xfs_attr_restore_rmt_blk(args);
 
-	if (args->rmtblkno) {
-		error = xfs_attr_rmtval_invalidate(args);
-		if (error)
-			return error;
+	error = xfs_attr_rmtval_invalidate(args);
+	if (error)
+		return error;
 
-		error = xfs_attr_rmtval_remove(args);
+	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
+	dac->dela_state = XFS_DAS_RM_NBLK;
+das_rm_nblk:
+	if (args->rmtblkno) {
+		error = __xfs_attr_rmtval_remove(dac);
 		if (error)
 			return error;
 	}
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 3154ef4..e101238 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -135,6 +135,227 @@ struct xfs_attr_list_context {
  *              v
  *            done
  *
+ *
+ * Below is a state machine diagram for attr set operations.
+ *
+ * It seems the challenge with undertanding this system comes from trying to
+ * absorb the state machine all at once, when really one should only be looking
+ * at it with in the context of a single function.  Once a state sensitive
+ * function is called, the idea is that it "takes ownership" of the
+ * statemachine. It isn't concerned with the states that may have belonged to
+ * it's calling parent.  Only the states relevant to itself or any other
+ * subroutines there in.  Once a calling function hands off the statemachine to
+ * a subroutine, it needs to respect the simple rule that it doesn't "own" the
+ * statemachine anymore, and it's the responsibility of that calling function to
+ * propagate the -EAGAIN back up the call stack.  Upon reentry, it is committed
+ * to re-calling that subroutine until it returns something other than -EAGAIN.
+ * Once that subroutine signals completion (by returning anything other than
+ * -EAGAIN), the calling function can resume using the statemachine.
+ *
+ *  xfs_attr_set_iter()
+ *              │
+ *              v
+ *   ┌─y─ has an attr fork?
+ *   │          |
+ *   │          n
+ *   │          |
+ *   │          V
+ *   │       add a fork
+ *   │          │
+ *   └──────────┤
+ *              │
+ *              V
+ *   ┌─n── is shortform?
+ *   │          |
+ *   │          y
+ *   │          |
+ *   │          V
+ *   │ xfs_attr_set_shortform
+ *   │          |
+ *   │          V
+ *   │      had enough ──y──> done
+ *   │        space?
+ *   │          │
+ *   │          n
+ *   │          │
+ *   │          V
+ *   │     return -EAGAIN
+ *   │   Re-enter in leaf form
+ *   │          │
+ *   └──────────┤
+ *              │
+ *              V
+ *       release leaf buffer
+ *          if needed
+ *              │
+ *              V
+ *   ┌───n── fork has
+ *   │      only 1 blk?
+ *   │          │
+ *   │          y
+ *   │          │
+ *   │          v
+ *   │ xfs_attr_leaf_try_add()
+ *   │                  │
+ *   │                  v
+ *   │              had enough
+ *   │       ┌────n── space?
+ *   │       │          │
+ *   │       v          │
+ *   │ return -EAGAIN   │
+ *   │  re-enter in     y
+ *   │   node form      │
+ *   │       │          │
+ *   ├───────┘          │
+ *   │                  v
+ *   │  XFS_DAS_FOUND_LBLK ──┐
+ *   │                       │
+ *   │  XFS_DAS_FLIP_LFLAG ──┤
+ *   │  (subroutine state)   │
+ *   │                       │
+ *   │                       └─>xfs_attr_leaf_addname()
+ *   │                                │
+ *   │                                v
+ *   │                     ┌──first time through?
+ *   │                     │          │
+ *   │                     │          y
+ *   │                     │          │
+ *   │                     n          v
+ *   │                     │    if we have rmt blks
+ *   │                     │    find space for them
+ *   │                     │          │
+ *   │                     └──────────┤
+ *   │                                │
+ *   │                                v
+ *   │                           still have
+ *   │                     ┌─n─ blks to alloc? <──┐
+ *   │                     │          │           │
+ *   │                     │          y           │
+ *   │                     │          │           │
+ *   │                     │          v           │
+ *   │                     │     alloc one blk    │
+ *   │                     │     return -EAGAIN ──┘
+ *   │                     │    re-enter with one
+ *   │                     │    less blk to alloc
+ *   │                     │
+ *   │                     │
+ *   │                     └───> set the rmt
+ *   │                              value
+ *   │                                │
+ *   │                                v
+ *   │                              was this
+ *   │                             a rename? ──n─┐
+ *   │                                │          │
+ *   │                                y          │
+ *   │                                │          │
+ *   │                                v          │
+ *   │                          flip incomplete  │
+ *   │                              flag         │
+ *   │                                │          │
+ *   │                                v          │
+ *   │                        XFS_DAS_FLIP_LFLAG │
+ *   │                                │          │
+ *   │                                v          │
+ *   │                              remove       │
+ *   │          XFS_DAS_RM_LBLK ─> old name      │
+ *   │                   ^            │          │
+ *   │                   │            v          │
+ *   │                   └──────y── more to      │
+ *   │                              remove       │
+ *   │                                │          │
+ *   │                                n          │
+ *   │                                │          │
+ *   │                                v          │
+ *   │                               done <──────┘
+ *   └──> XFS_DAS_FOUND_NBLK ──┐
+ *        (subroutine state)   │
+ *                             │
+ *        XFS_DAS_ALLOC_NODE ──┤
+ *        (subroutine state)   │
+ *                             │
+ *        XFS_DAS_FLIP_NFLAG ──┤
+ *        (subroutine state)   │
+ *                             │
+ *                             └─>xfs_attr_node_addname()
+ *                                     │
+ *                                     v
+ *                               determine if this
+ *                              is create or rename
+ *                            find space to store attr
+ *                                     │
+ *                                     v
+ *               ┌──────n──── fits in a node leaf?
+ *               │               ^     │
+ *       single leaf node?       │     │
+ *         │            │        │     y
+ *         n            y        │     │
+ *         │            │        │     v
+ *         v            v        │   update
+ *     split if   grow the leaf ─┘  hashvals
+ *      needed     return -EAGAIN      │
+ *         │      retry leaf add       │
+ *         │        on reentry         │
+ *         │                           │
+ *         └───────────────────────────┤
+ *                                     v
+ *                                need to alloc ──n──> done
+ *                                or flip flag?
+ *                                     │
+ *                                     y
+ *                                     │
+ *                                     v
+ *                             XFS_DAS_FOUND_NBLK
+ *                                     │
+ *                                     v
+ *                       ┌─────n──  need to
+ *                       │        alloc blks?
+ *                       │             │
+ *                       │             y
+ *                       │             │
+ *                       │             v
+ *                       │        find space
+ *                       │             │
+ *                       │             v
+ *                       │  ┌─>XFS_DAS_ALLOC_NODE
+ *                       │  │          │
+ *                       │  │          v
+ *                       │  │      alloc blk
+ *                       │  │          │
+ *                       │  │          v
+ *                       │  └──y── need to alloc
+ *                       │         more blocks?
+ *                       │             │
+ *                       │             n
+ *                       │             │
+ *                       │             v
+ *                       │      set the rmt value
+ *                       │             │
+ *                       │             v
+ *                       │          was this
+ *                       └────────> a rename? ──n─┐
+ *                                     │          │
+ *                                     y          │
+ *                                     │          │
+ *                                     v          │
+ *                               flip incomplete  │
+ *                                   flag         │
+ *                                     │          │
+ *                                     v          │
+ *                             XFS_DAS_FLIP_NFLAG │
+ *                                     │          │
+ *                                     v          │
+ *                                   remove       │
+ *               XFS_DAS_RM_NBLK ─> old name      │
+ *                        ^            │          │
+ *                        │            v          │
+ *                        └──────y── more to      │
+ *                                   remove       │
+ *                                     │          │
+ *                                     n          │
+ *                                     │          │
+ *                                     v          │
+ *                                    done <──────┘
+ *
  */
 
 /*
@@ -149,12 +370,20 @@ struct xfs_attr_list_context {
 enum xfs_delattr_state {
 	XFS_DAS_UNINIT		= 0,  /* No state has been set yet */
 	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
+	XFS_DAS_FOUND_LBLK,	      /* We found leaf blk for attr */
+	XFS_DAS_FOUND_NBLK,	      /* We found node blk for attr */
+	XFS_DAS_FLIP_LFLAG,	      /* Flipped leaf INCOMPLETE attr flag */
+	XFS_DAS_RM_LBLK,	      /* A rename is removing leaf blocks */
+	XFS_DAS_ALLOC_NODE,	      /* We are allocating node blocks */
+	XFS_DAS_FLIP_NFLAG,	      /* Flipped node INCOMPLETE attr flag */
+	XFS_DAS_RM_NBLK,	      /* A rename is removing node blocks */
 };
 
 /*
  * Defines for xfs_delattr_context.flags
  */
 #define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
+#define XFS_DAC_LEAF_ADDNAME_INIT	0x02 /* xfs_attr_leaf_addname init*/
 
 /*
  * Context used for keeping track of delayed attribute operations
@@ -162,6 +391,11 @@ enum xfs_delattr_state {
 struct xfs_delattr_context {
 	struct xfs_da_args      *da_args;
 
+	/* Used in xfs_attr_rmtval_set_blk to roll through allocating blocks */
+	struct xfs_bmbt_irec	map;
+	xfs_dablk_t		lblkno;
+	int			blkcnt;
+
 	/* Used in xfs_attr_node_removename to roll through removing blocks */
 	struct xfs_da_state     *da_state;
 
@@ -188,7 +422,6 @@ int xfs_attr_set_args(struct xfs_da_args *args);
 int xfs_has_attr(struct xfs_da_args *args);
 int xfs_attr_remove_args(struct xfs_da_args *args);
 int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
-int xfs_attr_trans_roll(struct xfs_delattr_context *dac);
 bool xfs_attr_namecheck(const void *name, size_t length);
 void xfs_delattr_context_init(struct xfs_delattr_context *dac,
 			      struct xfs_da_args *args);
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index f09820c..6af86bf 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -441,7 +441,7 @@ xfs_attr_rmtval_get(
  * Find a "hole" in the attribute address space large enough for us to drop the
  * new attribute's value into
  */
-STATIC int
+int
 xfs_attr_rmt_find_hole(
 	struct xfs_da_args	*args)
 {
@@ -468,7 +468,7 @@ xfs_attr_rmt_find_hole(
 	return 0;
 }
 
-STATIC int
+int
 xfs_attr_rmtval_set_value(
 	struct xfs_da_args	*args)
 {
@@ -628,6 +628,69 @@ xfs_attr_rmtval_set(
 }
 
 /*
+ * Find a hole for the attr and store it in the delayed attr context.  This
+ * initializes the context to roll through allocating an attr extent for a
+ * delayed attr operation
+ */
+int
+xfs_attr_rmtval_find_space(
+	struct xfs_delattr_context	*dac)
+{
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_bmbt_irec		*map = &dac->map;
+	int				error;
+
+	dac->lblkno = 0;
+	dac->blkcnt = 0;
+	args->rmtblkcnt = 0;
+	args->rmtblkno = 0;
+	memset(map, 0, sizeof(struct xfs_bmbt_irec));
+
+	error = xfs_attr_rmt_find_hole(args);
+	if (error)
+		return error;
+
+	dac->blkcnt = args->rmtblkcnt;
+	dac->lblkno = args->rmtblkno;
+
+	return 0;
+}
+
+/*
+ * Write one block of the value associated with an attribute into the
+ * out-of-line buffer that we have defined for it. This is similar to a subset
+ * of xfs_attr_rmtval_set, but records the current block to the delayed attr
+ * context, and leaves transaction handling to the caller.
+ */
+int
+xfs_attr_rmtval_set_blk(
+	struct xfs_delattr_context	*dac)
+{
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_inode		*dp = args->dp;
+	struct xfs_bmbt_irec		*map = &dac->map;
+	int nmap;
+	int error;
+
+	nmap = 1;
+	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)dac->lblkno,
+				dac->blkcnt, XFS_BMAPI_ATTRFORK, args->total,
+				map, &nmap);
+	if (error)
+		return error;
+
+	ASSERT(nmap == 1);
+	ASSERT((map->br_startblock != DELAYSTARTBLOCK) &&
+	       (map->br_startblock != HOLESTARTBLOCK));
+
+	/* roll attribute extent map forwards */
+	dac->lblkno += map->br_blockcount;
+	dac->blkcnt -= map->br_blockcount;
+
+	return 0;
+}
+
+/*
  * Remove the value associated with an attribute by deleting the
  * out-of-line buffer that it is stored on.
  */
@@ -669,37 +732,6 @@ xfs_attr_rmtval_invalidate(
 }
 
 /*
- * Remove the value associated with an attribute by deleting the
- * out-of-line buffer that it is stored on.
- */
-int
-xfs_attr_rmtval_remove(
-	struct xfs_da_args		*args)
-{
-	int				error;
-	struct xfs_delattr_context	dac  = {
-		.da_args	= args,
-	};
-
-	trace_xfs_attr_rmtval_remove(args);
-
-	/*
-	 * Keep de-allocating extents until the remote-value region is gone.
-	 */
-	do {
-		error = __xfs_attr_rmtval_remove(&dac);
-		if (error != -EAGAIN)
-			break;
-
-		error = xfs_attr_trans_roll(&dac);
-		if (error)
-			return error;
-	} while (true);
-
-	return error;
-}
-
-/*
  * Remove the value associated with an attribute by deleting the out-of-line
  * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the
  * transaction and re-call the function
diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
index 002fd30..8ad68d5 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.h
+++ b/fs/xfs/libxfs/xfs_attr_remote.h
@@ -10,9 +10,12 @@ int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
 
 int xfs_attr_rmtval_get(struct xfs_da_args *args);
 int xfs_attr_rmtval_set(struct xfs_da_args *args);
-int xfs_attr_rmtval_remove(struct xfs_da_args *args);
 int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
 		xfs_buf_flags_t incore_flags);
 int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
 int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
+int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
+int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
+int xfs_attr_rmtval_set_blk(struct xfs_delattr_context *dac);
+int xfs_attr_rmtval_find_space(struct xfs_delattr_context *dac);
 #endif /* __XFS_ATTR_REMOTE_H__ */
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 5a263ae..9074b8b 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -1943,7 +1943,6 @@ DEFINE_ATTR_EVENT(xfs_attr_refillstate);
 
 DEFINE_ATTR_EVENT(xfs_attr_rmtval_get);
 DEFINE_ATTR_EVENT(xfs_attr_rmtval_set);
-DEFINE_ATTR_EVENT(xfs_attr_rmtval_remove);
 
 #define DEFINE_DA_EVENT(name) \
 DEFINE_EVENT(xfs_da_class, name, \
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 06/15] xfs: Add state machine tracepoints
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (4 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 05/15] xfs: Add delay ready attr set routines Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2021-01-05  4:50   ` Chandan Babu R
  2021-01-05  5:28   ` Darrick J. Wong
  2020-12-18  7:29 ` [PATCH v14 07/15] xfs: Rename __xfs_attr_rmtval_remove Allison Henderson
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This is a quick patch to add a new tracepoint: xfs_das_state_return.  We
use this to track when ever a new state is set or -EAGAIN is returned

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 22 +++++++++++++++++++++-
 fs/xfs/libxfs/xfs_attr_remote.c |  1 +
 fs/xfs/xfs_trace.h              | 20 ++++++++++++++++++++
 3 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index cd72512..8ed00bc 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -263,6 +263,7 @@ xfs_attr_set_shortform(
 	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
 	 * fork to leaf format and will restart with the leaf add.
 	 */
+	trace_xfs_das_state_return(XFS_DAS_UNINIT);
 	return -EAGAIN;
 }
 
@@ -409,9 +410,11 @@ xfs_attr_set_iter(
 		 * down into the node handling code below
 		 */
 		dac->flags |= XFS_DAC_DEFER_FINISH;
+		trace_xfs_das_state_return(dac->dela_state);
 		return -EAGAIN;
 	case 0:
 		dac->dela_state = XFS_DAS_FOUND_LBLK;
+		trace_xfs_das_state_return(dac->dela_state);
 		return -EAGAIN;
 	}
 	return error;
@@ -841,6 +844,7 @@ xfs_attr_leaf_addname(
 			return error;
 
 		dac->flags |= XFS_DAC_DEFER_FINISH;
+		trace_xfs_das_state_return(dac->dela_state);
 		return -EAGAIN;
 	}
 
@@ -874,6 +878,7 @@ xfs_attr_leaf_addname(
 	 * Commit the flag value change and start the next trans in series.
 	 */
 	dac->dela_state = XFS_DAS_FLIP_LFLAG;
+	trace_xfs_das_state_return(dac->dela_state);
 	return -EAGAIN;
 das_flip_flag:
 	/*
@@ -891,6 +896,8 @@ xfs_attr_leaf_addname(
 das_rm_lblk:
 	if (args->rmtblkno) {
 		error = __xfs_attr_rmtval_remove(dac);
+		if (error == -EAGAIN)
+			trace_xfs_das_state_return(dac->dela_state);
 		if (error)
 			return error;
 	}
@@ -1142,6 +1149,7 @@ xfs_attr_node_addname(
 			 * this point.
 			 */
 			dac->flags |= XFS_DAC_DEFER_FINISH;
+			trace_xfs_das_state_return(dac->dela_state);
 			return -EAGAIN;
 		}
 
@@ -1175,6 +1183,7 @@ xfs_attr_node_addname(
 	state = NULL;
 
 	dac->dela_state = XFS_DAS_FOUND_NBLK;
+	trace_xfs_das_state_return(dac->dela_state);
 	return -EAGAIN;
 das_found_nblk:
 
@@ -1202,6 +1211,7 @@ xfs_attr_node_addname(
 				return error;
 
 			dac->flags |= XFS_DAC_DEFER_FINISH;
+			trace_xfs_das_state_return(dac->dela_state);
 			return -EAGAIN;
 		}
 
@@ -1236,6 +1246,7 @@ xfs_attr_node_addname(
 	 * Commit the flag value change and start the next trans in series
 	 */
 	dac->dela_state = XFS_DAS_FLIP_NFLAG;
+	trace_xfs_das_state_return(dac->dela_state);
 	return -EAGAIN;
 das_flip_flag:
 	/*
@@ -1253,6 +1264,10 @@ xfs_attr_node_addname(
 das_rm_nblk:
 	if (args->rmtblkno) {
 		error = __xfs_attr_rmtval_remove(dac);
+
+		if (error == -EAGAIN)
+			trace_xfs_das_state_return(dac->dela_state);
+
 		if (error)
 			return error;
 	}
@@ -1396,6 +1411,8 @@ xfs_attr_node_remove_rmt (
 	 * May return -EAGAIN to request that the caller recall this function
 	 */
 	error = __xfs_attr_rmtval_remove(dac);
+	if (error == -EAGAIN)
+		trace_xfs_das_state_return(dac->dela_state);
 	if (error)
 		return error;
 
@@ -1514,6 +1531,7 @@ xfs_attr_node_removename_iter(
 
 			dac->flags |= XFS_DAC_DEFER_FINISH;
 			dac->dela_state = XFS_DAS_RM_SHRINK;
+			trace_xfs_das_state_return(dac->dela_state);
 			return -EAGAIN;
 		}
 
@@ -1532,8 +1550,10 @@ xfs_attr_node_removename_iter(
 		goto out;
 	}
 
-	if (error == -EAGAIN)
+	if (error == -EAGAIN) {
+		trace_xfs_das_state_return(dac->dela_state);
 		return error;
+	}
 out:
 	if (state)
 		xfs_da_state_free(state);
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 6af86bf..4840de9 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -763,6 +763,7 @@ __xfs_attr_rmtval_remove(
 	 */
 	if (!done) {
 		dac->flags |= XFS_DAC_DEFER_FINISH;
+		trace_xfs_das_state_return(dac->dela_state);
 		return -EAGAIN;
 	}
 
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 9074b8b..4f6939b4 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -3887,6 +3887,26 @@ DEFINE_EVENT(xfs_timestamp_range_class, name, \
 DEFINE_TIMESTAMP_RANGE_EVENT(xfs_inode_timestamp_range);
 DEFINE_TIMESTAMP_RANGE_EVENT(xfs_quota_expiry_range);
 
+
+DECLARE_EVENT_CLASS(xfs_das_state_class,
+	TP_PROTO(int das),
+	TP_ARGS(das),
+	TP_STRUCT__entry(
+		__field(int, das)
+	),
+	TP_fast_assign(
+		__entry->das = das;
+	),
+	TP_printk("state change %d",
+		  __entry->das)
+)
+
+#define DEFINE_DAS_STATE_EVENT(name) \
+DEFINE_EVENT(xfs_das_state_class, name, \
+	TP_PROTO(int das), \
+	TP_ARGS(das))
+DEFINE_DAS_STATE_EVENT(xfs_das_state_return);
+
 #endif /* _TRACE_XFS_H */
 
 #undef TRACE_INCLUDE_PATH
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 07/15] xfs: Rename __xfs_attr_rmtval_remove
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (5 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 06/15] xfs: Add state machine tracepoints Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans Allison Henderson
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

Now that xfs_attr_rmtval_remove is gone, rename __xfs_attr_rmtval_remove
to xfs_attr_rmtval_remove

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 6 +++---
 fs/xfs/libxfs/xfs_attr_remote.c | 2 +-
 fs/xfs/libxfs/xfs_attr_remote.h | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 8ed00bc..47261a3 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -895,7 +895,7 @@ xfs_attr_leaf_addname(
 	dac->dela_state = XFS_DAS_RM_LBLK;
 das_rm_lblk:
 	if (args->rmtblkno) {
-		error = __xfs_attr_rmtval_remove(dac);
+		error = xfs_attr_rmtval_remove(dac);
 		if (error == -EAGAIN)
 			trace_xfs_das_state_return(dac->dela_state);
 		if (error)
@@ -1263,7 +1263,7 @@ xfs_attr_node_addname(
 	dac->dela_state = XFS_DAS_RM_NBLK;
 das_rm_nblk:
 	if (args->rmtblkno) {
-		error = __xfs_attr_rmtval_remove(dac);
+		error = xfs_attr_rmtval_remove(dac);
 
 		if (error == -EAGAIN)
 			trace_xfs_das_state_return(dac->dela_state);
@@ -1410,7 +1410,7 @@ xfs_attr_node_remove_rmt (
 	/*
 	 * May return -EAGAIN to request that the caller recall this function
 	 */
-	error = __xfs_attr_rmtval_remove(dac);
+	error = xfs_attr_rmtval_remove(dac);
 	if (error == -EAGAIN)
 		trace_xfs_das_state_return(dac->dela_state);
 	if (error)
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4840de9..25639c0 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -737,7 +737,7 @@ xfs_attr_rmtval_invalidate(
  * transaction and re-call the function
  */
 int
-__xfs_attr_rmtval_remove(
+xfs_attr_rmtval_remove(
 	struct xfs_delattr_context	*dac)
 {
 	struct xfs_da_args		*args = dac->da_args;
diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
index 8ad68d5..6ae91af 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.h
+++ b/fs/xfs/libxfs/xfs_attr_remote.h
@@ -13,7 +13,7 @@ int xfs_attr_rmtval_set(struct xfs_da_args *args);
 int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
 		xfs_buf_flags_t incore_flags);
 int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
-int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
+int xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
 int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
 int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
 int xfs_attr_rmtval_set_blk(struct xfs_delattr_context *dac);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (6 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 07/15] xfs: Rename __xfs_attr_rmtval_remove Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2021-01-05  5:38   ` Darrick J. Wong
  2020-12-18  7:29 ` [PATCH v14 09/15] xfs: Set up infastructure for deferred attribute operations Allison Henderson
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

Because xattrs can be over a page in size, we need to handle possible
krealloc errors to avoid warnings

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/xfs_log_recover.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 97f3130..295a5c6 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2061,7 +2061,10 @@ xlog_recover_add_to_cont_trans(
 	old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
 	old_len = item->ri_buf[item->ri_cnt-1].i_len;
 
-	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
+	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL);
+	if (ptr == NULL)
+		return -ENOMEM;
+
 	memcpy(&ptr[old_len], dp, len);
 	item->ri_buf[item->ri_cnt-1].i_len += len;
 	item->ri_buf[item->ri_cnt-1].i_addr = ptr;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 09/15] xfs: Set up infastructure for deferred attribute operations
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (7 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 10/15] xfs: Skip flip flags for delayed attrs Allison Henderson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

Currently attributes are modified directly across one or more
transactions. But they are not logged or replayed in the event of an
error. The goal of delayed attributes is to enable logging and replaying
of attribute operations using the existing delayed operations
infrastructure.  This will later enable the attributes to become part of
larger multi part operations that also must first be recorded to the
log.  This is mostly of interest in the scheme of parent pointers which
would need to maintain an attribute containing parent inode information
any time an inode is moved, created, or removed.  Parent pointers would
then be of interest to any feature that would need to quickly derive an
inode path from the mount point. Online scrub, nfs lookups and fs grow
or shrink operations are all features that could take advantage of this.

This patch adds two new log item types for setting or removing
attributes as deferred operations.  The xfs_attri_log_item logs an
intent to set or remove an attribute.  The corresponding
xfs_attrd_log_item holds a reference to the xfs_attri_log_item and is
freed once the transaction is done.  Both log items use a generic
xfs_attr_log_format structure that contains the attribute name, value,
flags, inode, and an op_flag that indicates if the operations is a set
or remove.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/Makefile                 |   1 +
 fs/xfs/libxfs/xfs_attr.c        |   7 +-
 fs/xfs/libxfs/xfs_attr.h        |  31 ++
 fs/xfs/libxfs/xfs_defer.c       |   1 +
 fs/xfs/libxfs/xfs_defer.h       |   3 +
 fs/xfs/libxfs/xfs_log_format.h  |  44 ++-
 fs/xfs/libxfs/xfs_log_recover.h |   2 +
 fs/xfs/scrub/common.c           |   2 +
 fs/xfs/xfs_acl.c                |   2 +
 fs/xfs/xfs_attr_item.c          | 828 ++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr_item.h          |  52 +++
 fs/xfs/xfs_attr_list.c          |   1 +
 fs/xfs/xfs_ioctl.c              |   2 +
 fs/xfs/xfs_ioctl32.c            |   2 +
 fs/xfs/xfs_iops.c               |   2 +
 fs/xfs/xfs_log.c                |   4 +
 fs/xfs/xfs_log_recover.c        |   2 +
 fs/xfs/xfs_ondisk.h             |   2 +
 fs/xfs/xfs_xattr.c              |   1 +
 19 files changed, 983 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index 04611a1..b056cfc 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -102,6 +102,7 @@ xfs-y				+= xfs_log.o \
 				   xfs_buf_item_recover.o \
 				   xfs_dquot_item_recover.o \
 				   xfs_extfree_item.o \
+				   xfs_attr_item.o \
 				   xfs_icreate_item.o \
 				   xfs_inode_item.o \
 				   xfs_inode_item_recover.o \
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 47261a3..d108866 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -24,6 +24,7 @@
 #include "xfs_quota.h"
 #include "xfs_trans_space.h"
 #include "xfs_trace.h"
+#include "xfs_attr_item.h"
 
 /*
  * xfs_attr.c
@@ -59,8 +60,6 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *args, struct xfs_buf *bp);
-STATIC int xfs_attr_set_iter(struct xfs_delattr_context *dac,
-			     struct xfs_buf **leaf_bp);
 
 int
 xfs_inode_hasattr(
@@ -142,7 +141,7 @@ xfs_attr_get(
 /*
  * Calculate how many blocks we need for the new attribute,
  */
-STATIC int
+int
 xfs_attr_calc_size(
 	struct xfs_da_args	*args,
 	int			*local)
@@ -328,7 +327,7 @@ xfs_attr_set_args(
  * to handle this, and recall the function until a successful error code is
  * returned.
  */
-STATIC int
+int
 xfs_attr_set_iter(
 	struct xfs_delattr_context	*dac,
 	struct xfs_buf			**leaf_bp)
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index e101238..7c7af0a 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -28,6 +28,11 @@ struct xfs_attr_list_context;
  */
 #define	ATTR_MAX_VALUELEN	(64*1024)	/* max length of a value */
 
+static inline bool xfs_hasdelattr(struct xfs_mount *mp)
+{
+	return false;
+}
+
 /*
  * Kernel-internal version of the attrlist cursor.
  */
@@ -384,6 +389,7 @@ enum xfs_delattr_state {
  */
 #define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
 #define XFS_DAC_LEAF_ADDNAME_INIT	0x02 /* xfs_attr_leaf_addname init*/
+#define XFS_DAC_DELAYED_OP_INIT		0x04 /* delayed operations init*/
 
 /*
  * Context used for keeping track of delayed attribute operations
@@ -391,6 +397,11 @@ enum xfs_delattr_state {
 struct xfs_delattr_context {
 	struct xfs_da_args      *da_args;
 
+	/*
+	 * Used by xfs_attr_set to hold a leaf buffer across a transaction roll
+	 */
+	struct xfs_buf		*leaf_bp;
+
 	/* Used in xfs_attr_rmtval_set_blk to roll through allocating blocks */
 	struct xfs_bmbt_irec	map;
 	xfs_dablk_t		lblkno;
@@ -404,6 +415,23 @@ struct xfs_delattr_context {
 	enum xfs_delattr_state  dela_state;
 };
 
+/*
+ * List of attrs to commit later.
+ */
+struct xfs_attr_item {
+	struct xfs_delattr_context	xattri_dac;
+
+	/*
+	 * Indicates if the attr operation is a set or a remove
+	 * XFS_ATTR_OP_FLAGS_{SET,REMOVE}
+	 */
+	uint32_t			xattri_op_flags;
+
+	/* used to log this item to an intent */
+	struct list_head		xattri_list;
+};
+
+
 /*========================================================================
  * Function prototypes for the kernel.
  *========================================================================*/
@@ -419,11 +447,14 @@ int xfs_attr_get_ilocked(struct xfs_da_args *args);
 int xfs_attr_get(struct xfs_da_args *args);
 int xfs_attr_set(struct xfs_da_args *args);
 int xfs_attr_set_args(struct xfs_da_args *args);
+int xfs_attr_set_iter(struct xfs_delattr_context *dac,
+		      struct xfs_buf **leaf_bp);
 int xfs_has_attr(struct xfs_da_args *args);
 int xfs_attr_remove_args(struct xfs_da_args *args);
 int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
 bool xfs_attr_namecheck(const void *name, size_t length);
 void xfs_delattr_context_init(struct xfs_delattr_context *dac,
 			      struct xfs_da_args *args);
+int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index eff4a12..e9caff7 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -178,6 +178,7 @@ static const struct xfs_defer_op_type *defer_op_types[] = {
 	[XFS_DEFER_OPS_TYPE_RMAP]	= &xfs_rmap_update_defer_type,
 	[XFS_DEFER_OPS_TYPE_FREE]	= &xfs_extent_free_defer_type,
 	[XFS_DEFER_OPS_TYPE_AGFL_FREE]	= &xfs_agfl_free_defer_type,
+	[XFS_DEFER_OPS_TYPE_ATTR]	= &xfs_attr_defer_type,
 };
 
 static void
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index 05472f7..72a5789 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -19,6 +19,7 @@ enum xfs_defer_ops_type {
 	XFS_DEFER_OPS_TYPE_RMAP,
 	XFS_DEFER_OPS_TYPE_FREE,
 	XFS_DEFER_OPS_TYPE_AGFL_FREE,
+	XFS_DEFER_OPS_TYPE_ATTR,
 	XFS_DEFER_OPS_TYPE_MAX,
 };
 
@@ -63,6 +64,8 @@ extern const struct xfs_defer_op_type xfs_refcount_update_defer_type;
 extern const struct xfs_defer_op_type xfs_rmap_update_defer_type;
 extern const struct xfs_defer_op_type xfs_extent_free_defer_type;
 extern const struct xfs_defer_op_type xfs_agfl_free_defer_type;
+extern const struct xfs_defer_op_type xfs_attr_defer_type;
+
 
 /*
  * This structure enables a dfops user to detach the chain of deferred
diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index 8bd00da..19963b6 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -117,7 +117,12 @@ struct xfs_unmount_log_format {
 #define XLOG_REG_TYPE_CUD_FORMAT	24
 #define XLOG_REG_TYPE_BUI_FORMAT	25
 #define XLOG_REG_TYPE_BUD_FORMAT	26
-#define XLOG_REG_TYPE_MAX		26
+#define XLOG_REG_TYPE_ATTRI_FORMAT	27
+#define XLOG_REG_TYPE_ATTRD_FORMAT	28
+#define XLOG_REG_TYPE_ATTR_NAME	29
+#define XLOG_REG_TYPE_ATTR_VALUE	30
+#define XLOG_REG_TYPE_MAX		30
+
 
 /*
  * Flags to log operation header
@@ -240,6 +245,8 @@ typedef struct xfs_trans_header {
 #define	XFS_LI_CUD		0x1243
 #define	XFS_LI_BUI		0x1244	/* bmbt update intent */
 #define	XFS_LI_BUD		0x1245
+#define	XFS_LI_ATTRI		0x1246  /* attr set/remove intent*/
+#define	XFS_LI_ATTRD		0x1247  /* attr set/remove done */
 
 #define XFS_LI_TYPE_DESC \
 	{ XFS_LI_EFI,		"XFS_LI_EFI" }, \
@@ -255,7 +262,9 @@ typedef struct xfs_trans_header {
 	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
 	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
 	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
-	{ XFS_LI_BUD,		"XFS_LI_BUD" }
+	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
+	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
+	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }
 
 /*
  * Inode Log Item Format definitions.
@@ -863,4 +872,35 @@ struct xfs_icreate_log {
 	__be32		icl_gen;	/* inode generation number to use */
 };
 
+/*
+ * Flags for deferred attribute operations.
+ * Upper bits are flags, lower byte is type code
+ */
+#define XFS_ATTR_OP_FLAGS_SET		1	/* Set the attribute */
+#define XFS_ATTR_OP_FLAGS_REMOVE	2	/* Remove the attribute */
+#define XFS_ATTR_OP_FLAGS_TYPE_MASK	0x0FF	/* Flags type mask */
+
+/*
+ * This is the structure used to lay out an attr log item in the
+ * log.
+ */
+struct xfs_attri_log_format {
+	uint16_t	alfi_type;	/* attri log item type */
+	uint16_t	alfi_size;	/* size of this item */
+	uint32_t	__pad;		/* pad to 64 bit aligned */
+	uint64_t	alfi_id;	/* attri identifier */
+	uint64_t	alfi_ino;	/* the inode for this attr operation */
+	uint32_t	alfi_op_flags;	/* marks the op as a set or remove */
+	uint32_t	alfi_name_len;	/* attr name length */
+	uint32_t	alfi_value_len;	/* attr value length */
+	uint32_t	alfi_attr_flags;/* attr flags */
+};
+
+struct xfs_attrd_log_format {
+	uint16_t	alfd_type;	/* attrd log item type */
+	uint16_t	alfd_size;	/* size of this item */
+	uint32_t	__pad;		/* pad to 64 bit aligned */
+	uint64_t	alfd_alf_id;	/* id of corresponding attri */
+};
+
 #endif /* __XFS_LOG_FORMAT_H__ */
diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index 3cca2bf..b6e5514 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -72,6 +72,8 @@ extern const struct xlog_recover_item_ops xlog_rui_item_ops;
 extern const struct xlog_recover_item_ops xlog_rud_item_ops;
 extern const struct xlog_recover_item_ops xlog_cui_item_ops;
 extern const struct xlog_recover_item_ops xlog_cud_item_ops;
+extern const struct xlog_recover_item_ops xlog_attri_item_ops;
+extern const struct xlog_recover_item_ops xlog_attrd_item_ops;
 
 /*
  * Macros, structures, prototypes for internal log manager use.
diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index 8ea6d4a..1b918d5 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -24,6 +24,8 @@
 #include "xfs_rmap_btree.h"
 #include "xfs_log.h"
 #include "xfs_trans_priv.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_reflink.h"
 #include "scrub/scrub.h"
diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
index 779cb73..79f7bd2 100644
--- a/fs/xfs/xfs_acl.c
+++ b/fs/xfs/xfs_acl.c
@@ -10,6 +10,8 @@
 #include "xfs_trans_resv.h"
 #include "xfs_mount.h"
 #include "xfs_inode.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_trace.h"
 #include "xfs_error.h"
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
new file mode 100644
index 0000000..c3b94a7
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.c
@@ -0,0 +1,828 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2021 Oracle.  All Rights Reserved.
+ * Author: Allison Collins <allison.henderson@oracle.com>
+ */
+
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_shared.h"
+#include "xfs_mount.h"
+#include "xfs_defer.h"
+#include "xfs_da_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans.h"
+#include "xfs_bmap.h"
+#include "xfs_bmap_btree.h"
+#include "xfs_trans_priv.h"
+#include "xfs_buf_item.h"
+#include "xfs_attr_item.h"
+#include "xfs_log.h"
+#include "xfs_btree.h"
+#include "xfs_rmap.h"
+#include "xfs_inode.h"
+#include "xfs_icache.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
+#include "xfs_attr.h"
+#include "xfs_shared.h"
+#include "xfs_attr_item.h"
+#include "xfs_alloc.h"
+#include "xfs_bmap.h"
+#include "xfs_trace.h"
+#include "libxfs/xfs_da_format.h"
+#include "xfs_inode.h"
+#include "xfs_quota.h"
+#include "xfs_trans_space.h"
+#include "xfs_log_priv.h"
+#include "xfs_log_recover.h"
+
+static const struct xfs_item_ops xfs_attri_item_ops;
+static const struct xfs_item_ops xfs_attrd_item_ops;
+
+/* iovec length must be 32-bit aligned */
+static inline size_t ATTR_NVEC_SIZE(size_t size)
+{
+	return size == sizeof(int32_t) ? size :
+	       sizeof(int32_t) + round_up(size, sizeof(int32_t));
+}
+
+static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attri_log_item, attri_item);
+}
+
+STATIC void
+xfs_attri_item_free(
+	struct xfs_attri_log_item	*attrip)
+{
+	kmem_free(attrip->attri_item.li_lv_shadow);
+	kmem_free(attrip);
+}
+
+/*
+ * Freeing the attrip requires that we remove it from the AIL if it has already
+ * been placed there. However, the ATTRI may not yet have been placed in the
+ * AIL when called by xfs_attri_release() from ATTRD processing due to the
+ * ordering of committed vs unpin operations in bulk insert operations. Hence
+ * the reference count to ensure only the last caller frees the ATTRI.
+ */
+STATIC void
+xfs_attri_release(
+	struct xfs_attri_log_item	*attrip)
+{
+	ASSERT(atomic_read(&attrip->attri_refcount) > 0);
+	if (atomic_dec_and_test(&attrip->attri_refcount)) {
+		xfs_trans_ail_delete(&attrip->attri_item,
+				     SHUTDOWN_LOG_IO_ERROR);
+		xfs_attri_item_free(attrip);
+	}
+}
+
+STATIC void
+xfs_attri_item_size(
+	struct xfs_log_item	*lip,
+	int			*nvecs,
+	int			*nbytes)
+{
+	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
+
+	*nvecs += 1;
+	*nbytes += sizeof(struct xfs_attri_log_format);
+
+	/* Attr set and remove operations require a name */
+	ASSERT(attrip->attri_name_len > 0);
+
+	*nvecs += 1;
+	*nbytes += ATTR_NVEC_SIZE(attrip->attri_name_len);
+
+	/*
+	 * Set ops can accept a value of 0 len to clear an attr value.  Remove
+	 * ops do not need a value at all.  So only account for the value
+	 * when it is needed.
+	 */
+	if (attrip->attri_value_len > 0) {
+		*nvecs += 1;
+		*nbytes += ATTR_NVEC_SIZE(attrip->attri_value_len);
+	}
+}
+
+/*
+ * This is called to fill in the log iovecs for the given attri log
+ * item. We use  1 iovec for the attri_format_item, 1 for the name, and
+ * another for the value if it is present
+ */
+STATIC void
+xfs_attri_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+	struct xfs_log_iovec		*vecp = NULL;
+
+	attrip->attri_format.alfi_type = XFS_LI_ATTRI;
+	attrip->attri_format.alfi_size = 1;
+
+	/*
+	 * This size accounting must be done before copying the attrip into the
+	 * iovec.  If we do it after, the wrong size will be recorded to the log
+	 * and we trip across assertion checks for bad region sizes later during
+	 * the log recovery.
+	 */
+
+	ASSERT(attrip->attri_name_len > 0);
+	attrip->attri_format.alfi_size++;
+
+	if (attrip->attri_value_len > 0)
+		attrip->attri_format.alfi_size++;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
+			&attrip->attri_format,
+			sizeof(struct xfs_attri_log_format));
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
+			attrip->attri_name,
+			ATTR_NVEC_SIZE(attrip->attri_name_len));
+	if (attrip->attri_value_len > 0)
+		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
+				attrip->attri_value,
+				ATTR_NVEC_SIZE(attrip->attri_value_len));
+}
+
+/*
+ * The unpin operation is the last place an ATTRI is manipulated in the log. It
+ * is either inserted in the AIL or aborted in the event of a log I/O error. In
+ * either case, the ATTRI transaction has been successfully committed to make
+ * it this far. Therefore, we expect whoever committed the ATTRI to either
+ * construct and commit the ATTRD or drop the ATTRD's reference in the event of
+ * error. Simply drop the log's ATTRI reference now that the log is done with
+ * it.
+ */
+STATIC void
+xfs_attri_item_unpin(
+	struct xfs_log_item	*lip,
+	int			remove)
+{
+	xfs_attri_release(ATTRI_ITEM(lip));
+}
+
+
+STATIC void
+xfs_attri_item_release(
+	struct xfs_log_item	*lip)
+{
+	xfs_attri_release(ATTRI_ITEM(lip));
+}
+
+/*
+ * Allocate and initialize an attri item.  Caller may allocate an additional
+ * trailing buffer of the specified size
+ */
+STATIC struct xfs_attri_log_item *
+xfs_attri_init(
+	struct xfs_mount		*mp,
+	int				buffer_size)
+
+{
+	struct xfs_attri_log_item	*attrip;
+	uint				size;
+
+	size = sizeof(struct xfs_attri_log_item) + buffer_size;
+	attrip = kmem_alloc_large(size, KM_ZERO);
+	if (attrip == NULL)
+		return NULL;
+
+	xfs_log_item_init(mp, &attrip->attri_item, XFS_LI_ATTRI,
+			  &xfs_attri_item_ops);
+	attrip->attri_format.alfi_id = (uintptr_t)(void *)attrip;
+	atomic_set(&attrip->attri_refcount, 2);
+
+	return attrip;
+}
+
+/*
+ * Copy an attr format buffer from the given buf, and into the destination attr
+ * format structure.
+ */
+STATIC int
+xfs_attri_copy_format(
+	struct xfs_log_iovec		*buf,
+	struct xfs_attri_log_format	*dst_attr_fmt)
+{
+	struct xfs_attri_log_format 	*src_attr_fmt = buf->i_addr;
+	uint 				len;
+
+	len = sizeof(struct xfs_attri_log_format);
+	if (buf->i_len != len)
+		return -EFSCORRUPTED;
+
+	memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
+	return 0;
+}
+
+static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attrd_log_item, attrd_item);
+}
+
+STATIC void
+xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
+{
+	kmem_free(attrdp->attrd_item.li_lv_shadow);
+	kmem_free(attrdp);
+}
+
+STATIC void
+xfs_attrd_item_size(
+	struct xfs_log_item		*lip,
+	int				*nvecs,
+	int				*nbytes)
+{
+	*nvecs += 1;
+	*nbytes += sizeof(struct xfs_attrd_log_format);
+}
+
+/*
+ * This is called to fill in the log iovecs for the given attrd log item. We use
+ * only 1 iovec for the attrd_format, and we point that at the attr_log_format
+ * structure embedded in the attrd item.
+ */
+STATIC void
+xfs_attrd_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+	struct xfs_log_iovec		*vecp = NULL;
+
+	attrdp->attrd_format.alfd_type = XFS_LI_ATTRD;
+	attrdp->attrd_format.alfd_size = 1;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
+			&attrdp->attrd_format,
+			sizeof(struct xfs_attrd_log_format));
+}
+
+/*
+ * The ATTRD is either committed or aborted if the transaction is cancelled. If
+ * the transaction is cancelled, drop our reference to the ATTRI and free the
+ * ATTRD.
+ */
+STATIC void
+xfs_attrd_item_release(
+	struct xfs_log_item		*lip)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+
+	xfs_attri_release(attrdp->attrd_attrip);
+	xfs_attrd_item_free(attrdp);
+}
+
+/*
+ * Performs one step of an attribute update intent and marks the attrd item
+ * dirty..  An attr operation may be a set or a remove.  Note that the
+ * transaction is marked dirty regardless of whether the operation succeeds or
+ * fails to support the ATTRI/ATTRD lifecycle rules.
+ */
+int
+xfs_trans_attr(
+	struct xfs_delattr_context	*dac,
+	struct xfs_attrd_log_item	*attrdp,
+	struct xfs_buf			**leaf_bp,
+	uint32_t			op_flags)
+{
+	struct xfs_da_args		*args = dac->da_args;
+	int				error;
+
+	error = xfs_qm_dqattach_locked(args->dp, 0);
+	if (error)
+		return error;
+
+	switch (op_flags) {
+	case XFS_ATTR_OP_FLAGS_SET:
+		args->op_flags |= XFS_DA_OP_ADDNAME;
+		error = xfs_attr_set_iter(dac, leaf_bp);
+		break;
+	case XFS_ATTR_OP_FLAGS_REMOVE:
+		ASSERT(XFS_IFORK_Q(args->dp));
+		error = xfs_attr_remove_iter(dac);
+		break;
+	default:
+		error = -EFSCORRUPTED;
+		break;
+	}
+
+	/*
+	 * Mark the transaction dirty, even on error. This ensures the
+	 * transaction is aborted, which:
+	 *
+	 * 1.) releases the ATTRI and frees the ATTRD
+	 * 2.) shuts down the filesystem
+	 */
+	args->trans->t_flags |= XFS_TRANS_DIRTY;
+
+	/*
+	 * attr intent/done items are null when delayed attributes are disabled
+	 */
+	if (attrdp)
+		set_bit(XFS_LI_DIRTY, &attrdp->attrd_item.li_flags);
+
+	return error;
+}
+
+/* Log an attr to the intent item. */
+STATIC void
+xfs_attr_log_item(
+	struct xfs_trans		*tp,
+	struct xfs_attri_log_item	*attrip,
+	struct xfs_attr_item		*attr)
+{
+	struct xfs_attri_log_format	*attrp;
+
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	set_bit(XFS_LI_DIRTY, &attrip->attri_item.li_flags);
+
+	/*
+	 * At this point the xfs_attr_item has been constructed, and we've
+	 * created the log intent. Fill in the attri log item and log format
+	 * structure with fields from this xfs_attr_item
+	 */
+	attrp = &attrip->attri_format;
+	attrp->alfi_ino = attr->xattri_dac.da_args->dp->i_ino;
+	attrp->alfi_op_flags = attr->xattri_op_flags;
+	attrp->alfi_value_len = attr->xattri_dac.da_args->valuelen;
+	attrp->alfi_name_len = attr->xattri_dac.da_args->namelen;
+	attrp->alfi_attr_flags = attr->xattri_dac.da_args->attr_filter;
+
+	attrip->attri_name = (void *)attr->xattri_dac.da_args->name;
+	attrip->attri_value = attr->xattri_dac.da_args->value;
+	attrip->attri_name_len = attr->xattri_dac.da_args->namelen;
+	attrip->attri_value_len = attr->xattri_dac.da_args->valuelen;
+}
+
+/* Get an ATTRI. */
+static struct xfs_log_item *
+xfs_attr_create_intent(
+	struct xfs_trans		*tp,
+	struct list_head		*items,
+	unsigned int			count,
+	bool				sort)
+{
+	struct xfs_mount		*mp = tp->t_mountp;
+	struct xfs_attri_log_item	*attrip;
+	struct xfs_attr_item		*attr;
+
+	ASSERT(count == 1);
+
+	if (!xfs_hasdelattr(mp))
+		return NULL;
+
+	attrip = xfs_attri_init(mp, 0);
+	if (attrip == NULL)
+		return NULL;
+
+	xfs_trans_add_item(tp, &attrip->attri_item);
+	list_for_each_entry(attr, items, xattri_list)
+		xfs_attr_log_item(tp, attrip, attr);
+	return &attrip->attri_item;
+}
+
+/* Process an attr. */
+STATIC int
+xfs_attr_finish_item(
+	struct xfs_trans		*tp,
+	struct xfs_log_item		*done,
+	struct list_head		*item,
+	struct xfs_btree_cur		**state)
+{
+	struct xfs_attr_item		*attr;
+	struct xfs_attrd_log_item	*done_item = NULL;
+	int				error;
+	struct xfs_delattr_context	*dac;
+
+	attr = container_of(item, struct xfs_attr_item, xattri_list);
+	dac = &attr->xattri_dac;
+	if (done)
+		done_item = ATTRD_ITEM(done);
+
+	/*
+	 * Corner case that can happen during a recovery.  Because the first
+	 * iteration of a multi part delay op happens in xfs_attri_item_recover
+	 * to maintain the order of the log replay items.  But the new
+	 * transactions do not automatically rejoin during a recovery as they do
+	 * in a standard delay op, so we need to catch this here and rejoin the
+	 * leaf to the new transaction
+	 */
+	if (attr->xattri_dac.leaf_bp &&
+	    attr->xattri_dac.leaf_bp->b_transp != tp) {
+		xfs_trans_bjoin(tp, attr->xattri_dac.leaf_bp);
+		xfs_trans_bhold(tp, attr->xattri_dac.leaf_bp);
+	}
+
+	/*
+	 * Always reset trans after EAGAIN cycle
+	 * since the transaction is new
+	 */
+	dac->da_args->trans = tp;
+
+	error = xfs_trans_attr(dac, done_item, &dac->leaf_bp,
+			       attr->xattri_op_flags);
+	if (error != -EAGAIN)
+		kmem_free(attr);
+
+	return error;
+}
+
+/* Abort all pending ATTRs. */
+STATIC void
+xfs_attr_abort_intent(
+	struct xfs_log_item		*intent)
+{
+	xfs_attri_release(ATTRI_ITEM(intent));
+}
+
+/* Cancel an attr */
+STATIC void
+xfs_attr_cancel_item(
+	struct list_head		*item)
+{
+	struct xfs_attr_item		*attr;
+
+	attr = container_of(item, struct xfs_attr_item, xattri_list);
+	kmem_free(attr);
+}
+
+STATIC xfs_lsn_t
+xfs_attri_item_committed(
+	struct xfs_log_item		*lip,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_attri_log_item	*attrip;
+	/*
+	 * The attrip refers to xfs_attr_item memory to log the name and value
+	 * with the intent item. This already occurred when the intent was
+	 * committed so these fields are no longer accessed. Clear them out of
+	 * caution since we're about to free the xfs_attr_item.
+	 */
+	attrip = ATTRI_ITEM(lip);
+	attrip->attri_name = NULL;
+	attrip->attri_value = NULL;
+
+	/*
+	 * The ATTRI is logged only once and cannot be moved in the log, so
+	 * simply return the lsn at which it's been logged.
+	 */
+	return lsn;
+}
+
+STATIC bool
+xfs_attri_item_match(
+	struct xfs_log_item	*lip,
+	uint64_t		intent_id)
+{
+	return ATTRI_ITEM(lip)->attri_format.alfi_id == intent_id;
+}
+
+/*
+ * This routine is called to allocate an "attr free done" log item.
+ */
+struct xfs_attrd_log_item *
+xfs_trans_get_attrd(struct xfs_trans		*tp,
+		  struct xfs_attri_log_item	*attrip)
+{
+	struct xfs_attrd_log_item		*attrdp;
+	uint					size;
+
+	ASSERT(tp != NULL);
+
+	size = sizeof(struct xfs_attrd_log_item);
+	attrdp = kmem_zalloc(size, 0);
+
+	xfs_log_item_init(tp->t_mountp, &attrdp->attrd_item, XFS_LI_ATTRD,
+			  &xfs_attrd_item_ops);
+	attrdp->attrd_attrip = attrip;
+	attrdp->attrd_format.alfd_alf_id = attrip->attri_format.alfi_id;
+
+	xfs_trans_add_item(tp, &attrdp->attrd_item);
+	return attrdp;
+}
+
+static const struct xfs_item_ops xfs_attrd_item_ops = {
+	.flags		= XFS_ITEM_RELEASE_WHEN_COMMITTED,
+	.iop_size	= xfs_attrd_item_size,
+	.iop_format	= xfs_attrd_item_format,
+	.iop_release    = xfs_attrd_item_release,
+};
+
+
+/* Get an ATTRD so we can process all the attrs. */
+static struct xfs_log_item *
+xfs_attr_create_done(
+	struct xfs_trans		*tp,
+	struct xfs_log_item		*intent,
+	unsigned int			count)
+{
+	if (!intent)
+		return NULL;
+
+	return &xfs_trans_get_attrd(tp, ATTRI_ITEM(intent))->attrd_item;
+}
+
+const struct xfs_defer_op_type xfs_attr_defer_type = {
+	.max_items	= 1,
+	.create_intent	= xfs_attr_create_intent,
+	.abort_intent	= xfs_attr_abort_intent,
+	.create_done	= xfs_attr_create_done,
+	.finish_item	= xfs_attr_finish_item,
+	.cancel_item	= xfs_attr_cancel_item,
+};
+
+/*
+ * Process an attr intent item that was recovered from the log.  We need to
+ * delete the attr that it describes.
+ */
+STATIC int
+xfs_attri_item_recover(
+	struct xfs_log_item		*lip,
+	struct list_head		*capture_list)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+	struct xfs_attr_item		*new_attr;
+	struct xfs_mount		*mp = lip->li_mountp;
+	struct xfs_inode		*ip;
+	struct xfs_da_args		args;
+	struct xfs_da_args		*new_args;
+	struct xfs_trans_res		tres;
+	bool				rsvd;
+	struct xfs_attri_log_format	*attrp;
+	int				error;
+	int				total;
+	int				local;
+	struct xfs_attrd_log_item	*done_item = NULL;
+	struct xfs_attr_item		attr = {
+		.xattri_op_flags	= attrip->attri_format.alfi_op_flags,
+		.xattri_dac.da_args	= &args,
+	};
+
+	/*
+	 * First check the validity of the attr described by the ATTRI.  If any
+	 * are bad, then assume that all are bad and just toss the ATTRI.
+	 */
+	attrp = &attrip->attri_format;
+	if (!(attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET ||
+	      attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE) ||
+	    (attrp->alfi_value_len > XATTR_SIZE_MAX) ||
+	    (attrp->alfi_name_len > XATTR_NAME_MAX) ||
+	    (attrp->alfi_name_len == 0) ||
+	    xfs_verify_ino(mp, attrp->alfi_ino) == false ||
+	    !xfs_hasdelattr(mp)) {
+		return -EFSCORRUPTED;
+	}
+
+	error = xfs_iget(mp, 0, attrp->alfi_ino, 0, 0, &ip);
+	if (error)
+		return error;
+
+	if (VFS_I(ip)->i_nlink == 0)
+		xfs_iflags_set(ip, XFS_IRECOVERY);
+
+	memset(&args, 0, sizeof(struct xfs_da_args));
+	args.dp = ip;
+	args.geo = mp->m_attr_geo;
+	args.op_flags = attrp->alfi_op_flags;
+	args.whichfork = XFS_ATTR_FORK;
+	args.name = attrip->attri_name;
+	args.namelen = attrp->alfi_name_len;
+	args.hashval = xfs_da_hashname(args.name, args.namelen);
+	args.attr_filter = attrp->alfi_attr_flags;
+
+	if (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET) {
+		args.value = attrip->attri_value;
+		args.valuelen = attrp->alfi_value_len;
+		args.total = xfs_attr_calc_size(&args, &local);
+
+		tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
+				 M_RES(mp)->tr_attrsetrt.tr_logres *
+					args.total;
+		tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
+		tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
+		total = args.total;
+	} else {
+		tres = M_RES(mp)->tr_attrrm;
+		total = XFS_ATTRRM_SPACE_RES(mp);
+	}
+	error = xfs_trans_alloc(mp, &tres, total, 0,
+				rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
+	if (error)
+		return error;
+
+	done_item = xfs_trans_get_attrd(args.trans, attrip);
+
+	xfs_ilock(ip, XFS_ILOCK_EXCL);
+	xfs_trans_ijoin(args.trans, ip, 0);
+
+	error = xfs_trans_attr(&attr.xattri_dac, done_item,
+			       &attr.xattri_dac.leaf_bp, attrp->alfi_op_flags);
+	if (error == -EAGAIN) {
+		/*
+		 * There's more work to do, so make a new xfs_attr_item and add
+		 * it to this transaction.  We dont use xfs_attr_item_init here
+		 * because we need the info stored in the current attr to
+		 * continue with this multi-part operation.  So, alloc space
+		 * for it and the args and copy everything there.
+		 */
+		new_attr = kmem_zalloc(sizeof(struct xfs_attr_item) +
+				       sizeof(struct xfs_da_args), KM_NOFS);
+		new_args = (struct xfs_da_args *)((char *)new_attr +
+			   sizeof(struct xfs_attr_item));
+
+		memcpy(new_args, &args, sizeof(struct xfs_da_args));
+		memcpy(new_attr, &attr, sizeof(struct xfs_attr_item));
+
+		new_attr->xattri_dac.da_args = new_args;
+		memset(&new_attr->xattri_list, 0, sizeof(struct list_head));
+
+		xfs_defer_add(args.trans, XFS_DEFER_OPS_TYPE_ATTR,
+			      &new_attr->xattri_list);
+
+		/* Do not send -EAGAIN back to caller */
+		error = 0;
+	} else if (error) {
+		xfs_trans_cancel(args.trans);
+		goto out;
+	}
+
+	xfs_defer_ops_capture_and_commit(args.trans, ip, capture_list);
+
+out:
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	xfs_irele(ip);
+	return error;
+}
+
+/* Relog an intent item to push the log tail forward. */
+static struct xfs_log_item *
+xfs_attri_item_relog(
+	struct xfs_log_item		*intent,
+	struct xfs_trans		*tp)
+{
+	struct xfs_attrd_log_item	*attrdp;
+	struct xfs_attri_log_item	*old_attrip;
+	struct xfs_attri_log_item	*new_attrip;
+	struct xfs_attri_log_format	*new_attrp;
+	struct xfs_attri_log_format	*old_attrp;
+	int				buffer_size;
+
+	old_attrip = ATTRI_ITEM(intent);
+	old_attrp = &old_attrip->attri_format;
+	buffer_size = old_attrp->alfi_value_len + old_attrp->alfi_name_len;
+
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	attrdp = xfs_trans_get_attrd(tp, old_attrip);
+	set_bit(XFS_LI_DIRTY, &attrdp->attrd_item.li_flags);
+
+	new_attrip = xfs_attri_init(tp->t_mountp, buffer_size);
+	new_attrp = &new_attrip->attri_format;
+
+	new_attrp->alfi_ino = old_attrp->alfi_ino;
+	new_attrp->alfi_op_flags = old_attrp->alfi_op_flags;
+	new_attrp->alfi_value_len = old_attrp->alfi_value_len;
+	new_attrp->alfi_name_len = old_attrp->alfi_name_len;
+	new_attrp->alfi_attr_flags = old_attrp->alfi_attr_flags;
+
+	new_attrip->attri_name_len = old_attrip->attri_name_len;
+	new_attrip->attri_name = ((char *)new_attrip) +
+				 sizeof(struct xfs_attri_log_item);
+	memcpy(new_attrip->attri_name, old_attrip->attri_name,
+		new_attrip->attri_name_len);
+
+	new_attrip->attri_value_len = old_attrip->attri_value_len;
+	if (new_attrip->attri_value_len > 0) {
+		new_attrip->attri_value = new_attrip->attri_name +
+					  new_attrip->attri_name_len;
+
+		memcpy(new_attrip->attri_value, old_attrip->attri_value,
+		       new_attrip->attri_value_len);
+	}
+
+	xfs_trans_add_item(tp, &new_attrip->attri_item);
+	set_bit(XFS_LI_DIRTY, &new_attrip->attri_item.li_flags);
+
+	return &new_attrip->attri_item;
+}
+
+static const struct xfs_item_ops xfs_attri_item_ops = {
+	.iop_size	= xfs_attri_item_size,
+	.iop_format	= xfs_attri_item_format,
+	.iop_unpin	= xfs_attri_item_unpin,
+	.iop_committed	= xfs_attri_item_committed,
+	.iop_release    = xfs_attri_item_release,
+	.iop_recover	= xfs_attri_item_recover,
+	.iop_match	= xfs_attri_item_match,
+	.iop_relog	= xfs_attri_item_relog,
+};
+
+
+
+STATIC int
+xlog_recover_attri_commit_pass2(
+	struct xlog                     *log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item        *item,
+	xfs_lsn_t                       lsn)
+{
+	int                             error;
+	struct xfs_mount                *mp = log->l_mp;
+	struct xfs_attri_log_item       *attrip;
+	struct xfs_attri_log_format     *attri_formatp;
+	char				*name = NULL;
+	char				*value = NULL;
+	int				region = 0;
+	int				buffer_size;
+
+	attri_formatp = item->ri_buf[region].i_addr;
+
+	/* Validate xfs_attri_log_format */
+	if (attri_formatp->__pad != 0 || attri_formatp->alfi_name_len == 0 ||
+	    (attri_formatp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE &&
+	    attri_formatp->alfi_value_len != 0))
+		return -EFSCORRUPTED;
+
+	buffer_size = attri_formatp->alfi_name_len +
+		      attri_formatp->alfi_value_len;
+
+	attrip = xfs_attri_init(mp, buffer_size);
+	if (attrip == NULL)
+		return -ENOMEM;
+
+	error = xfs_attri_copy_format(&item->ri_buf[region],
+				      &attrip->attri_format);
+	if (error) {
+		xfs_attri_item_free(attrip);
+		return error;
+	}
+
+	attrip->attri_name_len = attri_formatp->alfi_name_len;
+	attrip->attri_value_len = attri_formatp->alfi_value_len;
+	region++;
+	name = ((char *)attrip) + sizeof(struct xfs_attri_log_item);
+	memcpy(name, item->ri_buf[region].i_addr, attrip->attri_name_len);
+	attrip->attri_name = name;
+
+	if (attrip->attri_value_len > 0) {
+		region++;
+		value = ((char *)attrip) + sizeof(struct xfs_attri_log_item) +
+			attrip->attri_name_len;
+		memcpy(value, item->ri_buf[region].i_addr,
+			attrip->attri_value_len);
+		attrip->attri_value = value;
+	}
+
+	/*
+	 * The ATTRI has two references. One for the ATTRD and one for ATTRI to
+	 * ensure it makes it into the AIL. Insert the ATTRI into the AIL
+	 * directly and drop the ATTRI reference. Note that
+	 * xfs_trans_ail_update() drops the AIL lock.
+	 */
+	xfs_trans_ail_insert(log->l_ailp, &attrip->attri_item, lsn);
+	xfs_attri_release(attrip);
+	return 0;
+}
+
+const struct xlog_recover_item_ops xlog_attri_item_ops = {
+	.item_type	= XFS_LI_ATTRI,
+	.commit_pass2	= xlog_recover_attri_commit_pass2,
+};
+
+/*
+ * This routine is called when an ATTRD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding ATTRI if
+ * it was still in the log. To do this it searches the AIL for the ATTRI with
+ * an id equal to that in the ATTRD format structure. If we find it we drop
+ * the ATTRD reference, which removes the ATTRI from the AIL and frees it.
+ */
+STATIC int
+xlog_recover_attrd_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_attrd_log_format	*attrd_formatp;
+
+	attrd_formatp = item->ri_buf[0].i_addr;
+	ASSERT((item->ri_buf[0].i_len ==
+				(sizeof(struct xfs_attrd_log_format))));
+
+	xlog_recover_release_intent(log, XFS_LI_ATTRI,
+				    attrd_formatp->alfd_alf_id);
+	return 0;
+}
+
+const struct xlog_recover_item_ops xlog_attrd_item_ops = {
+	.item_type	= XFS_LI_ATTRD,
+	.commit_pass2	= xlog_recover_attrd_commit_pass2,
+};
diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
new file mode 100644
index 0000000..27c6bae
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * Copyright (C) 2019 Oracle.  All Rights Reserved.
+ * Author: Allison Collins <allison.henderson@oracle.com>
+ */
+#ifndef	__XFS_ATTR_ITEM_H__
+#define	__XFS_ATTR_ITEM_H__
+
+/* kernel only ATTRI/ATTRD definitions */
+
+struct xfs_mount;
+struct kmem_zone;
+
+/*
+ * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
+ */
+#define	XFS_ATTRI_RECOVERED	1
+
+
+/*
+ * This is the "attr intention" log item.  It is used to log the fact that some
+ * attribute operations need to be processed.  An operation is currently either
+ * a set or remove.  Set or remove operations are described by the xfs_attr_item
+ * which may be logged to this intent.
+ *
+ * During a normal attr operation, name and value point to the name and value
+ * feilds of the calling functions xfs_da_args.  During a recovery, the name
+ * and value buffers are copied from the log, and stored in a trailing buffer
+ * attached to the xfs_attr_item until they are committed.  They are freed when
+ * the xfs_attr_item itself is freed when the work is done.
+ */
+struct xfs_attri_log_item {
+	struct xfs_log_item		attri_item;
+	atomic_t			attri_refcount;
+	int				attri_name_len;
+	int				attri_value_len;
+	void				*attri_name;
+	void				*attri_value;
+	struct xfs_attri_log_format	attri_format;
+};
+
+/*
+ * This is the "attr done" log item.  It is used to log the fact that some attrs
+ * earlier mentioned in an attri item have been freed.
+ */
+struct xfs_attrd_log_item {
+	struct xfs_attri_log_item	*attrd_attrip;
+	struct xfs_log_item		attrd_item;
+	struct xfs_attrd_log_format	attrd_format;
+};
+
+#endif	/* __XFS_ATTR_ITEM_H__ */
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index 8f8837f..d7787a5 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -15,6 +15,7 @@
 #include "xfs_inode.h"
 #include "xfs_trans.h"
 #include "xfs_bmap.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_attr_sf.h"
 #include "xfs_attr_leaf.h"
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 3fbd98f..d5d1959 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -15,6 +15,8 @@
 #include "xfs_iwalk.h"
 #include "xfs_itable.h"
 #include "xfs_error.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_bmap.h"
 #include "xfs_bmap_util.h"
diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
index c1771e7..62e1534 100644
--- a/fs/xfs/xfs_ioctl32.c
+++ b/fs/xfs/xfs_ioctl32.c
@@ -17,6 +17,8 @@
 #include "xfs_itable.h"
 #include "xfs_fsops.h"
 #include "xfs_rtalloc.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_ioctl.h"
 #include "xfs_ioctl32.h"
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 67c8dc9..6ec9858 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -13,6 +13,8 @@
 #include "xfs_inode.h"
 #include "xfs_acl.h"
 #include "xfs_quota.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_trans.h"
 #include "xfs_trace.h"
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index fa2d05e..3457f22 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1993,6 +1993,10 @@ xlog_print_tic_res(
 	    REG_TYPE_STR(CUD_FORMAT, "cud_format"),
 	    REG_TYPE_STR(BUI_FORMAT, "bui_format"),
 	    REG_TYPE_STR(BUD_FORMAT, "bud_format"),
+	    REG_TYPE_STR(ATTRI_FORMAT, "attri_format"),
+	    REG_TYPE_STR(ATTRD_FORMAT, "attrd_format"),
+	    REG_TYPE_STR(ATTR_NAME, "attr_name"),
+	    REG_TYPE_STR(ATTR_VALUE, "attr_value"),
 	};
 	BUILD_BUG_ON(ARRAY_SIZE(res_type_str) != XLOG_REG_TYPE_MAX + 1);
 #undef REG_TYPE_STR
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 295a5c6..c0821b6 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1775,6 +1775,8 @@ static const struct xlog_recover_item_ops *xlog_recover_item_ops[] = {
 	&xlog_cud_item_ops,
 	&xlog_bui_item_ops,
 	&xlog_bud_item_ops,
+	&xlog_attri_item_ops,
+	&xlog_attrd_item_ops,
 };
 
 static const struct xlog_recover_item_ops *
diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h
index 0aa87c21..bc9c25e 100644
--- a/fs/xfs/xfs_ondisk.h
+++ b/fs/xfs/xfs_ondisk.h
@@ -132,6 +132,8 @@ xfs_check_ondisk_structs(void)
 	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format,	56);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_qoff_logformat,	20);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_trans_header,		16);
+	XFS_CHECK_STRUCT_SIZE(struct xfs_attri_log_format,	40);
+	XFS_CHECK_STRUCT_SIZE(struct xfs_attrd_log_format,	16);
 
 	/*
 	 * The v5 superblock format extended several v4 header structures with
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
index bca48b3..9b0c790 100644
--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -10,6 +10,7 @@
 #include "xfs_log_format.h"
 #include "xfs_da_format.h"
 #include "xfs_inode.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_acl.h"
 #include "xfs_da_btree.h"
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 10/15] xfs: Skip flip flags for delayed attrs
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (8 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 09/15] xfs: Set up infastructure for deferred attribute operations Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 11/15] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred Allison Henderson
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This is a clean up patch that skips the flip flag logic for delayed attr
renames.  Since the log replay keeps the inode locked, we do not need to
worry about race windows with attr lookups.  So we can skip over
flipping the flag and the extra transaction roll for it

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c      | 43 ++++++++++++++++++++++++-------------------
 fs/xfs/libxfs/xfs_attr_leaf.c |  3 ++-
 2 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index d108866..99f6539 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -805,6 +805,7 @@ xfs_attr_leaf_addname(
 	struct xfs_buf			*bp = NULL;
 	int				error, forkoff;
 	struct xfs_inode		*dp = args->dp;
+	struct xfs_mount		*mp = args->dp->i_mount;
 
 	/* State machine switch */
 	switch (dac->dela_state) {
@@ -870,15 +871,17 @@ xfs_attr_leaf_addname(
 	 * In a separate transaction, set the incomplete flag on the "old" attr
 	 * and clear the incomplete flag on the "new" attr.
 	 */
-	error = xfs_attr3_leaf_flipflags(args);
-	if (error)
-		return error;
-	/*
-	 * Commit the flag value change and start the next trans in series.
-	 */
-	dac->dela_state = XFS_DAS_FLIP_LFLAG;
-	trace_xfs_das_state_return(dac->dela_state);
-	return -EAGAIN;
+	if (!xfs_hasdelattr(mp)) {
+		error = xfs_attr3_leaf_flipflags(args);
+		if (error)
+			return error;
+		/*
+		 * Commit the flag value change and start the next trans in series.
+		 */
+		dac->dela_state = XFS_DAS_FLIP_LFLAG;
+		trace_xfs_das_state_return(dac->dela_state);
+		return -EAGAIN;
+	}
 das_flip_flag:
 	/*
 	 * Dismantle the "old" attribute/value pair by removing a "remote" value
@@ -1077,6 +1080,7 @@ xfs_attr_node_addname(
 	struct xfs_da_state_blk		*blk;
 	int				retval = 0;
 	int				error = 0;
+	struct xfs_mount		*mp = args->dp->i_mount;
 
 	trace_xfs_attr_node_addname(args);
 
@@ -1238,15 +1242,17 @@ xfs_attr_node_addname(
 	 * In a separate transaction, set the incomplete flag on the "old" attr
 	 * and clear the incomplete flag on the "new" attr.
 	 */
-	error = xfs_attr3_leaf_flipflags(args);
-	if (error)
-		goto out;
-	/*
-	 * Commit the flag value change and start the next trans in series
-	 */
-	dac->dela_state = XFS_DAS_FLIP_NFLAG;
-	trace_xfs_das_state_return(dac->dela_state);
-	return -EAGAIN;
+	if (!xfs_hasdelattr(mp)) {
+		error = xfs_attr3_leaf_flipflags(args);
+		if (error)
+			goto out;
+		/*
+		 * Commit the flag value change and start the next trans in series
+		 */
+		dac->dela_state = XFS_DAS_FLIP_NFLAG;
+		trace_xfs_das_state_return(dac->dela_state);
+		return -EAGAIN;
+	}
 das_flip_flag:
 	/*
 	 * Dismantle the "old" attribute/value pair by removing a "remote" value
@@ -1275,7 +1281,6 @@ xfs_attr_node_addname(
 	 * Re-find the "old" attribute entry after any split ops. The INCOMPLETE
 	 * flag means that we will find the "old" attr, not the "new" one.
 	 */
-	args->attr_filter |= XFS_ATTR_INCOMPLETE;
 	state = xfs_da_state_alloc(args);
 	state->inleaf = 0;
 	error = xfs_da3_node_lookup_int(state, &retval);
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index 3780141..ec707bd 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -1486,7 +1486,8 @@ xfs_attr3_leaf_add_work(
 	if (tmp)
 		entry->flags |= XFS_ATTR_LOCAL;
 	if (args->op_flags & XFS_DA_OP_RENAME) {
-		entry->flags |= XFS_ATTR_INCOMPLETE;
+		if (!xfs_hasdelattr(mp))
+			entry->flags |= XFS_ATTR_INCOMPLETE;
 		if ((args->blkno2 == args->blkno) &&
 		    (args->index2 <= args->index)) {
 			args->index2++;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 11/15] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (9 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 10/15] xfs: Skip flip flags for delayed attrs Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 12/15] xfs: Remove unused xfs_attr_*_args Allison Henderson
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

From: Allison Collins <allison.henderson@oracle.com>

These routines to set up and start a new deferred attribute operations.
These functions are meant to be called by any routine needing to
initiate a deferred attribute operation as opposed to the existing
inline operations. New helper function xfs_attr_item_init also added.

Finally enable delayed attributes in xfs_attr_set and xfs_attr_remove.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++--
 fs/xfs/libxfs/xfs_attr.h |  2 ++
 2 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 99f6539..85b63bb 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -25,6 +25,7 @@
 #include "xfs_trans_space.h"
 #include "xfs_trace.h"
 #include "xfs_attr_item.h"
+#include "xfs_attr.h"
 
 /*
  * xfs_attr.c
@@ -604,9 +605,10 @@ xfs_attr_set(
 		if (error != -ENOATTR && error != -EEXIST)
 			goto out_trans_cancel;
 
-		error = xfs_attr_set_args(args);
+		error = xfs_attr_set_deferred(args);
 		if (error)
 			goto out_trans_cancel;
+
 		/* shortform attribute has already been committed */
 		if (!args->trans)
 			goto out_unlock;
@@ -615,7 +617,7 @@ xfs_attr_set(
 		if (error != -EEXIST)
 			goto out_trans_cancel;
 
-		error = xfs_attr_remove_args(args);
+		error = xfs_attr_remove_deferred(args);
 		if (error)
 			goto out_trans_cancel;
 	}
@@ -645,6 +647,58 @@ xfs_attr_set(
 	goto out_unlock;
 }
 
+STATIC int
+xfs_attr_item_init(
+	struct xfs_da_args	*args,
+	unsigned int		op_flags,	/* op flag (set or remove) */
+	struct xfs_attr_item	**attr)		/* new xfs_attr_item */
+{
+
+	struct xfs_attr_item	*new;
+
+	new = kmem_zalloc(sizeof(struct xfs_attr_item), KM_NOFS);
+	new->xattri_op_flags = op_flags;
+	new->xattri_dac.da_args = args;
+
+	*attr = new;
+	return 0;
+}
+
+/* Sets an attribute for an inode as a deferred operation */
+int
+xfs_attr_set_deferred(
+	struct xfs_da_args	*args)
+{
+	struct xfs_attr_item	*new;
+	int			error = 0;
+
+	error = xfs_attr_item_init(args, XFS_ATTR_OP_FLAGS_SET, &new);
+	if (error)
+		return error;
+
+	xfs_defer_add(args->trans, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
+
+	return 0;
+}
+
+/* Removes an attribute for an inode as a deferred operation */
+int
+xfs_attr_remove_deferred(
+	struct xfs_da_args	*args)
+{
+
+	struct xfs_attr_item	*new;
+	int			error;
+
+	error  = xfs_attr_item_init(args, XFS_ATTR_OP_FLAGS_REMOVE, &new);
+	if (error)
+		return error;
+
+	xfs_defer_add(args->trans, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
+
+	return 0;
+}
+
 /*========================================================================
  * External routines when attribute list is inside the inode
  *========================================================================*/
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 7c7af0a..5d3aa0c 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -456,5 +456,7 @@ bool xfs_attr_namecheck(const void *name, size_t length);
 void xfs_delattr_context_init(struct xfs_delattr_context *dac,
 			      struct xfs_da_args *args);
 int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
+int xfs_attr_set_deferred(struct xfs_da_args *args);
+int xfs_attr_remove_deferred(struct xfs_da_args *args);
 
 #endif	/* __XFS_ATTR_H__ */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 12/15] xfs: Remove unused xfs_attr_*_args
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (10 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 11/15] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 13/15] xfs: Add delayed attributes error tag Allison Henderson
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

Remove xfs_attr_set_args, xfs_attr_remove_args, and xfs_attr_trans_roll.
These high level loops are now driven by the delayed operations code,
and can be removed.

Additionally collapse in the leaf_bp parameter of xfs_attr_set_iter
since we only have one caller that passes dac->leaf_bp

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 96 ++---------------------------------------
 fs/xfs/libxfs/xfs_attr.h        | 10 ++---
 fs/xfs/libxfs/xfs_attr_remote.c |  1 -
 fs/xfs/xfs_attr_item.c          |  8 ++--
 4 files changed, 9 insertions(+), 106 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 85b63bb..6e5a900 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -268,60 +268,6 @@ xfs_attr_set_shortform(
 }
 
 /*
- * Checks to see if a delayed attribute transaction should be rolled.  If so,
- * also checks for a defer finish.  Transaction is finished and rolled as
- * needed, and returns true of false if the delayed operation should continue.
- */
-STATIC int
-xfs_attr_trans_roll(
-	struct xfs_delattr_context	*dac)
-{
-	struct xfs_da_args		*args = dac->da_args;
-	int				error;
-
-	if (dac->flags & XFS_DAC_DEFER_FINISH) {
-		/*
-		 * The caller wants us to finish all the deferred ops so that we
-		 * avoid pinning the log tail with a large number of deferred
-		 * ops.
-		 */
-		dac->flags &= ~XFS_DAC_DEFER_FINISH;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
-	} else
-		error = xfs_trans_roll_inode(&args->trans, args->dp);
-
-	return error;
-}
-
-/*
- * Set the attribute specified in @args.
- */
-int
-xfs_attr_set_args(
-	struct xfs_da_args	*args)
-{
-	struct xfs_buf			*leaf_bp = NULL;
-	int				error = 0;
-	struct xfs_delattr_context	dac = {
-		.da_args	= args,
-	};
-
-	do {
-		error = xfs_attr_set_iter(&dac, &leaf_bp);
-		if (error != -EAGAIN)
-			break;
-
-		error = xfs_attr_trans_roll(&dac);
-		if (error)
-			return error;
-	} while (true);
-
-	return error;
-}
-
-/*
  * Set the attribute specified in @args.
  * This routine is meant to function as a delayed operation, and may return
  * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
@@ -330,11 +276,11 @@ xfs_attr_set_args(
  */
 int
 xfs_attr_set_iter(
-	struct xfs_delattr_context	*dac,
-	struct xfs_buf			**leaf_bp)
+	struct xfs_delattr_context	*dac)
 {
 	struct xfs_da_args		*args = dac->da_args;
 	struct xfs_inode		*dp = args->dp;
+	struct xfs_buf			**leaf_bp = &dac->leaf_bp;
 	int				error = 0;
 
 	/* State machine switch */
@@ -368,11 +314,7 @@ xfs_attr_set_iter(
 		 * continue.  Otherwise, is it converted from shortform to leaf
 		 * and -EAGAIN is returned.
 		 */
-		error = xfs_attr_set_shortform(args, leaf_bp);
-		if (error == -EAGAIN)
-			dac->flags |= XFS_DAC_DEFER_FINISH;
-
-		return error;
+		return xfs_attr_set_shortform(args, leaf_bp);
 	}
 
 	/*
@@ -409,7 +351,6 @@ xfs_attr_set_iter(
 		 * when we come back, we'll be a node, so we'll fall
 		 * down into the node handling code below
 		 */
-		dac->flags |= XFS_DAC_DEFER_FINISH;
 		trace_xfs_das_state_return(dac->dela_state);
 		return -EAGAIN;
 	case 0:
@@ -453,32 +394,6 @@ xfs_has_attr(
 
 /*
  * Remove the attribute specified in @args.
- */
-int
-xfs_attr_remove_args(
-	struct xfs_da_args	*args)
-{
-	int				error;
-	struct xfs_delattr_context	dac = {
-		.da_args	= args,
-	};
-
-	do {
-		error = xfs_attr_remove_iter(&dac);
-		if (error != -EAGAIN)
-			break;
-
-		error = xfs_attr_trans_roll(&dac);
-		if (error)
-			return error;
-
-	} while (true);
-
-	return error;
-}
-
-/*
- * Remove the attribute specified in @args.
  *
  * This function may return -EAGAIN to signal that the transaction needs to be
  * rolled.  Callers should continue calling this function until they receive a
@@ -897,7 +812,6 @@ xfs_attr_leaf_addname(
 		if (error)
 			return error;
 
-		dac->flags |= XFS_DAC_DEFER_FINISH;
 		trace_xfs_das_state_return(dac->dela_state);
 		return -EAGAIN;
 	}
@@ -1205,7 +1119,6 @@ xfs_attr_node_addname(
 			 * this. dela_state is still unset by this function at
 			 * this point.
 			 */
-			dac->flags |= XFS_DAC_DEFER_FINISH;
 			trace_xfs_das_state_return(dac->dela_state);
 			return -EAGAIN;
 		}
@@ -1219,7 +1132,6 @@ xfs_attr_node_addname(
 		error = xfs_da3_split(state);
 		if (error)
 			goto out;
-		dac->flags |= XFS_DAC_DEFER_FINISH;
 	} else {
 		/*
 		 * Addition succeeded, update Btree hashvals.
@@ -1267,7 +1179,6 @@ xfs_attr_node_addname(
 			if (error)
 				return error;
 
-			dac->flags |= XFS_DAC_DEFER_FINISH;
 			trace_xfs_das_state_return(dac->dela_state);
 			return -EAGAIN;
 		}
@@ -1587,7 +1498,6 @@ xfs_attr_node_removename_iter(
 			if (error)
 				return error;
 
-			dac->flags |= XFS_DAC_DEFER_FINISH;
 			dac->dela_state = XFS_DAS_RM_SHRINK;
 			trace_xfs_das_state_return(dac->dela_state);
 			return -EAGAIN;
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 5d3aa0c..4838094 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -387,9 +387,8 @@ enum xfs_delattr_state {
 /*
  * Defines for xfs_delattr_context.flags
  */
-#define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
-#define XFS_DAC_LEAF_ADDNAME_INIT	0x02 /* xfs_attr_leaf_addname init*/
-#define XFS_DAC_DELAYED_OP_INIT		0x04 /* delayed operations init*/
+#define XFS_DAC_LEAF_ADDNAME_INIT	0x01 /* xfs_attr_leaf_addname init*/
+#define XFS_DAC_DELAYED_OP_INIT		0x02 /* delayed operations init*/
 
 /*
  * Context used for keeping track of delayed attribute operations
@@ -446,11 +445,8 @@ int xfs_inode_hasattr(struct xfs_inode *ip);
 int xfs_attr_get_ilocked(struct xfs_da_args *args);
 int xfs_attr_get(struct xfs_da_args *args);
 int xfs_attr_set(struct xfs_da_args *args);
-int xfs_attr_set_args(struct xfs_da_args *args);
-int xfs_attr_set_iter(struct xfs_delattr_context *dac,
-		      struct xfs_buf **leaf_bp);
+int xfs_attr_set_iter(struct xfs_delattr_context *dac);
 int xfs_has_attr(struct xfs_da_args *args);
-int xfs_attr_remove_args(struct xfs_da_args *args);
 int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
 bool xfs_attr_namecheck(const void *name, size_t length);
 void xfs_delattr_context_init(struct xfs_delattr_context *dac,
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 25639c0..a5ff5e0 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -762,7 +762,6 @@ xfs_attr_rmtval_remove(
 	 * by the parent
 	 */
 	if (!done) {
-		dac->flags |= XFS_DAC_DEFER_FINISH;
 		trace_xfs_das_state_return(dac->dela_state);
 		return -EAGAIN;
 	}
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
index c3b94a7..3185350 100644
--- a/fs/xfs/xfs_attr_item.c
+++ b/fs/xfs/xfs_attr_item.c
@@ -291,7 +291,6 @@ int
 xfs_trans_attr(
 	struct xfs_delattr_context	*dac,
 	struct xfs_attrd_log_item	*attrdp,
-	struct xfs_buf			**leaf_bp,
 	uint32_t			op_flags)
 {
 	struct xfs_da_args		*args = dac->da_args;
@@ -304,7 +303,7 @@ xfs_trans_attr(
 	switch (op_flags) {
 	case XFS_ATTR_OP_FLAGS_SET:
 		args->op_flags |= XFS_DA_OP_ADDNAME;
-		error = xfs_attr_set_iter(dac, leaf_bp);
+		error = xfs_attr_set_iter(dac);
 		break;
 	case XFS_ATTR_OP_FLAGS_REMOVE:
 		ASSERT(XFS_IFORK_Q(args->dp));
@@ -428,8 +427,7 @@ xfs_attr_finish_item(
 	 */
 	dac->da_args->trans = tp;
 
-	error = xfs_trans_attr(dac, done_item, &dac->leaf_bp,
-			       attr->xattri_op_flags);
+	error = xfs_trans_attr(dac, done_item, attr->xattri_op_flags);
 	if (error != -EAGAIN)
 		kmem_free(attr);
 
@@ -625,7 +623,7 @@ xfs_attri_item_recover(
 	xfs_trans_ijoin(args.trans, ip, 0);
 
 	error = xfs_trans_attr(&attr.xattri_dac, done_item,
-			       &attr.xattri_dac.leaf_bp, attrp->alfi_op_flags);
+			       attrp->alfi_op_flags);
 	if (error == -EAGAIN) {
 		/*
 		 * There's more work to do, so make a new xfs_attr_item and add
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 13/15] xfs: Add delayed attributes error tag
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (11 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 12/15] xfs: Remove unused xfs_attr_*_args Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 14/15] xfs: Add delattr mount option Allison Henderson
  2020-12-18  7:29 ` [PATCH v14 15/15] xfs: Merge xfs_delattr_context into xfs_attr_item Allison Henderson
  14 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

From: Allison Collins <allison.henderson@oracle.com>

This patch adds an error tag that we can use to test delayed attribute
recovery and replay

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_errortag.h | 4 +++-
 fs/xfs/xfs_attr_item.c       | 8 ++++++++
 fs/xfs/xfs_error.c           | 3 +++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
index 53b305d..cb38cbf 100644
--- a/fs/xfs/libxfs/xfs_errortag.h
+++ b/fs/xfs/libxfs/xfs_errortag.h
@@ -56,7 +56,8 @@
 #define XFS_ERRTAG_FORCE_SUMMARY_RECALC			33
 #define XFS_ERRTAG_IUNLINK_FALLBACK			34
 #define XFS_ERRTAG_BUF_IOERROR				35
-#define XFS_ERRTAG_MAX					36
+#define XFS_ERRTAG_DELAYED_ATTR				36
+#define XFS_ERRTAG_MAX					37
 
 /*
  * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
@@ -97,5 +98,6 @@
 #define XFS_RANDOM_FORCE_SUMMARY_RECALC			1
 #define XFS_RANDOM_IUNLINK_FALLBACK			(XFS_RANDOM_DEFAULT/10)
 #define XFS_RANDOM_BUF_IOERROR				XFS_RANDOM_DEFAULT
+#define XFS_RANDOM_DELAYED_ATTR				1
 
 #endif /* __XFS_ERRORTAG_H_ */
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
index 3185350..e1cfef1 100644
--- a/fs/xfs/xfs_attr_item.c
+++ b/fs/xfs/xfs_attr_item.c
@@ -40,6 +40,8 @@
 #include "xfs_trans_space.h"
 #include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
+#include "xfs_error.h"
+#include "xfs_errortag.h"
 
 static const struct xfs_item_ops xfs_attri_item_ops;
 static const struct xfs_item_ops xfs_attrd_item_ops;
@@ -300,6 +302,11 @@ xfs_trans_attr(
 	if (error)
 		return error;
 
+	if (XFS_TEST_ERROR(false, args->dp->i_mount, XFS_ERRTAG_DELAYED_ATTR)) {
+		error = -EIO;
+		goto out;
+	}
+
 	switch (op_flags) {
 	case XFS_ATTR_OP_FLAGS_SET:
 		args->op_flags |= XFS_DA_OP_ADDNAME;
@@ -314,6 +321,7 @@ xfs_trans_attr(
 		break;
 	}
 
+out:
 	/*
 	 * Mark the transaction dirty, even on error. This ensures the
 	 * transaction is aborted, which:
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 7f6e208..fc551cb 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -54,6 +54,7 @@ static unsigned int xfs_errortag_random_default[] = {
 	XFS_RANDOM_FORCE_SUMMARY_RECALC,
 	XFS_RANDOM_IUNLINK_FALLBACK,
 	XFS_RANDOM_BUF_IOERROR,
+	XFS_RANDOM_DELAYED_ATTR,
 };
 
 struct xfs_errortag_attr {
@@ -164,6 +165,7 @@ XFS_ERRORTAG_ATTR_RW(force_repair,	XFS_ERRTAG_FORCE_SCRUB_REPAIR);
 XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
 XFS_ERRORTAG_ATTR_RW(iunlink_fallback,	XFS_ERRTAG_IUNLINK_FALLBACK);
 XFS_ERRORTAG_ATTR_RW(buf_ioerror,	XFS_ERRTAG_BUF_IOERROR);
+XFS_ERRORTAG_ATTR_RW(delayed_attr,	XFS_ERRTAG_DELAYED_ATTR);
 
 static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(noerror),
@@ -202,6 +204,7 @@ static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(bad_summary),
 	XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
 	XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
+	XFS_ERRORTAG_ATTR_LIST(delayed_attr),
 	NULL,
 };
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 14/15] xfs: Add delattr mount option
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (12 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 13/15] xfs: Add delayed attributes error tag Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2021-01-05  5:46   ` Darrick J. Wong
  2020-12-18  7:29 ` [PATCH v14 15/15] xfs: Merge xfs_delattr_context into xfs_attr_item Allison Henderson
  14 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This patch adds a mount option to enable delayed attributes. Eventually
this can be removed when delayed attrs becomes permanent.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.h | 2 +-
 fs/xfs/xfs_mount.h       | 1 +
 fs/xfs/xfs_super.c       | 6 +++++-
 fs/xfs/xfs_xattr.c       | 2 ++
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 4838094..edd008d 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -30,7 +30,7 @@ struct xfs_attr_list_context;
 
 static inline bool xfs_hasdelattr(struct xfs_mount *mp)
 {
-	return false;
+	return mp->m_flags & XFS_MOUNT_DELATTR;
 }
 
 /*
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index dfa429b..4794f27 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -254,6 +254,7 @@ typedef struct xfs_mount {
 #define XFS_MOUNT_NOATTR2	(1ULL << 25)	/* disable use of attr2 format */
 #define XFS_MOUNT_DAX_ALWAYS	(1ULL << 26)
 #define XFS_MOUNT_DAX_NEVER	(1ULL << 27)
+#define XFS_MOUNT_DELATTR	(1ULL << 28)	/* enable delayed attributes */
 
 /*
  * Max and min values for mount-option defined I/O
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 813be87..72169ee 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -92,7 +92,7 @@ enum {
 	Opt_filestreams, Opt_quota, Opt_noquota, Opt_usrquota, Opt_grpquota,
 	Opt_prjquota, Opt_uquota, Opt_gquota, Opt_pquota,
 	Opt_uqnoenforce, Opt_gqnoenforce, Opt_pqnoenforce, Opt_qnoenforce,
-	Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum,
+	Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum, Opt_delattr
 };
 
 static const struct fs_parameter_spec xfs_fs_parameters[] = {
@@ -137,6 +137,7 @@ static const struct fs_parameter_spec xfs_fs_parameters[] = {
 	fsparam_flag("nodiscard",	Opt_nodiscard),
 	fsparam_flag("dax",		Opt_dax),
 	fsparam_enum("dax",		Opt_dax_enum, dax_param_enums),
+	fsparam_flag("delattr",		Opt_delattr),
 	{}
 };
 
@@ -1292,6 +1293,9 @@ xfs_fs_parse_param(
 		xfs_mount_set_dax_mode(mp, result.uint_32);
 		return 0;
 #endif
+	case Opt_delattr:
+		mp->m_flags |= XFS_MOUNT_DELATTR;
+		return 0;
 	/* Following mount options will be removed in September 2025 */
 	case Opt_ikeep:
 		xfs_warn(mp, "%s mount option is deprecated.", param->key);
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
index 9b0c790..8ec61df 100644
--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -8,6 +8,8 @@
 #include "xfs_shared.h"
 #include "xfs_format.h"
 #include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
 #include "xfs_da_format.h"
 #include "xfs_inode.h"
 #include "xfs_da_btree.h"
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v14 15/15] xfs: Merge xfs_delattr_context into xfs_attr_item
  2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
                   ` (13 preceding siblings ...)
  2020-12-18  7:29 ` [PATCH v14 14/15] xfs: Add delattr mount option Allison Henderson
@ 2020-12-18  7:29 ` Allison Henderson
  2021-01-05  5:47   ` Darrick J. Wong
  14 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-18  7:29 UTC (permalink / raw)
  To: linux-xfs

This is a clean up patch that merges xfs_delattr_context into
xfs_attr_item.  Now that the refactoring is complete and the delayed
operation infastructure is in place, we can combine these to eliminate
the extra struct

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 138 ++++++++++++++++++++--------------------
 fs/xfs/libxfs/xfs_attr.h        |  40 +++++-------
 fs/xfs/libxfs/xfs_attr_remote.c |  34 +++++-----
 fs/xfs/libxfs/xfs_attr_remote.h |   6 +-
 fs/xfs/xfs_attr_item.c          |  46 ++++++--------
 5 files changed, 127 insertions(+), 137 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 6e5a900..badcdae 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -46,7 +46,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
  * Internal routines when attribute list is one block.
  */
 STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
-STATIC int xfs_attr_leaf_addname(struct xfs_delattr_context *dac);
+STATIC int xfs_attr_leaf_addname(struct xfs_attr_item *attr);
 STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
 STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
 
@@ -54,8 +54,8 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
  * Internal routines when attribute list is more than one block.
  */
 STATIC int xfs_attr_node_get(xfs_da_args_t *args);
-STATIC int xfs_attr_node_addname(struct xfs_delattr_context *dac);
-STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
+STATIC int xfs_attr_node_addname(struct xfs_attr_item *attr);
+STATIC int xfs_attr_node_removename_iter(struct xfs_attr_item *attr);
 STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
 				 struct xfs_da_state **state);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
@@ -276,27 +276,27 @@ xfs_attr_set_shortform(
  */
 int
 xfs_attr_set_iter(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	struct xfs_inode		*dp = args->dp;
-	struct xfs_buf			**leaf_bp = &dac->leaf_bp;
+	struct xfs_buf			**leaf_bp = &attr->xattri_leaf_bp;
 	int				error = 0;
 
 	/* State machine switch */
-	switch (dac->dela_state) {
+	switch (attr->xattri_dela_state) {
 	case XFS_DAS_FLIP_LFLAG:
 	case XFS_DAS_FOUND_LBLK:
 	case XFS_DAS_RM_LBLK:
-		return xfs_attr_leaf_addname(dac);
+		return xfs_attr_leaf_addname(attr);
 	case XFS_DAS_FOUND_NBLK:
 	case XFS_DAS_FLIP_NFLAG:
 	case XFS_DAS_ALLOC_NODE:
-		return xfs_attr_node_addname(dac);
+		return xfs_attr_node_addname(attr);
 	case XFS_DAS_UNINIT:
 		break;
 	default:
-		ASSERT(dac->dela_state != XFS_DAS_RM_SHRINK);
+		ASSERT(attr->xattri_dela_state != XFS_DAS_RM_SHRINK);
 		break;
 	}
 
@@ -328,7 +328,7 @@ xfs_attr_set_iter(
 	}
 
 	if (!xfs_bmap_one_block(dp, XFS_ATTR_FORK))
-		return xfs_attr_node_addname(dac);
+		return xfs_attr_node_addname(attr);
 
 	error = xfs_attr_leaf_try_add(args, *leaf_bp);
 	switch (error) {
@@ -351,11 +351,11 @@ xfs_attr_set_iter(
 		 * when we come back, we'll be a node, so we'll fall
 		 * down into the node handling code below
 		 */
-		trace_xfs_das_state_return(dac->dela_state);
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 		return -EAGAIN;
 	case 0:
-		dac->dela_state = XFS_DAS_FOUND_LBLK;
-		trace_xfs_das_state_return(dac->dela_state);
+		attr->xattri_dela_state = XFS_DAS_FOUND_LBLK;
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 		return -EAGAIN;
 	}
 	return error;
@@ -401,13 +401,13 @@ xfs_has_attr(
  */
 int
 xfs_attr_remove_iter(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	struct xfs_inode		*dp = args->dp;
 
 	/* If we are shrinking a node, resume shrink */
-	if (dac->dela_state == XFS_DAS_RM_SHRINK)
+	if (attr->xattri_dela_state == XFS_DAS_RM_SHRINK)
 		goto node;
 
 	if (!xfs_inode_hasattr(dp))
@@ -422,7 +422,7 @@ xfs_attr_remove_iter(
 		return xfs_attr_leaf_removename(args);
 node:
 	/* If we are not short form or leaf, then proceed to remove node */
-	return  xfs_attr_node_removename_iter(dac);
+	return  xfs_attr_node_removename_iter(attr);
 }
 
 /*
@@ -573,7 +573,7 @@ xfs_attr_item_init(
 
 	new = kmem_zalloc(sizeof(struct xfs_attr_item), KM_NOFS);
 	new->xattri_op_flags = op_flags;
-	new->xattri_dac.da_args = args;
+	new->xattri_da_args = args;
 
 	*attr = new;
 	return 0;
@@ -768,16 +768,16 @@ xfs_attr_leaf_try_add(
  */
 STATIC int
 xfs_attr_leaf_addname(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	struct xfs_buf			*bp = NULL;
 	int				error, forkoff;
 	struct xfs_inode		*dp = args->dp;
 	struct xfs_mount		*mp = args->dp->i_mount;
 
 	/* State machine switch */
-	switch (dac->dela_state) {
+	switch (attr->xattri_dela_state) {
 	case XFS_DAS_FLIP_LFLAG:
 		goto das_flip_flag;
 	case XFS_DAS_RM_LBLK:
@@ -794,10 +794,10 @@ xfs_attr_leaf_addname(
 	 */
 
 	/* Open coded xfs_attr_rmtval_set without trans handling */
-	if ((dac->flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
-		dac->flags |= XFS_DAC_LEAF_ADDNAME_INIT;
+	if ((attr->xattri_flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
+		attr->xattri_flags |= XFS_DAC_LEAF_ADDNAME_INIT;
 		if (args->rmtblkno > 0) {
-			error = xfs_attr_rmtval_find_space(dac);
+			error = xfs_attr_rmtval_find_space(attr);
 			if (error)
 				return error;
 		}
@@ -807,12 +807,12 @@ xfs_attr_leaf_addname(
 	 * Roll through the "value", allocating blocks on disk as
 	 * required.
 	 */
-	if (dac->blkcnt > 0) {
-		error = xfs_attr_rmtval_set_blk(dac);
+	if (attr->xattri_blkcnt > 0) {
+		error = xfs_attr_rmtval_set_blk(attr);
 		if (error)
 			return error;
 
-		trace_xfs_das_state_return(dac->dela_state);
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 		return -EAGAIN;
 	}
 
@@ -846,8 +846,8 @@ xfs_attr_leaf_addname(
 		/*
 		 * Commit the flag value change and start the next trans in series.
 		 */
-		dac->dela_state = XFS_DAS_FLIP_LFLAG;
-		trace_xfs_das_state_return(dac->dela_state);
+		attr->xattri_dela_state = XFS_DAS_FLIP_LFLAG;
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 		return -EAGAIN;
 	}
 das_flip_flag:
@@ -862,12 +862,12 @@ xfs_attr_leaf_addname(
 		return error;
 
 	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
-	dac->dela_state = XFS_DAS_RM_LBLK;
+	attr->xattri_dela_state = XFS_DAS_RM_LBLK;
 das_rm_lblk:
 	if (args->rmtblkno) {
-		error = xfs_attr_rmtval_remove(dac);
+		error = xfs_attr_rmtval_remove(attr);
 		if (error == -EAGAIN)
-			trace_xfs_das_state_return(dac->dela_state);
+			trace_xfs_das_state_return(attr->xattri_dela_state);
 		if (error)
 			return error;
 	}
@@ -1041,9 +1041,9 @@ xfs_attr_node_hasname(
  */
 STATIC int
 xfs_attr_node_addname(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	struct xfs_da_state		*state = NULL;
 	struct xfs_da_state_blk		*blk;
 	int				retval = 0;
@@ -1053,7 +1053,7 @@ xfs_attr_node_addname(
 	trace_xfs_attr_node_addname(args);
 
 	/* State machine switch */
-	switch (dac->dela_state) {
+	switch (attr->xattri_dela_state) {
 	case XFS_DAS_FLIP_NFLAG:
 		goto das_flip_flag;
 	case XFS_DAS_FOUND_NBLK:
@@ -1119,7 +1119,7 @@ xfs_attr_node_addname(
 			 * this. dela_state is still unset by this function at
 			 * this point.
 			 */
-			trace_xfs_das_state_return(dac->dela_state);
+			trace_xfs_das_state_return(attr->xattri_dela_state);
 			return -EAGAIN;
 		}
 
@@ -1151,8 +1151,8 @@ xfs_attr_node_addname(
 	xfs_da_state_free(state);
 	state = NULL;
 
-	dac->dela_state = XFS_DAS_FOUND_NBLK;
-	trace_xfs_das_state_return(dac->dela_state);
+	attr->xattri_dela_state = XFS_DAS_FOUND_NBLK;
+	trace_xfs_das_state_return(attr->xattri_dela_state);
 	return -EAGAIN;
 das_found_nblk:
 
@@ -1164,7 +1164,7 @@ xfs_attr_node_addname(
 	 */
 	if (args->rmtblkno > 0) {
 		/* Open coded xfs_attr_rmtval_set without trans handling */
-		error = xfs_attr_rmtval_find_space(dac);
+		error = xfs_attr_rmtval_find_space(attr);
 		if (error)
 			return error;
 
@@ -1172,14 +1172,14 @@ xfs_attr_node_addname(
 		 * Roll through the "value", allocating blocks on disk as
 		 * required.  Set the state in case of -EAGAIN return code
 		 */
-		dac->dela_state = XFS_DAS_ALLOC_NODE;
+		attr->xattri_dela_state = XFS_DAS_ALLOC_NODE;
 das_alloc_node:
-		if (dac->blkcnt > 0) {
-			error = xfs_attr_rmtval_set_blk(dac);
+		if (attr->xattri_blkcnt > 0) {
+			error = xfs_attr_rmtval_set_blk(attr);
 			if (error)
 				return error;
 
-			trace_xfs_das_state_return(dac->dela_state);
+			trace_xfs_das_state_return(attr->xattri_dela_state);
 			return -EAGAIN;
 		}
 
@@ -1214,8 +1214,8 @@ xfs_attr_node_addname(
 		/*
 		 * Commit the flag value change and start the next trans in series
 		 */
-		dac->dela_state = XFS_DAS_FLIP_NFLAG;
-		trace_xfs_das_state_return(dac->dela_state);
+		attr->xattri_dela_state = XFS_DAS_FLIP_NFLAG;
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 		return -EAGAIN;
 	}
 das_flip_flag:
@@ -1230,13 +1230,13 @@ xfs_attr_node_addname(
 		return error;
 
 	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
-	dac->dela_state = XFS_DAS_RM_NBLK;
+	attr->xattri_dela_state = XFS_DAS_RM_NBLK;
 das_rm_nblk:
 	if (args->rmtblkno) {
-		error = xfs_attr_rmtval_remove(dac);
+		error = xfs_attr_rmtval_remove(attr);
 
 		if (error == -EAGAIN)
-			trace_xfs_das_state_return(dac->dela_state);
+			trace_xfs_das_state_return(attr->xattri_dela_state);
 
 		if (error)
 			return error;
@@ -1344,10 +1344,10 @@ xfs_attr_leaf_mark_incomplete(
  */
 STATIC
 int xfs_attr_node_removename_setup(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
-	struct xfs_da_state		**state = &dac->da_state;
+	struct xfs_da_args		*args = attr->xattri_da_args;
+	struct xfs_da_state		**state = &attr->xattri_da_state;
 	int				error;
 
 	error = xfs_attr_node_hasname(args, state);
@@ -1371,7 +1371,7 @@ int xfs_attr_node_removename_setup(
 
 STATIC int
 xfs_attr_node_remove_rmt (
-	struct xfs_delattr_context	*dac,
+	struct xfs_attr_item		*attr,
 	struct xfs_da_state		*state)
 {
 	int				error = 0;
@@ -1379,9 +1379,9 @@ xfs_attr_node_remove_rmt (
 	/*
 	 * May return -EAGAIN to request that the caller recall this function
 	 */
-	error = xfs_attr_rmtval_remove(dac);
+	error = xfs_attr_rmtval_remove(attr);
 	if (error == -EAGAIN)
-		trace_xfs_das_state_return(dac->dela_state);
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 	if (error)
 		return error;
 
@@ -1425,10 +1425,10 @@ xfs_attr_node_remove_cleanup(
  */
 STATIC int
 xfs_attr_node_remove_step(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
-	struct xfs_da_state		*state = dac->da_state;
+	struct xfs_da_args		*args = attr->xattri_da_args;
+	struct xfs_da_state		*state = attr->xattri_da_state;
 	int				error = 0;
 	/*
 	 * If there is an out-of-line value, de-allocate the blocks.
@@ -1439,7 +1439,7 @@ xfs_attr_node_remove_step(
 		/*
 		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
 		 */
-		error = xfs_attr_node_remove_rmt(dac, state);
+		error = xfs_attr_node_remove_rmt(attr, state);
 		if (error)
 			return error;
 	}
@@ -1460,29 +1460,29 @@ xfs_attr_node_remove_step(
  */
 STATIC int
 xfs_attr_node_removename_iter(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	struct xfs_da_state		*state = NULL;
 	int				retval, error;
 	struct xfs_inode		*dp = args->dp;
 
 	trace_xfs_attr_node_removename(args);
 
-	if (!dac->da_state) {
-		error = xfs_attr_node_removename_setup(dac);
+	if (!attr->xattri_da_state) {
+		error = xfs_attr_node_removename_setup(attr);
 		if (error)
 			goto out;
 	}
-	state = dac->da_state;
+	state = attr->xattri_da_state;
 
-	switch (dac->dela_state) {
+	switch (attr->xattri_dela_state) {
 	case XFS_DAS_UNINIT:
 		/*
 		 * repeatedly remove remote blocks, remove the entry and join.
 		 * returns -EAGAIN or 0 for completion of the step.
 		 */
-		error = xfs_attr_node_remove_step(dac);
+		error = xfs_attr_node_remove_step(attr);
 		if (error)
 			break;
 
@@ -1498,8 +1498,8 @@ xfs_attr_node_removename_iter(
 			if (error)
 				return error;
 
-			dac->dela_state = XFS_DAS_RM_SHRINK;
-			trace_xfs_das_state_return(dac->dela_state);
+			attr->xattri_dela_state = XFS_DAS_RM_SHRINK;
+			trace_xfs_das_state_return(attr->xattri_dela_state);
 			return -EAGAIN;
 		}
 
@@ -1519,7 +1519,7 @@ xfs_attr_node_removename_iter(
 	}
 
 	if (error == -EAGAIN) {
-		trace_xfs_das_state_return(dac->dela_state);
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 		return error;
 	}
 out:
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index edd008d..d1a59d0 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -364,7 +364,7 @@ struct xfs_attr_list_context {
  */
 
 /*
- * Enum values for xfs_delattr_context.da_state
+ * Enum values for xfs_attr_item.xattri_da_state
  *
  * These values are used by delayed attribute operations to keep track  of where
  * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
@@ -385,7 +385,7 @@ enum xfs_delattr_state {
 };
 
 /*
- * Defines for xfs_delattr_context.flags
+ * Defines for xfs_attr_item.xattri_flags
  */
 #define XFS_DAC_LEAF_ADDNAME_INIT	0x01 /* xfs_attr_leaf_addname init*/
 #define XFS_DAC_DELAYED_OP_INIT		0x02 /* delayed operations init*/
@@ -393,32 +393,25 @@ enum xfs_delattr_state {
 /*
  * Context used for keeping track of delayed attribute operations
  */
-struct xfs_delattr_context {
-	struct xfs_da_args      *da_args;
+struct xfs_attr_item {
+	struct xfs_da_args		*xattri_da_args;
 
 	/*
 	 * Used by xfs_attr_set to hold a leaf buffer across a transaction roll
 	 */
-	struct xfs_buf		*leaf_bp;
+	struct xfs_buf			*xattri_leaf_bp;
 
 	/* Used in xfs_attr_rmtval_set_blk to roll through allocating blocks */
-	struct xfs_bmbt_irec	map;
-	xfs_dablk_t		lblkno;
-	int			blkcnt;
+	struct xfs_bmbt_irec		xattri_map;
+	xfs_dablk_t			xattri_lblkno;
+	int				xattri_blkcnt;
 
 	/* Used in xfs_attr_node_removename to roll through removing blocks */
-	struct xfs_da_state     *da_state;
+	struct xfs_da_state		*xattri_da_state;
 
 	/* Used to keep track of current state of delayed operation */
-	unsigned int            flags;
-	enum xfs_delattr_state  dela_state;
-};
-
-/*
- * List of attrs to commit later.
- */
-struct xfs_attr_item {
-	struct xfs_delattr_context	xattri_dac;
+	unsigned int			xattri_flags;
+	enum xfs_delattr_state		xattri_dela_state;
 
 	/*
 	 * Indicates if the attr operation is a set or a remove
@@ -426,7 +419,10 @@ struct xfs_attr_item {
 	 */
 	uint32_t			xattri_op_flags;
 
-	/* used to log this item to an intent */
+	/*
+	 * used to log this item to an intent containing a list of attrs to
+	 * commit later
+	 */
 	struct list_head		xattri_list;
 };
 
@@ -445,12 +441,10 @@ int xfs_inode_hasattr(struct xfs_inode *ip);
 int xfs_attr_get_ilocked(struct xfs_da_args *args);
 int xfs_attr_get(struct xfs_da_args *args);
 int xfs_attr_set(struct xfs_da_args *args);
-int xfs_attr_set_iter(struct xfs_delattr_context *dac);
+int xfs_attr_set_iter(struct xfs_attr_item *attr);
 int xfs_has_attr(struct xfs_da_args *args);
-int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
+int xfs_attr_remove_iter(struct xfs_attr_item *attr);
 bool xfs_attr_namecheck(const void *name, size_t length);
-void xfs_delattr_context_init(struct xfs_delattr_context *dac,
-			      struct xfs_da_args *args);
 int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 int xfs_attr_set_deferred(struct xfs_da_args *args);
 int xfs_attr_remove_deferred(struct xfs_da_args *args);
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index a5ff5e0..42cc9cc 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -634,14 +634,14 @@ xfs_attr_rmtval_set(
  */
 int
 xfs_attr_rmtval_find_space(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
-	struct xfs_bmbt_irec		*map = &dac->map;
+	struct xfs_da_args		*args = attr->xattri_da_args;
+	struct xfs_bmbt_irec		*map = &attr->xattri_map;
 	int				error;
 
-	dac->lblkno = 0;
-	dac->blkcnt = 0;
+	attr->xattri_lblkno = 0;
+	attr->xattri_blkcnt = 0;
 	args->rmtblkcnt = 0;
 	args->rmtblkno = 0;
 	memset(map, 0, sizeof(struct xfs_bmbt_irec));
@@ -650,8 +650,8 @@ xfs_attr_rmtval_find_space(
 	if (error)
 		return error;
 
-	dac->blkcnt = args->rmtblkcnt;
-	dac->lblkno = args->rmtblkno;
+	attr->xattri_blkcnt = args->rmtblkcnt;
+	attr->xattri_lblkno = args->rmtblkno;
 
 	return 0;
 }
@@ -664,17 +664,17 @@ xfs_attr_rmtval_find_space(
  */
 int
 xfs_attr_rmtval_set_blk(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	struct xfs_inode		*dp = args->dp;
-	struct xfs_bmbt_irec		*map = &dac->map;
+	struct xfs_bmbt_irec		*map = &attr->xattri_map;
 	int nmap;
 	int error;
 
 	nmap = 1;
-	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)dac->lblkno,
-				dac->blkcnt, XFS_BMAPI_ATTRFORK, args->total,
+	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)attr->xattri_lblkno,
+				attr->xattri_blkcnt, XFS_BMAPI_ATTRFORK, args->total,
 				map, &nmap);
 	if (error)
 		return error;
@@ -684,8 +684,8 @@ xfs_attr_rmtval_set_blk(
 	       (map->br_startblock != HOLESTARTBLOCK));
 
 	/* roll attribute extent map forwards */
-	dac->lblkno += map->br_blockcount;
-	dac->blkcnt -= map->br_blockcount;
+	attr->xattri_lblkno += map->br_blockcount;
+	attr->xattri_blkcnt -= map->br_blockcount;
 
 	return 0;
 }
@@ -738,9 +738,9 @@ xfs_attr_rmtval_invalidate(
  */
 int
 xfs_attr_rmtval_remove(
-	struct xfs_delattr_context	*dac)
+	struct xfs_attr_item		*attr)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	int				error, done;
 
 	/*
@@ -762,7 +762,7 @@ xfs_attr_rmtval_remove(
 	 * by the parent
 	 */
 	if (!done) {
-		trace_xfs_das_state_return(dac->dela_state);
+		trace_xfs_das_state_return(attr->xattri_dela_state);
 		return -EAGAIN;
 	}
 
diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
index 6ae91af..d3aa27d 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.h
+++ b/fs/xfs/libxfs/xfs_attr_remote.h
@@ -13,9 +13,9 @@ int xfs_attr_rmtval_set(struct xfs_da_args *args);
 int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
 		xfs_buf_flags_t incore_flags);
 int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
-int xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
+int xfs_attr_rmtval_remove(struct xfs_attr_item *attr);
 int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
 int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
-int xfs_attr_rmtval_set_blk(struct xfs_delattr_context *dac);
-int xfs_attr_rmtval_find_space(struct xfs_delattr_context *dac);
+int xfs_attr_rmtval_set_blk(struct xfs_attr_item *attr);
+int xfs_attr_rmtval_find_space(struct xfs_attr_item *attr);
 #endif /* __XFS_ATTR_REMOTE_H__ */
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
index e1cfef1..bbca949 100644
--- a/fs/xfs/xfs_attr_item.c
+++ b/fs/xfs/xfs_attr_item.c
@@ -291,11 +291,11 @@ xfs_attrd_item_release(
  */
 int
 xfs_trans_attr(
-	struct xfs_delattr_context	*dac,
+	struct xfs_attr_item		*attr,
 	struct xfs_attrd_log_item	*attrdp,
 	uint32_t			op_flags)
 {
-	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_args		*args = attr->xattri_da_args;
 	int				error;
 
 	error = xfs_qm_dqattach_locked(args->dp, 0);
@@ -310,11 +310,11 @@ xfs_trans_attr(
 	switch (op_flags) {
 	case XFS_ATTR_OP_FLAGS_SET:
 		args->op_flags |= XFS_DA_OP_ADDNAME;
-		error = xfs_attr_set_iter(dac);
+		error = xfs_attr_set_iter(attr);
 		break;
 	case XFS_ATTR_OP_FLAGS_REMOVE:
 		ASSERT(XFS_IFORK_Q(args->dp));
-		error = xfs_attr_remove_iter(dac);
+		error = xfs_attr_remove_iter(attr);
 		break;
 	default:
 		error = -EFSCORRUPTED;
@@ -358,16 +358,16 @@ xfs_attr_log_item(
 	 * structure with fields from this xfs_attr_item
 	 */
 	attrp = &attrip->attri_format;
-	attrp->alfi_ino = attr->xattri_dac.da_args->dp->i_ino;
+	attrp->alfi_ino = attr->xattri_da_args->dp->i_ino;
 	attrp->alfi_op_flags = attr->xattri_op_flags;
-	attrp->alfi_value_len = attr->xattri_dac.da_args->valuelen;
-	attrp->alfi_name_len = attr->xattri_dac.da_args->namelen;
-	attrp->alfi_attr_flags = attr->xattri_dac.da_args->attr_filter;
-
-	attrip->attri_name = (void *)attr->xattri_dac.da_args->name;
-	attrip->attri_value = attr->xattri_dac.da_args->value;
-	attrip->attri_name_len = attr->xattri_dac.da_args->namelen;
-	attrip->attri_value_len = attr->xattri_dac.da_args->valuelen;
+	attrp->alfi_value_len = attr->xattri_da_args->valuelen;
+	attrp->alfi_name_len = attr->xattri_da_args->namelen;
+	attrp->alfi_attr_flags = attr->xattri_da_args->attr_filter;
+
+	attrip->attri_name = (void *)attr->xattri_da_args->name;
+	attrip->attri_value = attr->xattri_da_args->value;
+	attrip->attri_name_len = attr->xattri_da_args->namelen;
+	attrip->attri_value_len = attr->xattri_da_args->valuelen;
 }
 
 /* Get an ATTRI. */
@@ -408,10 +408,8 @@ xfs_attr_finish_item(
 	struct xfs_attr_item		*attr;
 	struct xfs_attrd_log_item	*done_item = NULL;
 	int				error;
-	struct xfs_delattr_context	*dac;
 
 	attr = container_of(item, struct xfs_attr_item, xattri_list);
-	dac = &attr->xattri_dac;
 	if (done)
 		done_item = ATTRD_ITEM(done);
 
@@ -423,19 +421,18 @@ xfs_attr_finish_item(
 	 * in a standard delay op, so we need to catch this here and rejoin the
 	 * leaf to the new transaction
 	 */
-	if (attr->xattri_dac.leaf_bp &&
-	    attr->xattri_dac.leaf_bp->b_transp != tp) {
-		xfs_trans_bjoin(tp, attr->xattri_dac.leaf_bp);
-		xfs_trans_bhold(tp, attr->xattri_dac.leaf_bp);
+	if (attr->xattri_leaf_bp && attr->xattri_leaf_bp->b_transp != tp) {
+		xfs_trans_bjoin(tp, attr->xattri_leaf_bp);
+		xfs_trans_bhold(tp, attr->xattri_leaf_bp);
 	}
 
 	/*
 	 * Always reset trans after EAGAIN cycle
 	 * since the transaction is new
 	 */
-	dac->da_args->trans = tp;
+	attr->xattri_da_args->trans = tp;
 
-	error = xfs_trans_attr(dac, done_item, attr->xattri_op_flags);
+	error = xfs_trans_attr(attr, done_item, attr->xattri_op_flags);
 	if (error != -EAGAIN)
 		kmem_free(attr);
 
@@ -570,7 +567,7 @@ xfs_attri_item_recover(
 	struct xfs_attrd_log_item	*done_item = NULL;
 	struct xfs_attr_item		attr = {
 		.xattri_op_flags	= attrip->attri_format.alfi_op_flags,
-		.xattri_dac.da_args	= &args,
+		.xattri_da_args		= &args,
 	};
 
 	/*
@@ -630,8 +627,7 @@ xfs_attri_item_recover(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(args.trans, ip, 0);
 
-	error = xfs_trans_attr(&attr.xattri_dac, done_item,
-			       attrp->alfi_op_flags);
+	error = xfs_trans_attr(&attr, done_item, attrp->alfi_op_flags);
 	if (error == -EAGAIN) {
 		/*
 		 * There's more work to do, so make a new xfs_attr_item and add
@@ -648,7 +644,7 @@ xfs_attri_item_recover(
 		memcpy(new_args, &args, sizeof(struct xfs_da_args));
 		memcpy(new_attr, &attr, sizeof(struct xfs_attr_item));
 
-		new_attr->xattri_dac.da_args = new_args;
+		new_attr->xattri_da_args = new_args;
 		memset(&new_attr->xattri_list, 0, sizeof(struct list_head));
 
 		xfs_defer_add(args.trans, XFS_DEFER_OPS_TYPE_ATTR,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step
  2020-12-18  7:29 ` [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step Allison Henderson
@ 2020-12-21  6:45   ` Chandan Babu R
  2020-12-21 23:48     ` Allison Henderson
  2020-12-22 16:50   ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Chandan Babu R @ 2020-12-21  6:45 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, 18 Dec 2020 00:29:03 -0700, Allison Henderson wrote:
> From: Allison Collins <allison.henderson@oracle.com>
> 
> This patch as a new helper function xfs_attr_node_remove_step.  This

The above should probably be "This patch adds a new ...".

> will help simplify and modularize the calling function
> xfs_attr_node_remove.

The calling function is xfs_attr_node_removename.

Other than the above mentioned nits, the changes look good to me,

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 46 ++++++++++++++++++++++++++++++++++------------
>  1 file changed, 34 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index fd8e641..8b55a8d 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -1228,19 +1228,14 @@ xfs_attr_node_remove_rmt(
>   * the root node (a special case of an intermediate node).
>   */
>  STATIC int
> -xfs_attr_node_removename(
> -	struct xfs_da_args	*args)
> +xfs_attr_node_remove_step(
> +	struct xfs_da_args	*args,
> +	struct xfs_da_state	*state)
>  {
> -	struct xfs_da_state	*state;
>  	struct xfs_da_state_blk	*blk;
>  	int			retval, error;
>  	struct xfs_inode	*dp = args->dp;
>  
> -	trace_xfs_attr_node_removename(args);
> -
> -	error = xfs_attr_node_removename_setup(args, &state);
> -	if (error)
> -		goto out;
>  
>  	/*
>  	 * If there is an out-of-line value, de-allocate the blocks.
> @@ -1250,7 +1245,7 @@ xfs_attr_node_removename(
>  	if (args->rmtblkno > 0) {
>  		error = xfs_attr_node_remove_rmt(args, state);
>  		if (error)
> -			goto out;
> +			return error;
>  	}
>  
>  	/*
> @@ -1267,18 +1262,45 @@ xfs_attr_node_removename(
>  	if (retval && (state->path.active > 1)) {
>  		error = xfs_da3_join(state);
>  		if (error)
> -			goto out;
> +			return error;
>  		error = xfs_defer_finish(&args->trans);
>  		if (error)
> -			goto out;
> +			return error;
>  		/*
>  		 * Commit the Btree join operation and start a new trans.
>  		 */
>  		error = xfs_trans_roll_inode(&args->trans, dp);
>  		if (error)
> -			goto out;
> +			return error;
>  	}
>  
> +	return error;
> +}
> +
> +/*
> + * Remove a name from a B-tree attribute list.
> + *
> + * This routine will find the blocks of the name to remove, remove them and
> + * shrink the tree if needed.
> + */
> +STATIC int
> +xfs_attr_node_removename(
> +	struct xfs_da_args	*args)
> +{
> +	struct xfs_da_state	*state = NULL;
> +	int			error;
> +	struct xfs_inode	*dp = args->dp;
> +
> +	trace_xfs_attr_node_removename(args);
> +
> +	error = xfs_attr_node_removename_setup(args, &state);
> +	if (error)
> +		goto out;
> +
> +	error = xfs_attr_node_remove_step(args, state);
> +	if (error)
> +		goto out;
> +
>  	/*
>  	 * If the result is small enough, push it all into the inode.
>  	 */
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup
  2020-12-18  7:29 ` [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup Allison Henderson
@ 2020-12-21  6:45   ` Chandan Babu R
  2020-12-21 23:47     ` Allison Henderson
  2020-12-22 16:50   ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Chandan Babu R @ 2020-12-21  6:45 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, 18 Dec 2020 00:29:04 -0700, Allison Henderson wrote:
> This patch pulls a new helper function xfs_attr_node_remove_cleanup out
> of xfs_attr_node_remove_step.  This helps to modularize
> xfs_attr_node_remove_step which will help make the delayed attribute
> code easier to follow

The changes look good to me.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 29 ++++++++++++++++++++---------
>  1 file changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 8b55a8d..e93d76a 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -1220,6 +1220,25 @@ xfs_attr_node_remove_rmt(
>  	return xfs_attr_refillstate(state);
>  }
>  
> +STATIC int
> +xfs_attr_node_remove_cleanup(
> +	struct xfs_da_args	*args,
> +	struct xfs_da_state	*state)
> +{
> +	struct xfs_da_state_blk	*blk;
> +	int			retval;
> +
> +	/*
> +	 * Remove the name and update the hashvals in the tree.
> +	 */
> +	blk = &state->path.blk[state->path.active-1];
> +	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
> +	retval = xfs_attr3_leaf_remove(blk->bp, args);
> +	xfs_da3_fixhashpath(state, &state->path);
> +
> +	return retval;
> +}
> +
>  /*
>   * Remove a name from a B-tree attribute list.
>   *
> @@ -1232,7 +1251,6 @@ xfs_attr_node_remove_step(
>  	struct xfs_da_args	*args,
>  	struct xfs_da_state	*state)
>  {
> -	struct xfs_da_state_blk	*blk;
>  	int			retval, error;
>  	struct xfs_inode	*dp = args->dp;
>  
> @@ -1247,14 +1265,7 @@ xfs_attr_node_remove_step(
>  		if (error)
>  			return error;
>  	}
> -
> -	/*
> -	 * Remove the name and update the hashvals in the tree.
> -	 */
> -	blk = &state->path.blk[ state->path.active-1 ];
> -	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
> -	retval = xfs_attr3_leaf_remove(blk->bp, args);
> -	xfs_da3_fixhashpath(state, &state->path);
> +	retval = xfs_attr_node_remove_cleanup(args, state);
>  
>  	/*
>  	 * Check to see if the tree needs to be collapsed.
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 03/15] xfs: Hoist transaction handling in xfs_attr_node_remove_step
  2020-12-18  7:29 ` [PATCH v14 03/15] xfs: Hoist transaction handling in xfs_attr_node_remove_step Allison Henderson
@ 2020-12-21  6:45   ` Chandan Babu R
  2020-12-21 21:51     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Chandan Babu R @ 2020-12-21  6:45 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, 18 Dec 2020 00:29:05 -0700, Allison Henderson wrote:
> This patch hoists transaction handling in xfs_attr_node_remove to

... "transaction handling in xfs_attr_node_removename"

> xfs_attr_node_remove_step.  This will help keep transaction handling in
> higher level functions instead of buried in subfunctions when we
> introduce delay attributes
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 43 ++++++++++++++++++++++---------------------
>  1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index e93d76a..1969b88 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -1251,7 +1251,7 @@ xfs_attr_node_remove_step(
>  	struct xfs_da_args	*args,
>  	struct xfs_da_state	*state)
>  {
> -	int			retval, error;
> +	int			error;
>  	struct xfs_inode	*dp = args->dp;

The declaration of "dp" variable can be removed since there are no references
to it left after the removal of the following hunk.

>  
>  
> @@ -1265,25 +1265,6 @@ xfs_attr_node_remove_step(
>  		if (error)
>  			return error;
>  	}
> -	retval = xfs_attr_node_remove_cleanup(args, state);
> -
> -	/*
> -	 * Check to see if the tree needs to be collapsed.
> -	 */
> -	if (retval && (state->path.active > 1)) {
> -		error = xfs_da3_join(state);
> -		if (error)
> -			return error;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			return error;
> -		/*
> -		 * Commit the Btree join operation and start a new trans.
> -		 */
> -		error = xfs_trans_roll_inode(&args->trans, dp);
> -		if (error)
> -			return error;
> -	}
>  
>  	return error;
>  }
> @@ -1299,7 +1280,7 @@ xfs_attr_node_removename(
>  	struct xfs_da_args	*args)
>  {
>  	struct xfs_da_state	*state = NULL;
> -	int			error;
> +	int			retval, error;
>  	struct xfs_inode	*dp = args->dp;
>  
>  	trace_xfs_attr_node_removename(args);
> @@ -1312,6 +1293,26 @@ xfs_attr_node_removename(
>  	if (error)
>  		goto out;
>  
> +	retval = xfs_attr_node_remove_cleanup(args, state);
> +
> +	/*
> +	 * Check to see if the tree needs to be collapsed.
> +	 */
> +	if (retval && (state->path.active > 1)) {
> +		error = xfs_da3_join(state);
> +		if (error)
> +			return error;

When a non-zero value is returned by xfs_da3_join(), the code would fail to
free the memory pointed to by "state". Same review comment applies to the two
return statements below.

> +		error = xfs_defer_finish(&args->trans);
> +		if (error)
> +			return error;
> +		/*
> +		 * Commit the Btree join operation and start a new trans.
> +		 */
> +		error = xfs_trans_roll_inode(&args->trans, dp);
> +		if (error)
> +			return error;
> +	}
> +
>  	/*
>  	 * If the result is small enough, push it all into the inode.
>  	 */
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 03/15] xfs: Hoist transaction handling in xfs_attr_node_remove_step
  2020-12-21  6:45   ` Chandan Babu R
@ 2020-12-21 21:51     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-21 21:51 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs



On 12/20/20 11:45 PM, Chandan Babu R wrote:
> On Fri, 18 Dec 2020 00:29:05 -0700, Allison Henderson wrote:
>> This patch hoists transaction handling in xfs_attr_node_remove to
> 
> ... "transaction handling in xfs_attr_node_removename"
> 
>> xfs_attr_node_remove_step.  This will help keep transaction handling in
>> higher level functions instead of buried in subfunctions when we
>> introduce delay attributes
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 43 ++++++++++++++++++++++---------------------
>>   1 file changed, 22 insertions(+), 21 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index e93d76a..1969b88 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -1251,7 +1251,7 @@ xfs_attr_node_remove_step(
>>   	struct xfs_da_args	*args,
>>   	struct xfs_da_state	*state)
>>   {
>> -	int			retval, error;
>> +	int			error;
>>   	struct xfs_inode	*dp = args->dp;
> 
> The declaration of "dp" variable can be removed since there are no references
> to it left after the removal of the following hunk.
Ok, will remove

> 
>>   
>>   
>> @@ -1265,25 +1265,6 @@ xfs_attr_node_remove_step(
>>   		if (error)
>>   			return error;
>>   	}
>> -	retval = xfs_attr_node_remove_cleanup(args, state);
>> -
>> -	/*
>> -	 * Check to see if the tree needs to be collapsed.
>> -	 */
>> -	if (retval && (state->path.active > 1)) {
>> -		error = xfs_da3_join(state);
>> -		if (error)
>> -			return error;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			return error;
>> -		/*
>> -		 * Commit the Btree join operation and start a new trans.
>> -		 */
>> -		error = xfs_trans_roll_inode(&args->trans, dp);
>> -		if (error)
>> -			return error;
>> -	}
>>   
>>   	return error;
>>   }
>> @@ -1299,7 +1280,7 @@ xfs_attr_node_removename(
>>   	struct xfs_da_args	*args)
>>   {
>>   	struct xfs_da_state	*state = NULL;
>> -	int			error;
>> +	int			retval, error;
>>   	struct xfs_inode	*dp = args->dp;
>>   
>>   	trace_xfs_attr_node_removename(args);
>> @@ -1312,6 +1293,26 @@ xfs_attr_node_removename(
>>   	if (error)
>>   		goto out;
>>   
>> +	retval = xfs_attr_node_remove_cleanup(args, state);
>> +
>> +	/*
>> +	 * Check to see if the tree needs to be collapsed.
>> +	 */
>> +	if (retval && (state->path.active > 1)) {
>> +		error = xfs_da3_join(state);
>> +		if (error)
>> +			return error;
> 
> When a non-zero value is returned by xfs_da3_join(), the code would fail to
> free the memory pointed to by "state". Same review comment applies to the two
> return statements below.
Ok, these need to be "goto out".  Will fix, thx!
Allison

> 
>> +		error = xfs_defer_finish(&args->trans);
>> +		if (error)
>> +			return error;
>> +		/*
>> +		 * Commit the Btree join operation and start a new trans.
>> +		 */
>> +		error = xfs_trans_roll_inode(&args->trans, dp);
>> +		if (error)
>> +			return error;
>> +	}
>> +
>>   	/*
>>   	 * If the result is small enough, push it all into the inode.
>>   	 */
>>
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup
  2020-12-21  6:45   ` Chandan Babu R
@ 2020-12-21 23:47     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-21 23:47 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs



On 12/20/20 11:45 PM, Chandan Babu R wrote:
> On Fri, 18 Dec 2020 00:29:04 -0700, Allison Henderson wrote:
>> This patch pulls a new helper function xfs_attr_node_remove_cleanup out
>> of xfs_attr_node_remove_step.  This helps to modularize
>> xfs_attr_node_remove_step which will help make the delayed attribute
>> code easier to follow
> 
> The changes look good to me.
> 
> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Ok, thank you!
Allison

> 
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 29 ++++++++++++++++++++---------
>>   1 file changed, 20 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 8b55a8d..e93d76a 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -1220,6 +1220,25 @@ xfs_attr_node_remove_rmt(
>>   	return xfs_attr_refillstate(state);
>>   }
>>   
>> +STATIC int
>> +xfs_attr_node_remove_cleanup(
>> +	struct xfs_da_args	*args,
>> +	struct xfs_da_state	*state)
>> +{
>> +	struct xfs_da_state_blk	*blk;
>> +	int			retval;
>> +
>> +	/*
>> +	 * Remove the name and update the hashvals in the tree.
>> +	 */
>> +	blk = &state->path.blk[state->path.active-1];
>> +	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
>> +	retval = xfs_attr3_leaf_remove(blk->bp, args);
>> +	xfs_da3_fixhashpath(state, &state->path);
>> +
>> +	return retval;
>> +}
>> +
>>   /*
>>    * Remove a name from a B-tree attribute list.
>>    *
>> @@ -1232,7 +1251,6 @@ xfs_attr_node_remove_step(
>>   	struct xfs_da_args	*args,
>>   	struct xfs_da_state	*state)
>>   {
>> -	struct xfs_da_state_blk	*blk;
>>   	int			retval, error;
>>   	struct xfs_inode	*dp = args->dp;
>>   
>> @@ -1247,14 +1265,7 @@ xfs_attr_node_remove_step(
>>   		if (error)
>>   			return error;
>>   	}
>> -
>> -	/*
>> -	 * Remove the name and update the hashvals in the tree.
>> -	 */
>> -	blk = &state->path.blk[ state->path.active-1 ];
>> -	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
>> -	retval = xfs_attr3_leaf_remove(blk->bp, args);
>> -	xfs_da3_fixhashpath(state, &state->path);
>> +	retval = xfs_attr_node_remove_cleanup(args, state);
>>   
>>   	/*
>>   	 * Check to see if the tree needs to be collapsed.
>>
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step
  2020-12-21  6:45   ` Chandan Babu R
@ 2020-12-21 23:48     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-21 23:48 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs



On 12/20/20 11:45 PM, Chandan Babu R wrote:
> On Fri, 18 Dec 2020 00:29:03 -0700, Allison Henderson wrote:
>> From: Allison Collins <allison.henderson@oracle.com>
>>
>> This patch as a new helper function xfs_attr_node_remove_step.  This
> 
> The above should probably be "This patch adds a new ...".
> 
>> will help simplify and modularize the calling function
>> xfs_attr_node_remove.
> 
> The calling function is xfs_attr_node_removename.
> 
> Other than the above mentioned nits, the changes look good to me,
> 
> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Ok, will fix nits
Thank you!

Allison
> 
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 46 ++++++++++++++++++++++++++++++++++------------
>>   1 file changed, 34 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index fd8e641..8b55a8d 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -1228,19 +1228,14 @@ xfs_attr_node_remove_rmt(
>>    * the root node (a special case of an intermediate node).
>>    */
>>   STATIC int
>> -xfs_attr_node_removename(
>> -	struct xfs_da_args	*args)
>> +xfs_attr_node_remove_step(
>> +	struct xfs_da_args	*args,
>> +	struct xfs_da_state	*state)
>>   {
>> -	struct xfs_da_state	*state;
>>   	struct xfs_da_state_blk	*blk;
>>   	int			retval, error;
>>   	struct xfs_inode	*dp = args->dp;
>>   
>> -	trace_xfs_attr_node_removename(args);
>> -
>> -	error = xfs_attr_node_removename_setup(args, &state);
>> -	if (error)
>> -		goto out;
>>   
>>   	/*
>>   	 * If there is an out-of-line value, de-allocate the blocks.
>> @@ -1250,7 +1245,7 @@ xfs_attr_node_removename(
>>   	if (args->rmtblkno > 0) {
>>   		error = xfs_attr_node_remove_rmt(args, state);
>>   		if (error)
>> -			goto out;
>> +			return error;
>>   	}
>>   
>>   	/*
>> @@ -1267,18 +1262,45 @@ xfs_attr_node_removename(
>>   	if (retval && (state->path.active > 1)) {
>>   		error = xfs_da3_join(state);
>>   		if (error)
>> -			goto out;
>> +			return error;
>>   		error = xfs_defer_finish(&args->trans);
>>   		if (error)
>> -			goto out;
>> +			return error;
>>   		/*
>>   		 * Commit the Btree join operation and start a new trans.
>>   		 */
>>   		error = xfs_trans_roll_inode(&args->trans, dp);
>>   		if (error)
>> -			goto out;
>> +			return error;
>>   	}
>>   
>> +	return error;
>> +}
>> +
>> +/*
>> + * Remove a name from a B-tree attribute list.
>> + *
>> + * This routine will find the blocks of the name to remove, remove them and
>> + * shrink the tree if needed.
>> + */
>> +STATIC int
>> +xfs_attr_node_removename(
>> +	struct xfs_da_args	*args)
>> +{
>> +	struct xfs_da_state	*state = NULL;
>> +	int			error;
>> +	struct xfs_inode	*dp = args->dp;
>> +
>> +	trace_xfs_attr_node_removename(args);
>> +
>> +	error = xfs_attr_node_removename_setup(args, &state);
>> +	if (error)
>> +		goto out;
>> +
>> +	error = xfs_attr_node_remove_step(args, state);
>> +	if (error)
>> +		goto out;
>> +
>>   	/*
>>   	 * If the result is small enough, push it all into the inode.
>>   	 */
>>
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-18  7:29 ` [PATCH v14 04/15] xfs: Add delay ready attr remove routines Allison Henderson
@ 2020-12-22  7:22   ` Chandan Babu R
  2020-12-22 15:41     ` Allison Henderson
  2020-12-22 17:11   ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Chandan Babu R @ 2020-12-22  7:22 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, 18 Dec 2020 00:29:06 -0700, Allison Henderson wrote:
> This patch modifies the attr remove routines to be delay ready. This
> means they no longer roll or commit transactions, but instead return
> -EAGAIN to have the calling routine roll and refresh the transaction. In
> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> uses a sort of state machine like switch to keep track of where it was
> when EAGAIN was returned. xfs_attr_node_removename has also been
> modified to use the switch, and a new version of xfs_attr_remove_args
> consists of a simple loop to refresh the transaction until the operation
> is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> transaction where ever the existing code used to.
> 
> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> version __xfs_attr_rmtval_remove. We will rename
> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> done.
> 
> xfs_attr_rmtval_remove itself is still in use by the set routines (used
> during a rename).  For reasons of preserving existing function, we
> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> used and will be removed.
> 
> This patch also adds a new struct xfs_delattr_context, which we will use
> to keep track of the current state of an attribute operation. The new
> xfs_delattr_state enum is used to track various operations that are in
> progress so that we know not to repeat them, and resume where we left
> off before EAGAIN was returned to cycle out the transaction. Other
> members take the place of local variables that need to retain their
> values across multiple function recalls.  See xfs_attr.h for a more
> detailed diagram of the states.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c        | 218 +++++++++++++++++++++++++++++-----------
>  fs/xfs/libxfs/xfs_attr.h        | 100 ++++++++++++++++++
>  fs/xfs/libxfs/xfs_attr_leaf.c   |   2 +-
>  fs/xfs/libxfs/xfs_attr_remote.c |  48 +++++----
>  fs/xfs/libxfs/xfs_attr_remote.h |   2 +-
>  fs/xfs/xfs_attr_inactive.c      |   2 +-
>  6 files changed, 288 insertions(+), 84 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 1969b88..b6330f9 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -53,7 +53,7 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>   */
>  STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>  STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
> -STATIC int xfs_attr_node_removename(xfs_da_args_t *args);
> +STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
>  STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>  				 struct xfs_da_state **state);
>  STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
> @@ -264,6 +264,34 @@ xfs_attr_set_shortform(
>  }
>  
>  /*
> + * Checks to see if a delayed attribute transaction should be rolled.  If so,
> + * also checks for a defer finish.  Transaction is finished and rolled as
> + * needed, and returns true of false if the delayed operation should continue.
> + */
> +int
> +xfs_attr_trans_roll(
> +	struct xfs_delattr_context	*dac)
> +{
> +	struct xfs_da_args		*args = dac->da_args;
> +	int				error;
> +
> +	if (dac->flags & XFS_DAC_DEFER_FINISH) {
> +		/*
> +		 * The caller wants us to finish all the deferred ops so that we
> +		 * avoid pinning the log tail with a large number of deferred
> +		 * ops.
> +		 */
> +		dac->flags &= ~XFS_DAC_DEFER_FINISH;
> +		error = xfs_defer_finish(&args->trans);
> +		if (error)
> +			return error;
> +	} else
> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> +
> +	return error;
> +}
> +
> +/*
>   * Set the attribute specified in @args.
>   */
>  int
> @@ -364,23 +392,58 @@ xfs_has_attr(
>   */
>  int
>  xfs_attr_remove_args(
> -	struct xfs_da_args      *args)
> +	struct xfs_da_args	*args)
>  {
> -	struct xfs_inode	*dp = args->dp;
> -	int			error;
> +	int				error;
> +	struct xfs_delattr_context	dac = {
> +		.da_args	= args,
> +	};
> +
> +	do {
> +		error = xfs_attr_remove_iter(&dac);
> +		if (error != -EAGAIN)
> +			break;
> +
> +		error = xfs_attr_trans_roll(&dac);
> +		if (error)
> +			return error;
> +
> +	} while (true);
> +
> +	return error;
> +}
>  
> -	if (!xfs_inode_hasattr(dp)) {
> -		error = -ENOATTR;
> -	} else if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
> +/*
> + * Remove the attribute specified in @args.
> + *
> + * This function may return -EAGAIN to signal that the transaction needs to be
> + * rolled.  Callers should continue calling this function until they receive a
> + * return value other than -EAGAIN.
> + */
> +int
> +xfs_attr_remove_iter(
> +	struct xfs_delattr_context	*dac)
> +{
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_inode		*dp = args->dp;
> +
> +	/* If we are shrinking a node, resume shrink */
> +	if (dac->dela_state == XFS_DAS_RM_SHRINK)
> +		goto node;
> +
> +	if (!xfs_inode_hasattr(dp))
> +		return -ENOATTR;
> +
> +	if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
>  		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> -		error = xfs_attr_shortform_remove(args);
> -	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> -		error = xfs_attr_leaf_removename(args);
> -	} else {
> -		error = xfs_attr_node_removename(args);
> +		return xfs_attr_shortform_remove(args);
>  	}
>  
> -	return error;
> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +		return xfs_attr_leaf_removename(args);
> +node:
> +	/* If we are not short form or leaf, then proceed to remove node */
> +	return  xfs_attr_node_removename_iter(dac);
>  }
>  
>  /*
> @@ -1178,10 +1241,11 @@ xfs_attr_leaf_mark_incomplete(
>   */
>  STATIC
>  int xfs_attr_node_removename_setup(
> -	struct xfs_da_args	*args,
> -	struct xfs_da_state	**state)
> +	struct xfs_delattr_context	*dac)
>  {

In xfs_attr_node_removename_setup(), if either of
xfs_attr_leaf_mark_incomplete() or xfs_attr_rmtval_invalidate() returns with a
non-zero value, the memory pointed to by dac->da_state is not freed. This
happens because the caller (i.e. xfs_attr_node_removename_iter()) checks for
the non-NULL value of its local variable "state" to actually free the
corresponding memory.

> -	int			error;
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_state		**state = &dac->da_state;
> +	int				error;
>  
>  	error = xfs_attr_node_hasname(args, state);
>  	if (error != -EEXIST)
> @@ -1203,13 +1267,16 @@ int xfs_attr_node_removename_setup(
>  }
>  
>  STATIC int
> -xfs_attr_node_remove_rmt(
> -	struct xfs_da_args	*args,
> -	struct xfs_da_state	*state)
> +xfs_attr_node_remove_rmt (
> +	struct xfs_delattr_context	*dac,
> +	struct xfs_da_state		*state)
>  {
> -	int			error = 0;
> +	int				error = 0;
>  
> -	error = xfs_attr_rmtval_remove(args);
> +	/*
> +	 * May return -EAGAIN to request that the caller recall this function
> +	 */
> +	error = __xfs_attr_rmtval_remove(dac);
>  	if (error)
>  		return error;
>  
> @@ -1240,28 +1307,34 @@ xfs_attr_node_remove_cleanup(
>  }
>  
>  /*
> - * Remove a name from a B-tree attribute list.
> + * Step through removeing a name from a B-tree attribute list.
>   *
>   * This will involve walking down the Btree, and may involve joining
>   * leaf nodes and even joining intermediate nodes up to and including
>   * the root node (a special case of an intermediate node).
> + *
> + * This routine is meant to function as either an inline or delayed operation,
> + * and may return -EAGAIN when the transaction needs to be rolled.  Calling
> + * functions will need to handle this, and recall the function until a
> + * successful error code is returned.
>   */
>  STATIC int
>  xfs_attr_node_remove_step(
> -	struct xfs_da_args	*args,
> -	struct xfs_da_state	*state)
> +	struct xfs_delattr_context	*dac)
>  {
> -	int			error;
> -	struct xfs_inode	*dp = args->dp;
> -
> -
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_state		*state = dac->da_state;
> +	int				error = 0;
>  	/*
>  	 * If there is an out-of-line value, de-allocate the blocks.
>  	 * This is done before we remove the attribute so that we don't
>  	 * overflow the maximum size of a transaction and/or hit a deadlock.
>  	 */
>  	if (args->rmtblkno > 0) {
> -		error = xfs_attr_node_remove_rmt(args, state);
> +		/*
> +		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
> +		 */
> +		error = xfs_attr_node_remove_rmt(dac, state);
>  		if (error)
>  			return error;
>  	}
> @@ -1274,51 +1347,74 @@ xfs_attr_node_remove_step(
>   *
>   * This routine will find the blocks of the name to remove, remove them and
>   * shrink the tree if needed.
> + *
> + * This routine is meant to function as either an inline or delayed operation,
> + * and may return -EAGAIN when the transaction needs to be rolled.  Calling
> + * functions will need to handle this, and recall the function until a
> + * successful error code is returned.
>   */
>  STATIC int
> -xfs_attr_node_removename(
> -	struct xfs_da_args	*args)
> +xfs_attr_node_removename_iter(
> +	struct xfs_delattr_context	*dac)
>  {
> -	struct xfs_da_state	*state = NULL;
> -	int			retval, error;
> -	struct xfs_inode	*dp = args->dp;
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_state		*state = NULL;
> +	int				retval, error;
> +	struct xfs_inode		*dp = args->dp;
>  
>  	trace_xfs_attr_node_removename(args);
>  
> -	error = xfs_attr_node_removename_setup(args, &state);
> -	if (error)
> -		goto out;
> +	if (!dac->da_state) {
> +		error = xfs_attr_node_removename_setup(dac);
> +		if (error)
> +			goto out;
> +	}
> +	state = dac->da_state;
>  
> -	error = xfs_attr_node_remove_step(args, state);
> -	if (error)
> -		goto out;
> +	switch (dac->dela_state) {
> +	case XFS_DAS_UNINIT:
> +		/*
> +		 * repeatedly remove remote blocks, remove the entry and join.
> +		 * returns -EAGAIN or 0 for completion of the step.
> +		 */
> +		error = xfs_attr_node_remove_step(dac);
> +		if (error)
> +			break;
>  
> -	retval = xfs_attr_node_remove_cleanup(args, state);
> +		retval = xfs_attr_node_remove_cleanup(args, state);
>  
> -	/*
> -	 * Check to see if the tree needs to be collapsed.
> -	 */
> -	if (retval && (state->path.active > 1)) {
> -		error = xfs_da3_join(state);
> -		if (error)
> -			return error;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			return error;
>  		/*
> -		 * Commit the Btree join operation and start a new trans.
> +		 * Check to see if the tree needs to be collapsed. Set the flag
> +		 * to indicate that the calling function needs to move the
> +		 * shrink operation
>  		 */
> -		error = xfs_trans_roll_inode(&args->trans, dp);
> -		if (error)
> -			return error;
> -	}
> +		if (retval && (state->path.active > 1)) {
> +			error = xfs_da3_join(state);
> +			if (error)
> +				return error;
>  
> -	/*
> -	 * If the result is small enough, push it all into the inode.
> -	 */
> -	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> -		error = xfs_attr_node_shrink(args, state);
> +			dac->flags |= XFS_DAC_DEFER_FINISH;
> +			dac->dela_state = XFS_DAS_RM_SHRINK;
> +			return -EAGAIN;
> +		}
> +
> +		/* fallthrough */
> +	case XFS_DAS_RM_SHRINK:
> +		/*
> +		 * If the result is small enough, push it all into the inode.
> +		 */
> +		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +			error = xfs_attr_node_shrink(args, state);
> +
> +		break;
> +	default:
> +		ASSERT(0);
> +		error = -EINVAL;
> +		goto out;
> +	}
>  
> +	if (error == -EAGAIN)
> +		return error;
>  out:
>  	if (state)
>  		xfs_da_state_free(state);
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 3e97a93..3154ef4 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -74,6 +74,102 @@ struct xfs_attr_list_context {
>  };
>  
>  
> +/*
> + * ========================================================================
> + * Structure used to pass context around among the delayed routines.
> + * ========================================================================
> + */
> +
> +/*
> + * Below is a state machine diagram for attr remove operations. The  XFS_DAS_*
> + * states indicate places where the function would return -EAGAIN, and then
> + * immediately resume from after being recalled by the calling function. States
> + * marked as a "subroutine state" indicate that they belong to a subroutine, and
> + * so the calling function needs to pass them back to that subroutine to allow
> + * it to finish where it left off. But they otherwise do not have a role in the
> + * calling function other than just passing through.
> + *
> + * xfs_attr_remove_iter()
> + *              │
> + *              v
> + *        found attr blks? ───n──┐
> + *              │                v
> + *              │         find and invalidate
> + *              y         the blocks. mark
> + *              │         attr incomplete
> + *              ├────────────────┘
> + *              │
> + *              v
> + *      remove a block with
> + *    xfs_attr_node_remove_step <────┐
> + *              │                    │
> + *              v                    │
> + *      still have blks ──y──> return -EAGAIN.
> + *        to remove?          re-enter with one
> + *              │            less blk to remove
> + *              n
> + *              │
> + *              v
> + *       remove leaf and
> + *       update hash with
> + *   xfs_attr_node_remove_cleanup
> + *              │
> + *              v
> + *           need to
> + *        shrink tree? ─n─┐
> + *              │         │
> + *              y         │
> + *              │         │
> + *              v         │
> + *          join leaf     │
> + *              │         │
> + *              v         │
> + *      XFS_DAS_RM_SHRINK │
> + *              │         │
> + *              v         │
> + *       do the shrink    │
> + *              │         │
> + *              v         │
> + *          free state <──┘
> + *              │
> + *              v
> + *            done
> + *
> + */
> +
> +/*
> + * Enum values for xfs_delattr_context.da_state
> + *
> + * These values are used by delayed attribute operations to keep track  of where
> + * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
> + * calling function to roll the transaction, and then recall the subroutine to
> + * finish the operation.  The enum is then used by the subroutine to jump back
> + * to where it was and resume executing where it left off.
> + */
> +enum xfs_delattr_state {
> +	XFS_DAS_UNINIT		= 0,  /* No state has been set yet */
> +	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
> +};
> +
> +/*
> + * Defines for xfs_delattr_context.flags
> + */
> +#define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
> +
> +/*
> + * Context used for keeping track of delayed attribute operations
> + */
> +struct xfs_delattr_context {
> +	struct xfs_da_args      *da_args;
> +
> +	/* Used in xfs_attr_node_removename to roll through removing blocks */
> +	struct xfs_da_state     *da_state;
> +
> +	/* Used to keep track of current state of delayed operation */
> +	unsigned int            flags;
> +	enum xfs_delattr_state  dela_state;
> +};
> +
>  /*========================================================================
>   * Function prototypes for the kernel.
>   *========================================================================*/
> @@ -91,6 +187,10 @@ int xfs_attr_set(struct xfs_da_args *args);
>  int xfs_attr_set_args(struct xfs_da_args *args);
>  int xfs_has_attr(struct xfs_da_args *args);
>  int xfs_attr_remove_args(struct xfs_da_args *args);
> +int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
> +int xfs_attr_trans_roll(struct xfs_delattr_context *dac);
>  bool xfs_attr_namecheck(const void *name, size_t length);
> +void xfs_delattr_context_init(struct xfs_delattr_context *dac,
> +			      struct xfs_da_args *args);
>  
>  #endif	/* __XFS_ATTR_H__ */
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> index d6ef69a..3780141 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> @@ -19,8 +19,8 @@
>  #include "xfs_bmap_btree.h"
>  #include "xfs_bmap.h"
>  #include "xfs_attr_sf.h"
> -#include "xfs_attr_remote.h"
>  #include "xfs_attr.h"
> +#include "xfs_attr_remote.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_error.h"
>  #include "xfs_trace.h"
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index 48d8e9c..f09820c 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -674,10 +674,12 @@ xfs_attr_rmtval_invalidate(
>   */
>  int
>  xfs_attr_rmtval_remove(
> -	struct xfs_da_args      *args)
> +	struct xfs_da_args		*args)
>  {
> -	int			error;
> -	int			retval;
> +	int				error;
> +	struct xfs_delattr_context	dac  = {
> +		.da_args	= args,
> +	};
>  
>  	trace_xfs_attr_rmtval_remove(args);
>  
> @@ -685,31 +687,29 @@ xfs_attr_rmtval_remove(
>  	 * Keep de-allocating extents until the remote-value region is gone.
>  	 */
>  	do {
> -		retval = __xfs_attr_rmtval_remove(args);
> -		if (retval && retval != -EAGAIN)
> -			return retval;
> +		error = __xfs_attr_rmtval_remove(&dac);
> +		if (error != -EAGAIN)
> +			break;
>  
> -		/*
> -		 * Close out trans and start the next one in the chain.
> -		 */
> -		error = xfs_trans_roll_inode(&args->trans, args->dp);
> +		error = xfs_attr_trans_roll(&dac);
>  		if (error)
>  			return error;
> -	} while (retval == -EAGAIN);
> +	} while (true);
>  
> -	return 0;
> +	return error;
>  }
>  
>  /*
>   * Remove the value associated with an attribute by deleting the out-of-line
> - * buffer that it is stored on. Returns EAGAIN for the caller to refresh the
> + * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the
>   * transaction and re-call the function
>   */
>  int
>  __xfs_attr_rmtval_remove(
> -	struct xfs_da_args	*args)
> +	struct xfs_delattr_context	*dac)
>  {
> -	int			error, done;
> +	struct xfs_da_args		*args = dac->da_args;
> +	int				error, done;
>  
>  	/*
>  	 * Unmap value blocks for this attr.
> @@ -719,12 +719,20 @@ __xfs_attr_rmtval_remove(
>  	if (error)
>  		return error;
>  
> -	error = xfs_defer_finish(&args->trans);
> -	if (error)
> -		return error;
> -
> -	if (!done)
> +	/*
> +	 * We dont need an explicit state here to pick up where we left off.  We
> +	 * can figure it out using the !done return code.  Calling function only
> +	 * needs to keep recalling this routine until we indicate to stop by
> +	 * returning anything other than -EAGAIN. The actual value of
> +	 * attr->xattri_dela_state may be some value reminicent of the calling
> +	 * function, but it's value is irrelevant with in the context of this
> +	 * function.  Once we are done here, the next state is set as needed
> +	 * by the parent
> +	 */
> +	if (!done) {
> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>  		return -EAGAIN;
> +	}
>  
>  	return error;
>  }
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
> index 9eee615..002fd30 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.h
> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
> @@ -14,5 +14,5 @@ int xfs_attr_rmtval_remove(struct xfs_da_args *args);
>  int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
>  		xfs_buf_flags_t incore_flags);
>  int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
> -int __xfs_attr_rmtval_remove(struct xfs_da_args *args);
> +int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
>  #endif /* __XFS_ATTR_REMOTE_H__ */
> diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> index bfad669..aaa7e66 100644
> --- a/fs/xfs/xfs_attr_inactive.c
> +++ b/fs/xfs/xfs_attr_inactive.c
> @@ -15,10 +15,10 @@
>  #include "xfs_da_format.h"
>  #include "xfs_da_btree.h"
>  #include "xfs_inode.h"
> +#include "xfs_attr.h"
>  #include "xfs_attr_remote.h"
>  #include "xfs_trans.h"
>  #include "xfs_bmap.h"
> -#include "xfs_attr.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_quota.h"
>  #include "xfs_dir2.h"
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-22  7:22   ` Chandan Babu R
@ 2020-12-22 15:41     ` Allison Henderson
  2020-12-23  4:05       ` Chandan Babu R
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-22 15:41 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs



On 12/22/20 12:22 AM, Chandan Babu R wrote:
> On Fri, 18 Dec 2020 00:29:06 -0700, Allison Henderson wrote:
>> This patch modifies the attr remove routines to be delay ready. This
>> means they no longer roll or commit transactions, but instead return
>> -EAGAIN to have the calling routine roll and refresh the transaction. In
>> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
>> uses a sort of state machine like switch to keep track of where it was
>> when EAGAIN was returned. xfs_attr_node_removename has also been
>> modified to use the switch, and a new version of xfs_attr_remove_args
>> consists of a simple loop to refresh the transaction until the operation
>> is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
>> transaction where ever the existing code used to.
>>
>> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
>> version __xfs_attr_rmtval_remove. We will rename
>> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
>> done.
>>
>> xfs_attr_rmtval_remove itself is still in use by the set routines (used
>> during a rename).  For reasons of preserving existing function, we
>> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
>> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
>> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
>> used and will be removed.
>>
>> This patch also adds a new struct xfs_delattr_context, which we will use
>> to keep track of the current state of an attribute operation. The new
>> xfs_delattr_state enum is used to track various operations that are in
>> progress so that we know not to repeat them, and resume where we left
>> off before EAGAIN was returned to cycle out the transaction. Other
>> members take the place of local variables that need to retain their
>> values across multiple function recalls.  See xfs_attr.h for a more
>> detailed diagram of the states.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c        | 218 +++++++++++++++++++++++++++++-----------
>>   fs/xfs/libxfs/xfs_attr.h        | 100 ++++++++++++++++++
>>   fs/xfs/libxfs/xfs_attr_leaf.c   |   2 +-
>>   fs/xfs/libxfs/xfs_attr_remote.c |  48 +++++----
>>   fs/xfs/libxfs/xfs_attr_remote.h |   2 +-
>>   fs/xfs/xfs_attr_inactive.c      |   2 +-
>>   6 files changed, 288 insertions(+), 84 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 1969b88..b6330f9 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -53,7 +53,7 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>>    */
>>   STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>>   STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
>> -STATIC int xfs_attr_node_removename(xfs_da_args_t *args);
>> +STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
>>   STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>>   				 struct xfs_da_state **state);
>>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>> @@ -264,6 +264,34 @@ xfs_attr_set_shortform(
>>   }
>>   
>>   /*
>> + * Checks to see if a delayed attribute transaction should be rolled.  If so,
>> + * also checks for a defer finish.  Transaction is finished and rolled as
>> + * needed, and returns true of false if the delayed operation should continue.
>> + */
>> +int
>> +xfs_attr_trans_roll(
>> +	struct xfs_delattr_context	*dac)
>> +{
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	int				error;
>> +
>> +	if (dac->flags & XFS_DAC_DEFER_FINISH) {
>> +		/*
>> +		 * The caller wants us to finish all the deferred ops so that we
>> +		 * avoid pinning the log tail with a large number of deferred
>> +		 * ops.
>> +		 */
>> +		dac->flags &= ~XFS_DAC_DEFER_FINISH;
>> +		error = xfs_defer_finish(&args->trans);
>> +		if (error)
>> +			return error;
>> +	} else
>> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
>> +
>> +	return error;
>> +}
>> +
>> +/*
>>    * Set the attribute specified in @args.
>>    */
>>   int
>> @@ -364,23 +392,58 @@ xfs_has_attr(
>>    */
>>   int
>>   xfs_attr_remove_args(
>> -	struct xfs_da_args      *args)
>> +	struct xfs_da_args	*args)
>>   {
>> -	struct xfs_inode	*dp = args->dp;
>> -	int			error;
>> +	int				error;
>> +	struct xfs_delattr_context	dac = {
>> +		.da_args	= args,
>> +	};
>> +
>> +	do {
>> +		error = xfs_attr_remove_iter(&dac);
>> +		if (error != -EAGAIN)
>> +			break;
>> +
>> +		error = xfs_attr_trans_roll(&dac);
>> +		if (error)
>> +			return error;
>> +
>> +	} while (true);
>> +
>> +	return error;
>> +}
>>   
>> -	if (!xfs_inode_hasattr(dp)) {
>> -		error = -ENOATTR;
>> -	} else if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
>> +/*
>> + * Remove the attribute specified in @args.
>> + *
>> + * This function may return -EAGAIN to signal that the transaction needs to be
>> + * rolled.  Callers should continue calling this function until they receive a
>> + * return value other than -EAGAIN.
>> + */
>> +int
>> +xfs_attr_remove_iter(
>> +	struct xfs_delattr_context	*dac)
>> +{
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_inode		*dp = args->dp;
>> +
>> +	/* If we are shrinking a node, resume shrink */
>> +	if (dac->dela_state == XFS_DAS_RM_SHRINK)
>> +		goto node;
>> +
>> +	if (!xfs_inode_hasattr(dp))
>> +		return -ENOATTR;
>> +
>> +	if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
>>   		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
>> -		error = xfs_attr_shortform_remove(args);
>> -	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
>> -		error = xfs_attr_leaf_removename(args);
>> -	} else {
>> -		error = xfs_attr_node_removename(args);
>> +		return xfs_attr_shortform_remove(args);
>>   	}
>>   
>> -	return error;
>> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> +		return xfs_attr_leaf_removename(args);
>> +node:
>> +	/* If we are not short form or leaf, then proceed to remove node */
>> +	return  xfs_attr_node_removename_iter(dac);
>>   }
>>   
>>   /*
>> @@ -1178,10 +1241,11 @@ xfs_attr_leaf_mark_incomplete(
>>    */
>>   STATIC
>>   int xfs_attr_node_removename_setup(
>> -	struct xfs_da_args	*args,
>> -	struct xfs_da_state	**state)
>> +	struct xfs_delattr_context	*dac)
>>   {
> 
> In xfs_attr_node_removename_setup(), if either of
> xfs_attr_leaf_mark_incomplete() or xfs_attr_rmtval_invalidate() returns with a
> non-zero value, the memory pointed to by dac->da_state is not freed. This
> happens because the caller (i.e. xfs_attr_node_removename_iter()) checks for
> the non-NULL value of its local variable "state" to actually free the
> corresponding memory.
> 
Ok, for this one it think it makes more sense to put an extra free in 
the helper rather than have the caller handle it.  Will fix.

Do you have a tool thats tracing this out, or is it just by hand? 
Because if it's a tool, I should probably be using it too :-)

Thanks!
Allison


>> -	int			error;
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_state		**state = &dac->da_state;
>> +	int				error;
>>   
>>   	error = xfs_attr_node_hasname(args, state);
>>   	if (error != -EEXIST)
>> @@ -1203,13 +1267,16 @@ int xfs_attr_node_removename_setup(
>>   }
>>   
>>   STATIC int
>> -xfs_attr_node_remove_rmt(
>> -	struct xfs_da_args	*args,
>> -	struct xfs_da_state	*state)
>> +xfs_attr_node_remove_rmt (
>> +	struct xfs_delattr_context	*dac,
>> +	struct xfs_da_state		*state)
>>   {
>> -	int			error = 0;
>> +	int				error = 0;
>>   
>> -	error = xfs_attr_rmtval_remove(args);
>> +	/*
>> +	 * May return -EAGAIN to request that the caller recall this function
>> +	 */
>> +	error = __xfs_attr_rmtval_remove(dac);
>>   	if (error)
>>   		return error;
>>   
>> @@ -1240,28 +1307,34 @@ xfs_attr_node_remove_cleanup(
>>   }
>>   
>>   /*
>> - * Remove a name from a B-tree attribute list.
>> + * Step through removeing a name from a B-tree attribute list.
>>    *
>>    * This will involve walking down the Btree, and may involve joining
>>    * leaf nodes and even joining intermediate nodes up to and including
>>    * the root node (a special case of an intermediate node).
>> + *
>> + * This routine is meant to function as either an inline or delayed operation,
>> + * and may return -EAGAIN when the transaction needs to be rolled.  Calling
>> + * functions will need to handle this, and recall the function until a
>> + * successful error code is returned.
>>    */
>>   STATIC int
>>   xfs_attr_node_remove_step(
>> -	struct xfs_da_args	*args,
>> -	struct xfs_da_state	*state)
>> +	struct xfs_delattr_context	*dac)
>>   {
>> -	int			error;
>> -	struct xfs_inode	*dp = args->dp;
>> -
>> -
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_state		*state = dac->da_state;
>> +	int				error = 0;
>>   	/*
>>   	 * If there is an out-of-line value, de-allocate the blocks.
>>   	 * This is done before we remove the attribute so that we don't
>>   	 * overflow the maximum size of a transaction and/or hit a deadlock.
>>   	 */
>>   	if (args->rmtblkno > 0) {
>> -		error = xfs_attr_node_remove_rmt(args, state);
>> +		/*
>> +		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
>> +		 */
>> +		error = xfs_attr_node_remove_rmt(dac, state);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1274,51 +1347,74 @@ xfs_attr_node_remove_step(
>>    *
>>    * This routine will find the blocks of the name to remove, remove them and
>>    * shrink the tree if needed.
>> + *
>> + * This routine is meant to function as either an inline or delayed operation,
>> + * and may return -EAGAIN when the transaction needs to be rolled.  Calling
>> + * functions will need to handle this, and recall the function until a
>> + * successful error code is returned.
>>    */
>>   STATIC int
>> -xfs_attr_node_removename(
>> -	struct xfs_da_args	*args)
>> +xfs_attr_node_removename_iter(
>> +	struct xfs_delattr_context	*dac)
>>   {
>> -	struct xfs_da_state	*state = NULL;
>> -	int			retval, error;
>> -	struct xfs_inode	*dp = args->dp;
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_state		*state = NULL;
>> +	int				retval, error;
>> +	struct xfs_inode		*dp = args->dp;
>>   
>>   	trace_xfs_attr_node_removename(args);
>>   
>> -	error = xfs_attr_node_removename_setup(args, &state);
>> -	if (error)
>> -		goto out;
>> +	if (!dac->da_state) {
>> +		error = xfs_attr_node_removename_setup(dac);
>> +		if (error)
>> +			goto out;
>> +	}
>> +	state = dac->da_state;
>>   
>> -	error = xfs_attr_node_remove_step(args, state);
>> -	if (error)
>> -		goto out;
>> +	switch (dac->dela_state) {
>> +	case XFS_DAS_UNINIT:
>> +		/*
>> +		 * repeatedly remove remote blocks, remove the entry and join.
>> +		 * returns -EAGAIN or 0 for completion of the step.
>> +		 */
>> +		error = xfs_attr_node_remove_step(dac);
>> +		if (error)
>> +			break;
>>   
>> -	retval = xfs_attr_node_remove_cleanup(args, state);
>> +		retval = xfs_attr_node_remove_cleanup(args, state);
>>   
>> -	/*
>> -	 * Check to see if the tree needs to be collapsed.
>> -	 */
>> -	if (retval && (state->path.active > 1)) {
>> -		error = xfs_da3_join(state);
>> -		if (error)
>> -			return error;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			return error;
>>   		/*
>> -		 * Commit the Btree join operation and start a new trans.
>> +		 * Check to see if the tree needs to be collapsed. Set the flag
>> +		 * to indicate that the calling function needs to move the
>> +		 * shrink operation
>>   		 */
>> -		error = xfs_trans_roll_inode(&args->trans, dp);
>> -		if (error)
>> -			return error;
>> -	}
>> +		if (retval && (state->path.active > 1)) {
>> +			error = xfs_da3_join(state);
>> +			if (error)
>> +				return error;
>>   
>> -	/*
>> -	 * If the result is small enough, push it all into the inode.
>> -	 */
>> -	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> -		error = xfs_attr_node_shrink(args, state);
>> +			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +			dac->dela_state = XFS_DAS_RM_SHRINK;
>> +			return -EAGAIN;
>> +		}
>> +
>> +		/* fallthrough */
>> +	case XFS_DAS_RM_SHRINK:
>> +		/*
>> +		 * If the result is small enough, push it all into the inode.
>> +		 */
>> +		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> +			error = xfs_attr_node_shrink(args, state);
>> +
>> +		break;
>> +	default:
>> +		ASSERT(0);
>> +		error = -EINVAL;
>> +		goto out;
>> +	}
>>   
>> +	if (error == -EAGAIN)
>> +		return error;
>>   out:
>>   	if (state)
>>   		xfs_da_state_free(state);
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 3e97a93..3154ef4 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -74,6 +74,102 @@ struct xfs_attr_list_context {
>>   };
>>   
>>   
>> +/*
>> + * ========================================================================
>> + * Structure used to pass context around among the delayed routines.
>> + * ========================================================================
>> + */
>> +
>> +/*
>> + * Below is a state machine diagram for attr remove operations. The  XFS_DAS_*
>> + * states indicate places where the function would return -EAGAIN, and then
>> + * immediately resume from after being recalled by the calling function. States
>> + * marked as a "subroutine state" indicate that they belong to a subroutine, and
>> + * so the calling function needs to pass them back to that subroutine to allow
>> + * it to finish where it left off. But they otherwise do not have a role in the
>> + * calling function other than just passing through.
>> + *
>> + * xfs_attr_remove_iter()
>> + *              │
>> + *              v
>> + *        found attr blks? ───n──┐
>> + *              │                v
>> + *              │         find and invalidate
>> + *              y         the blocks. mark
>> + *              │         attr incomplete
>> + *              ├────────────────┘
>> + *              │
>> + *              v
>> + *      remove a block with
>> + *    xfs_attr_node_remove_step <────┐
>> + *              │                    │
>> + *              v                    │
>> + *      still have blks ──y──> return -EAGAIN.
>> + *        to remove?          re-enter with one
>> + *              │            less blk to remove
>> + *              n
>> + *              │
>> + *              v
>> + *       remove leaf and
>> + *       update hash with
>> + *   xfs_attr_node_remove_cleanup
>> + *              │
>> + *              v
>> + *           need to
>> + *        shrink tree? ─n─┐
>> + *              │         │
>> + *              y         │
>> + *              │         │
>> + *              v         │
>> + *          join leaf     │
>> + *              │         │
>> + *              v         │
>> + *      XFS_DAS_RM_SHRINK │
>> + *              │         │
>> + *              v         │
>> + *       do the shrink    │
>> + *              │         │
>> + *              v         │
>> + *          free state <──┘
>> + *              │
>> + *              v
>> + *            done
>> + *
>> + */
>> +
>> +/*
>> + * Enum values for xfs_delattr_context.da_state
>> + *
>> + * These values are used by delayed attribute operations to keep track  of where
>> + * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
>> + * calling function to roll the transaction, and then recall the subroutine to
>> + * finish the operation.  The enum is then used by the subroutine to jump back
>> + * to where it was and resume executing where it left off.
>> + */
>> +enum xfs_delattr_state {
>> +	XFS_DAS_UNINIT		= 0,  /* No state has been set yet */
>> +	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
>> +};
>> +
>> +/*
>> + * Defines for xfs_delattr_context.flags
>> + */
>> +#define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
>> +
>> +/*
>> + * Context used for keeping track of delayed attribute operations
>> + */
>> +struct xfs_delattr_context {
>> +	struct xfs_da_args      *da_args;
>> +
>> +	/* Used in xfs_attr_node_removename to roll through removing blocks */
>> +	struct xfs_da_state     *da_state;
>> +
>> +	/* Used to keep track of current state of delayed operation */
>> +	unsigned int            flags;
>> +	enum xfs_delattr_state  dela_state;
>> +};
>> +
>>   /*========================================================================
>>    * Function prototypes for the kernel.
>>    *========================================================================*/
>> @@ -91,6 +187,10 @@ int xfs_attr_set(struct xfs_da_args *args);
>>   int xfs_attr_set_args(struct xfs_da_args *args);
>>   int xfs_has_attr(struct xfs_da_args *args);
>>   int xfs_attr_remove_args(struct xfs_da_args *args);
>> +int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
>> +int xfs_attr_trans_roll(struct xfs_delattr_context *dac);
>>   bool xfs_attr_namecheck(const void *name, size_t length);
>> +void xfs_delattr_context_init(struct xfs_delattr_context *dac,
>> +			      struct xfs_da_args *args);
>>   
>>   #endif	/* __XFS_ATTR_H__ */
>> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
>> index d6ef69a..3780141 100644
>> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
>> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
>> @@ -19,8 +19,8 @@
>>   #include "xfs_bmap_btree.h"
>>   #include "xfs_bmap.h"
>>   #include "xfs_attr_sf.h"
>> -#include "xfs_attr_remote.h"
>>   #include "xfs_attr.h"
>> +#include "xfs_attr_remote.h"
>>   #include "xfs_attr_leaf.h"
>>   #include "xfs_error.h"
>>   #include "xfs_trace.h"
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
>> index 48d8e9c..f09820c 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.c
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
>> @@ -674,10 +674,12 @@ xfs_attr_rmtval_invalidate(
>>    */
>>   int
>>   xfs_attr_rmtval_remove(
>> -	struct xfs_da_args      *args)
>> +	struct xfs_da_args		*args)
>>   {
>> -	int			error;
>> -	int			retval;
>> +	int				error;
>> +	struct xfs_delattr_context	dac  = {
>> +		.da_args	= args,
>> +	};
>>   
>>   	trace_xfs_attr_rmtval_remove(args);
>>   
>> @@ -685,31 +687,29 @@ xfs_attr_rmtval_remove(
>>   	 * Keep de-allocating extents until the remote-value region is gone.
>>   	 */
>>   	do {
>> -		retval = __xfs_attr_rmtval_remove(args);
>> -		if (retval && retval != -EAGAIN)
>> -			return retval;
>> +		error = __xfs_attr_rmtval_remove(&dac);
>> +		if (error != -EAGAIN)
>> +			break;
>>   
>> -		/*
>> -		 * Close out trans and start the next one in the chain.
>> -		 */
>> -		error = xfs_trans_roll_inode(&args->trans, args->dp);
>> +		error = xfs_attr_trans_roll(&dac);
>>   		if (error)
>>   			return error;
>> -	} while (retval == -EAGAIN);
>> +	} while (true);
>>   
>> -	return 0;
>> +	return error;
>>   }
>>   
>>   /*
>>    * Remove the value associated with an attribute by deleting the out-of-line
>> - * buffer that it is stored on. Returns EAGAIN for the caller to refresh the
>> + * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the
>>    * transaction and re-call the function
>>    */
>>   int
>>   __xfs_attr_rmtval_remove(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_delattr_context	*dac)
>>   {
>> -	int			error, done;
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	int				error, done;
>>   
>>   	/*
>>   	 * Unmap value blocks for this attr.
>> @@ -719,12 +719,20 @@ __xfs_attr_rmtval_remove(
>>   	if (error)
>>   		return error;
>>   
>> -	error = xfs_defer_finish(&args->trans);
>> -	if (error)
>> -		return error;
>> -
>> -	if (!done)
>> +	/*
>> +	 * We dont need an explicit state here to pick up where we left off.  We
>> +	 * can figure it out using the !done return code.  Calling function only
>> +	 * needs to keep recalling this routine until we indicate to stop by
>> +	 * returning anything other than -EAGAIN. The actual value of
>> +	 * attr->xattri_dela_state may be some value reminicent of the calling
>> +	 * function, but it's value is irrelevant with in the context of this
>> +	 * function.  Once we are done here, the next state is set as needed
>> +	 * by the parent
>> +	 */
>> +	if (!done) {
>> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>>   		return -EAGAIN;
>> +	}
>>   
>>   	return error;
>>   }
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
>> index 9eee615..002fd30 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.h
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
>> @@ -14,5 +14,5 @@ int xfs_attr_rmtval_remove(struct xfs_da_args *args);
>>   int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
>>   		xfs_buf_flags_t incore_flags);
>>   int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
>> -int __xfs_attr_rmtval_remove(struct xfs_da_args *args);
>> +int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
>>   #endif /* __XFS_ATTR_REMOTE_H__ */
>> diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
>> index bfad669..aaa7e66 100644
>> --- a/fs/xfs/xfs_attr_inactive.c
>> +++ b/fs/xfs/xfs_attr_inactive.c
>> @@ -15,10 +15,10 @@
>>   #include "xfs_da_format.h"
>>   #include "xfs_da_btree.h"
>>   #include "xfs_inode.h"
>> +#include "xfs_attr.h"
>>   #include "xfs_attr_remote.h"
>>   #include "xfs_trans.h"
>>   #include "xfs_bmap.h"
>> -#include "xfs_attr.h"
>>   #include "xfs_attr_leaf.h"
>>   #include "xfs_quota.h"
>>   #include "xfs_dir2.h"
>>
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step
  2020-12-18  7:29 ` [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step Allison Henderson
  2020-12-21  6:45   ` Chandan Babu R
@ 2020-12-22 16:50   ` Brian Foster
  1 sibling, 0 replies; 48+ messages in thread
From: Brian Foster @ 2020-12-22 16:50 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Dec 18, 2020 at 12:29:03AM -0700, Allison Henderson wrote:
> From: Allison Collins <allison.henderson@oracle.com>
> 
> This patch as a new helper function xfs_attr_node_remove_step.  This
> will help simplify and modularize the calling function
> xfs_attr_node_remove.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_attr.c | 46 ++++++++++++++++++++++++++++++++++------------
>  1 file changed, 34 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index fd8e641..8b55a8d 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -1228,19 +1228,14 @@ xfs_attr_node_remove_rmt(
>   * the root node (a special case of an intermediate node).
>   */
>  STATIC int
> -xfs_attr_node_removename(
> -	struct xfs_da_args	*args)
> +xfs_attr_node_remove_step(
> +	struct xfs_da_args	*args,
> +	struct xfs_da_state	*state)
>  {
> -	struct xfs_da_state	*state;
>  	struct xfs_da_state_blk	*blk;
>  	int			retval, error;
>  	struct xfs_inode	*dp = args->dp;
>  
> -	trace_xfs_attr_node_removename(args);
> -
> -	error = xfs_attr_node_removename_setup(args, &state);
> -	if (error)
> -		goto out;
>  
>  	/*
>  	 * If there is an out-of-line value, de-allocate the blocks.
> @@ -1250,7 +1245,7 @@ xfs_attr_node_removename(
>  	if (args->rmtblkno > 0) {
>  		error = xfs_attr_node_remove_rmt(args, state);
>  		if (error)
> -			goto out;
> +			return error;
>  	}
>  
>  	/*
> @@ -1267,18 +1262,45 @@ xfs_attr_node_removename(
>  	if (retval && (state->path.active > 1)) {
>  		error = xfs_da3_join(state);
>  		if (error)
> -			goto out;
> +			return error;
>  		error = xfs_defer_finish(&args->trans);
>  		if (error)
> -			goto out;
> +			return error;
>  		/*
>  		 * Commit the Btree join operation and start a new trans.
>  		 */
>  		error = xfs_trans_roll_inode(&args->trans, dp);
>  		if (error)
> -			goto out;
> +			return error;
>  	}
>  
> +	return error;
> +}
> +
> +/*
> + * Remove a name from a B-tree attribute list.
> + *
> + * This routine will find the blocks of the name to remove, remove them and
> + * shrink the tree if needed.
> + */
> +STATIC int
> +xfs_attr_node_removename(
> +	struct xfs_da_args	*args)
> +{
> +	struct xfs_da_state	*state = NULL;
> +	int			error;
> +	struct xfs_inode	*dp = args->dp;
> +
> +	trace_xfs_attr_node_removename(args);
> +
> +	error = xfs_attr_node_removename_setup(args, &state);
> +	if (error)
> +		goto out;
> +
> +	error = xfs_attr_node_remove_step(args, state);
> +	if (error)
> +		goto out;
> +
>  	/*
>  	 * If the result is small enough, push it all into the inode.
>  	 */
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup
  2020-12-18  7:29 ` [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup Allison Henderson
  2020-12-21  6:45   ` Chandan Babu R
@ 2020-12-22 16:50   ` Brian Foster
  1 sibling, 0 replies; 48+ messages in thread
From: Brian Foster @ 2020-12-22 16:50 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Dec 18, 2020 at 12:29:04AM -0700, Allison Henderson wrote:
> This patch pulls a new helper function xfs_attr_node_remove_cleanup out
> of xfs_attr_node_remove_step.  This helps to modularize
> xfs_attr_node_remove_step which will help make the delayed attribute
> code easier to follow
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_attr.c | 29 ++++++++++++++++++++---------
>  1 file changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 8b55a8d..e93d76a 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -1220,6 +1220,25 @@ xfs_attr_node_remove_rmt(
>  	return xfs_attr_refillstate(state);
>  }
>  
> +STATIC int
> +xfs_attr_node_remove_cleanup(
> +	struct xfs_da_args	*args,
> +	struct xfs_da_state	*state)
> +{
> +	struct xfs_da_state_blk	*blk;
> +	int			retval;
> +
> +	/*
> +	 * Remove the name and update the hashvals in the tree.
> +	 */
> +	blk = &state->path.blk[state->path.active-1];
> +	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
> +	retval = xfs_attr3_leaf_remove(blk->bp, args);
> +	xfs_da3_fixhashpath(state, &state->path);
> +
> +	return retval;
> +}
> +
>  /*
>   * Remove a name from a B-tree attribute list.
>   *
> @@ -1232,7 +1251,6 @@ xfs_attr_node_remove_step(
>  	struct xfs_da_args	*args,
>  	struct xfs_da_state	*state)
>  {
> -	struct xfs_da_state_blk	*blk;
>  	int			retval, error;
>  	struct xfs_inode	*dp = args->dp;
>  
> @@ -1247,14 +1265,7 @@ xfs_attr_node_remove_step(
>  		if (error)
>  			return error;
>  	}
> -
> -	/*
> -	 * Remove the name and update the hashvals in the tree.
> -	 */
> -	blk = &state->path.blk[ state->path.active-1 ];
> -	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
> -	retval = xfs_attr3_leaf_remove(blk->bp, args);
> -	xfs_da3_fixhashpath(state, &state->path);
> +	retval = xfs_attr_node_remove_cleanup(args, state);
>  
>  	/*
>  	 * Check to see if the tree needs to be collapsed.
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-18  7:29 ` [PATCH v14 04/15] xfs: Add delay ready attr remove routines Allison Henderson
  2020-12-22  7:22   ` Chandan Babu R
@ 2020-12-22 17:11   ` Brian Foster
  2020-12-22 17:20     ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Brian Foster @ 2020-12-22 17:11 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
> This patch modifies the attr remove routines to be delay ready. This
> means they no longer roll or commit transactions, but instead return
> -EAGAIN to have the calling routine roll and refresh the transaction. In
> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> uses a sort of state machine like switch to keep track of where it was
> when EAGAIN was returned. xfs_attr_node_removename has also been
> modified to use the switch, and a new version of xfs_attr_remove_args
> consists of a simple loop to refresh the transaction until the operation
> is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> transaction where ever the existing code used to.
> 
> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> version __xfs_attr_rmtval_remove. We will rename
> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> done.
> 
> xfs_attr_rmtval_remove itself is still in use by the set routines (used
> during a rename).  For reasons of preserving existing function, we
> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> used and will be removed.
> 
> This patch also adds a new struct xfs_delattr_context, which we will use
> to keep track of the current state of an attribute operation. The new
> xfs_delattr_state enum is used to track various operations that are in
> progress so that we know not to repeat them, and resume where we left
> off before EAGAIN was returned to cycle out the transaction. Other
> members take the place of local variables that need to retain their
> values across multiple function recalls.  See xfs_attr.h for a more
> detailed diagram of the states.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---

I started with a couple small comments on this patch but inevitably
started thinking more about the factoring again and ended up with a
couple patches on top. The first is more of some small tweaks and
open-coding that IMO makes this patch a bit easier to follow. The
second is more of an RFC so I'll follow up with that in a second email.
I'm curious what folks' thoughts might be on either. Also note that I'm
primarily focusing on code structure and whatnot here, so these are fast
and loose, compile tested only and likely to be broken.

First diff:

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index b6330f953f40..2e466c4ac283 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -58,6 +58,9 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
 				 struct xfs_da_state **state);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
+STATIC
+int xfs_attr_node_removename_setup(
+	struct xfs_delattr_context	*dac);
 
 int
 xfs_inode_hasattr(
@@ -395,12 +398,34 @@ xfs_attr_remove_args(
 	struct xfs_da_args	*args)
 {
 	int				error;
+	struct xfs_inode		*dp = args->dp;
 	struct xfs_delattr_context	dac = {
+		.dela_state	= XFS_DAS_UNINIT,
 		.da_args	= args,
 	};
 
+	if (!xfs_inode_hasattr(dp))
+		return -ENOATTR;
+
+	if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
+		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
+		return xfs_attr_shortform_remove(args);
+	}
+
+	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+		return xfs_attr_leaf_removename(args);
+
+	/* node format requires multiple transactions... */
+
+	trace_xfs_attr_node_removename(args);
+	if (!dac.da_state) {
+		error = xfs_attr_node_removename_setup(&dac);
+		if (error)
+			return error;
+	}
+
 	do {
-		error = xfs_attr_remove_iter(&dac);
+		error = xfs_attr_node_removename_iter(&dac);
 		if (error != -EAGAIN)
 			break;
 
@@ -413,39 +438,6 @@ xfs_attr_remove_args(
 	return error;
 }
 
-/*
- * Remove the attribute specified in @args.
- *
- * This function may return -EAGAIN to signal that the transaction needs to be
- * rolled.  Callers should continue calling this function until they receive a
- * return value other than -EAGAIN.
- */
-int
-xfs_attr_remove_iter(
-	struct xfs_delattr_context	*dac)
-{
-	struct xfs_da_args		*args = dac->da_args;
-	struct xfs_inode		*dp = args->dp;
-
-	/* If we are shrinking a node, resume shrink */
-	if (dac->dela_state == XFS_DAS_RM_SHRINK)
-		goto node;
-
-	if (!xfs_inode_hasattr(dp))
-		return -ENOATTR;
-
-	if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
-		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
-		return xfs_attr_shortform_remove(args);
-	}
-
-	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
-		return xfs_attr_leaf_removename(args);
-node:
-	/* If we are not short form or leaf, then proceed to remove node */
-	return  xfs_attr_node_removename_iter(dac);
-}
-
 /*
  * Note: If args->value is NULL the attribute will be removed, just like the
  * Linux ->setattr API.
@@ -1266,46 +1258,6 @@ int xfs_attr_node_removename_setup(
 	return 0;
 }
 
-STATIC int
-xfs_attr_node_remove_rmt (
-	struct xfs_delattr_context	*dac,
-	struct xfs_da_state		*state)
-{
-	int				error = 0;
-
-	/*
-	 * May return -EAGAIN to request that the caller recall this function
-	 */
-	error = __xfs_attr_rmtval_remove(dac);
-	if (error)
-		return error;
-
-	/*
-	 * Refill the state structure with buffers, the prior calls released our
-	 * buffers.
-	 */
-	return xfs_attr_refillstate(state);
-}
-
-STATIC int
-xfs_attr_node_remove_cleanup(
-	struct xfs_da_args	*args,
-	struct xfs_da_state	*state)
-{
-	struct xfs_da_state_blk	*blk;
-	int			retval;
-
-	/*
-	 * Remove the name and update the hashvals in the tree.
-	 */
-	blk = &state->path.blk[state->path.active-1];
-	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
-	retval = xfs_attr3_leaf_remove(blk->bp, args);
-	xfs_da3_fixhashpath(state, &state->path);
-
-	return retval;
-}
-
 /*
  * Step through removeing a name from a B-tree attribute list.
  *
@@ -1320,25 +1272,54 @@ xfs_attr_node_remove_cleanup(
  */
 STATIC int
 xfs_attr_node_remove_step(
-	struct xfs_delattr_context	*dac)
+	struct xfs_delattr_context	*dac,
+	bool				*joined)
 {
 	struct xfs_da_args		*args = dac->da_args;
 	struct xfs_da_state		*state = dac->da_state;
-	int				error = 0;
+	struct xfs_da_state_blk		*blk;
+	int				error = 0, retval, done;
+
 	/*
-	 * If there is an out-of-line value, de-allocate the blocks.
-	 * This is done before we remove the attribute so that we don't
-	 * overflow the maximum size of a transaction and/or hit a deadlock.
+	 * If there is an out-of-line value, de-allocate the blocks.  This is
+	 * done before we remove the attribute so that we don't overflow the
+	 * maximum size of a transaction and/or hit a deadlock.
 	 */
 	if (args->rmtblkno > 0) {
-		/*
-		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
-		 */
-		error = xfs_attr_node_remove_rmt(dac, state);
+		error = xfs_bunmapi(args->trans, args->dp, args->rmtblkno,
+				args->rmtblkcnt, XFS_BMAPI_ATTRFORK, 1, &done);
+		if (error)
+			return error;
+		if (!done) {
+			dac->flags |= XFS_DAC_DEFER_FINISH;
+			return -EAGAIN;
+		}
+
+		error = xfs_attr_refillstate(state);
 		if (error)
 			return error;
 	}
 
+	/*
+	 * Remove the name and update the hashvals in the tree.
+	 */
+	blk = &state->path.blk[state->path.active-1];
+	ASSERT(blk->magic == XFS_ATTR_LEAF_MAGIC);
+	retval = xfs_attr3_leaf_remove(blk->bp, args);
+	xfs_da3_fixhashpath(state, &state->path);
+
+	/*
+	 * Check to see if the tree needs to be collapsed. Set the flag to
+	 * indicate that the calling function needs to move the shrink
+	 * operation
+	 */
+	if (retval && (state->path.active > 1)) {
+		error = xfs_da3_join(state);
+		if (error)
+			return error;
+		*joined = true;
+	}
+
 	return error;
 }
 
@@ -1358,18 +1339,10 @@ xfs_attr_node_removename_iter(
 	struct xfs_delattr_context	*dac)
 {
 	struct xfs_da_args		*args = dac->da_args;
-	struct xfs_da_state		*state = NULL;
-	int				retval, error;
+	struct xfs_da_state		*state = dac->da_state;
+	int				error;
 	struct xfs_inode		*dp = args->dp;
-
-	trace_xfs_attr_node_removename(args);
-
-	if (!dac->da_state) {
-		error = xfs_attr_node_removename_setup(dac);
-		if (error)
-			goto out;
-	}
-	state = dac->da_state;
+	bool				joined = false;
 
 	switch (dac->dela_state) {
 	case XFS_DAS_UNINIT:
@@ -1377,27 +1350,14 @@ xfs_attr_node_removename_iter(
 		 * repeatedly remove remote blocks, remove the entry and join.
 		 * returns -EAGAIN or 0 for completion of the step.
 		 */
-		error = xfs_attr_node_remove_step(dac);
+		error = xfs_attr_node_remove_step(dac, &joined);
 		if (error)
-			break;
-
-		retval = xfs_attr_node_remove_cleanup(args, state);
-
-		/*
-		 * Check to see if the tree needs to be collapsed. Set the flag
-		 * to indicate that the calling function needs to move the
-		 * shrink operation
-		 */
-		if (retval && (state->path.active > 1)) {
-			error = xfs_da3_join(state);
-			if (error)
-				return error;
-
+			goto out;
+		if (joined) {
 			dac->flags |= XFS_DAC_DEFER_FINISH;
 			dac->dela_state = XFS_DAS_RM_SHRINK;
 			return -EAGAIN;
 		}
-
 		/* fallthrough */
 	case XFS_DAS_RM_SHRINK:
 		/*
@@ -1405,7 +1365,6 @@ xfs_attr_node_removename_iter(
 		 */
 		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
 			error = xfs_attr_node_shrink(args, state);
-
 		break;
 	default:
 		ASSERT(0);
@@ -1413,10 +1372,8 @@ xfs_attr_node_removename_iter(
 		goto out;
 	}
 
-	if (error == -EAGAIN)
-		return error;
 out:
-	if (state)
+	if (state && error != -EAGAIN)
 		xfs_da_state_free(state);
 	return error;
 }


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-22 17:11   ` Brian Foster
@ 2020-12-22 17:20     ` Brian Foster
  2020-12-22 18:44       ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2020-12-22 17:20 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
> On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
> > This patch modifies the attr remove routines to be delay ready. This
> > means they no longer roll or commit transactions, but instead return
> > -EAGAIN to have the calling routine roll and refresh the transaction. In
> > this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> > uses a sort of state machine like switch to keep track of where it was
> > when EAGAIN was returned. xfs_attr_node_removename has also been
> > modified to use the switch, and a new version of xfs_attr_remove_args
> > consists of a simple loop to refresh the transaction until the operation
> > is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> > transaction where ever the existing code used to.
> > 
> > Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> > version __xfs_attr_rmtval_remove. We will rename
> > __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> > done.
> > 
> > xfs_attr_rmtval_remove itself is still in use by the set routines (used
> > during a rename).  For reasons of preserving existing function, we
> > modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> > set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> > the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> > used and will be removed.
> > 
> > This patch also adds a new struct xfs_delattr_context, which we will use
> > to keep track of the current state of an attribute operation. The new
> > xfs_delattr_state enum is used to track various operations that are in
> > progress so that we know not to repeat them, and resume where we left
> > off before EAGAIN was returned to cycle out the transaction. Other
> > members take the place of local variables that need to retain their
> > values across multiple function recalls.  See xfs_attr.h for a more
> > detailed diagram of the states.
> > 
> > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > ---
> 
> I started with a couple small comments on this patch but inevitably
> started thinking more about the factoring again and ended up with a
> couple patches on top. The first is more of some small tweaks and
> open-coding that IMO makes this patch a bit easier to follow. The
> second is more of an RFC so I'll follow up with that in a second email.
> I'm curious what folks' thoughts might be on either. Also note that I'm
> primarily focusing on code structure and whatnot here, so these are fast
> and loose, compile tested only and likely to be broken.
> 

... and here's the second diff (applies on top of the first).

This one popped up after staring at the previous changes for a bit and
wondering whether using "done flags" might make the whole thing easier
to follow than incremental state transitions. I think the attr remove
path is easy enough to follow with either method, but the attr set path
is a beast and so this is more with that in mind. Initial thoughts?

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 2e466c4ac283..106e3c070131 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -1271,14 +1271,12 @@ int xfs_attr_node_removename_setup(
  * successful error code is returned.
  */
 STATIC int
-xfs_attr_node_remove_step(
-	struct xfs_delattr_context	*dac,
-	bool				*joined)
+xfs_attr_node_remove_rmt_step(
+	struct xfs_delattr_context	*dac)
 {
 	struct xfs_da_args		*args = dac->da_args;
 	struct xfs_da_state		*state = dac->da_state;
-	struct xfs_da_state_blk		*blk;
-	int				error = 0, retval, done;
+	int				error, done;
 
 	/*
 	 * If there is an out-of-line value, de-allocate the blocks.  This is
@@ -1300,6 +1298,19 @@ xfs_attr_node_remove_step(
 			return error;
 	}
 
+	dac->dela_state |= XFS_DAS_RMT_DONE;
+	return error;
+}
+
+STATIC int
+xfs_attr_node_remove_join_step(
+	struct xfs_delattr_context	*dac)
+{
+	struct xfs_da_args		*args = dac->da_args;
+	struct xfs_da_state		*state = dac->da_state;
+	struct xfs_da_state_blk		*blk;
+	int				error, retval;
+
 	/*
 	 * Remove the name and update the hashvals in the tree.
 	 */
@@ -1317,9 +1328,12 @@ xfs_attr_node_remove_step(
 		error = xfs_da3_join(state);
 		if (error)
 			return error;
-		*joined = true;
+
+		error = -EAGAIN;
+		dac->flags |= XFS_DAC_DEFER_FINISH;
 	}
 
+	dac->dela_state |= XFS_DAS_JOIN_DONE;
 	return error;
 }
 
@@ -1342,36 +1356,23 @@ xfs_attr_node_removename_iter(
 	struct xfs_da_state		*state = dac->da_state;
 	int				error;
 	struct xfs_inode		*dp = args->dp;
-	bool				joined = false;
 
-	switch (dac->dela_state) {
-	case XFS_DAS_UNINIT:
-		/*
-		 * repeatedly remove remote blocks, remove the entry and join.
-		 * returns -EAGAIN or 0 for completion of the step.
-		 */
-		error = xfs_attr_node_remove_step(dac, &joined);
+	if (!(dac->dela_state & XFS_DAS_RMT_DONE)) {
+		error = xfs_attr_node_remove_rmt_step(dac);
 		if (error)
 			goto out;
-		if (joined) {
-			dac->flags |= XFS_DAC_DEFER_FINISH;
-			dac->dela_state = XFS_DAS_RM_SHRINK;
-			return -EAGAIN;
-		}
-		/* fallthrough */
-	case XFS_DAS_RM_SHRINK:
-		/*
-		 * If the result is small enough, push it all into the inode.
-		 */
-		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
-			error = xfs_attr_node_shrink(args, state);
-		break;
-	default:
-		ASSERT(0);
-		error = -EINVAL;
-		goto out;
 	}
 
+	if (!(dac->dela_state & XFS_DAS_JOIN_DONE)) {
+		error = xfs_attr_node_remove_join_step(dac);
+		if (error)
+			goto out;
+	}
+
+	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+		error = xfs_attr_node_shrink(args, state);
+	ASSERT(error != -EAGAIN);
+
 out:
 	if (state && error != -EAGAIN)
 		xfs_da_state_free(state);
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 3154ef4b7833..67e730cd3267 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -151,6 +151,9 @@ enum xfs_delattr_state {
 	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
 };
 
+#define XFS_DAS_RMT_DONE	0x1
+#define XFS_DAS_JOIN_DONE	0x2
+
 /*
  * Defines for xfs_delattr_context.flags
  */


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-22 17:20     ` Brian Foster
@ 2020-12-22 18:44       ` Brian Foster
  2020-12-23  5:20         ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2020-12-22 18:44 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Tue, Dec 22, 2020 at 12:20:20PM -0500, Brian Foster wrote:
> On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
> > On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
> > > This patch modifies the attr remove routines to be delay ready. This
> > > means they no longer roll or commit transactions, but instead return
> > > -EAGAIN to have the calling routine roll and refresh the transaction. In
> > > this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> > > uses a sort of state machine like switch to keep track of where it was
> > > when EAGAIN was returned. xfs_attr_node_removename has also been
> > > modified to use the switch, and a new version of xfs_attr_remove_args
> > > consists of a simple loop to refresh the transaction until the operation
> > > is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> > > transaction where ever the existing code used to.
> > > 
> > > Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> > > version __xfs_attr_rmtval_remove. We will rename
> > > __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> > > done.
> > > 
> > > xfs_attr_rmtval_remove itself is still in use by the set routines (used
> > > during a rename).  For reasons of preserving existing function, we
> > > modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> > > set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> > > the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> > > used and will be removed.
> > > 
> > > This patch also adds a new struct xfs_delattr_context, which we will use
> > > to keep track of the current state of an attribute operation. The new
> > > xfs_delattr_state enum is used to track various operations that are in
> > > progress so that we know not to repeat them, and resume where we left
> > > off before EAGAIN was returned to cycle out the transaction. Other
> > > members take the place of local variables that need to retain their
> > > values across multiple function recalls.  See xfs_attr.h for a more
> > > detailed diagram of the states.
> > > 
> > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > ---
> > 
> > I started with a couple small comments on this patch but inevitably
> > started thinking more about the factoring again and ended up with a
> > couple patches on top. The first is more of some small tweaks and
> > open-coding that IMO makes this patch a bit easier to follow. The
> > second is more of an RFC so I'll follow up with that in a second email.
> > I'm curious what folks' thoughts might be on either. Also note that I'm
> > primarily focusing on code structure and whatnot here, so these are fast
> > and loose, compile tested only and likely to be broken.
> > 
> 
> ... and here's the second diff (applies on top of the first).
> 
> This one popped up after staring at the previous changes for a bit and
> wondering whether using "done flags" might make the whole thing easier
> to follow than incremental state transitions. I think the attr remove
> path is easy enough to follow with either method, but the attr set path
> is a beast and so this is more with that in mind. Initial thoughts?
> 

Eh, the more I stare at the attr set code I'm not sure this by itself is
much of an improvement. It helps in some areas, but there are so many
transaction rolls embedded throughout at different levels that a larger
rework of the code is probably still necessary. Anyways, this was just a
random thought for now..

Brian

> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 2e466c4ac283..106e3c070131 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -1271,14 +1271,12 @@ int xfs_attr_node_removename_setup(
>   * successful error code is returned.
>   */
>  STATIC int
> -xfs_attr_node_remove_step(
> -	struct xfs_delattr_context	*dac,
> -	bool				*joined)
> +xfs_attr_node_remove_rmt_step(
> +	struct xfs_delattr_context	*dac)
>  {
>  	struct xfs_da_args		*args = dac->da_args;
>  	struct xfs_da_state		*state = dac->da_state;
> -	struct xfs_da_state_blk		*blk;
> -	int				error = 0, retval, done;
> +	int				error, done;
>  
>  	/*
>  	 * If there is an out-of-line value, de-allocate the blocks.  This is
> @@ -1300,6 +1298,19 @@ xfs_attr_node_remove_step(
>  			return error;
>  	}
>  
> +	dac->dela_state |= XFS_DAS_RMT_DONE;
> +	return error;
> +}
> +
> +STATIC int
> +xfs_attr_node_remove_join_step(
> +	struct xfs_delattr_context	*dac)
> +{
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_state		*state = dac->da_state;
> +	struct xfs_da_state_blk		*blk;
> +	int				error, retval;
> +
>  	/*
>  	 * Remove the name and update the hashvals in the tree.
>  	 */
> @@ -1317,9 +1328,12 @@ xfs_attr_node_remove_step(
>  		error = xfs_da3_join(state);
>  		if (error)
>  			return error;
> -		*joined = true;
> +
> +		error = -EAGAIN;
> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>  	}
>  
> +	dac->dela_state |= XFS_DAS_JOIN_DONE;
>  	return error;
>  }
>  
> @@ -1342,36 +1356,23 @@ xfs_attr_node_removename_iter(
>  	struct xfs_da_state		*state = dac->da_state;
>  	int				error;
>  	struct xfs_inode		*dp = args->dp;
> -	bool				joined = false;
>  
> -	switch (dac->dela_state) {
> -	case XFS_DAS_UNINIT:
> -		/*
> -		 * repeatedly remove remote blocks, remove the entry and join.
> -		 * returns -EAGAIN or 0 for completion of the step.
> -		 */
> -		error = xfs_attr_node_remove_step(dac, &joined);
> +	if (!(dac->dela_state & XFS_DAS_RMT_DONE)) {
> +		error = xfs_attr_node_remove_rmt_step(dac);
>  		if (error)
>  			goto out;
> -		if (joined) {
> -			dac->flags |= XFS_DAC_DEFER_FINISH;
> -			dac->dela_state = XFS_DAS_RM_SHRINK;
> -			return -EAGAIN;
> -		}
> -		/* fallthrough */
> -	case XFS_DAS_RM_SHRINK:
> -		/*
> -		 * If the result is small enough, push it all into the inode.
> -		 */
> -		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> -			error = xfs_attr_node_shrink(args, state);
> -		break;
> -	default:
> -		ASSERT(0);
> -		error = -EINVAL;
> -		goto out;
>  	}
>  
> +	if (!(dac->dela_state & XFS_DAS_JOIN_DONE)) {
> +		error = xfs_attr_node_remove_join_step(dac);
> +		if (error)
> +			goto out;
> +	}
> +
> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +		error = xfs_attr_node_shrink(args, state);
> +	ASSERT(error != -EAGAIN);
> +
>  out:
>  	if (state && error != -EAGAIN)
>  		xfs_da_state_free(state);
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 3154ef4b7833..67e730cd3267 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -151,6 +151,9 @@ enum xfs_delattr_state {
>  	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
>  };
>  
> +#define XFS_DAS_RMT_DONE	0x1
> +#define XFS_DAS_JOIN_DONE	0x2
> +
>  /*
>   * Defines for xfs_delattr_context.flags
>   */
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-22 15:41     ` Allison Henderson
@ 2020-12-23  4:05       ` Chandan Babu R
  0 siblings, 0 replies; 48+ messages in thread
From: Chandan Babu R @ 2020-12-23  4:05 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Tue, 22 Dec 2020 08:41:49 -0700, Allison Henderson wrote:
> 
> On 12/22/20 12:22 AM, Chandan Babu R wrote:
> > On Fri, 18 Dec 2020 00:29:06 -0700, Allison Henderson wrote:
> >> This patch modifies the attr remove routines to be delay ready. This
> >> means they no longer roll or commit transactions, but instead return
> >> -EAGAIN to have the calling routine roll and refresh the transaction. In
> >> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> >> uses a sort of state machine like switch to keep track of where it was
> >> when EAGAIN was returned. xfs_attr_node_removename has also been
> >> modified to use the switch, and a new version of xfs_attr_remove_args
> >> consists of a simple loop to refresh the transaction until the operation
> >> is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> >> transaction where ever the existing code used to.
> >>
> >> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> >> version __xfs_attr_rmtval_remove. We will rename
> >> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> >> done.
> >>
> >> xfs_attr_rmtval_remove itself is still in use by the set routines (used
> >> during a rename).  For reasons of preserving existing function, we
> >> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> >> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> >> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> >> used and will be removed.
> >>
> >> This patch also adds a new struct xfs_delattr_context, which we will use
> >> to keep track of the current state of an attribute operation. The new
> >> xfs_delattr_state enum is used to track various operations that are in
> >> progress so that we know not to repeat them, and resume where we left
> >> off before EAGAIN was returned to cycle out the transaction. Other
> >> members take the place of local variables that need to retain their
> >> values across multiple function recalls.  See xfs_attr.h for a more
> >> detailed diagram of the states.
> >>
> >> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> >> ---
> >>   fs/xfs/libxfs/xfs_attr.c        | 218 +++++++++++++++++++++++++++++-----------
> >>   fs/xfs/libxfs/xfs_attr.h        | 100 ++++++++++++++++++
> >>   fs/xfs/libxfs/xfs_attr_leaf.c   |   2 +-
> >>   fs/xfs/libxfs/xfs_attr_remote.c |  48 +++++----
> >>   fs/xfs/libxfs/xfs_attr_remote.h |   2 +-
> >>   fs/xfs/xfs_attr_inactive.c      |   2 +-
> >>   6 files changed, 288 insertions(+), 84 deletions(-)
> >>
> >> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> >> index 1969b88..b6330f9 100644
> >> --- a/fs/xfs/libxfs/xfs_attr.c
> >> +++ b/fs/xfs/libxfs/xfs_attr.c
> >> @@ -53,7 +53,7 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
> >>    */
> >>   STATIC int xfs_attr_node_get(xfs_da_args_t *args);
> >>   STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
> >> -STATIC int xfs_attr_node_removename(xfs_da_args_t *args);
> >> +STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
> >>   STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
> >>   				 struct xfs_da_state **state);
> >>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
> >> @@ -264,6 +264,34 @@ xfs_attr_set_shortform(
> >>   }
> >>   
> >>   /*
> >> + * Checks to see if a delayed attribute transaction should be rolled.  If so,
> >> + * also checks for a defer finish.  Transaction is finished and rolled as
> >> + * needed, and returns true of false if the delayed operation should continue.
> >> + */
> >> +int
> >> +xfs_attr_trans_roll(
> >> +	struct xfs_delattr_context	*dac)
> >> +{
> >> +	struct xfs_da_args		*args = dac->da_args;
> >> +	int				error;
> >> +
> >> +	if (dac->flags & XFS_DAC_DEFER_FINISH) {
> >> +		/*
> >> +		 * The caller wants us to finish all the deferred ops so that we
> >> +		 * avoid pinning the log tail with a large number of deferred
> >> +		 * ops.
> >> +		 */
> >> +		dac->flags &= ~XFS_DAC_DEFER_FINISH;
> >> +		error = xfs_defer_finish(&args->trans);
> >> +		if (error)
> >> +			return error;
> >> +	} else
> >> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> >> +
> >> +	return error;
> >> +}
> >> +
> >> +/*
> >>    * Set the attribute specified in @args.
> >>    */
> >>   int
> >> @@ -364,23 +392,58 @@ xfs_has_attr(
> >>    */
> >>   int
> >>   xfs_attr_remove_args(
> >> -	struct xfs_da_args      *args)
> >> +	struct xfs_da_args	*args)
> >>   {
> >> -	struct xfs_inode	*dp = args->dp;
> >> -	int			error;
> >> +	int				error;
> >> +	struct xfs_delattr_context	dac = {
> >> +		.da_args	= args,
> >> +	};
> >> +
> >> +	do {
> >> +		error = xfs_attr_remove_iter(&dac);
> >> +		if (error != -EAGAIN)
> >> +			break;
> >> +
> >> +		error = xfs_attr_trans_roll(&dac);
> >> +		if (error)
> >> +			return error;
> >> +
> >> +	} while (true);
> >> +
> >> +	return error;
> >> +}
> >>   
> >> -	if (!xfs_inode_hasattr(dp)) {
> >> -		error = -ENOATTR;
> >> -	} else if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
> >> +/*
> >> + * Remove the attribute specified in @args.
> >> + *
> >> + * This function may return -EAGAIN to signal that the transaction needs to be
> >> + * rolled.  Callers should continue calling this function until they receive a
> >> + * return value other than -EAGAIN.
> >> + */
> >> +int
> >> +xfs_attr_remove_iter(
> >> +	struct xfs_delattr_context	*dac)
> >> +{
> >> +	struct xfs_da_args		*args = dac->da_args;
> >> +	struct xfs_inode		*dp = args->dp;
> >> +
> >> +	/* If we are shrinking a node, resume shrink */
> >> +	if (dac->dela_state == XFS_DAS_RM_SHRINK)
> >> +		goto node;
> >> +
> >> +	if (!xfs_inode_hasattr(dp))
> >> +		return -ENOATTR;
> >> +
> >> +	if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
> >>   		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> >> -		error = xfs_attr_shortform_remove(args);
> >> -	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> >> -		error = xfs_attr_leaf_removename(args);
> >> -	} else {
> >> -		error = xfs_attr_node_removename(args);
> >> +		return xfs_attr_shortform_remove(args);
> >>   	}
> >>   
> >> -	return error;
> >> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> >> +		return xfs_attr_leaf_removename(args);
> >> +node:
> >> +	/* If we are not short form or leaf, then proceed to remove node */
> >> +	return  xfs_attr_node_removename_iter(dac);
> >>   }
> >>   
> >>   /*
> >> @@ -1178,10 +1241,11 @@ xfs_attr_leaf_mark_incomplete(
> >>    */
> >>   STATIC
> >>   int xfs_attr_node_removename_setup(
> >> -	struct xfs_da_args	*args,
> >> -	struct xfs_da_state	**state)
> >> +	struct xfs_delattr_context	*dac)
> >>   {
> > 
> > In xfs_attr_node_removename_setup(), if either of
> > xfs_attr_leaf_mark_incomplete() or xfs_attr_rmtval_invalidate() returns with a
> > non-zero value, the memory pointed to by dac->da_state is not freed. This
> > happens because the caller (i.e. xfs_attr_node_removename_iter()) checks for
> > the non-NULL value of its local variable "state" to actually free the
> > corresponding memory.
> > 
> Ok, for this one it think it makes more sense to put an extra free in 
> the helper rather than have the caller handle it.  Will fix.
> 
> Do you have a tool thats tracing this out, or is it just by hand? 
> Because if it's a tool, I should probably be using it too :-)
>

Unfortunately, I found this by reading through the code changes. Tools to
figure these out would be great to have since it would let us focus mostly on
the larger picture.

> Thanks!
> Allison
> 
> 
> >> -	int			error;
> >> +	struct xfs_da_args		*args = dac->da_args;
> >> +	struct xfs_da_state		**state = &dac->da_state;
> >> +	int				error;
> >>   
> >>   	error = xfs_attr_node_hasname(args, state);
> >>   	if (error != -EEXIST)
> >> @@ -1203,13 +1267,16 @@ int xfs_attr_node_removename_setup(
> >>   }
> >>   
> >>   STATIC int
> >> -xfs_attr_node_remove_rmt(
> >> -	struct xfs_da_args	*args,
> >> -	struct xfs_da_state	*state)
> >> +xfs_attr_node_remove_rmt (
> >> +	struct xfs_delattr_context	*dac,
> >> +	struct xfs_da_state		*state)
> >>   {
> >> -	int			error = 0;
> >> +	int				error = 0;
> >>   
> >> -	error = xfs_attr_rmtval_remove(args);
> >> +	/*
> >> +	 * May return -EAGAIN to request that the caller recall this function
> >> +	 */
> >> +	error = __xfs_attr_rmtval_remove(dac);
> >>   	if (error)
> >>   		return error;
> >>   
> >> @@ -1240,28 +1307,34 @@ xfs_attr_node_remove_cleanup(
> >>   }
> >>   
> >>   /*
> >> - * Remove a name from a B-tree attribute list.
> >> + * Step through removeing a name from a B-tree attribute list.
> >>    *
> >>    * This will involve walking down the Btree, and may involve joining
> >>    * leaf nodes and even joining intermediate nodes up to and including
> >>    * the root node (a special case of an intermediate node).
> >> + *
> >> + * This routine is meant to function as either an inline or delayed operation,
> >> + * and may return -EAGAIN when the transaction needs to be rolled.  Calling
> >> + * functions will need to handle this, and recall the function until a
> >> + * successful error code is returned.
> >>    */
> >>   STATIC int
> >>   xfs_attr_node_remove_step(
> >> -	struct xfs_da_args	*args,
> >> -	struct xfs_da_state	*state)
> >> +	struct xfs_delattr_context	*dac)
> >>   {
> >> -	int			error;
> >> -	struct xfs_inode	*dp = args->dp;
> >> -
> >> -
> >> +	struct xfs_da_args		*args = dac->da_args;
> >> +	struct xfs_da_state		*state = dac->da_state;
> >> +	int				error = 0;
> >>   	/*
> >>   	 * If there is an out-of-line value, de-allocate the blocks.
> >>   	 * This is done before we remove the attribute so that we don't
> >>   	 * overflow the maximum size of a transaction and/or hit a deadlock.
> >>   	 */
> >>   	if (args->rmtblkno > 0) {
> >> -		error = xfs_attr_node_remove_rmt(args, state);
> >> +		/*
> >> +		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
> >> +		 */
> >> +		error = xfs_attr_node_remove_rmt(dac, state);
> >>   		if (error)
> >>   			return error;
> >>   	}
> >> @@ -1274,51 +1347,74 @@ xfs_attr_node_remove_step(
> >>    *
> >>    * This routine will find the blocks of the name to remove, remove them and
> >>    * shrink the tree if needed.
> >> + *
> >> + * This routine is meant to function as either an inline or delayed operation,
> >> + * and may return -EAGAIN when the transaction needs to be rolled.  Calling
> >> + * functions will need to handle this, and recall the function until a
> >> + * successful error code is returned.
> >>    */
> >>   STATIC int
> >> -xfs_attr_node_removename(
> >> -	struct xfs_da_args	*args)
> >> +xfs_attr_node_removename_iter(
> >> +	struct xfs_delattr_context	*dac)
> >>   {
> >> -	struct xfs_da_state	*state = NULL;
> >> -	int			retval, error;
> >> -	struct xfs_inode	*dp = args->dp;
> >> +	struct xfs_da_args		*args = dac->da_args;
> >> +	struct xfs_da_state		*state = NULL;
> >> +	int				retval, error;
> >> +	struct xfs_inode		*dp = args->dp;
> >>   
> >>   	trace_xfs_attr_node_removename(args);
> >>   
> >> -	error = xfs_attr_node_removename_setup(args, &state);
> >> -	if (error)
> >> -		goto out;
> >> +	if (!dac->da_state) {
> >> +		error = xfs_attr_node_removename_setup(dac);
> >> +		if (error)
> >> +			goto out;
> >> +	}
> >> +	state = dac->da_state;
> >>   
> >> -	error = xfs_attr_node_remove_step(args, state);
> >> -	if (error)
> >> -		goto out;
> >> +	switch (dac->dela_state) {
> >> +	case XFS_DAS_UNINIT:
> >> +		/*
> >> +		 * repeatedly remove remote blocks, remove the entry and join.
> >> +		 * returns -EAGAIN or 0 for completion of the step.
> >> +		 */
> >> +		error = xfs_attr_node_remove_step(dac);
> >> +		if (error)
> >> +			break;
> >>   
> >> -	retval = xfs_attr_node_remove_cleanup(args, state);
> >> +		retval = xfs_attr_node_remove_cleanup(args, state);
> >>   
> >> -	/*
> >> -	 * Check to see if the tree needs to be collapsed.
> >> -	 */
> >> -	if (retval && (state->path.active > 1)) {
> >> -		error = xfs_da3_join(state);
> >> -		if (error)
> >> -			return error;
> >> -		error = xfs_defer_finish(&args->trans);
> >> -		if (error)
> >> -			return error;
> >>   		/*
> >> -		 * Commit the Btree join operation and start a new trans.
> >> +		 * Check to see if the tree needs to be collapsed. Set the flag
> >> +		 * to indicate that the calling function needs to move the
> >> +		 * shrink operation
> >>   		 */
> >> -		error = xfs_trans_roll_inode(&args->trans, dp);
> >> -		if (error)
> >> -			return error;
> >> -	}
> >> +		if (retval && (state->path.active > 1)) {
> >> +			error = xfs_da3_join(state);
> >> +			if (error)
> >> +				return error;
> >>   
> >> -	/*
> >> -	 * If the result is small enough, push it all into the inode.
> >> -	 */
> >> -	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> >> -		error = xfs_attr_node_shrink(args, state);
> >> +			dac->flags |= XFS_DAC_DEFER_FINISH;
> >> +			dac->dela_state = XFS_DAS_RM_SHRINK;
> >> +			return -EAGAIN;
> >> +		}
> >> +
> >> +		/* fallthrough */
> >> +	case XFS_DAS_RM_SHRINK:
> >> +		/*
> >> +		 * If the result is small enough, push it all into the inode.
> >> +		 */
> >> +		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> >> +			error = xfs_attr_node_shrink(args, state);
> >> +
> >> +		break;
> >> +	default:
> >> +		ASSERT(0);
> >> +		error = -EINVAL;
> >> +		goto out;
> >> +	}
> >>   
> >> +	if (error == -EAGAIN)
> >> +		return error;
> >>   out:
> >>   	if (state)
> >>   		xfs_da_state_free(state);
> >> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> >> index 3e97a93..3154ef4 100644
> >> --- a/fs/xfs/libxfs/xfs_attr.h
> >> +++ b/fs/xfs/libxfs/xfs_attr.h
> >> @@ -74,6 +74,102 @@ struct xfs_attr_list_context {
> >>   };
> >>   
> >>   
> >> +/*
> >> + * ========================================================================
> >> + * Structure used to pass context around among the delayed routines.
> >> + * ========================================================================
> >> + */
> >> +
> >> +/*
> >> + * Below is a state machine diagram for attr remove operations. The  XFS_DAS_*
> >> + * states indicate places where the function would return -EAGAIN, and then
> >> + * immediately resume from after being recalled by the calling function. States
> >> + * marked as a "subroutine state" indicate that they belong to a subroutine, and
> >> + * so the calling function needs to pass them back to that subroutine to allow
> >> + * it to finish where it left off. But they otherwise do not have a role in the
> >> + * calling function other than just passing through.
> >> + *
> >> + * xfs_attr_remove_iter()
> >> + *              │
> >> + *              v
> >> + *        found attr blks? ───n──┐
> >> + *              │                v
> >> + *              │         find and invalidate
> >> + *              y         the blocks. mark
> >> + *              │         attr incomplete
> >> + *              ├────────────────┘
> >> + *              │
> >> + *              v
> >> + *      remove a block with
> >> + *    xfs_attr_node_remove_step <────┐
> >> + *              │                    │
> >> + *              v                    │
> >> + *      still have blks ──y──> return -EAGAIN.
> >> + *        to remove?          re-enter with one
> >> + *              │            less blk to remove
> >> + *              n
> >> + *              │
> >> + *              v
> >> + *       remove leaf and
> >> + *       update hash with
> >> + *   xfs_attr_node_remove_cleanup
> >> + *              │
> >> + *              v
> >> + *           need to
> >> + *        shrink tree? ─n─┐
> >> + *              │         │
> >> + *              y         │
> >> + *              │         │
> >> + *              v         │
> >> + *          join leaf     │
> >> + *              │         │
> >> + *              v         │
> >> + *      XFS_DAS_RM_SHRINK │
> >> + *              │         │
> >> + *              v         │
> >> + *       do the shrink    │
> >> + *              │         │
> >> + *              v         │
> >> + *          free state <──┘
> >> + *              │
> >> + *              v
> >> + *            done
> >> + *
> >> + */
> >> +
> >> +/*
> >> + * Enum values for xfs_delattr_context.da_state
> >> + *
> >> + * These values are used by delayed attribute operations to keep track  of where
> >> + * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
> >> + * calling function to roll the transaction, and then recall the subroutine to
> >> + * finish the operation.  The enum is then used by the subroutine to jump back
> >> + * to where it was and resume executing where it left off.
> >> + */
> >> +enum xfs_delattr_state {
> >> +	XFS_DAS_UNINIT		= 0,  /* No state has been set yet */
> >> +	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
> >> +};
> >> +
> >> +/*
> >> + * Defines for xfs_delattr_context.flags
> >> + */
> >> +#define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
> >> +
> >> +/*
> >> + * Context used for keeping track of delayed attribute operations
> >> + */
> >> +struct xfs_delattr_context {
> >> +	struct xfs_da_args      *da_args;
> >> +
> >> +	/* Used in xfs_attr_node_removename to roll through removing blocks */
> >> +	struct xfs_da_state     *da_state;
> >> +
> >> +	/* Used to keep track of current state of delayed operation */
> >> +	unsigned int            flags;
> >> +	enum xfs_delattr_state  dela_state;
> >> +};
> >> +
> >>   /*========================================================================
> >>    * Function prototypes for the kernel.
> >>    *========================================================================*/
> >> @@ -91,6 +187,10 @@ int xfs_attr_set(struct xfs_da_args *args);
> >>   int xfs_attr_set_args(struct xfs_da_args *args);
> >>   int xfs_has_attr(struct xfs_da_args *args);
> >>   int xfs_attr_remove_args(struct xfs_da_args *args);
> >> +int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
> >> +int xfs_attr_trans_roll(struct xfs_delattr_context *dac);
> >>   bool xfs_attr_namecheck(const void *name, size_t length);
> >> +void xfs_delattr_context_init(struct xfs_delattr_context *dac,
> >> +			      struct xfs_da_args *args);
> >>   
> >>   #endif	/* __XFS_ATTR_H__ */
> >> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> >> index d6ef69a..3780141 100644
> >> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> >> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> >> @@ -19,8 +19,8 @@
> >>   #include "xfs_bmap_btree.h"
> >>   #include "xfs_bmap.h"
> >>   #include "xfs_attr_sf.h"
> >> -#include "xfs_attr_remote.h"
> >>   #include "xfs_attr.h"
> >> +#include "xfs_attr_remote.h"
> >>   #include "xfs_attr_leaf.h"
> >>   #include "xfs_error.h"
> >>   #include "xfs_trace.h"
> >> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> >> index 48d8e9c..f09820c 100644
> >> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> >> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> >> @@ -674,10 +674,12 @@ xfs_attr_rmtval_invalidate(
> >>    */
> >>   int
> >>   xfs_attr_rmtval_remove(
> >> -	struct xfs_da_args      *args)
> >> +	struct xfs_da_args		*args)
> >>   {
> >> -	int			error;
> >> -	int			retval;
> >> +	int				error;
> >> +	struct xfs_delattr_context	dac  = {
> >> +		.da_args	= args,
> >> +	};
> >>   
> >>   	trace_xfs_attr_rmtval_remove(args);
> >>   
> >> @@ -685,31 +687,29 @@ xfs_attr_rmtval_remove(
> >>   	 * Keep de-allocating extents until the remote-value region is gone.
> >>   	 */
> >>   	do {
> >> -		retval = __xfs_attr_rmtval_remove(args);
> >> -		if (retval && retval != -EAGAIN)
> >> -			return retval;
> >> +		error = __xfs_attr_rmtval_remove(&dac);
> >> +		if (error != -EAGAIN)
> >> +			break;
> >>   
> >> -		/*
> >> -		 * Close out trans and start the next one in the chain.
> >> -		 */
> >> -		error = xfs_trans_roll_inode(&args->trans, args->dp);
> >> +		error = xfs_attr_trans_roll(&dac);
> >>   		if (error)
> >>   			return error;
> >> -	} while (retval == -EAGAIN);
> >> +	} while (true);
> >>   
> >> -	return 0;
> >> +	return error;
> >>   }
> >>   
> >>   /*
> >>    * Remove the value associated with an attribute by deleting the out-of-line
> >> - * buffer that it is stored on. Returns EAGAIN for the caller to refresh the
> >> + * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the
> >>    * transaction and re-call the function
> >>    */
> >>   int
> >>   __xfs_attr_rmtval_remove(
> >> -	struct xfs_da_args	*args)
> >> +	struct xfs_delattr_context	*dac)
> >>   {
> >> -	int			error, done;
> >> +	struct xfs_da_args		*args = dac->da_args;
> >> +	int				error, done;
> >>   
> >>   	/*
> >>   	 * Unmap value blocks for this attr.
> >> @@ -719,12 +719,20 @@ __xfs_attr_rmtval_remove(
> >>   	if (error)
> >>   		return error;
> >>   
> >> -	error = xfs_defer_finish(&args->trans);
> >> -	if (error)
> >> -		return error;
> >> -
> >> -	if (!done)
> >> +	/*
> >> +	 * We dont need an explicit state here to pick up where we left off.  We
> >> +	 * can figure it out using the !done return code.  Calling function only
> >> +	 * needs to keep recalling this routine until we indicate to stop by
> >> +	 * returning anything other than -EAGAIN. The actual value of
> >> +	 * attr->xattri_dela_state may be some value reminicent of the calling
> >> +	 * function, but it's value is irrelevant with in the context of this
> >> +	 * function.  Once we are done here, the next state is set as needed
> >> +	 * by the parent
> >> +	 */
> >> +	if (!done) {
> >> +		dac->flags |= XFS_DAC_DEFER_FINISH;
> >>   		return -EAGAIN;
> >> +	}
> >>   
> >>   	return error;
> >>   }
> >> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
> >> index 9eee615..002fd30 100644
> >> --- a/fs/xfs/libxfs/xfs_attr_remote.h
> >> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
> >> @@ -14,5 +14,5 @@ int xfs_attr_rmtval_remove(struct xfs_da_args *args);
> >>   int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
> >>   		xfs_buf_flags_t incore_flags);
> >>   int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
> >> -int __xfs_attr_rmtval_remove(struct xfs_da_args *args);
> >> +int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
> >>   #endif /* __XFS_ATTR_REMOTE_H__ */
> >> diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> >> index bfad669..aaa7e66 100644
> >> --- a/fs/xfs/xfs_attr_inactive.c
> >> +++ b/fs/xfs/xfs_attr_inactive.c
> >> @@ -15,10 +15,10 @@
> >>   #include "xfs_da_format.h"
> >>   #include "xfs_da_btree.h"
> >>   #include "xfs_inode.h"
> >> +#include "xfs_attr.h"
> >>   #include "xfs_attr_remote.h"
> >>   #include "xfs_trans.h"
> >>   #include "xfs_bmap.h"
> >> -#include "xfs_attr.h"
> >>   #include "xfs_attr_leaf.h"
> >>   #include "xfs_quota.h"
> >>   #include "xfs_dir2.h"
> >>
> > 
> > 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-22 18:44       ` Brian Foster
@ 2020-12-23  5:20         ` Allison Henderson
  2020-12-23 14:16           ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-23  5:20 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 12/22/20 11:44 AM, Brian Foster wrote:
> On Tue, Dec 22, 2020 at 12:20:20PM -0500, Brian Foster wrote:
>> On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
>>> On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
>>>> This patch modifies the attr remove routines to be delay ready. This
>>>> means they no longer roll or commit transactions, but instead return
>>>> -EAGAIN to have the calling routine roll and refresh the transaction. In
>>>> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
>>>> uses a sort of state machine like switch to keep track of where it was
>>>> when EAGAIN was returned. xfs_attr_node_removename has also been
>>>> modified to use the switch, and a new version of xfs_attr_remove_args
>>>> consists of a simple loop to refresh the transaction until the operation
>>>> is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
>>>> transaction where ever the existing code used to.
>>>>
>>>> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
>>>> version __xfs_attr_rmtval_remove. We will rename
>>>> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
>>>> done.
>>>>
>>>> xfs_attr_rmtval_remove itself is still in use by the set routines (used
>>>> during a rename).  For reasons of preserving existing function, we
>>>> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
>>>> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
>>>> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
>>>> used and will be removed.
>>>>
>>>> This patch also adds a new struct xfs_delattr_context, which we will use
>>>> to keep track of the current state of an attribute operation. The new
>>>> xfs_delattr_state enum is used to track various operations that are in
>>>> progress so that we know not to repeat them, and resume where we left
>>>> off before EAGAIN was returned to cycle out the transaction. Other
>>>> members take the place of local variables that need to retain their
>>>> values across multiple function recalls.  See xfs_attr.h for a more
>>>> detailed diagram of the states.
>>>>
>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>> ---
>>>
>>> I started with a couple small comments on this patch but inevitably
>>> started thinking more about the factoring again and ended up with a
>>> couple patches on top. The first is more of some small tweaks and
>>> open-coding that IMO makes this patch a bit easier to follow. The
>>> second is more of an RFC so I'll follow up with that in a second email.
>>> I'm curious what folks' thoughts might be on either. Also note that I'm
>>> primarily focusing on code structure and whatnot here, so these are fast
>>> and loose, compile tested only and likely to be broken.
>>>
>>
>> ... and here's the second diff (applies on top of the first).
>>
>> This one popped up after staring at the previous changes for a bit and
>> wondering whether using "done flags" might make the whole thing easier
>> to follow than incremental state transitions. I think the attr remove
>> path is easy enough to follow with either method, but the attr set path
>> is a beast and so this is more with that in mind. Initial thoughts?
>>
> 
> Eh, the more I stare at the attr set code I'm not sure this by itself is
> much of an improvement. It helps in some areas, but there are so many
> transaction rolls embedded throughout at different levels that a larger
> rework of the code is probably still necessary. Anyways, this was just a
> random thought for now..
> 
> Brian

No worries, I know the feeling :-)  The set works and all, but I do 
think there is struggle around trying to find a particularly pleasent 
looking presentation of it.  Especially when we get into the set path, 
it's a bit more complex.  I may pick through the patches you habe here 
and pick up the whitespace cleanups and other style adjustments if 
people prefer it that way.  The good news is, a lot of the *_args 
routines are supposed to disappear at the end of the set, so there's not 
really a need to invest too much in them I suppose. It may help to jump 
to the "Set up infastructure" patch too.  I've expanded the diagram to 
try and help illustrait the code flow a bit, so that may help with 
following the code flow.

Allison

> 
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 2e466c4ac283..106e3c070131 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -1271,14 +1271,12 @@ int xfs_attr_node_removename_setup(
>>    * successful error code is returned.
>>    */
>>   STATIC int
>> -xfs_attr_node_remove_step(
>> -	struct xfs_delattr_context	*dac,
>> -	bool				*joined)
>> +xfs_attr_node_remove_rmt_step(
>> +	struct xfs_delattr_context	*dac)
>>   {
>>   	struct xfs_da_args		*args = dac->da_args;
>>   	struct xfs_da_state		*state = dac->da_state;
>> -	struct xfs_da_state_blk		*blk;
>> -	int				error = 0, retval, done;
>> +	int				error, done;
>>   
>>   	/*
>>   	 * If there is an out-of-line value, de-allocate the blocks.  This is
>> @@ -1300,6 +1298,19 @@ xfs_attr_node_remove_step(
>>   			return error;
>>   	}
>>   
>> +	dac->dela_state |= XFS_DAS_RMT_DONE;
>> +	return error;
>> +}
>> +
>> +STATIC int
>> +xfs_attr_node_remove_join_step(
>> +	struct xfs_delattr_context	*dac)
>> +{
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_state		*state = dac->da_state;
>> +	struct xfs_da_state_blk		*blk;
>> +	int				error, retval;
>> +
>>   	/*
>>   	 * Remove the name and update the hashvals in the tree.
>>   	 */
>> @@ -1317,9 +1328,12 @@ xfs_attr_node_remove_step(
>>   		error = xfs_da3_join(state);
>>   		if (error)
>>   			return error;
>> -		*joined = true;
>> +
>> +		error = -EAGAIN;
>> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>>   	}
>>   
>> +	dac->dela_state |= XFS_DAS_JOIN_DONE;
>>   	return error;
>>   }
>>   
>> @@ -1342,36 +1356,23 @@ xfs_attr_node_removename_iter(
>>   	struct xfs_da_state		*state = dac->da_state;
>>   	int				error;
>>   	struct xfs_inode		*dp = args->dp;
>> -	bool				joined = false;
>>   
>> -	switch (dac->dela_state) {
>> -	case XFS_DAS_UNINIT:
>> -		/*
>> -		 * repeatedly remove remote blocks, remove the entry and join.
>> -		 * returns -EAGAIN or 0 for completion of the step.
>> -		 */
>> -		error = xfs_attr_node_remove_step(dac, &joined);
>> +	if (!(dac->dela_state & XFS_DAS_RMT_DONE)) {
>> +		error = xfs_attr_node_remove_rmt_step(dac);
>>   		if (error)
>>   			goto out;
>> -		if (joined) {
>> -			dac->flags |= XFS_DAC_DEFER_FINISH;
>> -			dac->dela_state = XFS_DAS_RM_SHRINK;
>> -			return -EAGAIN;
>> -		}
>> -		/* fallthrough */
>> -	case XFS_DAS_RM_SHRINK:
>> -		/*
>> -		 * If the result is small enough, push it all into the inode.
>> -		 */
>> -		if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> -			error = xfs_attr_node_shrink(args, state);
>> -		break;
>> -	default:
>> -		ASSERT(0);
>> -		error = -EINVAL;
>> -		goto out;
>>   	}
>>   
>> +	if (!(dac->dela_state & XFS_DAS_JOIN_DONE)) {
>> +		error = xfs_attr_node_remove_join_step(dac);
>> +		if (error)
>> +			goto out;
>> +	}
>> +
>> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> +		error = xfs_attr_node_shrink(args, state);
>> +	ASSERT(error != -EAGAIN);
>> +
>>   out:
>>   	if (state && error != -EAGAIN)
>>   		xfs_da_state_free(state);
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 3154ef4b7833..67e730cd3267 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -151,6 +151,9 @@ enum xfs_delattr_state {
>>   	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
>>   };
>>   
>> +#define XFS_DAS_RMT_DONE	0x1
>> +#define XFS_DAS_JOIN_DONE	0x2
>> +
>>   /*
>>    * Defines for xfs_delattr_context.flags
>>    */
>>
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 05/15] xfs: Add delay ready attr set routines
  2020-12-18  7:29 ` [PATCH v14 05/15] xfs: Add delay ready attr set routines Allison Henderson
@ 2020-12-23  8:00   ` Chandan Babu R
  2020-12-23 16:31     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Chandan Babu R @ 2020-12-23  8:00 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, 18 Dec 2020 00:29:07 -0700, Allison Henderson wrote:
> This patch modifies the attr set routines to be delay ready. This means
> they no longer roll or commit transactions, but instead return -EAGAIN
> to have the calling routine roll and refresh the transaction.  In this
> series, xfs_attr_set_args has become xfs_attr_set_iter, which uses a
> state machine like switch to keep track of where it was when EAGAIN was
> returned. See xfs_attr.h for a more detailed diagram of the states.
> 
> Two new helper functions have been added: xfs_attr_rmtval_set_init and

I don't see xfs_attr_rmtval_set_init() being added in this patch. Maybe it
needs to be removed from the description.

> xfs_attr_rmtval_set_blk.  They provide a subset of logic similar to
> xfs_attr_rmtval_set, but they store the current block in the delay attr
> context to allow the caller to roll the transaction between allocations.
> This helps to simplify and consolidate code used by
> xfs_attr_leaf_addname and xfs_attr_node_addname. xfs_attr_set_args has
> now become a simple loop to refresh the transaction until the operation
> is completed.  Lastly, xfs_attr_rmtval_remove is no longer used, and is
> removed.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c        | 357 ++++++++++++++++++++++++++--------------
>  fs/xfs/libxfs/xfs_attr.h        | 235 +++++++++++++++++++++++++-
>  fs/xfs/libxfs/xfs_attr_remote.c |  98 +++++++----
>  fs/xfs/libxfs/xfs_attr_remote.h |   5 +-
>  fs/xfs/xfs_trace.h              |   1 -
>  5 files changed, 541 insertions(+), 155 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index b6330f9..cd72512 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -44,7 +44,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
>   * Internal routines when attribute list is one block.
>   */
>  STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
> -STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args);
> +STATIC int xfs_attr_leaf_addname(struct xfs_delattr_context *dac);
>  STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
>  STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>  
> @@ -52,12 +52,15 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>   * Internal routines when attribute list is more than one block.
>   */
>  STATIC int xfs_attr_node_get(xfs_da_args_t *args);
> -STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
> +STATIC int xfs_attr_node_addname(struct xfs_delattr_context *dac);
>  STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
>  STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>  				 struct xfs_da_state **state);
>  STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>  STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
> +STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *args, struct xfs_buf *bp);
> +STATIC int xfs_attr_set_iter(struct xfs_delattr_context *dac,
> +			     struct xfs_buf **leaf_bp);
>  
>  int
>  xfs_inode_hasattr(
> @@ -218,8 +221,11 @@ xfs_attr_is_shortform(
>  
>  /*
>   * Attempts to set an attr in shortform, or converts short form to leaf form if
> - * there is not enough room.  If the attr is set, the transaction is committed
> - * and set to NULL.
> + * there is not enough room.  This function is meant to operate as a helper
> + * routine to the delayed attribute functions.  It returns -EAGAIN to indicate
> + * that the calling function should roll the transaction, and then proceed to
> + * add the attr in leaf form.  This subroutine does not expect to be recalled
> + * again like the other delayed attr routines do.
>   */
>  STATIC int
>  xfs_attr_set_shortform(
> @@ -227,16 +233,16 @@ xfs_attr_set_shortform(
>  	struct xfs_buf		**leaf_bp)
>  {
>  	struct xfs_inode	*dp = args->dp;
> -	int			error, error2 = 0;
> +	int			error = 0;
>  
>  	/*
>  	 * Try to add the attr to the attribute list in the inode.
>  	 */
>  	error = xfs_attr_try_sf_addname(dp, args);
> +
> +	/* Should only be 0, -EEXIST or -ENOSPC */
>  	if (error != -ENOSPC) {
> -		error2 = xfs_trans_commit(args->trans);
> -		args->trans = NULL;
> -		return error ? error : error2;
> +		return error;
>  	}
>  	/*
>  	 * It won't fit in the shortform, transform to a leaf block.  GROT:
> @@ -249,18 +255,15 @@ xfs_attr_set_shortform(
>  	/*
>  	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
>  	 * push cannot grab the half-baked leaf buffer and run into problems
> -	 * with the write verifier. Once we're done rolling the transaction we
> -	 * can release the hold and add the attr to the leaf.
> +	 * with the write verifier.
>  	 */
>  	xfs_trans_bhold(args->trans, *leaf_bp);
> -	error = xfs_defer_finish(&args->trans);
> -	xfs_trans_bhold_release(args->trans, *leaf_bp);
> -	if (error) {
> -		xfs_trans_brelse(args->trans, *leaf_bp);
> -		return error;
> -	}
>  
> -	return 0;
> +	/*
> +	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
> +	 * fork to leaf format and will restart with the leaf add.
> +	 */
> +	return -EAGAIN;
>  }
>  
>  /*
> @@ -268,7 +271,7 @@ xfs_attr_set_shortform(
>   * also checks for a defer finish.  Transaction is finished and rolled as
>   * needed, and returns true of false if the delayed operation should continue.
>   */
> -int
> +STATIC int
>  xfs_attr_trans_roll(
>  	struct xfs_delattr_context	*dac)
>  {
> @@ -298,34 +301,95 @@ int
>  xfs_attr_set_args(
>  	struct xfs_da_args	*args)
>  {
> -	struct xfs_inode	*dp = args->dp;
> -	struct xfs_buf          *leaf_bp = NULL;
> -	int			error = 0;
> +	struct xfs_buf			*leaf_bp = NULL;
> +	int				error = 0;
> +	struct xfs_delattr_context	dac = {
> +		.da_args	= args,
> +	};
> +
> +	do {
> +		error = xfs_attr_set_iter(&dac, &leaf_bp);
> +		if (error != -EAGAIN)
> +			break;
> +
> +		error = xfs_attr_trans_roll(&dac);
> +		if (error)
> +			return error;
> +	} while (true);
> +
> +	return error;
> +}
> +
> +/*
> + * Set the attribute specified in @args.
> + * This routine is meant to function as a delayed operation, and may return
> + * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
> + * to handle this, and recall the function until a successful error code is
> + * returned.
> + */
> +STATIC int
> +xfs_attr_set_iter(
> +	struct xfs_delattr_context	*dac,
> +	struct xfs_buf			**leaf_bp)
> +{
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_inode		*dp = args->dp;
> +	int				error = 0;
> +
> +	/* State machine switch */
> +	switch (dac->dela_state) {
> +	case XFS_DAS_FLIP_LFLAG:
> +	case XFS_DAS_FOUND_LBLK:
> +	case XFS_DAS_RM_LBLK:
> +		return xfs_attr_leaf_addname(dac);
> +	case XFS_DAS_FOUND_NBLK:
> +	case XFS_DAS_FLIP_NFLAG:
> +	case XFS_DAS_ALLOC_NODE:
> +		return xfs_attr_node_addname(dac);
> +	case XFS_DAS_UNINIT:
> +		break;
> +	default:
> +		ASSERT(dac->dela_state != XFS_DAS_RM_SHRINK);
> +		break;
> +	}
>  
>  	/*
>  	 * If the attribute list is already in leaf format, jump straight to
>  	 * leaf handling.  Otherwise, try to add the attribute to the shortform
>  	 * list; if there's no room then convert the list to leaf format and try
> -	 * again.
> +	 * again. No need to set state as we will be in leaf form when we come
> +	 * back
>  	 */
>  	if (xfs_attr_is_shortform(dp)) {
>  
>  		/*
> -		 * If the attr was successfully set in shortform, the
> -		 * transaction is committed and set to NULL.  Otherwise, is it
> -		 * converted from shortform to leaf, and the transaction is
> -		 * retained.
> +		 * If the attr was successfully set in shortform, no need to
> +		 * continue.  Otherwise, is it converted from shortform to leaf
> +		 * and -EAGAIN is returned.
>  		 */
> -		error = xfs_attr_set_shortform(args, &leaf_bp);
> -		if (error || !args->trans)
> -			return error;
> +		error = xfs_attr_set_shortform(args, leaf_bp);
> +		if (error == -EAGAIN)
> +			dac->flags |= XFS_DAC_DEFER_FINISH;
> +
> +		return error;
>  	}
>  
> -	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> -		error = xfs_attr_leaf_addname(args);
> -		if (error != -ENOSPC)
> -			return error;
> +	/*
> +	 * After a shortform to leaf conversion, we need to hold the leaf and
> +	 * cycle out the transaction.  When we get back, we need to release
> +	 * the leaf to release the hold on the leaf buffer.
> +	 */
> +	if (*leaf_bp != NULL) {
> +		xfs_trans_bhold_release(args->trans, *leaf_bp);
> +		*leaf_bp = NULL;
> +	}
> +
> +	if (!xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +		return xfs_attr_node_addname(dac);
>  
> +	error = xfs_attr_leaf_try_add(args, *leaf_bp);
> +	switch (error) {
> +	case -ENOSPC:
>  		/*
>  		 * Promote the attribute list to the Btree format.
>  		 */
> @@ -334,25 +398,22 @@ xfs_attr_set_args(
>  			return error;
>  
>  		/*
> -		 * Finish any deferred work items and roll the transaction once
> -		 * more.  The goal here is to call node_addname with the inode
> -		 * and transaction in the same state (inode locked and joined,
> -		 * transaction clean) no matter how we got to this step.
> -		 */
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			return error;
> -
> -		/*
> -		 * Commit the current trans (including the inode) and
> -		 * start a new one.
> +		 * Finish any deferred work items and roll the
> +		 * transaction once more.  The goal here is to call
> +		 * node_addname with the inode and transaction in the
> +		 * same state (inode locked and joined, transaction
> +		 * clean) no matter how we got to this step.
> +		 *
> +		 * At this point, we are still in XFS_DAS_UNINIT, but
> +		 * when we come back, we'll be a node, so we'll fall
> +		 * down into the node handling code below

 ... node handling code above?.

Apart from the above nits I don't see any issues w.r.t the logical correctness
of the code. Hence,

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

>  		 */
> -		error = xfs_trans_roll_inode(&args->trans, dp);
> -		if (error)
> -			return error;
> +		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		return -EAGAIN;
> +	case 0:
> +		dac->dela_state = XFS_DAS_FOUND_LBLK;
> +		return -EAGAIN;
>  	}
> -
> -	error = xfs_attr_node_addname(args);
>  	return error;
>  }
>  
> @@ -728,28 +789,30 @@ xfs_attr_leaf_try_add(
>   *
>   * This leaf block cannot have a "remote" value, we only call this routine
>   * if bmap_one_block() says there is only one block (ie: no remote blks).
> + *
> + * This routine is meant to function as a delayed operation, and may return
> + * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
> + * to handle this, and recall the function until a successful error code is
> + * returned.
>   */
>  STATIC int
>  xfs_attr_leaf_addname(
> -	struct xfs_da_args	*args)
> +	struct xfs_delattr_context	*dac)
>  {
> -	int			error, forkoff;
> -	struct xfs_buf		*bp = NULL;
> -	struct xfs_inode	*dp = args->dp;
> -
> -	trace_xfs_attr_leaf_addname(args);
> -
> -	error = xfs_attr_leaf_try_add(args, bp);
> -	if (error)
> -		return error;
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_buf			*bp = NULL;
> +	int				error, forkoff;
> +	struct xfs_inode		*dp = args->dp;
>  
> -	/*
> -	 * Commit the transaction that added the attr name so that
> -	 * later routines can manage their own transactions.
> -	 */
> -	error = xfs_trans_roll_inode(&args->trans, dp);
> -	if (error)
> -		return error;
> +	/* State machine switch */
> +	switch (dac->dela_state) {
> +	case XFS_DAS_FLIP_LFLAG:
> +		goto das_flip_flag;
> +	case XFS_DAS_RM_LBLK:
> +		goto das_rm_lblk;
> +	default:
> +		break;
> +	}
>  
>  	/*
>  	 * If there was an out-of-line value, allocate the blocks we
> @@ -757,12 +820,34 @@ xfs_attr_leaf_addname(
>  	 * after we create the attribute so that we don't overflow the
>  	 * maximum size of a transaction and/or hit a deadlock.
>  	 */
> -	if (args->rmtblkno > 0) {
> -		error = xfs_attr_rmtval_set(args);
> +
> +	/* Open coded xfs_attr_rmtval_set without trans handling */
> +	if ((dac->flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
> +		dac->flags |= XFS_DAC_LEAF_ADDNAME_INIT;
> +		if (args->rmtblkno > 0) {
> +			error = xfs_attr_rmtval_find_space(dac);
> +			if (error)
> +				return error;
> +		}
> +	}
> +
> +	/*
> +	 * Roll through the "value", allocating blocks on disk as
> +	 * required.
> +	 */
> +	if (dac->blkcnt > 0) {
> +		error = xfs_attr_rmtval_set_blk(dac);
>  		if (error)
>  			return error;
> +
> +		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		return -EAGAIN;
>  	}
>  
> +	error = xfs_attr_rmtval_set_value(args);
> +	if (error)
> +		return error;
> +
>  	if (!(args->op_flags & XFS_DA_OP_RENAME)) {
>  		/*
>  		 * Added a "remote" value, just clear the incomplete flag.
> @@ -782,29 +867,30 @@ xfs_attr_leaf_addname(
>  	 * In a separate transaction, set the incomplete flag on the "old" attr
>  	 * and clear the incomplete flag on the "new" attr.
>  	 */
> -
>  	error = xfs_attr3_leaf_flipflags(args);
>  	if (error)
>  		return error;
>  	/*
>  	 * Commit the flag value change and start the next trans in series.
>  	 */
> -	error = xfs_trans_roll_inode(&args->trans, args->dp);
> -	if (error)
> -		return error;
> -
> +	dac->dela_state = XFS_DAS_FLIP_LFLAG;
> +	return -EAGAIN;
> +das_flip_flag:
>  	/*
>  	 * Dismantle the "old" attribute/value pair by removing a "remote" value
>  	 * (if it exists).
>  	 */
>  	xfs_attr_restore_rmt_blk(args);
>  
> -	if (args->rmtblkno) {
> -		error = xfs_attr_rmtval_invalidate(args);
> -		if (error)
> -			return error;
> +	error = xfs_attr_rmtval_invalidate(args);
> +	if (error)
> +		return error;
>  
> -		error = xfs_attr_rmtval_remove(args);
> +	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
> +	dac->dela_state = XFS_DAS_RM_LBLK;
> +das_rm_lblk:
> +	if (args->rmtblkno) {
> +		error = __xfs_attr_rmtval_remove(dac);
>  		if (error)
>  			return error;
>  	}
> @@ -970,23 +1056,38 @@ xfs_attr_node_hasname(
>   *
>   * "Remote" attribute values confuse the issue and atomic rename operations
>   * add a whole extra layer of confusion on top of that.
> + *
> + * This routine is meant to function as a delayed operation, and may return
> + * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
> + * to handle this, and recall the function until a successful error code is
> + *returned.
>   */
>  STATIC int
>  xfs_attr_node_addname(
> -	struct xfs_da_args	*args)
> +	struct xfs_delattr_context	*dac)
>  {
> -	struct xfs_da_state	*state;
> -	struct xfs_da_state_blk	*blk;
> -	struct xfs_inode	*dp;
> -	int			retval, error;
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_state		*state = NULL;
> +	struct xfs_da_state_blk		*blk;
> +	int				retval = 0;
> +	int				error = 0;
>  
>  	trace_xfs_attr_node_addname(args);
>  
> -	/*
> -	 * Fill in bucket of arguments/results/context to carry around.
> -	 */
> -	dp = args->dp;
> -restart:
> +	/* State machine switch */
> +	switch (dac->dela_state) {
> +	case XFS_DAS_FLIP_NFLAG:
> +		goto das_flip_flag;
> +	case XFS_DAS_FOUND_NBLK:
> +		goto das_found_nblk;
> +	case XFS_DAS_ALLOC_NODE:
> +		goto das_alloc_node;
> +	case XFS_DAS_RM_NBLK:
> +		goto das_rm_nblk;
> +	default:
> +		break;
> +	}
> +
>  	/*
>  	 * Search to see if name already exists, and get back a pointer
>  	 * to where it should go.
> @@ -1032,19 +1133,16 @@ xfs_attr_node_addname(
>  			error = xfs_attr3_leaf_to_node(args);
>  			if (error)
>  				goto out;
> -			error = xfs_defer_finish(&args->trans);
> -			if (error)
> -				goto out;
>  
>  			/*
> -			 * Commit the node conversion and start the next
> -			 * trans in the chain.
> +			 * Now that we have converted the leaf to a node, we can
> +			 * roll the transaction, and try xfs_attr3_leaf_add
> +			 * again on re-entry.  No need to set dela_state to do
> +			 * this. dela_state is still unset by this function at
> +			 * this point.
>  			 */
> -			error = xfs_trans_roll_inode(&args->trans, dp);
> -			if (error)
> -				goto out;
> -
> -			goto restart;
> +			dac->flags |= XFS_DAC_DEFER_FINISH;
> +			return -EAGAIN;
>  		}
>  
>  		/*
> @@ -1056,9 +1154,7 @@ xfs_attr_node_addname(
>  		error = xfs_da3_split(state);
>  		if (error)
>  			goto out;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			goto out;
> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>  	} else {
>  		/*
>  		 * Addition succeeded, update Btree hashvals.
> @@ -1066,6 +1162,11 @@ xfs_attr_node_addname(
>  		xfs_da3_fixhashpath(state, &state->path);
>  	}
>  
> +	if (!args->rmtblkno && !(args->op_flags & XFS_DA_OP_RENAME)) {
> +		retval = error;
> +		goto out;
> +	}
> +
>  	/*
>  	 * Kill the state structure, we're done with it and need to
>  	 * allow the buffers to come back later.
> @@ -1073,13 +1174,9 @@ xfs_attr_node_addname(
>  	xfs_da_state_free(state);
>  	state = NULL;
>  
> -	/*
> -	 * Commit the leaf addition or btree split and start the next
> -	 * trans in the chain.
> -	 */
> -	error = xfs_trans_roll_inode(&args->trans, dp);
> -	if (error)
> -		goto out;
> +	dac->dela_state = XFS_DAS_FOUND_NBLK;
> +	return -EAGAIN;
> +das_found_nblk:
>  
>  	/*
>  	 * If there was an out-of-line value, allocate the blocks we
> @@ -1088,7 +1185,27 @@ xfs_attr_node_addname(
>  	 * maximum size of a transaction and/or hit a deadlock.
>  	 */
>  	if (args->rmtblkno > 0) {
> -		error = xfs_attr_rmtval_set(args);
> +		/* Open coded xfs_attr_rmtval_set without trans handling */
> +		error = xfs_attr_rmtval_find_space(dac);
> +		if (error)
> +			return error;
> +
> +		/*
> +		 * Roll through the "value", allocating blocks on disk as
> +		 * required.  Set the state in case of -EAGAIN return code
> +		 */
> +		dac->dela_state = XFS_DAS_ALLOC_NODE;
> +das_alloc_node:
> +		if (dac->blkcnt > 0) {
> +			error = xfs_attr_rmtval_set_blk(dac);
> +			if (error)
> +				return error;
> +
> +			dac->flags |= XFS_DAC_DEFER_FINISH;
> +			return -EAGAIN;
> +		}
> +
> +		error = xfs_attr_rmtval_set_value(args);
>  		if (error)
>  			return error;
>  	}
> @@ -1118,22 +1235,24 @@ xfs_attr_node_addname(
>  	/*
>  	 * Commit the flag value change and start the next trans in series
>  	 */
> -	error = xfs_trans_roll_inode(&args->trans, args->dp);
> -	if (error)
> -		goto out;
> -
> +	dac->dela_state = XFS_DAS_FLIP_NFLAG;
> +	return -EAGAIN;
> +das_flip_flag:
>  	/*
>  	 * Dismantle the "old" attribute/value pair by removing a "remote" value
>  	 * (if it exists).
>  	 */
>  	xfs_attr_restore_rmt_blk(args);
>  
> -	if (args->rmtblkno) {
> -		error = xfs_attr_rmtval_invalidate(args);
> -		if (error)
> -			return error;
> +	error = xfs_attr_rmtval_invalidate(args);
> +	if (error)
> +		return error;
>  
> -		error = xfs_attr_rmtval_remove(args);
> +	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
> +	dac->dela_state = XFS_DAS_RM_NBLK;
> +das_rm_nblk:
> +	if (args->rmtblkno) {
> +		error = __xfs_attr_rmtval_remove(dac);
>  		if (error)
>  			return error;
>  	}
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 3154ef4..e101238 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -135,6 +135,227 @@ struct xfs_attr_list_context {
>   *              v
>   *            done
>   *
> + *
> + * Below is a state machine diagram for attr set operations.
> + *
> + * It seems the challenge with undertanding this system comes from trying to
> + * absorb the state machine all at once, when really one should only be looking
> + * at it with in the context of a single function.  Once a state sensitive
> + * function is called, the idea is that it "takes ownership" of the
> + * statemachine. It isn't concerned with the states that may have belonged to
> + * it's calling parent.  Only the states relevant to itself or any other
> + * subroutines there in.  Once a calling function hands off the statemachine to
> + * a subroutine, it needs to respect the simple rule that it doesn't "own" the
> + * statemachine anymore, and it's the responsibility of that calling function to
> + * propagate the -EAGAIN back up the call stack.  Upon reentry, it is committed
> + * to re-calling that subroutine until it returns something other than -EAGAIN.
> + * Once that subroutine signals completion (by returning anything other than
> + * -EAGAIN), the calling function can resume using the statemachine.
> + *
> + *  xfs_attr_set_iter()
> + *              │
> + *              v
> + *   ┌─y─ has an attr fork?
> + *   │          |
> + *   │          n
> + *   │          |
> + *   │          V
> + *   │       add a fork
> + *   │          │
> + *   └──────────┤
> + *              │
> + *              V
> + *   ┌─n── is shortform?
> + *   │          |
> + *   │          y
> + *   │          |
> + *   │          V
> + *   │ xfs_attr_set_shortform
> + *   │          |
> + *   │          V
> + *   │      had enough ──y──> done
> + *   │        space?
> + *   │          │
> + *   │          n
> + *   │          │
> + *   │          V
> + *   │     return -EAGAIN
> + *   │   Re-enter in leaf form
> + *   │          │
> + *   └──────────┤
> + *              │
> + *              V
> + *       release leaf buffer
> + *          if needed
> + *              │
> + *              V
> + *   ┌───n── fork has
> + *   │      only 1 blk?
> + *   │          │
> + *   │          y
> + *   │          │
> + *   │          v
> + *   │ xfs_attr_leaf_try_add()
> + *   │                  │
> + *   │                  v
> + *   │              had enough
> + *   │       ┌────n── space?
> + *   │       │          │
> + *   │       v          │
> + *   │ return -EAGAIN   │
> + *   │  re-enter in     y
> + *   │   node form      │
> + *   │       │          │
> + *   ├───────┘          │
> + *   │                  v
> + *   │  XFS_DAS_FOUND_LBLK ──┐
> + *   │                       │
> + *   │  XFS_DAS_FLIP_LFLAG ──┤
> + *   │  (subroutine state)   │
> + *   │                       │
> + *   │                       └─>xfs_attr_leaf_addname()
> + *   │                                │
> + *   │                                v
> + *   │                     ┌──first time through?
> + *   │                     │          │
> + *   │                     │          y
> + *   │                     │          │
> + *   │                     n          v
> + *   │                     │    if we have rmt blks
> + *   │                     │    find space for them
> + *   │                     │          │
> + *   │                     └──────────┤
> + *   │                                │
> + *   │                                v
> + *   │                           still have
> + *   │                     ┌─n─ blks to alloc? <──┐
> + *   │                     │          │           │
> + *   │                     │          y           │
> + *   │                     │          │           │
> + *   │                     │          v           │
> + *   │                     │     alloc one blk    │
> + *   │                     │     return -EAGAIN ──┘
> + *   │                     │    re-enter with one
> + *   │                     │    less blk to alloc
> + *   │                     │
> + *   │                     │
> + *   │                     └───> set the rmt
> + *   │                              value
> + *   │                                │
> + *   │                                v
> + *   │                              was this
> + *   │                             a rename? ──n─┐
> + *   │                                │          │
> + *   │                                y          │
> + *   │                                │          │
> + *   │                                v          │
> + *   │                          flip incomplete  │
> + *   │                              flag         │
> + *   │                                │          │
> + *   │                                v          │
> + *   │                        XFS_DAS_FLIP_LFLAG │
> + *   │                                │          │
> + *   │                                v          │
> + *   │                              remove       │
> + *   │          XFS_DAS_RM_LBLK ─> old name      │
> + *   │                   ^            │          │
> + *   │                   │            v          │
> + *   │                   └──────y── more to      │
> + *   │                              remove       │
> + *   │                                │          │
> + *   │                                n          │
> + *   │                                │          │
> + *   │                                v          │
> + *   │                               done <──────┘
> + *   └──> XFS_DAS_FOUND_NBLK ──┐
> + *        (subroutine state)   │
> + *                             │
> + *        XFS_DAS_ALLOC_NODE ──┤
> + *        (subroutine state)   │
> + *                             │
> + *        XFS_DAS_FLIP_NFLAG ──┤
> + *        (subroutine state)   │
> + *                             │
> + *                             └─>xfs_attr_node_addname()
> + *                                     │
> + *                                     v
> + *                               determine if this
> + *                              is create or rename
> + *                            find space to store attr
> + *                                     │
> + *                                     v
> + *               ┌──────n──── fits in a node leaf?
> + *               │               ^     │
> + *       single leaf node?       │     │
> + *         │            │        │     y
> + *         n            y        │     │
> + *         │            │        │     v
> + *         v            v        │   update
> + *     split if   grow the leaf ─┘  hashvals
> + *      needed     return -EAGAIN      │
> + *         │      retry leaf add       │
> + *         │        on reentry         │
> + *         │                           │
> + *         └───────────────────────────┤
> + *                                     v
> + *                                need to alloc ──n──> done
> + *                                or flip flag?
> + *                                     │
> + *                                     y
> + *                                     │
> + *                                     v
> + *                             XFS_DAS_FOUND_NBLK
> + *                                     │
> + *                                     v
> + *                       ┌─────n──  need to
> + *                       │        alloc blks?
> + *                       │             │
> + *                       │             y
> + *                       │             │
> + *                       │             v
> + *                       │        find space
> + *                       │             │
> + *                       │             v
> + *                       │  ┌─>XFS_DAS_ALLOC_NODE
> + *                       │  │          │
> + *                       │  │          v
> + *                       │  │      alloc blk
> + *                       │  │          │
> + *                       │  │          v
> + *                       │  └──y── need to alloc
> + *                       │         more blocks?
> + *                       │             │
> + *                       │             n
> + *                       │             │
> + *                       │             v
> + *                       │      set the rmt value
> + *                       │             │
> + *                       │             v
> + *                       │          was this
> + *                       └────────> a rename? ──n─┐
> + *                                     │          │
> + *                                     y          │
> + *                                     │          │
> + *                                     v          │
> + *                               flip incomplete  │
> + *                                   flag         │
> + *                                     │          │
> + *                                     v          │
> + *                             XFS_DAS_FLIP_NFLAG │
> + *                                     │          │
> + *                                     v          │
> + *                                   remove       │
> + *               XFS_DAS_RM_NBLK ─> old name      │
> + *                        ^            │          │
> + *                        │            v          │
> + *                        └──────y── more to      │
> + *                                   remove       │
> + *                                     │          │
> + *                                     n          │
> + *                                     │          │
> + *                                     v          │
> + *                                    done <──────┘
> + *
>   */
>  
>  /*
> @@ -149,12 +370,20 @@ struct xfs_attr_list_context {
>  enum xfs_delattr_state {
>  	XFS_DAS_UNINIT		= 0,  /* No state has been set yet */
>  	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
> +	XFS_DAS_FOUND_LBLK,	      /* We found leaf blk for attr */
> +	XFS_DAS_FOUND_NBLK,	      /* We found node blk for attr */
> +	XFS_DAS_FLIP_LFLAG,	      /* Flipped leaf INCOMPLETE attr flag */
> +	XFS_DAS_RM_LBLK,	      /* A rename is removing leaf blocks */
> +	XFS_DAS_ALLOC_NODE,	      /* We are allocating node blocks */
> +	XFS_DAS_FLIP_NFLAG,	      /* Flipped node INCOMPLETE attr flag */
> +	XFS_DAS_RM_NBLK,	      /* A rename is removing node blocks */
>  };
>  
>  /*
>   * Defines for xfs_delattr_context.flags
>   */
>  #define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
> +#define XFS_DAC_LEAF_ADDNAME_INIT	0x02 /* xfs_attr_leaf_addname init*/
>  
>  /*
>   * Context used for keeping track of delayed attribute operations
> @@ -162,6 +391,11 @@ enum xfs_delattr_state {
>  struct xfs_delattr_context {
>  	struct xfs_da_args      *da_args;
>  
> +	/* Used in xfs_attr_rmtval_set_blk to roll through allocating blocks */
> +	struct xfs_bmbt_irec	map;
> +	xfs_dablk_t		lblkno;
> +	int			blkcnt;
> +
>  	/* Used in xfs_attr_node_removename to roll through removing blocks */
>  	struct xfs_da_state     *da_state;
>  
> @@ -188,7 +422,6 @@ int xfs_attr_set_args(struct xfs_da_args *args);
>  int xfs_has_attr(struct xfs_da_args *args);
>  int xfs_attr_remove_args(struct xfs_da_args *args);
>  int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
> -int xfs_attr_trans_roll(struct xfs_delattr_context *dac);
>  bool xfs_attr_namecheck(const void *name, size_t length);
>  void xfs_delattr_context_init(struct xfs_delattr_context *dac,
>  			      struct xfs_da_args *args);
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index f09820c..6af86bf 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -441,7 +441,7 @@ xfs_attr_rmtval_get(
>   * Find a "hole" in the attribute address space large enough for us to drop the
>   * new attribute's value into
>   */
> -STATIC int
> +int
>  xfs_attr_rmt_find_hole(
>  	struct xfs_da_args	*args)
>  {
> @@ -468,7 +468,7 @@ xfs_attr_rmt_find_hole(
>  	return 0;
>  }
>  
> -STATIC int
> +int
>  xfs_attr_rmtval_set_value(
>  	struct xfs_da_args	*args)
>  {
> @@ -628,6 +628,69 @@ xfs_attr_rmtval_set(
>  }
>  
>  /*
> + * Find a hole for the attr and store it in the delayed attr context.  This
> + * initializes the context to roll through allocating an attr extent for a
> + * delayed attr operation
> + */
> +int
> +xfs_attr_rmtval_find_space(
> +	struct xfs_delattr_context	*dac)
> +{
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_bmbt_irec		*map = &dac->map;
> +	int				error;
> +
> +	dac->lblkno = 0;
> +	dac->blkcnt = 0;
> +	args->rmtblkcnt = 0;
> +	args->rmtblkno = 0;
> +	memset(map, 0, sizeof(struct xfs_bmbt_irec));
> +
> +	error = xfs_attr_rmt_find_hole(args);
> +	if (error)
> +		return error;
> +
> +	dac->blkcnt = args->rmtblkcnt;
> +	dac->lblkno = args->rmtblkno;
> +
> +	return 0;
> +}
> +
> +/*
> + * Write one block of the value associated with an attribute into the
> + * out-of-line buffer that we have defined for it. This is similar to a subset
> + * of xfs_attr_rmtval_set, but records the current block to the delayed attr
> + * context, and leaves transaction handling to the caller.
> + */
> +int
> +xfs_attr_rmtval_set_blk(
> +	struct xfs_delattr_context	*dac)
> +{
> +	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_inode		*dp = args->dp;
> +	struct xfs_bmbt_irec		*map = &dac->map;
> +	int nmap;
> +	int error;
> +
> +	nmap = 1;
> +	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)dac->lblkno,
> +				dac->blkcnt, XFS_BMAPI_ATTRFORK, args->total,
> +				map, &nmap);
> +	if (error)
> +		return error;
> +
> +	ASSERT(nmap == 1);
> +	ASSERT((map->br_startblock != DELAYSTARTBLOCK) &&
> +	       (map->br_startblock != HOLESTARTBLOCK));
> +
> +	/* roll attribute extent map forwards */
> +	dac->lblkno += map->br_blockcount;
> +	dac->blkcnt -= map->br_blockcount;
> +
> +	return 0;
> +}
> +
> +/*
>   * Remove the value associated with an attribute by deleting the
>   * out-of-line buffer that it is stored on.
>   */
> @@ -669,37 +732,6 @@ xfs_attr_rmtval_invalidate(
>  }
>  
>  /*
> - * Remove the value associated with an attribute by deleting the
> - * out-of-line buffer that it is stored on.
> - */
> -int
> -xfs_attr_rmtval_remove(
> -	struct xfs_da_args		*args)
> -{
> -	int				error;
> -	struct xfs_delattr_context	dac  = {
> -		.da_args	= args,
> -	};
> -
> -	trace_xfs_attr_rmtval_remove(args);
> -
> -	/*
> -	 * Keep de-allocating extents until the remote-value region is gone.
> -	 */
> -	do {
> -		error = __xfs_attr_rmtval_remove(&dac);
> -		if (error != -EAGAIN)
> -			break;
> -
> -		error = xfs_attr_trans_roll(&dac);
> -		if (error)
> -			return error;
> -	} while (true);
> -
> -	return error;
> -}
> -
> -/*
>   * Remove the value associated with an attribute by deleting the out-of-line
>   * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the
>   * transaction and re-call the function
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
> index 002fd30..8ad68d5 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.h
> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
> @@ -10,9 +10,12 @@ int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
>  
>  int xfs_attr_rmtval_get(struct xfs_da_args *args);
>  int xfs_attr_rmtval_set(struct xfs_da_args *args);
> -int xfs_attr_rmtval_remove(struct xfs_da_args *args);
>  int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
>  		xfs_buf_flags_t incore_flags);
>  int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
>  int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
> +int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
> +int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
> +int xfs_attr_rmtval_set_blk(struct xfs_delattr_context *dac);
> +int xfs_attr_rmtval_find_space(struct xfs_delattr_context *dac);
>  #endif /* __XFS_ATTR_REMOTE_H__ */
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index 5a263ae..9074b8b 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -1943,7 +1943,6 @@ DEFINE_ATTR_EVENT(xfs_attr_refillstate);
>  
>  DEFINE_ATTR_EVENT(xfs_attr_rmtval_get);
>  DEFINE_ATTR_EVENT(xfs_attr_rmtval_set);
> -DEFINE_ATTR_EVENT(xfs_attr_rmtval_remove);
>  
>  #define DEFINE_DA_EVENT(name) \
>  DEFINE_EVENT(xfs_da_class, name, \
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-23  5:20         ` Allison Henderson
@ 2020-12-23 14:16           ` Brian Foster
  2020-12-24  8:23             ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2020-12-23 14:16 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Tue, Dec 22, 2020 at 10:20:16PM -0700, Allison Henderson wrote:
> 
> 
> On 12/22/20 11:44 AM, Brian Foster wrote:
> > On Tue, Dec 22, 2020 at 12:20:20PM -0500, Brian Foster wrote:
> > > On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
> > > > On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
> > > > > This patch modifies the attr remove routines to be delay ready. This
> > > > > means they no longer roll or commit transactions, but instead return
> > > > > -EAGAIN to have the calling routine roll and refresh the transaction. In
> > > > > this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> > > > > uses a sort of state machine like switch to keep track of where it was
> > > > > when EAGAIN was returned. xfs_attr_node_removename has also been
> > > > > modified to use the switch, and a new version of xfs_attr_remove_args
> > > > > consists of a simple loop to refresh the transaction until the operation
> > > > > is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> > > > > transaction where ever the existing code used to.
> > > > > 
> > > > > Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> > > > > version __xfs_attr_rmtval_remove. We will rename
> > > > > __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> > > > > done.
> > > > > 
> > > > > xfs_attr_rmtval_remove itself is still in use by the set routines (used
> > > > > during a rename).  For reasons of preserving existing function, we
> > > > > modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> > > > > set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> > > > > the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> > > > > used and will be removed.
> > > > > 
> > > > > This patch also adds a new struct xfs_delattr_context, which we will use
> > > > > to keep track of the current state of an attribute operation. The new
> > > > > xfs_delattr_state enum is used to track various operations that are in
> > > > > progress so that we know not to repeat them, and resume where we left
> > > > > off before EAGAIN was returned to cycle out the transaction. Other
> > > > > members take the place of local variables that need to retain their
> > > > > values across multiple function recalls.  See xfs_attr.h for a more
> > > > > detailed diagram of the states.
> > > > > 
> > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > ---
> > > > 
> > > > I started with a couple small comments on this patch but inevitably
> > > > started thinking more about the factoring again and ended up with a
> > > > couple patches on top. The first is more of some small tweaks and
> > > > open-coding that IMO makes this patch a bit easier to follow. The
> > > > second is more of an RFC so I'll follow up with that in a second email.
> > > > I'm curious what folks' thoughts might be on either. Also note that I'm
> > > > primarily focusing on code structure and whatnot here, so these are fast
> > > > and loose, compile tested only and likely to be broken.
> > > > 
> > > 
> > > ... and here's the second diff (applies on top of the first).
> > > 
> > > This one popped up after staring at the previous changes for a bit and
> > > wondering whether using "done flags" might make the whole thing easier
> > > to follow than incremental state transitions. I think the attr remove
> > > path is easy enough to follow with either method, but the attr set path
> > > is a beast and so this is more with that in mind. Initial thoughts?
> > > 
> > 
> > Eh, the more I stare at the attr set code I'm not sure this by itself is
> > much of an improvement. It helps in some areas, but there are so many
> > transaction rolls embedded throughout at different levels that a larger
> > rework of the code is probably still necessary. Anyways, this was just a
> > random thought for now..
> > 
> > Brian
> 
> No worries, I know the feeling :-)  The set works and all, but I do think
> there is struggle around trying to find a particularly pleasent looking
> presentation of it.  Especially when we get into the set path, it's a bit
> more complex.  I may pick through the patches you habe here and pick up the
> whitespace cleanups and other style adjustments if people prefer it that
> way.  The good news is, a lot of the *_args routines are supposed to
> disappear at the end of the set, so there's not really a need to invest too
> much in them I suppose. It may help to jump to the "Set up infastructure"
> patch too.  I've expanded the diagram to try and help illustrait the code
> flow a bit, so that may help with following the code flow.
> 

I'm sure.. :P Note that the first patch was more smaller tweaks and
refactoring with the existing model in mind. For the set path, the
challenge IMO is to make the code generally more readable. I think the
remove path accomplishes this for the most part because the states and
whatnot are fairly low overhead on top of the existing complexity. This
changes considerably for the set path, not so much due to the mechanism
but because the baseline code is so fragmented and complex from the
start. I am slightly concerned that bolting state management onto the
current code as such might make it harder to grok and clean up after the
fact, but I could be wrong about that (my hope was certainly for the
opposite).

Regardless, that had me shifting focus a bit and playing around with the
current upstream code as opposed to shifting around your code. ISTM that
there is some commonality across the various set codepaths and perhaps
there is potential to simplify things notably _before_ applying the
state management scheme. I've appended a new diff below (based on
for-next) that starts to demonstrate what I mean. Note again that this
is similarly fast and loose as I've knowingly threw away some quirks of
the code (i.e. leaf buffer bhold) for the purpose of quickly trying to
explore/POC whether the factoring might be sane and plausible.

In summary, this combines the "try addname" part of each xattr format to
fall under a single transaction rolling loop such that I think the
resulting function could become one high level state. I ran out of time
for working through the rest, but from a read through it seems there's
at least a chance we could continue with similar refactoring and
reduction to a fewer number of generic states (vs. more format-specific
states). For example, the remaining parts of the set operation all seem
to have something along the lines of the following high level
components:

- remote value block allocation (and value set)
- if rename == true, clear flag and done
- if rename == false, flip flags
	- remove old xattr (i.e., similar to xattr remove)

... where much of that code looks remarkably similar across the
different leaf/node code branches. So I'm curious what you and others
following along might think about something like this as an intermediate
step...

Brian

--- 8< ---

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index fd8e6418a0d3..eff8833d5303 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -58,6 +58,8 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
 				 struct xfs_da_state **state);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
+STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *, struct xfs_buf *);
+STATIC int xfs_attr_node_addname_work(struct xfs_da_args *);
 
 int
 xfs_inode_hasattr(
@@ -216,116 +218,93 @@ xfs_attr_is_shortform(
 		ip->i_afp->if_nextents == 0);
 }
 
-/*
- * Attempts to set an attr in shortform, or converts short form to leaf form if
- * there is not enough room.  If the attr is set, the transaction is committed
- * and set to NULL.
- */
-STATIC int
-xfs_attr_set_shortform(
+int
+xfs_attr_set_fmt(
 	struct xfs_da_args	*args,
-	struct xfs_buf		**leaf_bp)
+	bool			*done)
 {
 	struct xfs_inode	*dp = args->dp;
-	int			error, error2 = 0;
+	struct xfs_buf		*leaf_bp = NULL;
+	int			error = 0;
 
-	/*
-	 * Try to add the attr to the attribute list in the inode.
-	 */
-	error = xfs_attr_try_sf_addname(dp, args);
-	if (error != -ENOSPC) {
-		error2 = xfs_trans_commit(args->trans);
-		args->trans = NULL;
-		return error ? error : error2;
+	if (xfs_attr_is_shortform(dp)) {
+		error = xfs_attr_try_sf_addname(dp, args);
+		if (!error)
+			*done = true;
+		if (error != -ENOSPC)
+			return error;
+
+		error = xfs_attr_shortform_to_leaf(args, &leaf_bp);
+		if (error)
+			return error;
+		return -EAGAIN;
 	}
-	/*
-	 * It won't fit in the shortform, transform to a leaf block.  GROT:
-	 * another possible req'mt for a double-split btree op.
-	 */
-	error = xfs_attr_shortform_to_leaf(args, leaf_bp);
-	if (error)
-		return error;
 
-	/*
-	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
-	 * push cannot grab the half-baked leaf buffer and run into problems
-	 * with the write verifier. Once we're done rolling the transaction we
-	 * can release the hold and add the attr to the leaf.
-	 */
-	xfs_trans_bhold(args->trans, *leaf_bp);
-	error = xfs_defer_finish(&args->trans);
-	xfs_trans_bhold_release(args->trans, *leaf_bp);
-	if (error) {
-		xfs_trans_brelse(args->trans, *leaf_bp);
-		return error;
+	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
+		struct xfs_buf	*bp = NULL;
+
+		error = xfs_attr_leaf_try_add(args, bp);
+		if (error != -ENOSPC)
+			return error;
+
+		error = xfs_attr3_leaf_to_node(args);
+		if (error)
+			return error;
+		return -EAGAIN;
 	}
 
-	return 0;
+	return xfs_attr_node_addname(args);
 }
 
 /*
  * Set the attribute specified in @args.
  */
 int
-xfs_attr_set_args(
+__xfs_attr_set_args(
 	struct xfs_da_args	*args)
 {
 	struct xfs_inode	*dp = args->dp;
-	struct xfs_buf          *leaf_bp = NULL;
 	int			error = 0;
 
-	/*
-	 * If the attribute list is already in leaf format, jump straight to
-	 * leaf handling.  Otherwise, try to add the attribute to the shortform
-	 * list; if there's no room then convert the list to leaf format and try
-	 * again.
-	 */
-	if (xfs_attr_is_shortform(dp)) {
-
-		/*
-		 * If the attr was successfully set in shortform, the
-		 * transaction is committed and set to NULL.  Otherwise, is it
-		 * converted from shortform to leaf, and the transaction is
-		 * retained.
-		 */
-		error = xfs_attr_set_shortform(args, &leaf_bp);
-		if (error || !args->trans)
-			return error;
-	}
-
 	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
 		error = xfs_attr_leaf_addname(args);
-		if (error != -ENOSPC)
-			return error;
-
-		/*
-		 * Promote the attribute list to the Btree format.
-		 */
-		error = xfs_attr3_leaf_to_node(args);
 		if (error)
 			return error;
+	}
+
+	error = xfs_attr_node_addname_work(args);
+	return error;
+}
+
+int
+xfs_attr_set_args(
+	struct xfs_da_args	*args)
+
+{
+	int			error;
+	bool			done = false;
+
+	do {
+		error = xfs_attr_set_fmt(args, &done);
+		if (error != -EAGAIN)
+			break;
 
-		/*
-		 * Finish any deferred work items and roll the transaction once
-		 * more.  The goal here is to call node_addname with the inode
-		 * and transaction in the same state (inode locked and joined,
-		 * transaction clean) no matter how we got to this step.
-		 */
 		error = xfs_defer_finish(&args->trans);
 		if (error)
-			return error;
+			break;
+		error = xfs_trans_roll_inode(&args->trans, args->dp);
+	} while (!error);
 
-		/*
-		 * Commit the current trans (including the inode) and
-		 * start a new one.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
-	}
+	if (error || done)
+		return error;
 
-	error = xfs_attr_node_addname(args);
-	return error;
+	error = xfs_defer_finish(&args->trans);
+	if (!error)
+		error = xfs_trans_roll_inode(&args->trans, args->dp);
+	if (error)
+		return error;
+
+	return __xfs_attr_set_args(args);
 }
 
 /*
@@ -676,18 +655,6 @@ xfs_attr_leaf_addname(
 
 	trace_xfs_attr_leaf_addname(args);
 
-	error = xfs_attr_leaf_try_add(args, bp);
-	if (error)
-		return error;
-
-	/*
-	 * Commit the transaction that added the attr name so that
-	 * later routines can manage their own transactions.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
-	if (error)
-		return error;
-
 	/*
 	 * If there was an out-of-line value, allocate the blocks we
 	 * identified for its storage and copy the value.  This is done
@@ -923,7 +890,7 @@ xfs_attr_node_addname(
 	 * Fill in bucket of arguments/results/context to carry around.
 	 */
 	dp = args->dp;
-restart:
+
 	/*
 	 * Search to see if name already exists, and get back a pointer
 	 * to where it should go.
@@ -967,21 +934,10 @@ xfs_attr_node_addname(
 			xfs_da_state_free(state);
 			state = NULL;
 			error = xfs_attr3_leaf_to_node(args);
-			if (error)
-				goto out;
-			error = xfs_defer_finish(&args->trans);
 			if (error)
 				goto out;
 
-			/*
-			 * Commit the node conversion and start the next
-			 * trans in the chain.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
-			if (error)
-				goto out;
-
-			goto restart;
+			return -EAGAIN;
 		}
 
 		/*
@@ -993,9 +949,6 @@ xfs_attr_node_addname(
 		error = xfs_da3_split(state);
 		if (error)
 			goto out;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			goto out;
 	} else {
 		/*
 		 * Addition succeeded, update Btree hashvals.
@@ -1010,13 +963,23 @@ xfs_attr_node_addname(
 	xfs_da_state_free(state);
 	state = NULL;
 
-	/*
-	 * Commit the leaf addition or btree split and start the next
-	 * trans in the chain.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
+	return 0;
+
+out:
+	if (state)
+		xfs_da_state_free(state);
 	if (error)
-		goto out;
+		return error;
+	return retval;
+}
+
+STATIC int
+xfs_attr_node_addname_work(
+	struct xfs_da_args	*args)
+{
+	struct xfs_da_state	*state;
+	struct xfs_da_state_blk	*blk;
+	int			retval, error;
 
 	/*
 	 * If there was an out-of-line value, allocate the blocks we


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 05/15] xfs: Add delay ready attr set routines
  2020-12-23  8:00   ` Chandan Babu R
@ 2020-12-23 16:31     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2020-12-23 16:31 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs



On 12/23/20 1:00 AM, Chandan Babu R wrote:
> On Fri, 18 Dec 2020 00:29:07 -0700, Allison Henderson wrote:
>> This patch modifies the attr set routines to be delay ready. This means
>> they no longer roll or commit transactions, but instead return -EAGAIN
>> to have the calling routine roll and refresh the transaction.  In this
>> series, xfs_attr_set_args has become xfs_attr_set_iter, which uses a
>> state machine like switch to keep track of where it was when EAGAIN was
>> returned. See xfs_attr.h for a more detailed diagram of the states.
>>
>> Two new helper functions have been added: xfs_attr_rmtval_set_init and
> 
> I don't see xfs_attr_rmtval_set_init() being added in this patch. Maybe it
> needs to be removed from the description.

Yeah, I think we dropped it a couple revisions ago.  Will remove.
> 
>> xfs_attr_rmtval_set_blk.  They provide a subset of logic similar to
>> xfs_attr_rmtval_set, but they store the current block in the delay attr
>> context to allow the caller to roll the transaction between allocations.
>> This helps to simplify and consolidate code used by
>> xfs_attr_leaf_addname and xfs_attr_node_addname. xfs_attr_set_args has
>> now become a simple loop to refresh the transaction until the operation
>> is completed.  Lastly, xfs_attr_rmtval_remove is no longer used, and is
>> removed.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c        | 357 ++++++++++++++++++++++++++--------------
>>   fs/xfs/libxfs/xfs_attr.h        | 235 +++++++++++++++++++++++++-
>>   fs/xfs/libxfs/xfs_attr_remote.c |  98 +++++++----
>>   fs/xfs/libxfs/xfs_attr_remote.h |   5 +-
>>   fs/xfs/xfs_trace.h              |   1 -
>>   5 files changed, 541 insertions(+), 155 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index b6330f9..cd72512 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -44,7 +44,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
>>    * Internal routines when attribute list is one block.
>>    */
>>   STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
>> -STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args);
>> +STATIC int xfs_attr_leaf_addname(struct xfs_delattr_context *dac);
>>   STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
>>   STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>>   
>> @@ -52,12 +52,15 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>>    * Internal routines when attribute list is more than one block.
>>    */
>>   STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>> -STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
>> +STATIC int xfs_attr_node_addname(struct xfs_delattr_context *dac);
>>   STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
>>   STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>>   				 struct xfs_da_state **state);
>>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>>   STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>> +STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *args, struct xfs_buf *bp);
>> +STATIC int xfs_attr_set_iter(struct xfs_delattr_context *dac,
>> +			     struct xfs_buf **leaf_bp);
>>   
>>   int
>>   xfs_inode_hasattr(
>> @@ -218,8 +221,11 @@ xfs_attr_is_shortform(
>>   
>>   /*
>>    * Attempts to set an attr in shortform, or converts short form to leaf form if
>> - * there is not enough room.  If the attr is set, the transaction is committed
>> - * and set to NULL.
>> + * there is not enough room.  This function is meant to operate as a helper
>> + * routine to the delayed attribute functions.  It returns -EAGAIN to indicate
>> + * that the calling function should roll the transaction, and then proceed to
>> + * add the attr in leaf form.  This subroutine does not expect to be recalled
>> + * again like the other delayed attr routines do.
>>    */
>>   STATIC int
>>   xfs_attr_set_shortform(
>> @@ -227,16 +233,16 @@ xfs_attr_set_shortform(
>>   	struct xfs_buf		**leaf_bp)
>>   {
>>   	struct xfs_inode	*dp = args->dp;
>> -	int			error, error2 = 0;
>> +	int			error = 0;
>>   
>>   	/*
>>   	 * Try to add the attr to the attribute list in the inode.
>>   	 */
>>   	error = xfs_attr_try_sf_addname(dp, args);
>> +
>> +	/* Should only be 0, -EEXIST or -ENOSPC */
>>   	if (error != -ENOSPC) {
>> -		error2 = xfs_trans_commit(args->trans);
>> -		args->trans = NULL;
>> -		return error ? error : error2;
>> +		return error;
>>   	}
>>   	/*
>>   	 * It won't fit in the shortform, transform to a leaf block.  GROT:
>> @@ -249,18 +255,15 @@ xfs_attr_set_shortform(
>>   	/*
>>   	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
>>   	 * push cannot grab the half-baked leaf buffer and run into problems
>> -	 * with the write verifier. Once we're done rolling the transaction we
>> -	 * can release the hold and add the attr to the leaf.
>> +	 * with the write verifier.
>>   	 */
>>   	xfs_trans_bhold(args->trans, *leaf_bp);
>> -	error = xfs_defer_finish(&args->trans);
>> -	xfs_trans_bhold_release(args->trans, *leaf_bp);
>> -	if (error) {
>> -		xfs_trans_brelse(args->trans, *leaf_bp);
>> -		return error;
>> -	}
>>   
>> -	return 0;
>> +	/*
>> +	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
>> +	 * fork to leaf format and will restart with the leaf add.
>> +	 */
>> +	return -EAGAIN;
>>   }
>>   
>>   /*
>> @@ -268,7 +271,7 @@ xfs_attr_set_shortform(
>>    * also checks for a defer finish.  Transaction is finished and rolled as
>>    * needed, and returns true of false if the delayed operation should continue.
>>    */
>> -int
>> +STATIC int
>>   xfs_attr_trans_roll(
>>   	struct xfs_delattr_context	*dac)
>>   {
>> @@ -298,34 +301,95 @@ int
>>   xfs_attr_set_args(
>>   	struct xfs_da_args	*args)
>>   {
>> -	struct xfs_inode	*dp = args->dp;
>> -	struct xfs_buf          *leaf_bp = NULL;
>> -	int			error = 0;
>> +	struct xfs_buf			*leaf_bp = NULL;
>> +	int				error = 0;
>> +	struct xfs_delattr_context	dac = {
>> +		.da_args	= args,
>> +	};
>> +
>> +	do {
>> +		error = xfs_attr_set_iter(&dac, &leaf_bp);
>> +		if (error != -EAGAIN)
>> +			break;
>> +
>> +		error = xfs_attr_trans_roll(&dac);
>> +		if (error)
>> +			return error;
>> +	} while (true);
>> +
>> +	return error;
>> +}
>> +
>> +/*
>> + * Set the attribute specified in @args.
>> + * This routine is meant to function as a delayed operation, and may return
>> + * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
>> + * to handle this, and recall the function until a successful error code is
>> + * returned.
>> + */
>> +STATIC int
>> +xfs_attr_set_iter(
>> +	struct xfs_delattr_context	*dac,
>> +	struct xfs_buf			**leaf_bp)
>> +{
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_inode		*dp = args->dp;
>> +	int				error = 0;
>> +
>> +	/* State machine switch */
>> +	switch (dac->dela_state) {
>> +	case XFS_DAS_FLIP_LFLAG:
>> +	case XFS_DAS_FOUND_LBLK:
>> +	case XFS_DAS_RM_LBLK:
>> +		return xfs_attr_leaf_addname(dac);
>> +	case XFS_DAS_FOUND_NBLK:
>> +	case XFS_DAS_FLIP_NFLAG:
>> +	case XFS_DAS_ALLOC_NODE:
>> +		return xfs_attr_node_addname(dac);
>> +	case XFS_DAS_UNINIT:
>> +		break;
>> +	default:
>> +		ASSERT(dac->dela_state != XFS_DAS_RM_SHRINK);
>> +		break;
>> +	}
>>   
>>   	/*
>>   	 * If the attribute list is already in leaf format, jump straight to
>>   	 * leaf handling.  Otherwise, try to add the attribute to the shortform
>>   	 * list; if there's no room then convert the list to leaf format and try
>> -	 * again.
>> +	 * again. No need to set state as we will be in leaf form when we come
>> +	 * back
>>   	 */
>>   	if (xfs_attr_is_shortform(dp)) {
>>   
>>   		/*
>> -		 * If the attr was successfully set in shortform, the
>> -		 * transaction is committed and set to NULL.  Otherwise, is it
>> -		 * converted from shortform to leaf, and the transaction is
>> -		 * retained.
>> +		 * If the attr was successfully set in shortform, no need to
>> +		 * continue.  Otherwise, is it converted from shortform to leaf
>> +		 * and -EAGAIN is returned.
>>   		 */
>> -		error = xfs_attr_set_shortform(args, &leaf_bp);
>> -		if (error || !args->trans)
>> -			return error;
>> +		error = xfs_attr_set_shortform(args, leaf_bp);
>> +		if (error == -EAGAIN)
>> +			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +
>> +		return error;
>>   	}
>>   
>> -	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
>> -		error = xfs_attr_leaf_addname(args);
>> -		if (error != -ENOSPC)
>> -			return error;
>> +	/*
>> +	 * After a shortform to leaf conversion, we need to hold the leaf and
>> +	 * cycle out the transaction.  When we get back, we need to release
>> +	 * the leaf to release the hold on the leaf buffer.
>> +	 */
>> +	if (*leaf_bp != NULL) {
>> +		xfs_trans_bhold_release(args->trans, *leaf_bp);
>> +		*leaf_bp = NULL;
>> +	}
>> +
>> +	if (!xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> +		return xfs_attr_node_addname(dac);
>>   
>> +	error = xfs_attr_leaf_try_add(args, *leaf_bp);
>> +	switch (error) {
>> +	case -ENOSPC:
>>   		/*
>>   		 * Promote the attribute list to the Btree format.
>>   		 */
>> @@ -334,25 +398,22 @@ xfs_attr_set_args(
>>   			return error;
>>   
>>   		/*
>> -		 * Finish any deferred work items and roll the transaction once
>> -		 * more.  The goal here is to call node_addname with the inode
>> -		 * and transaction in the same state (inode locked and joined,
>> -		 * transaction clean) no matter how we got to this step.
>> -		 */
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			return error;
>> -
>> -		/*
>> -		 * Commit the current trans (including the inode) and
>> -		 * start a new one.
>> +		 * Finish any deferred work items and roll the
>> +		 * transaction once more.  The goal here is to call
>> +		 * node_addname with the inode and transaction in the
>> +		 * same state (inode locked and joined, transaction
>> +		 * clean) no matter how we got to this step.
>> +		 *
>> +		 * At this point, we are still in XFS_DAS_UNINIT, but
>> +		 * when we come back, we'll be a node, so we'll fall
>> +		 * down into the node handling code below
> 
>   ... node handling code above?.
It used to be a goto that jumped below.  Will update

> 
> Apart from the above nits I don't see any issues w.r.t the logical correctness
> of the code. Hence,
> 
> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Ok, thank you!
Allison

> 
>>   		 */
>> -		error = xfs_trans_roll_inode(&args->trans, dp);
>> -		if (error)
>> -			return error;
>> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		return -EAGAIN;
>> +	case 0:
>> +		dac->dela_state = XFS_DAS_FOUND_LBLK;
>> +		return -EAGAIN;
>>   	}
>> -
>> -	error = xfs_attr_node_addname(args);
>>   	return error;
>>   }
>>   
>> @@ -728,28 +789,30 @@ xfs_attr_leaf_try_add(
>>    *
>>    * This leaf block cannot have a "remote" value, we only call this routine
>>    * if bmap_one_block() says there is only one block (ie: no remote blks).
>> + *
>> + * This routine is meant to function as a delayed operation, and may return
>> + * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
>> + * to handle this, and recall the function until a successful error code is
>> + * returned.
>>    */
>>   STATIC int
>>   xfs_attr_leaf_addname(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_delattr_context	*dac)
>>   {
>> -	int			error, forkoff;
>> -	struct xfs_buf		*bp = NULL;
>> -	struct xfs_inode	*dp = args->dp;
>> -
>> -	trace_xfs_attr_leaf_addname(args);
>> -
>> -	error = xfs_attr_leaf_try_add(args, bp);
>> -	if (error)
>> -		return error;
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_buf			*bp = NULL;
>> +	int				error, forkoff;
>> +	struct xfs_inode		*dp = args->dp;
>>   
>> -	/*
>> -	 * Commit the transaction that added the attr name so that
>> -	 * later routines can manage their own transactions.
>> -	 */
>> -	error = xfs_trans_roll_inode(&args->trans, dp);
>> -	if (error)
>> -		return error;
>> +	/* State machine switch */
>> +	switch (dac->dela_state) {
>> +	case XFS_DAS_FLIP_LFLAG:
>> +		goto das_flip_flag;
>> +	case XFS_DAS_RM_LBLK:
>> +		goto das_rm_lblk;
>> +	default:
>> +		break;
>> +	}
>>   
>>   	/*
>>   	 * If there was an out-of-line value, allocate the blocks we
>> @@ -757,12 +820,34 @@ xfs_attr_leaf_addname(
>>   	 * after we create the attribute so that we don't overflow the
>>   	 * maximum size of a transaction and/or hit a deadlock.
>>   	 */
>> -	if (args->rmtblkno > 0) {
>> -		error = xfs_attr_rmtval_set(args);
>> +
>> +	/* Open coded xfs_attr_rmtval_set without trans handling */
>> +	if ((dac->flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
>> +		dac->flags |= XFS_DAC_LEAF_ADDNAME_INIT;
>> +		if (args->rmtblkno > 0) {
>> +			error = xfs_attr_rmtval_find_space(dac);
>> +			if (error)
>> +				return error;
>> +		}
>> +	}
>> +
>> +	/*
>> +	 * Roll through the "value", allocating blocks on disk as
>> +	 * required.
>> +	 */
>> +	if (dac->blkcnt > 0) {
>> +		error = xfs_attr_rmtval_set_blk(dac);
>>   		if (error)
>>   			return error;
>> +
>> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		return -EAGAIN;
>>   	}
>>   
>> +	error = xfs_attr_rmtval_set_value(args);
>> +	if (error)
>> +		return error;
>> +
>>   	if (!(args->op_flags & XFS_DA_OP_RENAME)) {
>>   		/*
>>   		 * Added a "remote" value, just clear the incomplete flag.
>> @@ -782,29 +867,30 @@ xfs_attr_leaf_addname(
>>   	 * In a separate transaction, set the incomplete flag on the "old" attr
>>   	 * and clear the incomplete flag on the "new" attr.
>>   	 */
>> -
>>   	error = xfs_attr3_leaf_flipflags(args);
>>   	if (error)
>>   		return error;
>>   	/*
>>   	 * Commit the flag value change and start the next trans in series.
>>   	 */
>> -	error = xfs_trans_roll_inode(&args->trans, args->dp);
>> -	if (error)
>> -		return error;
>> -
>> +	dac->dela_state = XFS_DAS_FLIP_LFLAG;
>> +	return -EAGAIN;
>> +das_flip_flag:
>>   	/*
>>   	 * Dismantle the "old" attribute/value pair by removing a "remote" value
>>   	 * (if it exists).
>>   	 */
>>   	xfs_attr_restore_rmt_blk(args);
>>   
>> -	if (args->rmtblkno) {
>> -		error = xfs_attr_rmtval_invalidate(args);
>> -		if (error)
>> -			return error;
>> +	error = xfs_attr_rmtval_invalidate(args);
>> +	if (error)
>> +		return error;
>>   
>> -		error = xfs_attr_rmtval_remove(args);
>> +	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
>> +	dac->dela_state = XFS_DAS_RM_LBLK;
>> +das_rm_lblk:
>> +	if (args->rmtblkno) {
>> +		error = __xfs_attr_rmtval_remove(dac);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -970,23 +1056,38 @@ xfs_attr_node_hasname(
>>    *
>>    * "Remote" attribute values confuse the issue and atomic rename operations
>>    * add a whole extra layer of confusion on top of that.
>> + *
>> + * This routine is meant to function as a delayed operation, and may return
>> + * -EAGAIN when the transaction needs to be rolled.  Calling functions will need
>> + * to handle this, and recall the function until a successful error code is
>> + *returned.
>>    */
>>   STATIC int
>>   xfs_attr_node_addname(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_delattr_context	*dac)
>>   {
>> -	struct xfs_da_state	*state;
>> -	struct xfs_da_state_blk	*blk;
>> -	struct xfs_inode	*dp;
>> -	int			retval, error;
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_state		*state = NULL;
>> +	struct xfs_da_state_blk		*blk;
>> +	int				retval = 0;
>> +	int				error = 0;
>>   
>>   	trace_xfs_attr_node_addname(args);
>>   
>> -	/*
>> -	 * Fill in bucket of arguments/results/context to carry around.
>> -	 */
>> -	dp = args->dp;
>> -restart:
>> +	/* State machine switch */
>> +	switch (dac->dela_state) {
>> +	case XFS_DAS_FLIP_NFLAG:
>> +		goto das_flip_flag;
>> +	case XFS_DAS_FOUND_NBLK:
>> +		goto das_found_nblk;
>> +	case XFS_DAS_ALLOC_NODE:
>> +		goto das_alloc_node;
>> +	case XFS_DAS_RM_NBLK:
>> +		goto das_rm_nblk;
>> +	default:
>> +		break;
>> +	}
>> +
>>   	/*
>>   	 * Search to see if name already exists, and get back a pointer
>>   	 * to where it should go.
>> @@ -1032,19 +1133,16 @@ xfs_attr_node_addname(
>>   			error = xfs_attr3_leaf_to_node(args);
>>   			if (error)
>>   				goto out;
>> -			error = xfs_defer_finish(&args->trans);
>> -			if (error)
>> -				goto out;
>>   
>>   			/*
>> -			 * Commit the node conversion and start the next
>> -			 * trans in the chain.
>> +			 * Now that we have converted the leaf to a node, we can
>> +			 * roll the transaction, and try xfs_attr3_leaf_add
>> +			 * again on re-entry.  No need to set dela_state to do
>> +			 * this. dela_state is still unset by this function at
>> +			 * this point.
>>   			 */
>> -			error = xfs_trans_roll_inode(&args->trans, dp);
>> -			if (error)
>> -				goto out;
>> -
>> -			goto restart;
>> +			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +			return -EAGAIN;
>>   		}
>>   
>>   		/*
>> @@ -1056,9 +1154,7 @@ xfs_attr_node_addname(
>>   		error = xfs_da3_split(state);
>>   		if (error)
>>   			goto out;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			goto out;
>> +		dac->flags |= XFS_DAC_DEFER_FINISH;
>>   	} else {
>>   		/*
>>   		 * Addition succeeded, update Btree hashvals.
>> @@ -1066,6 +1162,11 @@ xfs_attr_node_addname(
>>   		xfs_da3_fixhashpath(state, &state->path);
>>   	}
>>   
>> +	if (!args->rmtblkno && !(args->op_flags & XFS_DA_OP_RENAME)) {
>> +		retval = error;
>> +		goto out;
>> +	}
>> +
>>   	/*
>>   	 * Kill the state structure, we're done with it and need to
>>   	 * allow the buffers to come back later.
>> @@ -1073,13 +1174,9 @@ xfs_attr_node_addname(
>>   	xfs_da_state_free(state);
>>   	state = NULL;
>>   
>> -	/*
>> -	 * Commit the leaf addition or btree split and start the next
>> -	 * trans in the chain.
>> -	 */
>> -	error = xfs_trans_roll_inode(&args->trans, dp);
>> -	if (error)
>> -		goto out;
>> +	dac->dela_state = XFS_DAS_FOUND_NBLK;
>> +	return -EAGAIN;
>> +das_found_nblk:
>>   
>>   	/*
>>   	 * If there was an out-of-line value, allocate the blocks we
>> @@ -1088,7 +1185,27 @@ xfs_attr_node_addname(
>>   	 * maximum size of a transaction and/or hit a deadlock.
>>   	 */
>>   	if (args->rmtblkno > 0) {
>> -		error = xfs_attr_rmtval_set(args);
>> +		/* Open coded xfs_attr_rmtval_set without trans handling */
>> +		error = xfs_attr_rmtval_find_space(dac);
>> +		if (error)
>> +			return error;
>> +
>> +		/*
>> +		 * Roll through the "value", allocating blocks on disk as
>> +		 * required.  Set the state in case of -EAGAIN return code
>> +		 */
>> +		dac->dela_state = XFS_DAS_ALLOC_NODE;
>> +das_alloc_node:
>> +		if (dac->blkcnt > 0) {
>> +			error = xfs_attr_rmtval_set_blk(dac);
>> +			if (error)
>> +				return error;
>> +
>> +			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +			return -EAGAIN;
>> +		}
>> +
>> +		error = xfs_attr_rmtval_set_value(args);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1118,22 +1235,24 @@ xfs_attr_node_addname(
>>   	/*
>>   	 * Commit the flag value change and start the next trans in series
>>   	 */
>> -	error = xfs_trans_roll_inode(&args->trans, args->dp);
>> -	if (error)
>> -		goto out;
>> -
>> +	dac->dela_state = XFS_DAS_FLIP_NFLAG;
>> +	return -EAGAIN;
>> +das_flip_flag:
>>   	/*
>>   	 * Dismantle the "old" attribute/value pair by removing a "remote" value
>>   	 * (if it exists).
>>   	 */
>>   	xfs_attr_restore_rmt_blk(args);
>>   
>> -	if (args->rmtblkno) {
>> -		error = xfs_attr_rmtval_invalidate(args);
>> -		if (error)
>> -			return error;
>> +	error = xfs_attr_rmtval_invalidate(args);
>> +	if (error)
>> +		return error;
>>   
>> -		error = xfs_attr_rmtval_remove(args);
>> +	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
>> +	dac->dela_state = XFS_DAS_RM_NBLK;
>> +das_rm_nblk:
>> +	if (args->rmtblkno) {
>> +		error = __xfs_attr_rmtval_remove(dac);
>>   		if (error)
>>   			return error;
>>   	}
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 3154ef4..e101238 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -135,6 +135,227 @@ struct xfs_attr_list_context {
>>    *              v
>>    *            done
>>    *
>> + *
>> + * Below is a state machine diagram for attr set operations.
>> + *
>> + * It seems the challenge with undertanding this system comes from trying to
>> + * absorb the state machine all at once, when really one should only be looking
>> + * at it with in the context of a single function.  Once a state sensitive
>> + * function is called, the idea is that it "takes ownership" of the
>> + * statemachine. It isn't concerned with the states that may have belonged to
>> + * it's calling parent.  Only the states relevant to itself or any other
>> + * subroutines there in.  Once a calling function hands off the statemachine to
>> + * a subroutine, it needs to respect the simple rule that it doesn't "own" the
>> + * statemachine anymore, and it's the responsibility of that calling function to
>> + * propagate the -EAGAIN back up the call stack.  Upon reentry, it is committed
>> + * to re-calling that subroutine until it returns something other than -EAGAIN.
>> + * Once that subroutine signals completion (by returning anything other than
>> + * -EAGAIN), the calling function can resume using the statemachine.
>> + *
>> + *  xfs_attr_set_iter()
>> + *              │
>> + *              v
>> + *   ┌─y─ has an attr fork?
>> + *   │          |
>> + *   │          n
>> + *   │          |
>> + *   │          V
>> + *   │       add a fork
>> + *   │          │
>> + *   └──────────┤
>> + *              │
>> + *              V
>> + *   ┌─n── is shortform?
>> + *   │          |
>> + *   │          y
>> + *   │          |
>> + *   │          V
>> + *   │ xfs_attr_set_shortform
>> + *   │          |
>> + *   │          V
>> + *   │      had enough ──y──> done
>> + *   │        space?
>> + *   │          │
>> + *   │          n
>> + *   │          │
>> + *   │          V
>> + *   │     return -EAGAIN
>> + *   │   Re-enter in leaf form
>> + *   │          │
>> + *   └──────────┤
>> + *              │
>> + *              V
>> + *       release leaf buffer
>> + *          if needed
>> + *              │
>> + *              V
>> + *   ┌───n── fork has
>> + *   │      only 1 blk?
>> + *   │          │
>> + *   │          y
>> + *   │          │
>> + *   │          v
>> + *   │ xfs_attr_leaf_try_add()
>> + *   │                  │
>> + *   │                  v
>> + *   │              had enough
>> + *   │       ┌────n── space?
>> + *   │       │          │
>> + *   │       v          │
>> + *   │ return -EAGAIN   │
>> + *   │  re-enter in     y
>> + *   │   node form      │
>> + *   │       │          │
>> + *   ├───────┘          │
>> + *   │                  v
>> + *   │  XFS_DAS_FOUND_LBLK ──┐
>> + *   │                       │
>> + *   │  XFS_DAS_FLIP_LFLAG ──┤
>> + *   │  (subroutine state)   │
>> + *   │                       │
>> + *   │                       └─>xfs_attr_leaf_addname()
>> + *   │                                │
>> + *   │                                v
>> + *   │                     ┌──first time through?
>> + *   │                     │          │
>> + *   │                     │          y
>> + *   │                     │          │
>> + *   │                     n          v
>> + *   │                     │    if we have rmt blks
>> + *   │                     │    find space for them
>> + *   │                     │          │
>> + *   │                     └──────────┤
>> + *   │                                │
>> + *   │                                v
>> + *   │                           still have
>> + *   │                     ┌─n─ blks to alloc? <──┐
>> + *   │                     │          │           │
>> + *   │                     │          y           │
>> + *   │                     │          │           │
>> + *   │                     │          v           │
>> + *   │                     │     alloc one blk    │
>> + *   │                     │     return -EAGAIN ──┘
>> + *   │                     │    re-enter with one
>> + *   │                     │    less blk to alloc
>> + *   │                     │
>> + *   │                     │
>> + *   │                     └───> set the rmt
>> + *   │                              value
>> + *   │                                │
>> + *   │                                v
>> + *   │                              was this
>> + *   │                             a rename? ──n─┐
>> + *   │                                │          │
>> + *   │                                y          │
>> + *   │                                │          │
>> + *   │                                v          │
>> + *   │                          flip incomplete  │
>> + *   │                              flag         │
>> + *   │                                │          │
>> + *   │                                v          │
>> + *   │                        XFS_DAS_FLIP_LFLAG │
>> + *   │                                │          │
>> + *   │                                v          │
>> + *   │                              remove       │
>> + *   │          XFS_DAS_RM_LBLK ─> old name      │
>> + *   │                   ^            │          │
>> + *   │                   │            v          │
>> + *   │                   └──────y── more to      │
>> + *   │                              remove       │
>> + *   │                                │          │
>> + *   │                                n          │
>> + *   │                                │          │
>> + *   │                                v          │
>> + *   │                               done <──────┘
>> + *   └──> XFS_DAS_FOUND_NBLK ──┐
>> + *        (subroutine state)   │
>> + *                             │
>> + *        XFS_DAS_ALLOC_NODE ──┤
>> + *        (subroutine state)   │
>> + *                             │
>> + *        XFS_DAS_FLIP_NFLAG ──┤
>> + *        (subroutine state)   │
>> + *                             │
>> + *                             └─>xfs_attr_node_addname()
>> + *                                     │
>> + *                                     v
>> + *                               determine if this
>> + *                              is create or rename
>> + *                            find space to store attr
>> + *                                     │
>> + *                                     v
>> + *               ┌──────n──── fits in a node leaf?
>> + *               │               ^     │
>> + *       single leaf node?       │     │
>> + *         │            │        │     y
>> + *         n            y        │     │
>> + *         │            │        │     v
>> + *         v            v        │   update
>> + *     split if   grow the leaf ─┘  hashvals
>> + *      needed     return -EAGAIN      │
>> + *         │      retry leaf add       │
>> + *         │        on reentry         │
>> + *         │                           │
>> + *         └───────────────────────────┤
>> + *                                     v
>> + *                                need to alloc ──n──> done
>> + *                                or flip flag?
>> + *                                     │
>> + *                                     y
>> + *                                     │
>> + *                                     v
>> + *                             XFS_DAS_FOUND_NBLK
>> + *                                     │
>> + *                                     v
>> + *                       ┌─────n──  need to
>> + *                       │        alloc blks?
>> + *                       │             │
>> + *                       │             y
>> + *                       │             │
>> + *                       │             v
>> + *                       │        find space
>> + *                       │             │
>> + *                       │             v
>> + *                       │  ┌─>XFS_DAS_ALLOC_NODE
>> + *                       │  │          │
>> + *                       │  │          v
>> + *                       │  │      alloc blk
>> + *                       │  │          │
>> + *                       │  │          v
>> + *                       │  └──y── need to alloc
>> + *                       │         more blocks?
>> + *                       │             │
>> + *                       │             n
>> + *                       │             │
>> + *                       │             v
>> + *                       │      set the rmt value
>> + *                       │             │
>> + *                       │             v
>> + *                       │          was this
>> + *                       └────────> a rename? ──n─┐
>> + *                                     │          │
>> + *                                     y          │
>> + *                                     │          │
>> + *                                     v          │
>> + *                               flip incomplete  │
>> + *                                   flag         │
>> + *                                     │          │
>> + *                                     v          │
>> + *                             XFS_DAS_FLIP_NFLAG │
>> + *                                     │          │
>> + *                                     v          │
>> + *                                   remove       │
>> + *               XFS_DAS_RM_NBLK ─> old name      │
>> + *                        ^            │          │
>> + *                        │            v          │
>> + *                        └──────y── more to      │
>> + *                                   remove       │
>> + *                                     │          │
>> + *                                     n          │
>> + *                                     │          │
>> + *                                     v          │
>> + *                                    done <──────┘
>> + *
>>    */
>>   
>>   /*
>> @@ -149,12 +370,20 @@ struct xfs_attr_list_context {
>>   enum xfs_delattr_state {
>>   	XFS_DAS_UNINIT		= 0,  /* No state has been set yet */
>>   	XFS_DAS_RM_SHRINK,	      /* We are shrinking the tree */
>> +	XFS_DAS_FOUND_LBLK,	      /* We found leaf blk for attr */
>> +	XFS_DAS_FOUND_NBLK,	      /* We found node blk for attr */
>> +	XFS_DAS_FLIP_LFLAG,	      /* Flipped leaf INCOMPLETE attr flag */
>> +	XFS_DAS_RM_LBLK,	      /* A rename is removing leaf blocks */
>> +	XFS_DAS_ALLOC_NODE,	      /* We are allocating node blocks */
>> +	XFS_DAS_FLIP_NFLAG,	      /* Flipped node INCOMPLETE attr flag */
>> +	XFS_DAS_RM_NBLK,	      /* A rename is removing node blocks */
>>   };
>>   
>>   /*
>>    * Defines for xfs_delattr_context.flags
>>    */
>>   #define XFS_DAC_DEFER_FINISH		0x01 /* finish the transaction */
>> +#define XFS_DAC_LEAF_ADDNAME_INIT	0x02 /* xfs_attr_leaf_addname init*/
>>   
>>   /*
>>    * Context used for keeping track of delayed attribute operations
>> @@ -162,6 +391,11 @@ enum xfs_delattr_state {
>>   struct xfs_delattr_context {
>>   	struct xfs_da_args      *da_args;
>>   
>> +	/* Used in xfs_attr_rmtval_set_blk to roll through allocating blocks */
>> +	struct xfs_bmbt_irec	map;
>> +	xfs_dablk_t		lblkno;
>> +	int			blkcnt;
>> +
>>   	/* Used in xfs_attr_node_removename to roll through removing blocks */
>>   	struct xfs_da_state     *da_state;
>>   
>> @@ -188,7 +422,6 @@ int xfs_attr_set_args(struct xfs_da_args *args);
>>   int xfs_has_attr(struct xfs_da_args *args);
>>   int xfs_attr_remove_args(struct xfs_da_args *args);
>>   int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
>> -int xfs_attr_trans_roll(struct xfs_delattr_context *dac);
>>   bool xfs_attr_namecheck(const void *name, size_t length);
>>   void xfs_delattr_context_init(struct xfs_delattr_context *dac,
>>   			      struct xfs_da_args *args);
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
>> index f09820c..6af86bf 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.c
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
>> @@ -441,7 +441,7 @@ xfs_attr_rmtval_get(
>>    * Find a "hole" in the attribute address space large enough for us to drop the
>>    * new attribute's value into
>>    */
>> -STATIC int
>> +int
>>   xfs_attr_rmt_find_hole(
>>   	struct xfs_da_args	*args)
>>   {
>> @@ -468,7 +468,7 @@ xfs_attr_rmt_find_hole(
>>   	return 0;
>>   }
>>   
>> -STATIC int
>> +int
>>   xfs_attr_rmtval_set_value(
>>   	struct xfs_da_args	*args)
>>   {
>> @@ -628,6 +628,69 @@ xfs_attr_rmtval_set(
>>   }
>>   
>>   /*
>> + * Find a hole for the attr and store it in the delayed attr context.  This
>> + * initializes the context to roll through allocating an attr extent for a
>> + * delayed attr operation
>> + */
>> +int
>> +xfs_attr_rmtval_find_space(
>> +	struct xfs_delattr_context	*dac)
>> +{
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_bmbt_irec		*map = &dac->map;
>> +	int				error;
>> +
>> +	dac->lblkno = 0;
>> +	dac->blkcnt = 0;
>> +	args->rmtblkcnt = 0;
>> +	args->rmtblkno = 0;
>> +	memset(map, 0, sizeof(struct xfs_bmbt_irec));
>> +
>> +	error = xfs_attr_rmt_find_hole(args);
>> +	if (error)
>> +		return error;
>> +
>> +	dac->blkcnt = args->rmtblkcnt;
>> +	dac->lblkno = args->rmtblkno;
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>> + * Write one block of the value associated with an attribute into the
>> + * out-of-line buffer that we have defined for it. This is similar to a subset
>> + * of xfs_attr_rmtval_set, but records the current block to the delayed attr
>> + * context, and leaves transaction handling to the caller.
>> + */
>> +int
>> +xfs_attr_rmtval_set_blk(
>> +	struct xfs_delattr_context	*dac)
>> +{
>> +	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_inode		*dp = args->dp;
>> +	struct xfs_bmbt_irec		*map = &dac->map;
>> +	int nmap;
>> +	int error;
>> +
>> +	nmap = 1;
>> +	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)dac->lblkno,
>> +				dac->blkcnt, XFS_BMAPI_ATTRFORK, args->total,
>> +				map, &nmap);
>> +	if (error)
>> +		return error;
>> +
>> +	ASSERT(nmap == 1);
>> +	ASSERT((map->br_startblock != DELAYSTARTBLOCK) &&
>> +	       (map->br_startblock != HOLESTARTBLOCK));
>> +
>> +	/* roll attribute extent map forwards */
>> +	dac->lblkno += map->br_blockcount;
>> +	dac->blkcnt -= map->br_blockcount;
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>>    * Remove the value associated with an attribute by deleting the
>>    * out-of-line buffer that it is stored on.
>>    */
>> @@ -669,37 +732,6 @@ xfs_attr_rmtval_invalidate(
>>   }
>>   
>>   /*
>> - * Remove the value associated with an attribute by deleting the
>> - * out-of-line buffer that it is stored on.
>> - */
>> -int
>> -xfs_attr_rmtval_remove(
>> -	struct xfs_da_args		*args)
>> -{
>> -	int				error;
>> -	struct xfs_delattr_context	dac  = {
>> -		.da_args	= args,
>> -	};
>> -
>> -	trace_xfs_attr_rmtval_remove(args);
>> -
>> -	/*
>> -	 * Keep de-allocating extents until the remote-value region is gone.
>> -	 */
>> -	do {
>> -		error = __xfs_attr_rmtval_remove(&dac);
>> -		if (error != -EAGAIN)
>> -			break;
>> -
>> -		error = xfs_attr_trans_roll(&dac);
>> -		if (error)
>> -			return error;
>> -	} while (true);
>> -
>> -	return error;
>> -}
>> -
>> -/*
>>    * Remove the value associated with an attribute by deleting the out-of-line
>>    * buffer that it is stored on. Returns -EAGAIN for the caller to refresh the
>>    * transaction and re-call the function
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
>> index 002fd30..8ad68d5 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.h
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
>> @@ -10,9 +10,12 @@ int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
>>   
>>   int xfs_attr_rmtval_get(struct xfs_da_args *args);
>>   int xfs_attr_rmtval_set(struct xfs_da_args *args);
>> -int xfs_attr_rmtval_remove(struct xfs_da_args *args);
>>   int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
>>   		xfs_buf_flags_t incore_flags);
>>   int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
>>   int __xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
>> +int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
>> +int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
>> +int xfs_attr_rmtval_set_blk(struct xfs_delattr_context *dac);
>> +int xfs_attr_rmtval_find_space(struct xfs_delattr_context *dac);
>>   #endif /* __XFS_ATTR_REMOTE_H__ */
>> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
>> index 5a263ae..9074b8b 100644
>> --- a/fs/xfs/xfs_trace.h
>> +++ b/fs/xfs/xfs_trace.h
>> @@ -1943,7 +1943,6 @@ DEFINE_ATTR_EVENT(xfs_attr_refillstate);
>>   
>>   DEFINE_ATTR_EVENT(xfs_attr_rmtval_get);
>>   DEFINE_ATTR_EVENT(xfs_attr_rmtval_set);
>> -DEFINE_ATTR_EVENT(xfs_attr_rmtval_remove);
>>   
>>   #define DEFINE_DA_EVENT(name) \
>>   DEFINE_EVENT(xfs_da_class, name, \
>>
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-23 14:16           ` Brian Foster
@ 2020-12-24  8:23             ` Allison Henderson
  2021-01-04 17:52               ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2020-12-24  8:23 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 12/23/20 7:16 AM, Brian Foster wrote:
> On Tue, Dec 22, 2020 at 10:20:16PM -0700, Allison Henderson wrote:
>>
>>
>> On 12/22/20 11:44 AM, Brian Foster wrote:
>>> On Tue, Dec 22, 2020 at 12:20:20PM -0500, Brian Foster wrote:
>>>> On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
>>>>> On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
>>>>>> This patch modifies the attr remove routines to be delay ready. This
>>>>>> means they no longer roll or commit transactions, but instead return
>>>>>> -EAGAIN to have the calling routine roll and refresh the transaction. In
>>>>>> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
>>>>>> uses a sort of state machine like switch to keep track of where it was
>>>>>> when EAGAIN was returned. xfs_attr_node_removename has also been
>>>>>> modified to use the switch, and a new version of xfs_attr_remove_args
>>>>>> consists of a simple loop to refresh the transaction until the operation
>>>>>> is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
>>>>>> transaction where ever the existing code used to.
>>>>>>
>>>>>> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
>>>>>> version __xfs_attr_rmtval_remove. We will rename
>>>>>> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
>>>>>> done.
>>>>>>
>>>>>> xfs_attr_rmtval_remove itself is still in use by the set routines (used
>>>>>> during a rename).  For reasons of preserving existing function, we
>>>>>> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
>>>>>> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
>>>>>> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
>>>>>> used and will be removed.
>>>>>>
>>>>>> This patch also adds a new struct xfs_delattr_context, which we will use
>>>>>> to keep track of the current state of an attribute operation. The new
>>>>>> xfs_delattr_state enum is used to track various operations that are in
>>>>>> progress so that we know not to repeat them, and resume where we left
>>>>>> off before EAGAIN was returned to cycle out the transaction. Other
>>>>>> members take the place of local variables that need to retain their
>>>>>> values across multiple function recalls.  See xfs_attr.h for a more
>>>>>> detailed diagram of the states.
>>>>>>
>>>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>>>> ---
>>>>>
>>>>> I started with a couple small comments on this patch but inevitably
>>>>> started thinking more about the factoring again and ended up with a
>>>>> couple patches on top. The first is more of some small tweaks and
>>>>> open-coding that IMO makes this patch a bit easier to follow. The
>>>>> second is more of an RFC so I'll follow up with that in a second email.
>>>>> I'm curious what folks' thoughts might be on either. Also note that I'm
>>>>> primarily focusing on code structure and whatnot here, so these are fast
>>>>> and loose, compile tested only and likely to be broken.
>>>>>
>>>>
>>>> ... and here's the second diff (applies on top of the first).
>>>>
>>>> This one popped up after staring at the previous changes for a bit and
>>>> wondering whether using "done flags" might make the whole thing easier
>>>> to follow than incremental state transitions. I think the attr remove
>>>> path is easy enough to follow with either method, but the attr set path
>>>> is a beast and so this is more with that in mind. Initial thoughts?
>>>>
>>>
>>> Eh, the more I stare at the attr set code I'm not sure this by itself is
>>> much of an improvement. It helps in some areas, but there are so many
>>> transaction rolls embedded throughout at different levels that a larger
>>> rework of the code is probably still necessary. Anyways, this was just a
>>> random thought for now..
>>>
>>> Brian
>>
>> No worries, I know the feeling :-)  The set works and all, but I do think
>> there is struggle around trying to find a particularly pleasent looking
>> presentation of it.  Especially when we get into the set path, it's a bit
>> more complex.  I may pick through the patches you habe here and pick up the
>> whitespace cleanups and other style adjustments if people prefer it that
>> way.  The good news is, a lot of the *_args routines are supposed to
>> disappear at the end of the set, so there's not really a need to invest too
>> much in them I suppose. It may help to jump to the "Set up infastructure"
>> patch too.  I've expanded the diagram to try and help illustrait the code
>> flow a bit, so that may help with following the code flow.
>>
> 
> I'm sure.. :P Note that the first patch was more smaller tweaks and
> refactoring with the existing model in mind. For the set path, the
> challenge IMO is to make the code generally more readable. I think the
> remove path accomplishes this for the most part because the states and
> whatnot are fairly low overhead on top of the existing complexity. This
> changes considerably for the set path, not so much due to the mechanism
> but because the baseline code is so fragmented and complex from the
> start. I am slightly concerned that bolting state management onto the
> current code as such might make it harder to grok and clean up after the
> fact, but I could be wrong about that (my hope was certainly for the
> opposite).
tbh, everytime I do another spin of the set, I actually make all my 
modifications on top of the extended set, with parent pointers and all, 
and make sure all the test cases are still good.  I know pptrs are still 
pretty far out from here, but they're actually the best testcase for 
this, because it generates so much more activity.  If all thats still 
golden, then I'll pull them back down into the lower subsets and work 
out all the conflicts on the back way up.  If something went wrong, 
diffing the branch heads tracks it down pretty fast.

> 
> Regardless, that had me shifting focus a bit and playing around with the
> current upstream code as opposed to shifting around your code. ISTM that
> there is some commonality across the various set codepaths and perhaps
> there is potential to simplify things notably _before_ applying the
> state management scheme. I've appended a new diff below (based on
> for-next) that starts to demonstrate what I mean. Note again that this
> is similarly fast and loose as I've knowingly threw away some quirks of
> the code (i.e. leaf buffer bhold) for the purpose of quickly trying to
> explore/POC whether the factoring might be sane and plausible.
> 
> In summary, this combines the "try addname" part of each xattr format to
> fall under a single transaction rolling loop such that I think the
> resulting function could become one high level state. I ran out of time
> for working through the rest, but from a read through it seems there's
> at least a chance we could continue with similar refactoring and
> reduction to a fewer number of generic states (vs. more format-specific
> states). For example, the remaining parts of the set operation all seem
> to have something along the lines of the following high level
> components:
> 
> - remote value block allocation (and value set)
> - if rename == true, clear flag and done
> - if rename == false, flip flags
> 	- remove old xattr (i.e., similar to xattr remove)
> 
> ... where much of that code looks remarkably similar across the
> different leaf/node code branches. So I'm curious what you and others
> following along might think about something like this as an intermediate
> step...

Yes, I had noticed similarities when we first started, though I got the 
impression that people mostly wanted to focus on just hoisting the 
transactions upwards.  I did look at them at one point, but seem to 
recall the similarities having just enough disimilarities such that 
trying to consolodate them tends to introduce about as much plumbing 
with if/else's.  In any case, I do think the solution here with the 
format handling is creative, and may reduce a state or two, but I'd 
really need to see it through the test cases to know if it's going to 
work.  From what you've hashed out here, I think I get the idea. It's 
hard for me to comment on readability because I've been up and down the 
code so much.  I do think it's a little loopy looking, but so is the 
statemachine.  Maybe a good spot for others to chime in too.

I actually find it easier to work on it from the top of the set rather 
than the bottom.  Just so that the end goal of what it will end up 
looking like is a little more clear.  Once the goal is clear, then I 
worry about layering it in what ever patch it goes in.  Otherwise it's 
harder to see exactly how the conflicts shake out.

Allison
> 
> Brian
> 
> --- 8< ---
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index fd8e6418a0d3..eff8833d5303 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -58,6 +58,8 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>   				 struct xfs_da_state **state);
>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>   STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
> +STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *, struct xfs_buf *);
> +STATIC int xfs_attr_node_addname_work(struct xfs_da_args *);
>   
>   int
>   xfs_inode_hasattr(
> @@ -216,116 +218,93 @@ xfs_attr_is_shortform(
>   		ip->i_afp->if_nextents == 0);
>   }
>   
> -/*
> - * Attempts to set an attr in shortform, or converts short form to leaf form if
> - * there is not enough room.  If the attr is set, the transaction is committed
> - * and set to NULL.
> - */
> -STATIC int
> -xfs_attr_set_shortform(
> +int
> +xfs_attr_set_fmt(
>   	struct xfs_da_args	*args,
> -	struct xfs_buf		**leaf_bp)
> +	bool			*done)
>   {
>   	struct xfs_inode	*dp = args->dp;
> -	int			error, error2 = 0;
> +	struct xfs_buf		*leaf_bp = NULL;
> +	int			error = 0;
>   
> -	/*
> -	 * Try to add the attr to the attribute list in the inode.
> -	 */
> -	error = xfs_attr_try_sf_addname(dp, args);
> -	if (error != -ENOSPC) {
> -		error2 = xfs_trans_commit(args->trans);
> -		args->trans = NULL;
> -		return error ? error : error2;
> +	if (xfs_attr_is_shortform(dp)) {
> +		error = xfs_attr_try_sf_addname(dp, args);
> +		if (!error)
> +			*done = true;
> +		if (error != -ENOSPC)
> +			return error;
> +
> +		error = xfs_attr_shortform_to_leaf(args, &leaf_bp);
> +		if (error)
> +			return error;
> +		return -EAGAIN;
>   	}
> -	/*
> -	 * It won't fit in the shortform, transform to a leaf block.  GROT:
> -	 * another possible req'mt for a double-split btree op.
> -	 */
> -	error = xfs_attr_shortform_to_leaf(args, leaf_bp);
> -	if (error)
> -		return error;
>   
> -	/*
> -	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
> -	 * push cannot grab the half-baked leaf buffer and run into problems
> -	 * with the write verifier. Once we're done rolling the transaction we
> -	 * can release the hold and add the attr to the leaf.
> -	 */
> -	xfs_trans_bhold(args->trans, *leaf_bp);
> -	error = xfs_defer_finish(&args->trans);
> -	xfs_trans_bhold_release(args->trans, *leaf_bp);
> -	if (error) {
> -		xfs_trans_brelse(args->trans, *leaf_bp);
> -		return error;
> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> +		struct xfs_buf	*bp = NULL;
> +
> +		error = xfs_attr_leaf_try_add(args, bp);
> +		if (error != -ENOSPC)
> +			return error;
> +
> +		error = xfs_attr3_leaf_to_node(args);
> +		if (error)
> +			return error;
> +		return -EAGAIN;
>   	}
>   
> -	return 0;
> +	return xfs_attr_node_addname(args);
>   }
>   
>   /*
>    * Set the attribute specified in @args.
>    */
>   int
> -xfs_attr_set_args(
> +__xfs_attr_set_args(
>   	struct xfs_da_args	*args)
>   {
>   	struct xfs_inode	*dp = args->dp;
> -	struct xfs_buf          *leaf_bp = NULL;
>   	int			error = 0;
>   
> -	/*
> -	 * If the attribute list is already in leaf format, jump straight to
> -	 * leaf handling.  Otherwise, try to add the attribute to the shortform
> -	 * list; if there's no room then convert the list to leaf format and try
> -	 * again.
> -	 */
> -	if (xfs_attr_is_shortform(dp)) {
> -
> -		/*
> -		 * If the attr was successfully set in shortform, the
> -		 * transaction is committed and set to NULL.  Otherwise, is it
> -		 * converted from shortform to leaf, and the transaction is
> -		 * retained.
> -		 */
> -		error = xfs_attr_set_shortform(args, &leaf_bp);
> -		if (error || !args->trans)
> -			return error;
> -	}
> -
>   	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
>   		error = xfs_attr_leaf_addname(args);
> -		if (error != -ENOSPC)
> -			return error;
> -
> -		/*
> -		 * Promote the attribute list to the Btree format.
> -		 */
> -		error = xfs_attr3_leaf_to_node(args);
>   		if (error)
>   			return error;
> +	}
> +
> +	error = xfs_attr_node_addname_work(args);
> +	return error;
> +}
> +
> +int
> +xfs_attr_set_args(
> +	struct xfs_da_args	*args)
> +
> +{
> +	int			error;
> +	bool			done = false;
> +
> +	do {
> +		error = xfs_attr_set_fmt(args, &done);
> +		if (error != -EAGAIN)
> +			break;
>   
> -		/*
> -		 * Finish any deferred work items and roll the transaction once
> -		 * more.  The goal here is to call node_addname with the inode
> -		 * and transaction in the same state (inode locked and joined,
> -		 * transaction clean) no matter how we got to this step.
> -		 */
>   		error = xfs_defer_finish(&args->trans);
>   		if (error)
> -			return error;
> +			break;
> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> +	} while (!error);
>   
> -		/*
> -		 * Commit the current trans (including the inode) and
> -		 * start a new one.
> -		 */
> -		error = xfs_trans_roll_inode(&args->trans, dp);
> -		if (error)
> -			return error;
> -	}
> +	if (error || done)
> +		return error;
>   
> -	error = xfs_attr_node_addname(args);
> -	return error;
> +	error = xfs_defer_finish(&args->trans);
> +	if (!error)
> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> +	if (error)
> +		return error;
> +
> +	return __xfs_attr_set_args(args);
>   }
>   
>   /*
> @@ -676,18 +655,6 @@ xfs_attr_leaf_addname(
>   
>   	trace_xfs_attr_leaf_addname(args);
>   
> -	error = xfs_attr_leaf_try_add(args, bp);
> -	if (error)
> -		return error;
> -
> -	/*
> -	 * Commit the transaction that added the attr name so that
> -	 * later routines can manage their own transactions.
> -	 */
> -	error = xfs_trans_roll_inode(&args->trans, dp);
> -	if (error)
> -		return error;
> -
>   	/*
>   	 * If there was an out-of-line value, allocate the blocks we
>   	 * identified for its storage and copy the value.  This is done
> @@ -923,7 +890,7 @@ xfs_attr_node_addname(
>   	 * Fill in bucket of arguments/results/context to carry around.
>   	 */
>   	dp = args->dp;
> -restart:
> +
>   	/*
>   	 * Search to see if name already exists, and get back a pointer
>   	 * to where it should go.
> @@ -967,21 +934,10 @@ xfs_attr_node_addname(
>   			xfs_da_state_free(state);
>   			state = NULL;
>   			error = xfs_attr3_leaf_to_node(args);
> -			if (error)
> -				goto out;
> -			error = xfs_defer_finish(&args->trans);
>   			if (error)
>   				goto out;
>   
> -			/*
> -			 * Commit the node conversion and start the next
> -			 * trans in the chain.
> -			 */
> -			error = xfs_trans_roll_inode(&args->trans, dp);
> -			if (error)
> -				goto out;
> -
> -			goto restart;
> +			return -EAGAIN;
>   		}
>   
>   		/*
> @@ -993,9 +949,6 @@ xfs_attr_node_addname(
>   		error = xfs_da3_split(state);
>   		if (error)
>   			goto out;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			goto out;
>   	} else {
>   		/*
>   		 * Addition succeeded, update Btree hashvals.
> @@ -1010,13 +963,23 @@ xfs_attr_node_addname(
>   	xfs_da_state_free(state);
>   	state = NULL;
>   
> -	/*
> -	 * Commit the leaf addition or btree split and start the next
> -	 * trans in the chain.
> -	 */
> -	error = xfs_trans_roll_inode(&args->trans, dp);
> +	return 0;
> +
> +out:
> +	if (state)
> +		xfs_da_state_free(state);
>   	if (error)
> -		goto out;
> +		return error;
> +	return retval;
> +}
> +
> +STATIC int
> +xfs_attr_node_addname_work(
> +	struct xfs_da_args	*args)
> +{
> +	struct xfs_da_state	*state;
> +	struct xfs_da_state_blk	*blk;
> +	int			retval, error;
>   
>   	/*
>   	 * If there was an out-of-line value, allocate the blocks we
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2020-12-24  8:23             ` Allison Henderson
@ 2021-01-04 17:52               ` Brian Foster
  2021-01-05 18:10                 ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2021-01-04 17:52 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Thu, Dec 24, 2020 at 01:23:24AM -0700, Allison Henderson wrote:
> 
> 
> On 12/23/20 7:16 AM, Brian Foster wrote:
> > On Tue, Dec 22, 2020 at 10:20:16PM -0700, Allison Henderson wrote:
> > > 
> > > 
> > > On 12/22/20 11:44 AM, Brian Foster wrote:
> > > > On Tue, Dec 22, 2020 at 12:20:20PM -0500, Brian Foster wrote:
> > > > > On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
> > > > > > On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
> > > > > > > This patch modifies the attr remove routines to be delay ready. This
> > > > > > > means they no longer roll or commit transactions, but instead return
> > > > > > > -EAGAIN to have the calling routine roll and refresh the transaction. In
> > > > > > > this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> > > > > > > uses a sort of state machine like switch to keep track of where it was
> > > > > > > when EAGAIN was returned. xfs_attr_node_removename has also been
> > > > > > > modified to use the switch, and a new version of xfs_attr_remove_args
> > > > > > > consists of a simple loop to refresh the transaction until the operation
> > > > > > > is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> > > > > > > transaction where ever the existing code used to.
> > > > > > > 
> > > > > > > Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> > > > > > > version __xfs_attr_rmtval_remove. We will rename
> > > > > > > __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> > > > > > > done.
> > > > > > > 
> > > > > > > xfs_attr_rmtval_remove itself is still in use by the set routines (used
> > > > > > > during a rename).  For reasons of preserving existing function, we
> > > > > > > modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> > > > > > > set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> > > > > > > the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> > > > > > > used and will be removed.
> > > > > > > 
> > > > > > > This patch also adds a new struct xfs_delattr_context, which we will use
> > > > > > > to keep track of the current state of an attribute operation. The new
> > > > > > > xfs_delattr_state enum is used to track various operations that are in
> > > > > > > progress so that we know not to repeat them, and resume where we left
> > > > > > > off before EAGAIN was returned to cycle out the transaction. Other
> > > > > > > members take the place of local variables that need to retain their
> > > > > > > values across multiple function recalls.  See xfs_attr.h for a more
> > > > > > > detailed diagram of the states.
> > > > > > > 
> > > > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > > > ---
> > > > > > 
> > > > > > I started with a couple small comments on this patch but inevitably
> > > > > > started thinking more about the factoring again and ended up with a
> > > > > > couple patches on top. The first is more of some small tweaks and
> > > > > > open-coding that IMO makes this patch a bit easier to follow. The
> > > > > > second is more of an RFC so I'll follow up with that in a second email.
> > > > > > I'm curious what folks' thoughts might be on either. Also note that I'm
> > > > > > primarily focusing on code structure and whatnot here, so these are fast
> > > > > > and loose, compile tested only and likely to be broken.
> > > > > > 
> > > > > 
> > > > > ... and here's the second diff (applies on top of the first).
> > > > > 
> > > > > This one popped up after staring at the previous changes for a bit and
> > > > > wondering whether using "done flags" might make the whole thing easier
> > > > > to follow than incremental state transitions. I think the attr remove
> > > > > path is easy enough to follow with either method, but the attr set path
> > > > > is a beast and so this is more with that in mind. Initial thoughts?
> > > > > 
> > > > 
> > > > Eh, the more I stare at the attr set code I'm not sure this by itself is
> > > > much of an improvement. It helps in some areas, but there are so many
> > > > transaction rolls embedded throughout at different levels that a larger
> > > > rework of the code is probably still necessary. Anyways, this was just a
> > > > random thought for now..
> > > > 
> > > > Brian
> > > 
> > > No worries, I know the feeling :-)  The set works and all, but I do think
> > > there is struggle around trying to find a particularly pleasent looking
> > > presentation of it.  Especially when we get into the set path, it's a bit
> > > more complex.  I may pick through the patches you habe here and pick up the
> > > whitespace cleanups and other style adjustments if people prefer it that
> > > way.  The good news is, a lot of the *_args routines are supposed to
> > > disappear at the end of the set, so there's not really a need to invest too
> > > much in them I suppose. It may help to jump to the "Set up infastructure"
> > > patch too.  I've expanded the diagram to try and help illustrait the code
> > > flow a bit, so that may help with following the code flow.
> > > 
> > 
> > I'm sure.. :P Note that the first patch was more smaller tweaks and
> > refactoring with the existing model in mind. For the set path, the
> > challenge IMO is to make the code generally more readable. I think the
> > remove path accomplishes this for the most part because the states and
> > whatnot are fairly low overhead on top of the existing complexity. This
> > changes considerably for the set path, not so much due to the mechanism
> > but because the baseline code is so fragmented and complex from the
> > start. I am slightly concerned that bolting state management onto the
> > current code as such might make it harder to grok and clean up after the
> > fact, but I could be wrong about that (my hope was certainly for the
> > opposite).
> tbh, everytime I do another spin of the set, I actually make all my
> modifications on top of the extended set, with parent pointers and all, and
> make sure all the test cases are still good.  I know pptrs are still pretty
> far out from here, but they're actually the best testcase for this, because
> it generates so much more activity.  If all thats still golden, then I'll
> pull them back down into the lower subsets and work out all the conflicts on
> the back way up.  If something went wrong, diffing the branch heads tracks
> it down pretty fast.
> 

Indeed, that's a good thing. My comment was more around the readability
of the code and subsequent ability to clean it up, reduce the number of
required states, etc...

> > 
> > Regardless, that had me shifting focus a bit and playing around with the
> > current upstream code as opposed to shifting around your code. ISTM that
> > there is some commonality across the various set codepaths and perhaps
> > there is potential to simplify things notably _before_ applying the
> > state management scheme. I've appended a new diff below (based on
> > for-next) that starts to demonstrate what I mean. Note again that this
> > is similarly fast and loose as I've knowingly threw away some quirks of
> > the code (i.e. leaf buffer bhold) for the purpose of quickly trying to
> > explore/POC whether the factoring might be sane and plausible.
> > 
> > In summary, this combines the "try addname" part of each xattr format to
> > fall under a single transaction rolling loop such that I think the
> > resulting function could become one high level state. I ran out of time
> > for working through the rest, but from a read through it seems there's
> > at least a chance we could continue with similar refactoring and
> > reduction to a fewer number of generic states (vs. more format-specific
> > states). For example, the remaining parts of the set operation all seem
> > to have something along the lines of the following high level
> > components:
> > 
> > - remote value block allocation (and value set)
> > - if rename == true, clear flag and done
> > - if rename == false, flip flags
> > 	- remove old xattr (i.e., similar to xattr remove)
> > 
> > ... where much of that code looks remarkably similar across the
> > different leaf/node code branches. So I'm curious what you and others
> > following along might think about something like this as an intermediate
> > step...
> 
> Yes, I had noticed similarities when we first started, though I got the
> impression that people mostly wanted to focus on just hoisting the
> transactions upwards.  I did look at them at one point, but seem to recall
> the similarities having just enough disimilarities such that trying to
> consolodate them tends to introduce about as much plumbing with if/else's.
> In any case, I do think the solution here with the format handling is
> creative, and may reduce a state or two, but I'd really need to see it
> through the test cases to know if it's going to work.  From what you've
> hashed out here, I think I get the idea. It's hard for me to comment on
> readability because I've been up and down the code so much.  I do think it's
> a little loopy looking, but so is the statemachine.  Maybe a good spot for
> others to chime in too.
> 

Can you elaborate on what you mean by loopy? :P I'm sure you noticed I
borrowed the transaction rolling mechanism from your infra patch..

But yeah, I'm partly to blame for the hoisting approach as well. I was
thinking/hoping that seeing the various states would facilitate
simplification of the code, but my first reaction when looking at the
(much more complex) xattr set path is more confusion than clarity. I see
the code drop into state management, using that to call into
format-specific helpers, then fall into doing some other stuff that
might call into some of the same format-specific add helpers, then
realize I'll probably have to trace up and down through the whole path
to make some sense of the execution flow. That is what has me wondering
whether this would become more simple with fewer, generic and higher
level states like SET_FORMAT (i.e. what I hacked up), SET_NAME,
SET_VALUE (rmt block allocs), SET_FLAG (clear or flip), and then finally
fall into the remove path in the rename case.

We'd ultimately implement the same type of state machine approach, it
would just require more up front cleanup rework than the other way
around, and hopefully land fairly simplified from the onset. Of course
those states are just off the top of my head so might not be feasible,
but I'm also curious if any others following along might have thoughts
one way or the other. I'm sure we could implement things in either order
when it comes down to it...

Brian

> I actually find it easier to work on it from the top of the set rather than
> the bottom.  Just so that the end goal of what it will end up looking like
> is a little more clear.  Once the goal is clear, then I worry about layering
> it in what ever patch it goes in.  Otherwise it's harder to see exactly how
> the conflicts shake out.
> 
> Allison
> > 
> > Brian
> > 
> > --- 8< ---
> > 
> > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > index fd8e6418a0d3..eff8833d5303 100644
> > --- a/fs/xfs/libxfs/xfs_attr.c
> > +++ b/fs/xfs/libxfs/xfs_attr.c
> > @@ -58,6 +58,8 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
> >   				 struct xfs_da_state **state);
> >   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
> >   STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
> > +STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *, struct xfs_buf *);
> > +STATIC int xfs_attr_node_addname_work(struct xfs_da_args *);
> >   int
> >   xfs_inode_hasattr(
> > @@ -216,116 +218,93 @@ xfs_attr_is_shortform(
> >   		ip->i_afp->if_nextents == 0);
> >   }
> > -/*
> > - * Attempts to set an attr in shortform, or converts short form to leaf form if
> > - * there is not enough room.  If the attr is set, the transaction is committed
> > - * and set to NULL.
> > - */
> > -STATIC int
> > -xfs_attr_set_shortform(
> > +int
> > +xfs_attr_set_fmt(
> >   	struct xfs_da_args	*args,
> > -	struct xfs_buf		**leaf_bp)
> > +	bool			*done)
> >   {
> >   	struct xfs_inode	*dp = args->dp;
> > -	int			error, error2 = 0;
> > +	struct xfs_buf		*leaf_bp = NULL;
> > +	int			error = 0;
> > -	/*
> > -	 * Try to add the attr to the attribute list in the inode.
> > -	 */
> > -	error = xfs_attr_try_sf_addname(dp, args);
> > -	if (error != -ENOSPC) {
> > -		error2 = xfs_trans_commit(args->trans);
> > -		args->trans = NULL;
> > -		return error ? error : error2;
> > +	if (xfs_attr_is_shortform(dp)) {
> > +		error = xfs_attr_try_sf_addname(dp, args);
> > +		if (!error)
> > +			*done = true;
> > +		if (error != -ENOSPC)
> > +			return error;
> > +
> > +		error = xfs_attr_shortform_to_leaf(args, &leaf_bp);
> > +		if (error)
> > +			return error;
> > +		return -EAGAIN;
> >   	}
> > -	/*
> > -	 * It won't fit in the shortform, transform to a leaf block.  GROT:
> > -	 * another possible req'mt for a double-split btree op.
> > -	 */
> > -	error = xfs_attr_shortform_to_leaf(args, leaf_bp);
> > -	if (error)
> > -		return error;
> > -	/*
> > -	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
> > -	 * push cannot grab the half-baked leaf buffer and run into problems
> > -	 * with the write verifier. Once we're done rolling the transaction we
> > -	 * can release the hold and add the attr to the leaf.
> > -	 */
> > -	xfs_trans_bhold(args->trans, *leaf_bp);
> > -	error = xfs_defer_finish(&args->trans);
> > -	xfs_trans_bhold_release(args->trans, *leaf_bp);
> > -	if (error) {
> > -		xfs_trans_brelse(args->trans, *leaf_bp);
> > -		return error;
> > +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> > +		struct xfs_buf	*bp = NULL;
> > +
> > +		error = xfs_attr_leaf_try_add(args, bp);
> > +		if (error != -ENOSPC)
> > +			return error;
> > +
> > +		error = xfs_attr3_leaf_to_node(args);
> > +		if (error)
> > +			return error;
> > +		return -EAGAIN;
> >   	}
> > -	return 0;
> > +	return xfs_attr_node_addname(args);
> >   }
> >   /*
> >    * Set the attribute specified in @args.
> >    */
> >   int
> > -xfs_attr_set_args(
> > +__xfs_attr_set_args(
> >   	struct xfs_da_args	*args)
> >   {
> >   	struct xfs_inode	*dp = args->dp;
> > -	struct xfs_buf          *leaf_bp = NULL;
> >   	int			error = 0;
> > -	/*
> > -	 * If the attribute list is already in leaf format, jump straight to
> > -	 * leaf handling.  Otherwise, try to add the attribute to the shortform
> > -	 * list; if there's no room then convert the list to leaf format and try
> > -	 * again.
> > -	 */
> > -	if (xfs_attr_is_shortform(dp)) {
> > -
> > -		/*
> > -		 * If the attr was successfully set in shortform, the
> > -		 * transaction is committed and set to NULL.  Otherwise, is it
> > -		 * converted from shortform to leaf, and the transaction is
> > -		 * retained.
> > -		 */
> > -		error = xfs_attr_set_shortform(args, &leaf_bp);
> > -		if (error || !args->trans)
> > -			return error;
> > -	}
> > -
> >   	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> >   		error = xfs_attr_leaf_addname(args);
> > -		if (error != -ENOSPC)
> > -			return error;
> > -
> > -		/*
> > -		 * Promote the attribute list to the Btree format.
> > -		 */
> > -		error = xfs_attr3_leaf_to_node(args);
> >   		if (error)
> >   			return error;
> > +	}
> > +
> > +	error = xfs_attr_node_addname_work(args);
> > +	return error;
> > +}
> > +
> > +int
> > +xfs_attr_set_args(
> > +	struct xfs_da_args	*args)
> > +
> > +{
> > +	int			error;
> > +	bool			done = false;
> > +
> > +	do {
> > +		error = xfs_attr_set_fmt(args, &done);
> > +		if (error != -EAGAIN)
> > +			break;
> > -		/*
> > -		 * Finish any deferred work items and roll the transaction once
> > -		 * more.  The goal here is to call node_addname with the inode
> > -		 * and transaction in the same state (inode locked and joined,
> > -		 * transaction clean) no matter how we got to this step.
> > -		 */
> >   		error = xfs_defer_finish(&args->trans);
> >   		if (error)
> > -			return error;
> > +			break;
> > +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> > +	} while (!error);
> > -		/*
> > -		 * Commit the current trans (including the inode) and
> > -		 * start a new one.
> > -		 */
> > -		error = xfs_trans_roll_inode(&args->trans, dp);
> > -		if (error)
> > -			return error;
> > -	}
> > +	if (error || done)
> > +		return error;
> > -	error = xfs_attr_node_addname(args);
> > -	return error;
> > +	error = xfs_defer_finish(&args->trans);
> > +	if (!error)
> > +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> > +	if (error)
> > +		return error;
> > +
> > +	return __xfs_attr_set_args(args);
> >   }
> >   /*
> > @@ -676,18 +655,6 @@ xfs_attr_leaf_addname(
> >   	trace_xfs_attr_leaf_addname(args);
> > -	error = xfs_attr_leaf_try_add(args, bp);
> > -	if (error)
> > -		return error;
> > -
> > -	/*
> > -	 * Commit the transaction that added the attr name so that
> > -	 * later routines can manage their own transactions.
> > -	 */
> > -	error = xfs_trans_roll_inode(&args->trans, dp);
> > -	if (error)
> > -		return error;
> > -
> >   	/*
> >   	 * If there was an out-of-line value, allocate the blocks we
> >   	 * identified for its storage and copy the value.  This is done
> > @@ -923,7 +890,7 @@ xfs_attr_node_addname(
> >   	 * Fill in bucket of arguments/results/context to carry around.
> >   	 */
> >   	dp = args->dp;
> > -restart:
> > +
> >   	/*
> >   	 * Search to see if name already exists, and get back a pointer
> >   	 * to where it should go.
> > @@ -967,21 +934,10 @@ xfs_attr_node_addname(
> >   			xfs_da_state_free(state);
> >   			state = NULL;
> >   			error = xfs_attr3_leaf_to_node(args);
> > -			if (error)
> > -				goto out;
> > -			error = xfs_defer_finish(&args->trans);
> >   			if (error)
> >   				goto out;
> > -			/*
> > -			 * Commit the node conversion and start the next
> > -			 * trans in the chain.
> > -			 */
> > -			error = xfs_trans_roll_inode(&args->trans, dp);
> > -			if (error)
> > -				goto out;
> > -
> > -			goto restart;
> > +			return -EAGAIN;
> >   		}
> >   		/*
> > @@ -993,9 +949,6 @@ xfs_attr_node_addname(
> >   		error = xfs_da3_split(state);
> >   		if (error)
> >   			goto out;
> > -		error = xfs_defer_finish(&args->trans);
> > -		if (error)
> > -			goto out;
> >   	} else {
> >   		/*
> >   		 * Addition succeeded, update Btree hashvals.
> > @@ -1010,13 +963,23 @@ xfs_attr_node_addname(
> >   	xfs_da_state_free(state);
> >   	state = NULL;
> > -	/*
> > -	 * Commit the leaf addition or btree split and start the next
> > -	 * trans in the chain.
> > -	 */
> > -	error = xfs_trans_roll_inode(&args->trans, dp);
> > +	return 0;
> > +
> > +out:
> > +	if (state)
> > +		xfs_da_state_free(state);
> >   	if (error)
> > -		goto out;
> > +		return error;
> > +	return retval;
> > +}
> > +
> > +STATIC int
> > +xfs_attr_node_addname_work(
> > +	struct xfs_da_args	*args)
> > +{
> > +	struct xfs_da_state	*state;
> > +	struct xfs_da_state_blk	*blk;
> > +	int			retval, error;
> >   	/*
> >   	 * If there was an out-of-line value, allocate the blocks we
> > 
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 06/15] xfs: Add state machine tracepoints
  2020-12-18  7:29 ` [PATCH v14 06/15] xfs: Add state machine tracepoints Allison Henderson
@ 2021-01-05  4:50   ` Chandan Babu R
  2021-01-05 21:06     ` Allison Henderson
  2021-01-05  5:28   ` Darrick J. Wong
  1 sibling, 1 reply; 48+ messages in thread
From: Chandan Babu R @ 2021-01-05  4:50 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, 18 Dec 2020 00:29:08 -0700, Allison Henderson wrote:
> This is a quick patch to add a new tracepoint: xfs_das_state_return.  We
> use this to track when ever a new state is set or -EAGAIN is returned
>

Looks good to me.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c        | 22 +++++++++++++++++++++-
>  fs/xfs/libxfs/xfs_attr_remote.c |  1 +
>  fs/xfs/xfs_trace.h              | 20 ++++++++++++++++++++
>  3 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index cd72512..8ed00bc 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -263,6 +263,7 @@ xfs_attr_set_shortform(
>  	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
>  	 * fork to leaf format and will restart with the leaf add.
>  	 */
> +	trace_xfs_das_state_return(XFS_DAS_UNINIT);
>  	return -EAGAIN;
>  }
>  
> @@ -409,9 +410,11 @@ xfs_attr_set_iter(
>  		 * down into the node handling code below
>  		 */
>  		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return -EAGAIN;
>  	case 0:
>  		dac->dela_state = XFS_DAS_FOUND_LBLK;
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return -EAGAIN;
>  	}
>  	return error;
> @@ -841,6 +844,7 @@ xfs_attr_leaf_addname(
>  			return error;
>  
>  		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return -EAGAIN;
>  	}
>  
> @@ -874,6 +878,7 @@ xfs_attr_leaf_addname(
>  	 * Commit the flag value change and start the next trans in series.
>  	 */
>  	dac->dela_state = XFS_DAS_FLIP_LFLAG;
> +	trace_xfs_das_state_return(dac->dela_state);
>  	return -EAGAIN;
>  das_flip_flag:
>  	/*
> @@ -891,6 +896,8 @@ xfs_attr_leaf_addname(
>  das_rm_lblk:
>  	if (args->rmtblkno) {
>  		error = __xfs_attr_rmtval_remove(dac);
> +		if (error == -EAGAIN)
> +			trace_xfs_das_state_return(dac->dela_state);
>  		if (error)
>  			return error;
>  	}
> @@ -1142,6 +1149,7 @@ xfs_attr_node_addname(
>  			 * this point.
>  			 */
>  			dac->flags |= XFS_DAC_DEFER_FINISH;
> +			trace_xfs_das_state_return(dac->dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1175,6 +1183,7 @@ xfs_attr_node_addname(
>  	state = NULL;
>  
>  	dac->dela_state = XFS_DAS_FOUND_NBLK;
> +	trace_xfs_das_state_return(dac->dela_state);
>  	return -EAGAIN;
>  das_found_nblk:
>  
> @@ -1202,6 +1211,7 @@ xfs_attr_node_addname(
>  				return error;
>  
>  			dac->flags |= XFS_DAC_DEFER_FINISH;
> +			trace_xfs_das_state_return(dac->dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1236,6 +1246,7 @@ xfs_attr_node_addname(
>  	 * Commit the flag value change and start the next trans in series
>  	 */
>  	dac->dela_state = XFS_DAS_FLIP_NFLAG;
> +	trace_xfs_das_state_return(dac->dela_state);
>  	return -EAGAIN;
>  das_flip_flag:
>  	/*
> @@ -1253,6 +1264,10 @@ xfs_attr_node_addname(
>  das_rm_nblk:
>  	if (args->rmtblkno) {
>  		error = __xfs_attr_rmtval_remove(dac);
> +
> +		if (error == -EAGAIN)
> +			trace_xfs_das_state_return(dac->dela_state);
> +
>  		if (error)
>  			return error;
>  	}
> @@ -1396,6 +1411,8 @@ xfs_attr_node_remove_rmt (
>  	 * May return -EAGAIN to request that the caller recall this function
>  	 */
>  	error = __xfs_attr_rmtval_remove(dac);
> +	if (error == -EAGAIN)
> +		trace_xfs_das_state_return(dac->dela_state);
>  	if (error)
>  		return error;
>  
> @@ -1514,6 +1531,7 @@ xfs_attr_node_removename_iter(
>  
>  			dac->flags |= XFS_DAC_DEFER_FINISH;
>  			dac->dela_state = XFS_DAS_RM_SHRINK;
> +			trace_xfs_das_state_return(dac->dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1532,8 +1550,10 @@ xfs_attr_node_removename_iter(
>  		goto out;
>  	}
>  
> -	if (error == -EAGAIN)
> +	if (error == -EAGAIN) {
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return error;
> +	}
>  out:
>  	if (state)
>  		xfs_da_state_free(state);
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index 6af86bf..4840de9 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -763,6 +763,7 @@ __xfs_attr_rmtval_remove(
>  	 */
>  	if (!done) {
>  		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return -EAGAIN;
>  	}
>  
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index 9074b8b..4f6939b4 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -3887,6 +3887,26 @@ DEFINE_EVENT(xfs_timestamp_range_class, name, \
>  DEFINE_TIMESTAMP_RANGE_EVENT(xfs_inode_timestamp_range);
>  DEFINE_TIMESTAMP_RANGE_EVENT(xfs_quota_expiry_range);
>  
> +
> +DECLARE_EVENT_CLASS(xfs_das_state_class,
> +	TP_PROTO(int das),
> +	TP_ARGS(das),
> +	TP_STRUCT__entry(
> +		__field(int, das)
> +	),
> +	TP_fast_assign(
> +		__entry->das = das;
> +	),
> +	TP_printk("state change %d",
> +		  __entry->das)
> +)
> +
> +#define DEFINE_DAS_STATE_EVENT(name) \
> +DEFINE_EVENT(xfs_das_state_class, name, \
> +	TP_PROTO(int das), \
> +	TP_ARGS(das))
> +DEFINE_DAS_STATE_EVENT(xfs_das_state_return);
> +
>  #endif /* _TRACE_XFS_H */
>  
>  #undef TRACE_INCLUDE_PATH
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 06/15] xfs: Add state machine tracepoints
  2020-12-18  7:29 ` [PATCH v14 06/15] xfs: Add state machine tracepoints Allison Henderson
  2021-01-05  4:50   ` Chandan Babu R
@ 2021-01-05  5:28   ` Darrick J. Wong
  2021-01-05 21:07     ` Allison Henderson
  1 sibling, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2021-01-05  5:28 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Dec 18, 2020 at 12:29:08AM -0700, Allison Henderson wrote:
> This is a quick patch to add a new tracepoint: xfs_das_state_return.  We
> use this to track when ever a new state is set or -EAGAIN is returned
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c        | 22 +++++++++++++++++++++-
>  fs/xfs/libxfs/xfs_attr_remote.c |  1 +
>  fs/xfs/xfs_trace.h              | 20 ++++++++++++++++++++
>  3 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index cd72512..8ed00bc 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -263,6 +263,7 @@ xfs_attr_set_shortform(
>  	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
>  	 * fork to leaf format and will restart with the leaf add.
>  	 */
> +	trace_xfs_das_state_return(XFS_DAS_UNINIT);

It would help to record the inode number in the trace data.  When
someone encounters an xattr problem involving things like fsstress,
it'll be /much/ easier to disentangle who's doing what.

>  	return -EAGAIN;
>  }
>  
> @@ -409,9 +410,11 @@ xfs_attr_set_iter(
>  		 * down into the node handling code below
>  		 */
>  		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return -EAGAIN;
>  	case 0:
>  		dac->dela_state = XFS_DAS_FOUND_LBLK;
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return -EAGAIN;
>  	}
>  	return error;
> @@ -841,6 +844,7 @@ xfs_attr_leaf_addname(
>  			return error;
>  
>  		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		trace_xfs_das_state_return(dac->dela_state);

Also, please consider capturing more info about /which/ of these
xfs_das_state_return tracepoints fired, either by introducing more
variants (e.g. xfs_attr_leaf_addname_das_return) or by feeding
__this_address into the trace "call" and printing it in the TP_printk
output (formatting string '%pS').

Each declared tracepoint /does/ have a permanent memory cost, so I would
think hard about trying #2...

--D

>  		return -EAGAIN;
>  	}
>  
> @@ -874,6 +878,7 @@ xfs_attr_leaf_addname(
>  	 * Commit the flag value change and start the next trans in series.
>  	 */
>  	dac->dela_state = XFS_DAS_FLIP_LFLAG;
> +	trace_xfs_das_state_return(dac->dela_state);
>  	return -EAGAIN;
>  das_flip_flag:
>  	/*
> @@ -891,6 +896,8 @@ xfs_attr_leaf_addname(
>  das_rm_lblk:
>  	if (args->rmtblkno) {
>  		error = __xfs_attr_rmtval_remove(dac);
> +		if (error == -EAGAIN)
> +			trace_xfs_das_state_return(dac->dela_state);
>  		if (error)
>  			return error;
>  	}
> @@ -1142,6 +1149,7 @@ xfs_attr_node_addname(
>  			 * this point.
>  			 */
>  			dac->flags |= XFS_DAC_DEFER_FINISH;
> +			trace_xfs_das_state_return(dac->dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1175,6 +1183,7 @@ xfs_attr_node_addname(
>  	state = NULL;
>  
>  	dac->dela_state = XFS_DAS_FOUND_NBLK;
> +	trace_xfs_das_state_return(dac->dela_state);
>  	return -EAGAIN;
>  das_found_nblk:
>  
> @@ -1202,6 +1211,7 @@ xfs_attr_node_addname(
>  				return error;
>  
>  			dac->flags |= XFS_DAC_DEFER_FINISH;
> +			trace_xfs_das_state_return(dac->dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1236,6 +1246,7 @@ xfs_attr_node_addname(
>  	 * Commit the flag value change and start the next trans in series
>  	 */
>  	dac->dela_state = XFS_DAS_FLIP_NFLAG;
> +	trace_xfs_das_state_return(dac->dela_state);
>  	return -EAGAIN;
>  das_flip_flag:
>  	/*
> @@ -1253,6 +1264,10 @@ xfs_attr_node_addname(
>  das_rm_nblk:
>  	if (args->rmtblkno) {
>  		error = __xfs_attr_rmtval_remove(dac);
> +
> +		if (error == -EAGAIN)
> +			trace_xfs_das_state_return(dac->dela_state);
> +
>  		if (error)
>  			return error;
>  	}
> @@ -1396,6 +1411,8 @@ xfs_attr_node_remove_rmt (
>  	 * May return -EAGAIN to request that the caller recall this function
>  	 */
>  	error = __xfs_attr_rmtval_remove(dac);
> +	if (error == -EAGAIN)
> +		trace_xfs_das_state_return(dac->dela_state);
>  	if (error)
>  		return error;
>  
> @@ -1514,6 +1531,7 @@ xfs_attr_node_removename_iter(
>  
>  			dac->flags |= XFS_DAC_DEFER_FINISH;
>  			dac->dela_state = XFS_DAS_RM_SHRINK;
> +			trace_xfs_das_state_return(dac->dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1532,8 +1550,10 @@ xfs_attr_node_removename_iter(
>  		goto out;
>  	}
>  
> -	if (error == -EAGAIN)
> +	if (error == -EAGAIN) {
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return error;
> +	}
>  out:
>  	if (state)
>  		xfs_da_state_free(state);
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index 6af86bf..4840de9 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -763,6 +763,7 @@ __xfs_attr_rmtval_remove(
>  	 */
>  	if (!done) {
>  		dac->flags |= XFS_DAC_DEFER_FINISH;
> +		trace_xfs_das_state_return(dac->dela_state);
>  		return -EAGAIN;
>  	}
>  
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index 9074b8b..4f6939b4 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -3887,6 +3887,26 @@ DEFINE_EVENT(xfs_timestamp_range_class, name, \
>  DEFINE_TIMESTAMP_RANGE_EVENT(xfs_inode_timestamp_range);
>  DEFINE_TIMESTAMP_RANGE_EVENT(xfs_quota_expiry_range);
>  
> +
> +DECLARE_EVENT_CLASS(xfs_das_state_class,
> +	TP_PROTO(int das),
> +	TP_ARGS(das),
> +	TP_STRUCT__entry(
> +		__field(int, das)
> +	),
> +	TP_fast_assign(
> +		__entry->das = das;
> +	),
> +	TP_printk("state change %d",
> +		  __entry->das)
> +)
> +
> +#define DEFINE_DAS_STATE_EVENT(name) \
> +DEFINE_EVENT(xfs_das_state_class, name, \
> +	TP_PROTO(int das), \
> +	TP_ARGS(das))
> +DEFINE_DAS_STATE_EVENT(xfs_das_state_return);
> +
>  #endif /* _TRACE_XFS_H */
>  
>  #undef TRACE_INCLUDE_PATH
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans
  2020-12-18  7:29 ` [PATCH v14 08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans Allison Henderson
@ 2021-01-05  5:38   ` Darrick J. Wong
  2021-01-05 20:15     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2021-01-05  5:38 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Dec 18, 2020 at 12:29:10AM -0700, Allison Henderson wrote:
> Because xattrs can be over a page in size, we need to handle possible
> krealloc errors to avoid warnings

Which warnings?

> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/xfs_log_recover.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 97f3130..295a5c6 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2061,7 +2061,10 @@ xlog_recover_add_to_cont_trans(
>  	old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
>  	old_len = item->ri_buf[item->ri_cnt-1].i_len;
>  
> -	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
> +	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL);

Does the removal of NOFAIL increase the likelihood that log recovery
will fail instead of looping around looking for more memory?

Hm, what /are/ we doing here, anyway?  I guess someone logged a gigantic
xattri item, which gets split across multiple log records, and now we're
trying to staple all that back together?  And perhaps the xattri item is
larger than a ... page(?) which causes dmesg warnings when combined with
NOFAIL?

--D

> +	if (ptr == NULL)
> +		return -ENOMEM;
> +
>  	memcpy(&ptr[old_len], dp, len);
>  	item->ri_buf[item->ri_cnt-1].i_len += len;
>  	item->ri_buf[item->ri_cnt-1].i_addr = ptr;
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 14/15] xfs: Add delattr mount option
  2020-12-18  7:29 ` [PATCH v14 14/15] xfs: Add delattr mount option Allison Henderson
@ 2021-01-05  5:46   ` Darrick J. Wong
  2021-01-05 21:49     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2021-01-05  5:46 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Dec 18, 2020 at 12:29:16AM -0700, Allison Henderson wrote:
> This patch adds a mount option to enable delayed attributes. Eventually
> this can be removed when delayed attrs becomes permanent.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.h | 2 +-
>  fs/xfs/xfs_mount.h       | 1 +
>  fs/xfs/xfs_super.c       | 6 +++++-
>  fs/xfs/xfs_xattr.c       | 2 ++
>  4 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 4838094..edd008d 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -30,7 +30,7 @@ struct xfs_attr_list_context;
>  
>  static inline bool xfs_hasdelattr(struct xfs_mount *mp)

/me had a brain fart just now that ... since struct xfs_delattr_context
is ultimately going to be absorbed into struct xfs_attr_item, we really
should have called the control knob part of this 'logattr' instead of
'delattr', because that's (IMIO) a better explanation of what the mount
option actually does for users.

An even better name would have been "logged attributes replayable"
because then you could use the prefix XFS_LARP for things. :P

Comments? :)

--D


>  {
> -	return false;
> +	return mp->m_flags & XFS_MOUNT_DELATTR;
>  }
>  
>  /*
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index dfa429b..4794f27 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -254,6 +254,7 @@ typedef struct xfs_mount {
>  #define XFS_MOUNT_NOATTR2	(1ULL << 25)	/* disable use of attr2 format */
>  #define XFS_MOUNT_DAX_ALWAYS	(1ULL << 26)
>  #define XFS_MOUNT_DAX_NEVER	(1ULL << 27)
> +#define XFS_MOUNT_DELATTR	(1ULL << 28)	/* enable delayed attributes */
>  
>  /*
>   * Max and min values for mount-option defined I/O
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 813be87..72169ee 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -92,7 +92,7 @@ enum {
>  	Opt_filestreams, Opt_quota, Opt_noquota, Opt_usrquota, Opt_grpquota,
>  	Opt_prjquota, Opt_uquota, Opt_gquota, Opt_pquota,
>  	Opt_uqnoenforce, Opt_gqnoenforce, Opt_pqnoenforce, Opt_qnoenforce,
> -	Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum,
> +	Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum, Opt_delattr
>  };
>  
>  static const struct fs_parameter_spec xfs_fs_parameters[] = {
> @@ -137,6 +137,7 @@ static const struct fs_parameter_spec xfs_fs_parameters[] = {
>  	fsparam_flag("nodiscard",	Opt_nodiscard),
>  	fsparam_flag("dax",		Opt_dax),
>  	fsparam_enum("dax",		Opt_dax_enum, dax_param_enums),
> +	fsparam_flag("delattr",		Opt_delattr),
>  	{}
>  };
>  
> @@ -1292,6 +1293,9 @@ xfs_fs_parse_param(
>  		xfs_mount_set_dax_mode(mp, result.uint_32);
>  		return 0;
>  #endif
> +	case Opt_delattr:
> +		mp->m_flags |= XFS_MOUNT_DELATTR;
> +		return 0;
>  	/* Following mount options will be removed in September 2025 */
>  	case Opt_ikeep:
>  		xfs_warn(mp, "%s mount option is deprecated.", param->key);
> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> index 9b0c790..8ec61df 100644
> --- a/fs/xfs/xfs_xattr.c
> +++ b/fs/xfs/xfs_xattr.c
> @@ -8,6 +8,8 @@
>  #include "xfs_shared.h"
>  #include "xfs_format.h"
>  #include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_mount.h"
>  #include "xfs_da_format.h"
>  #include "xfs_inode.h"
>  #include "xfs_da_btree.h"
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 15/15] xfs: Merge xfs_delattr_context into xfs_attr_item
  2020-12-18  7:29 ` [PATCH v14 15/15] xfs: Merge xfs_delattr_context into xfs_attr_item Allison Henderson
@ 2021-01-05  5:47   ` Darrick J. Wong
  2021-01-05 21:07     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2021-01-05  5:47 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Dec 18, 2020 at 12:29:17AM -0700, Allison Henderson wrote:
> This is a clean up patch that merges xfs_delattr_context into
> xfs_attr_item.  Now that the refactoring is complete and the delayed
> operation infastructure is in place, we can combine these to eliminate
> the extra struct
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>

Nice consolidation!
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/libxfs/xfs_attr.c        | 138 ++++++++++++++++++++--------------------
>  fs/xfs/libxfs/xfs_attr.h        |  40 +++++-------
>  fs/xfs/libxfs/xfs_attr_remote.c |  34 +++++-----
>  fs/xfs/libxfs/xfs_attr_remote.h |   6 +-
>  fs/xfs/xfs_attr_item.c          |  46 ++++++--------
>  5 files changed, 127 insertions(+), 137 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 6e5a900..badcdae 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -46,7 +46,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
>   * Internal routines when attribute list is one block.
>   */
>  STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
> -STATIC int xfs_attr_leaf_addname(struct xfs_delattr_context *dac);
> +STATIC int xfs_attr_leaf_addname(struct xfs_attr_item *attr);
>  STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
>  STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>  
> @@ -54,8 +54,8 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>   * Internal routines when attribute list is more than one block.
>   */
>  STATIC int xfs_attr_node_get(xfs_da_args_t *args);
> -STATIC int xfs_attr_node_addname(struct xfs_delattr_context *dac);
> -STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
> +STATIC int xfs_attr_node_addname(struct xfs_attr_item *attr);
> +STATIC int xfs_attr_node_removename_iter(struct xfs_attr_item *attr);
>  STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>  				 struct xfs_da_state **state);
>  STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
> @@ -276,27 +276,27 @@ xfs_attr_set_shortform(
>   */
>  int
>  xfs_attr_set_iter(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	struct xfs_inode		*dp = args->dp;
> -	struct xfs_buf			**leaf_bp = &dac->leaf_bp;
> +	struct xfs_buf			**leaf_bp = &attr->xattri_leaf_bp;
>  	int				error = 0;
>  
>  	/* State machine switch */
> -	switch (dac->dela_state) {
> +	switch (attr->xattri_dela_state) {
>  	case XFS_DAS_FLIP_LFLAG:
>  	case XFS_DAS_FOUND_LBLK:
>  	case XFS_DAS_RM_LBLK:
> -		return xfs_attr_leaf_addname(dac);
> +		return xfs_attr_leaf_addname(attr);
>  	case XFS_DAS_FOUND_NBLK:
>  	case XFS_DAS_FLIP_NFLAG:
>  	case XFS_DAS_ALLOC_NODE:
> -		return xfs_attr_node_addname(dac);
> +		return xfs_attr_node_addname(attr);
>  	case XFS_DAS_UNINIT:
>  		break;
>  	default:
> -		ASSERT(dac->dela_state != XFS_DAS_RM_SHRINK);
> +		ASSERT(attr->xattri_dela_state != XFS_DAS_RM_SHRINK);
>  		break;
>  	}
>  
> @@ -328,7 +328,7 @@ xfs_attr_set_iter(
>  	}
>  
>  	if (!xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> -		return xfs_attr_node_addname(dac);
> +		return xfs_attr_node_addname(attr);
>  
>  	error = xfs_attr_leaf_try_add(args, *leaf_bp);
>  	switch (error) {
> @@ -351,11 +351,11 @@ xfs_attr_set_iter(
>  		 * when we come back, we'll be a node, so we'll fall
>  		 * down into the node handling code below
>  		 */
> -		trace_xfs_das_state_return(dac->dela_state);
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  		return -EAGAIN;
>  	case 0:
> -		dac->dela_state = XFS_DAS_FOUND_LBLK;
> -		trace_xfs_das_state_return(dac->dela_state);
> +		attr->xattri_dela_state = XFS_DAS_FOUND_LBLK;
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  		return -EAGAIN;
>  	}
>  	return error;
> @@ -401,13 +401,13 @@ xfs_has_attr(
>   */
>  int
>  xfs_attr_remove_iter(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	struct xfs_inode		*dp = args->dp;
>  
>  	/* If we are shrinking a node, resume shrink */
> -	if (dac->dela_state == XFS_DAS_RM_SHRINK)
> +	if (attr->xattri_dela_state == XFS_DAS_RM_SHRINK)
>  		goto node;
>  
>  	if (!xfs_inode_hasattr(dp))
> @@ -422,7 +422,7 @@ xfs_attr_remove_iter(
>  		return xfs_attr_leaf_removename(args);
>  node:
>  	/* If we are not short form or leaf, then proceed to remove node */
> -	return  xfs_attr_node_removename_iter(dac);
> +	return  xfs_attr_node_removename_iter(attr);
>  }
>  
>  /*
> @@ -573,7 +573,7 @@ xfs_attr_item_init(
>  
>  	new = kmem_zalloc(sizeof(struct xfs_attr_item), KM_NOFS);
>  	new->xattri_op_flags = op_flags;
> -	new->xattri_dac.da_args = args;
> +	new->xattri_da_args = args;
>  
>  	*attr = new;
>  	return 0;
> @@ -768,16 +768,16 @@ xfs_attr_leaf_try_add(
>   */
>  STATIC int
>  xfs_attr_leaf_addname(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	struct xfs_buf			*bp = NULL;
>  	int				error, forkoff;
>  	struct xfs_inode		*dp = args->dp;
>  	struct xfs_mount		*mp = args->dp->i_mount;
>  
>  	/* State machine switch */
> -	switch (dac->dela_state) {
> +	switch (attr->xattri_dela_state) {
>  	case XFS_DAS_FLIP_LFLAG:
>  		goto das_flip_flag;
>  	case XFS_DAS_RM_LBLK:
> @@ -794,10 +794,10 @@ xfs_attr_leaf_addname(
>  	 */
>  
>  	/* Open coded xfs_attr_rmtval_set without trans handling */
> -	if ((dac->flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
> -		dac->flags |= XFS_DAC_LEAF_ADDNAME_INIT;
> +	if ((attr->xattri_flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
> +		attr->xattri_flags |= XFS_DAC_LEAF_ADDNAME_INIT;
>  		if (args->rmtblkno > 0) {
> -			error = xfs_attr_rmtval_find_space(dac);
> +			error = xfs_attr_rmtval_find_space(attr);
>  			if (error)
>  				return error;
>  		}
> @@ -807,12 +807,12 @@ xfs_attr_leaf_addname(
>  	 * Roll through the "value", allocating blocks on disk as
>  	 * required.
>  	 */
> -	if (dac->blkcnt > 0) {
> -		error = xfs_attr_rmtval_set_blk(dac);
> +	if (attr->xattri_blkcnt > 0) {
> +		error = xfs_attr_rmtval_set_blk(attr);
>  		if (error)
>  			return error;
>  
> -		trace_xfs_das_state_return(dac->dela_state);
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  		return -EAGAIN;
>  	}
>  
> @@ -846,8 +846,8 @@ xfs_attr_leaf_addname(
>  		/*
>  		 * Commit the flag value change and start the next trans in series.
>  		 */
> -		dac->dela_state = XFS_DAS_FLIP_LFLAG;
> -		trace_xfs_das_state_return(dac->dela_state);
> +		attr->xattri_dela_state = XFS_DAS_FLIP_LFLAG;
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  		return -EAGAIN;
>  	}
>  das_flip_flag:
> @@ -862,12 +862,12 @@ xfs_attr_leaf_addname(
>  		return error;
>  
>  	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
> -	dac->dela_state = XFS_DAS_RM_LBLK;
> +	attr->xattri_dela_state = XFS_DAS_RM_LBLK;
>  das_rm_lblk:
>  	if (args->rmtblkno) {
> -		error = xfs_attr_rmtval_remove(dac);
> +		error = xfs_attr_rmtval_remove(attr);
>  		if (error == -EAGAIN)
> -			trace_xfs_das_state_return(dac->dela_state);
> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>  		if (error)
>  			return error;
>  	}
> @@ -1041,9 +1041,9 @@ xfs_attr_node_hasname(
>   */
>  STATIC int
>  xfs_attr_node_addname(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	struct xfs_da_state		*state = NULL;
>  	struct xfs_da_state_blk		*blk;
>  	int				retval = 0;
> @@ -1053,7 +1053,7 @@ xfs_attr_node_addname(
>  	trace_xfs_attr_node_addname(args);
>  
>  	/* State machine switch */
> -	switch (dac->dela_state) {
> +	switch (attr->xattri_dela_state) {
>  	case XFS_DAS_FLIP_NFLAG:
>  		goto das_flip_flag;
>  	case XFS_DAS_FOUND_NBLK:
> @@ -1119,7 +1119,7 @@ xfs_attr_node_addname(
>  			 * this. dela_state is still unset by this function at
>  			 * this point.
>  			 */
> -			trace_xfs_das_state_return(dac->dela_state);
> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1151,8 +1151,8 @@ xfs_attr_node_addname(
>  	xfs_da_state_free(state);
>  	state = NULL;
>  
> -	dac->dela_state = XFS_DAS_FOUND_NBLK;
> -	trace_xfs_das_state_return(dac->dela_state);
> +	attr->xattri_dela_state = XFS_DAS_FOUND_NBLK;
> +	trace_xfs_das_state_return(attr->xattri_dela_state);
>  	return -EAGAIN;
>  das_found_nblk:
>  
> @@ -1164,7 +1164,7 @@ xfs_attr_node_addname(
>  	 */
>  	if (args->rmtblkno > 0) {
>  		/* Open coded xfs_attr_rmtval_set without trans handling */
> -		error = xfs_attr_rmtval_find_space(dac);
> +		error = xfs_attr_rmtval_find_space(attr);
>  		if (error)
>  			return error;
>  
> @@ -1172,14 +1172,14 @@ xfs_attr_node_addname(
>  		 * Roll through the "value", allocating blocks on disk as
>  		 * required.  Set the state in case of -EAGAIN return code
>  		 */
> -		dac->dela_state = XFS_DAS_ALLOC_NODE;
> +		attr->xattri_dela_state = XFS_DAS_ALLOC_NODE;
>  das_alloc_node:
> -		if (dac->blkcnt > 0) {
> -			error = xfs_attr_rmtval_set_blk(dac);
> +		if (attr->xattri_blkcnt > 0) {
> +			error = xfs_attr_rmtval_set_blk(attr);
>  			if (error)
>  				return error;
>  
> -			trace_xfs_das_state_return(dac->dela_state);
> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1214,8 +1214,8 @@ xfs_attr_node_addname(
>  		/*
>  		 * Commit the flag value change and start the next trans in series
>  		 */
> -		dac->dela_state = XFS_DAS_FLIP_NFLAG;
> -		trace_xfs_das_state_return(dac->dela_state);
> +		attr->xattri_dela_state = XFS_DAS_FLIP_NFLAG;
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  		return -EAGAIN;
>  	}
>  das_flip_flag:
> @@ -1230,13 +1230,13 @@ xfs_attr_node_addname(
>  		return error;
>  
>  	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
> -	dac->dela_state = XFS_DAS_RM_NBLK;
> +	attr->xattri_dela_state = XFS_DAS_RM_NBLK;
>  das_rm_nblk:
>  	if (args->rmtblkno) {
> -		error = xfs_attr_rmtval_remove(dac);
> +		error = xfs_attr_rmtval_remove(attr);
>  
>  		if (error == -EAGAIN)
> -			trace_xfs_das_state_return(dac->dela_state);
> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>  
>  		if (error)
>  			return error;
> @@ -1344,10 +1344,10 @@ xfs_attr_leaf_mark_incomplete(
>   */
>  STATIC
>  int xfs_attr_node_removename_setup(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> -	struct xfs_da_state		**state = &dac->da_state;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
> +	struct xfs_da_state		**state = &attr->xattri_da_state;
>  	int				error;
>  
>  	error = xfs_attr_node_hasname(args, state);
> @@ -1371,7 +1371,7 @@ int xfs_attr_node_removename_setup(
>  
>  STATIC int
>  xfs_attr_node_remove_rmt (
> -	struct xfs_delattr_context	*dac,
> +	struct xfs_attr_item		*attr,
>  	struct xfs_da_state		*state)
>  {
>  	int				error = 0;
> @@ -1379,9 +1379,9 @@ xfs_attr_node_remove_rmt (
>  	/*
>  	 * May return -EAGAIN to request that the caller recall this function
>  	 */
> -	error = xfs_attr_rmtval_remove(dac);
> +	error = xfs_attr_rmtval_remove(attr);
>  	if (error == -EAGAIN)
> -		trace_xfs_das_state_return(dac->dela_state);
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  	if (error)
>  		return error;
>  
> @@ -1425,10 +1425,10 @@ xfs_attr_node_remove_cleanup(
>   */
>  STATIC int
>  xfs_attr_node_remove_step(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> -	struct xfs_da_state		*state = dac->da_state;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
> +	struct xfs_da_state		*state = attr->xattri_da_state;
>  	int				error = 0;
>  	/*
>  	 * If there is an out-of-line value, de-allocate the blocks.
> @@ -1439,7 +1439,7 @@ xfs_attr_node_remove_step(
>  		/*
>  		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
>  		 */
> -		error = xfs_attr_node_remove_rmt(dac, state);
> +		error = xfs_attr_node_remove_rmt(attr, state);
>  		if (error)
>  			return error;
>  	}
> @@ -1460,29 +1460,29 @@ xfs_attr_node_remove_step(
>   */
>  STATIC int
>  xfs_attr_node_removename_iter(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	struct xfs_da_state		*state = NULL;
>  	int				retval, error;
>  	struct xfs_inode		*dp = args->dp;
>  
>  	trace_xfs_attr_node_removename(args);
>  
> -	if (!dac->da_state) {
> -		error = xfs_attr_node_removename_setup(dac);
> +	if (!attr->xattri_da_state) {
> +		error = xfs_attr_node_removename_setup(attr);
>  		if (error)
>  			goto out;
>  	}
> -	state = dac->da_state;
> +	state = attr->xattri_da_state;
>  
> -	switch (dac->dela_state) {
> +	switch (attr->xattri_dela_state) {
>  	case XFS_DAS_UNINIT:
>  		/*
>  		 * repeatedly remove remote blocks, remove the entry and join.
>  		 * returns -EAGAIN or 0 for completion of the step.
>  		 */
> -		error = xfs_attr_node_remove_step(dac);
> +		error = xfs_attr_node_remove_step(attr);
>  		if (error)
>  			break;
>  
> @@ -1498,8 +1498,8 @@ xfs_attr_node_removename_iter(
>  			if (error)
>  				return error;
>  
> -			dac->dela_state = XFS_DAS_RM_SHRINK;
> -			trace_xfs_das_state_return(dac->dela_state);
> +			attr->xattri_dela_state = XFS_DAS_RM_SHRINK;
> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>  			return -EAGAIN;
>  		}
>  
> @@ -1519,7 +1519,7 @@ xfs_attr_node_removename_iter(
>  	}
>  
>  	if (error == -EAGAIN) {
> -		trace_xfs_das_state_return(dac->dela_state);
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  		return error;
>  	}
>  out:
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index edd008d..d1a59d0 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -364,7 +364,7 @@ struct xfs_attr_list_context {
>   */
>  
>  /*
> - * Enum values for xfs_delattr_context.da_state
> + * Enum values for xfs_attr_item.xattri_da_state
>   *
>   * These values are used by delayed attribute operations to keep track  of where
>   * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
> @@ -385,7 +385,7 @@ enum xfs_delattr_state {
>  };
>  
>  /*
> - * Defines for xfs_delattr_context.flags
> + * Defines for xfs_attr_item.xattri_flags
>   */
>  #define XFS_DAC_LEAF_ADDNAME_INIT	0x01 /* xfs_attr_leaf_addname init*/
>  #define XFS_DAC_DELAYED_OP_INIT		0x02 /* delayed operations init*/
> @@ -393,32 +393,25 @@ enum xfs_delattr_state {
>  /*
>   * Context used for keeping track of delayed attribute operations
>   */
> -struct xfs_delattr_context {
> -	struct xfs_da_args      *da_args;
> +struct xfs_attr_item {
> +	struct xfs_da_args		*xattri_da_args;
>  
>  	/*
>  	 * Used by xfs_attr_set to hold a leaf buffer across a transaction roll
>  	 */
> -	struct xfs_buf		*leaf_bp;
> +	struct xfs_buf			*xattri_leaf_bp;
>  
>  	/* Used in xfs_attr_rmtval_set_blk to roll through allocating blocks */
> -	struct xfs_bmbt_irec	map;
> -	xfs_dablk_t		lblkno;
> -	int			blkcnt;
> +	struct xfs_bmbt_irec		xattri_map;
> +	xfs_dablk_t			xattri_lblkno;
> +	int				xattri_blkcnt;
>  
>  	/* Used in xfs_attr_node_removename to roll through removing blocks */
> -	struct xfs_da_state     *da_state;
> +	struct xfs_da_state		*xattri_da_state;
>  
>  	/* Used to keep track of current state of delayed operation */
> -	unsigned int            flags;
> -	enum xfs_delattr_state  dela_state;
> -};
> -
> -/*
> - * List of attrs to commit later.
> - */
> -struct xfs_attr_item {
> -	struct xfs_delattr_context	xattri_dac;
> +	unsigned int			xattri_flags;
> +	enum xfs_delattr_state		xattri_dela_state;
>  
>  	/*
>  	 * Indicates if the attr operation is a set or a remove
> @@ -426,7 +419,10 @@ struct xfs_attr_item {
>  	 */
>  	uint32_t			xattri_op_flags;
>  
> -	/* used to log this item to an intent */
> +	/*
> +	 * used to log this item to an intent containing a list of attrs to
> +	 * commit later
> +	 */
>  	struct list_head		xattri_list;
>  };
>  
> @@ -445,12 +441,10 @@ int xfs_inode_hasattr(struct xfs_inode *ip);
>  int xfs_attr_get_ilocked(struct xfs_da_args *args);
>  int xfs_attr_get(struct xfs_da_args *args);
>  int xfs_attr_set(struct xfs_da_args *args);
> -int xfs_attr_set_iter(struct xfs_delattr_context *dac);
> +int xfs_attr_set_iter(struct xfs_attr_item *attr);
>  int xfs_has_attr(struct xfs_da_args *args);
> -int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
> +int xfs_attr_remove_iter(struct xfs_attr_item *attr);
>  bool xfs_attr_namecheck(const void *name, size_t length);
> -void xfs_delattr_context_init(struct xfs_delattr_context *dac,
> -			      struct xfs_da_args *args);
>  int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>  int xfs_attr_set_deferred(struct xfs_da_args *args);
>  int xfs_attr_remove_deferred(struct xfs_da_args *args);
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index a5ff5e0..42cc9cc 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -634,14 +634,14 @@ xfs_attr_rmtval_set(
>   */
>  int
>  xfs_attr_rmtval_find_space(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> -	struct xfs_bmbt_irec		*map = &dac->map;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
> +	struct xfs_bmbt_irec		*map = &attr->xattri_map;
>  	int				error;
>  
> -	dac->lblkno = 0;
> -	dac->blkcnt = 0;
> +	attr->xattri_lblkno = 0;
> +	attr->xattri_blkcnt = 0;
>  	args->rmtblkcnt = 0;
>  	args->rmtblkno = 0;
>  	memset(map, 0, sizeof(struct xfs_bmbt_irec));
> @@ -650,8 +650,8 @@ xfs_attr_rmtval_find_space(
>  	if (error)
>  		return error;
>  
> -	dac->blkcnt = args->rmtblkcnt;
> -	dac->lblkno = args->rmtblkno;
> +	attr->xattri_blkcnt = args->rmtblkcnt;
> +	attr->xattri_lblkno = args->rmtblkno;
>  
>  	return 0;
>  }
> @@ -664,17 +664,17 @@ xfs_attr_rmtval_find_space(
>   */
>  int
>  xfs_attr_rmtval_set_blk(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	struct xfs_inode		*dp = args->dp;
> -	struct xfs_bmbt_irec		*map = &dac->map;
> +	struct xfs_bmbt_irec		*map = &attr->xattri_map;
>  	int nmap;
>  	int error;
>  
>  	nmap = 1;
> -	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)dac->lblkno,
> -				dac->blkcnt, XFS_BMAPI_ATTRFORK, args->total,
> +	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)attr->xattri_lblkno,
> +				attr->xattri_blkcnt, XFS_BMAPI_ATTRFORK, args->total,
>  				map, &nmap);
>  	if (error)
>  		return error;
> @@ -684,8 +684,8 @@ xfs_attr_rmtval_set_blk(
>  	       (map->br_startblock != HOLESTARTBLOCK));
>  
>  	/* roll attribute extent map forwards */
> -	dac->lblkno += map->br_blockcount;
> -	dac->blkcnt -= map->br_blockcount;
> +	attr->xattri_lblkno += map->br_blockcount;
> +	attr->xattri_blkcnt -= map->br_blockcount;
>  
>  	return 0;
>  }
> @@ -738,9 +738,9 @@ xfs_attr_rmtval_invalidate(
>   */
>  int
>  xfs_attr_rmtval_remove(
> -	struct xfs_delattr_context	*dac)
> +	struct xfs_attr_item		*attr)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	int				error, done;
>  
>  	/*
> @@ -762,7 +762,7 @@ xfs_attr_rmtval_remove(
>  	 * by the parent
>  	 */
>  	if (!done) {
> -		trace_xfs_das_state_return(dac->dela_state);
> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>  		return -EAGAIN;
>  	}
>  
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
> index 6ae91af..d3aa27d 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.h
> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
> @@ -13,9 +13,9 @@ int xfs_attr_rmtval_set(struct xfs_da_args *args);
>  int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
>  		xfs_buf_flags_t incore_flags);
>  int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
> -int xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
> +int xfs_attr_rmtval_remove(struct xfs_attr_item *attr);
>  int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
>  int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
> -int xfs_attr_rmtval_set_blk(struct xfs_delattr_context *dac);
> -int xfs_attr_rmtval_find_space(struct xfs_delattr_context *dac);
> +int xfs_attr_rmtval_set_blk(struct xfs_attr_item *attr);
> +int xfs_attr_rmtval_find_space(struct xfs_attr_item *attr);
>  #endif /* __XFS_ATTR_REMOTE_H__ */
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> index e1cfef1..bbca949 100644
> --- a/fs/xfs/xfs_attr_item.c
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -291,11 +291,11 @@ xfs_attrd_item_release(
>   */
>  int
>  xfs_trans_attr(
> -	struct xfs_delattr_context	*dac,
> +	struct xfs_attr_item		*attr,
>  	struct xfs_attrd_log_item	*attrdp,
>  	uint32_t			op_flags)
>  {
> -	struct xfs_da_args		*args = dac->da_args;
> +	struct xfs_da_args		*args = attr->xattri_da_args;
>  	int				error;
>  
>  	error = xfs_qm_dqattach_locked(args->dp, 0);
> @@ -310,11 +310,11 @@ xfs_trans_attr(
>  	switch (op_flags) {
>  	case XFS_ATTR_OP_FLAGS_SET:
>  		args->op_flags |= XFS_DA_OP_ADDNAME;
> -		error = xfs_attr_set_iter(dac);
> +		error = xfs_attr_set_iter(attr);
>  		break;
>  	case XFS_ATTR_OP_FLAGS_REMOVE:
>  		ASSERT(XFS_IFORK_Q(args->dp));
> -		error = xfs_attr_remove_iter(dac);
> +		error = xfs_attr_remove_iter(attr);
>  		break;
>  	default:
>  		error = -EFSCORRUPTED;
> @@ -358,16 +358,16 @@ xfs_attr_log_item(
>  	 * structure with fields from this xfs_attr_item
>  	 */
>  	attrp = &attrip->attri_format;
> -	attrp->alfi_ino = attr->xattri_dac.da_args->dp->i_ino;
> +	attrp->alfi_ino = attr->xattri_da_args->dp->i_ino;
>  	attrp->alfi_op_flags = attr->xattri_op_flags;
> -	attrp->alfi_value_len = attr->xattri_dac.da_args->valuelen;
> -	attrp->alfi_name_len = attr->xattri_dac.da_args->namelen;
> -	attrp->alfi_attr_flags = attr->xattri_dac.da_args->attr_filter;
> -
> -	attrip->attri_name = (void *)attr->xattri_dac.da_args->name;
> -	attrip->attri_value = attr->xattri_dac.da_args->value;
> -	attrip->attri_name_len = attr->xattri_dac.da_args->namelen;
> -	attrip->attri_value_len = attr->xattri_dac.da_args->valuelen;
> +	attrp->alfi_value_len = attr->xattri_da_args->valuelen;
> +	attrp->alfi_name_len = attr->xattri_da_args->namelen;
> +	attrp->alfi_attr_flags = attr->xattri_da_args->attr_filter;
> +
> +	attrip->attri_name = (void *)attr->xattri_da_args->name;
> +	attrip->attri_value = attr->xattri_da_args->value;
> +	attrip->attri_name_len = attr->xattri_da_args->namelen;
> +	attrip->attri_value_len = attr->xattri_da_args->valuelen;
>  }
>  
>  /* Get an ATTRI. */
> @@ -408,10 +408,8 @@ xfs_attr_finish_item(
>  	struct xfs_attr_item		*attr;
>  	struct xfs_attrd_log_item	*done_item = NULL;
>  	int				error;
> -	struct xfs_delattr_context	*dac;
>  
>  	attr = container_of(item, struct xfs_attr_item, xattri_list);
> -	dac = &attr->xattri_dac;
>  	if (done)
>  		done_item = ATTRD_ITEM(done);
>  
> @@ -423,19 +421,18 @@ xfs_attr_finish_item(
>  	 * in a standard delay op, so we need to catch this here and rejoin the
>  	 * leaf to the new transaction
>  	 */
> -	if (attr->xattri_dac.leaf_bp &&
> -	    attr->xattri_dac.leaf_bp->b_transp != tp) {
> -		xfs_trans_bjoin(tp, attr->xattri_dac.leaf_bp);
> -		xfs_trans_bhold(tp, attr->xattri_dac.leaf_bp);
> +	if (attr->xattri_leaf_bp && attr->xattri_leaf_bp->b_transp != tp) {
> +		xfs_trans_bjoin(tp, attr->xattri_leaf_bp);
> +		xfs_trans_bhold(tp, attr->xattri_leaf_bp);
>  	}
>  
>  	/*
>  	 * Always reset trans after EAGAIN cycle
>  	 * since the transaction is new
>  	 */
> -	dac->da_args->trans = tp;
> +	attr->xattri_da_args->trans = tp;
>  
> -	error = xfs_trans_attr(dac, done_item, attr->xattri_op_flags);
> +	error = xfs_trans_attr(attr, done_item, attr->xattri_op_flags);
>  	if (error != -EAGAIN)
>  		kmem_free(attr);
>  
> @@ -570,7 +567,7 @@ xfs_attri_item_recover(
>  	struct xfs_attrd_log_item	*done_item = NULL;
>  	struct xfs_attr_item		attr = {
>  		.xattri_op_flags	= attrip->attri_format.alfi_op_flags,
> -		.xattri_dac.da_args	= &args,
> +		.xattri_da_args		= &args,
>  	};
>  
>  	/*
> @@ -630,8 +627,7 @@ xfs_attri_item_recover(
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
>  	xfs_trans_ijoin(args.trans, ip, 0);
>  
> -	error = xfs_trans_attr(&attr.xattri_dac, done_item,
> -			       attrp->alfi_op_flags);
> +	error = xfs_trans_attr(&attr, done_item, attrp->alfi_op_flags);
>  	if (error == -EAGAIN) {
>  		/*
>  		 * There's more work to do, so make a new xfs_attr_item and add
> @@ -648,7 +644,7 @@ xfs_attri_item_recover(
>  		memcpy(new_args, &args, sizeof(struct xfs_da_args));
>  		memcpy(new_attr, &attr, sizeof(struct xfs_attr_item));
>  
> -		new_attr->xattri_dac.da_args = new_args;
> +		new_attr->xattri_da_args = new_args;
>  		memset(&new_attr->xattri_list, 0, sizeof(struct list_head));
>  
>  		xfs_defer_add(args.trans, XFS_DEFER_OPS_TYPE_ATTR,
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2021-01-04 17:52               ` Brian Foster
@ 2021-01-05 18:10                 ` Allison Henderson
  2021-01-06 14:25                   ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2021-01-05 18:10 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 1/4/21 10:52 AM, Brian Foster wrote:
> On Thu, Dec 24, 2020 at 01:23:24AM -0700, Allison Henderson wrote:
>>
>>
>> On 12/23/20 7:16 AM, Brian Foster wrote:
>>> On Tue, Dec 22, 2020 at 10:20:16PM -0700, Allison Henderson wrote:
>>>>
>>>>
>>>> On 12/22/20 11:44 AM, Brian Foster wrote:
>>>>> On Tue, Dec 22, 2020 at 12:20:20PM -0500, Brian Foster wrote:
>>>>>> On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
>>>>>>> On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
>>>>>>>> This patch modifies the attr remove routines to be delay ready. This
>>>>>>>> means they no longer roll or commit transactions, but instead return
>>>>>>>> -EAGAIN to have the calling routine roll and refresh the transaction. In
>>>>>>>> this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
>>>>>>>> uses a sort of state machine like switch to keep track of where it was
>>>>>>>> when EAGAIN was returned. xfs_attr_node_removename has also been
>>>>>>>> modified to use the switch, and a new version of xfs_attr_remove_args
>>>>>>>> consists of a simple loop to refresh the transaction until the operation
>>>>>>>> is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
>>>>>>>> transaction where ever the existing code used to.
>>>>>>>>
>>>>>>>> Calls to xfs_attr_rmtval_remove are replaced with the delay ready
>>>>>>>> version __xfs_attr_rmtval_remove. We will rename
>>>>>>>> __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
>>>>>>>> done.
>>>>>>>>
>>>>>>>> xfs_attr_rmtval_remove itself is still in use by the set routines (used
>>>>>>>> during a rename).  For reasons of preserving existing function, we
>>>>>>>> modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
>>>>>>>> set.  Similar to how xfs_attr_remove_args does here.  Once we transition
>>>>>>>> the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
>>>>>>>> used and will be removed.
>>>>>>>>
>>>>>>>> This patch also adds a new struct xfs_delattr_context, which we will use
>>>>>>>> to keep track of the current state of an attribute operation. The new
>>>>>>>> xfs_delattr_state enum is used to track various operations that are in
>>>>>>>> progress so that we know not to repeat them, and resume where we left
>>>>>>>> off before EAGAIN was returned to cycle out the transaction. Other
>>>>>>>> members take the place of local variables that need to retain their
>>>>>>>> values across multiple function recalls.  See xfs_attr.h for a more
>>>>>>>> detailed diagram of the states.
>>>>>>>>
>>>>>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>>>>>> ---
>>>>>>>
>>>>>>> I started with a couple small comments on this patch but inevitably
>>>>>>> started thinking more about the factoring again and ended up with a
>>>>>>> couple patches on top. The first is more of some small tweaks and
>>>>>>> open-coding that IMO makes this patch a bit easier to follow. The
>>>>>>> second is more of an RFC so I'll follow up with that in a second email.
>>>>>>> I'm curious what folks' thoughts might be on either. Also note that I'm
>>>>>>> primarily focusing on code structure and whatnot here, so these are fast
>>>>>>> and loose, compile tested only and likely to be broken.
>>>>>>>
>>>>>>
>>>>>> ... and here's the second diff (applies on top of the first).
>>>>>>
>>>>>> This one popped up after staring at the previous changes for a bit and
>>>>>> wondering whether using "done flags" might make the whole thing easier
>>>>>> to follow than incremental state transitions. I think the attr remove
>>>>>> path is easy enough to follow with either method, but the attr set path
>>>>>> is a beast and so this is more with that in mind. Initial thoughts?
>>>>>>
>>>>>
>>>>> Eh, the more I stare at the attr set code I'm not sure this by itself is
>>>>> much of an improvement. It helps in some areas, but there are so many
>>>>> transaction rolls embedded throughout at different levels that a larger
>>>>> rework of the code is probably still necessary. Anyways, this was just a
>>>>> random thought for now..
>>>>>
>>>>> Brian
>>>>
>>>> No worries, I know the feeling :-)  The set works and all, but I do think
>>>> there is struggle around trying to find a particularly pleasent looking
>>>> presentation of it.  Especially when we get into the set path, it's a bit
>>>> more complex.  I may pick through the patches you habe here and pick up the
>>>> whitespace cleanups and other style adjustments if people prefer it that
>>>> way.  The good news is, a lot of the *_args routines are supposed to
>>>> disappear at the end of the set, so there's not really a need to invest too
>>>> much in them I suppose. It may help to jump to the "Set up infastructure"
>>>> patch too.  I've expanded the diagram to try and help illustrait the code
>>>> flow a bit, so that may help with following the code flow.
>>>>
>>>
>>> I'm sure.. :P Note that the first patch was more smaller tweaks and
>>> refactoring with the existing model in mind. For the set path, the
>>> challenge IMO is to make the code generally more readable. I think the
>>> remove path accomplishes this for the most part because the states and
>>> whatnot are fairly low overhead on top of the existing complexity. This
>>> changes considerably for the set path, not so much due to the mechanism
>>> but because the baseline code is so fragmented and complex from the
>>> start. I am slightly concerned that bolting state management onto the
>>> current code as such might make it harder to grok and clean up after the
>>> fact, but I could be wrong about that (my hope was certainly for the
>>> opposite).
>> tbh, everytime I do another spin of the set, I actually make all my
>> modifications on top of the extended set, with parent pointers and all, and
>> make sure all the test cases are still good.  I know pptrs are still pretty
>> far out from here, but they're actually the best testcase for this, because
>> it generates so much more activity.  If all thats still golden, then I'll
>> pull them back down into the lower subsets and work out all the conflicts on
>> the back way up.  If something went wrong, diffing the branch heads tracks
>> it down pretty fast.
>>
> 
> Indeed, that's a good thing. My comment was more around the readability
> of the code and subsequent ability to clean it up, reduce the number of
> required states, etc...
> 
>>>
>>> Regardless, that had me shifting focus a bit and playing around with the
>>> current upstream code as opposed to shifting around your code. ISTM that
>>> there is some commonality across the various set codepaths and perhaps
>>> there is potential to simplify things notably _before_ applying the
>>> state management scheme. I've appended a new diff below (based on
>>> for-next) that starts to demonstrate what I mean. Note again that this
>>> is similarly fast and loose as I've knowingly threw away some quirks of
>>> the code (i.e. leaf buffer bhold) for the purpose of quickly trying to
>>> explore/POC whether the factoring might be sane and plausible.
>>>
>>> In summary, this combines the "try addname" part of each xattr format to
>>> fall under a single transaction rolling loop such that I think the
>>> resulting function could become one high level state. I ran out of time
>>> for working through the rest, but from a read through it seems there's
>>> at least a chance we could continue with similar refactoring and
>>> reduction to a fewer number of generic states (vs. more format-specific
>>> states). For example, the remaining parts of the set operation all seem
>>> to have something along the lines of the following high level
>>> components:
>>>
>>> - remote value block allocation (and value set)
>>> - if rename == true, clear flag and done
>>> - if rename == false, flip flags
>>> 	- remove old xattr (i.e., similar to xattr remove)
>>>
>>> ... where much of that code looks remarkably similar across the
>>> different leaf/node code branches. So I'm curious what you and others
>>> following along might think about something like this as an intermediate
>>> step...
>>
>> Yes, I had noticed similarities when we first started, though I got the
>> impression that people mostly wanted to focus on just hoisting the
>> transactions upwards.  I did look at them at one point, but seem to recall
>> the similarities having just enough disimilarities such that trying to
>> consolodate them tends to introduce about as much plumbing with if/else's.
>> In any case, I do think the solution here with the format handling is
>> creative, and may reduce a state or two, but I'd really need to see it
>> through the test cases to know if it's going to work.  From what you've
>> hashed out here, I think I get the idea. It's hard for me to comment on
>> readability because I've been up and down the code so much.  I do think it's
>> a little loopy looking, but so is the statemachine.  Maybe a good spot for
>> others to chime in too.
>>
> 
> Can you elaborate on what you mean by loopy? :P I'm sure you noticed I
> borrowed the transaction rolling mechanism from your infra patch..
> 
Well, that loop that is borrowed is meant to disappear at the end of the 
set though.  This part with *_set_fmt we would have to keep.  I guess 
that really means the *_set_fmt call would probably get consolodated 
into the *_iter routine though.  Let me see if I can get something like 
this to work on top of the set so it's a bit more clear what it would 
look like.  I think this modification would actually look simpler if it 
came in after the statemachine.  Otherwise you're trying to introduce 
the tranaction loop early.  Really it's purpose is just to get the state 
machine working, and then we get rid of it later.

> But yeah, I'm partly to blame for the hoisting approach as well. I was
> thinking/hoping that seeing the various states would facilitate
> simplification of the code, but my first reaction when looking at the
> (much more complex) xattr set path is more confusion than clarity. I see
> the code drop into state management, using that to call into
> format-specific helpers, then fall into doing some other stuff that
> might call into some of the same format-specific add helpers, then
> realize I'll probably have to trace up and down through the whole path
> to make some sense of the execution flow. 

Yeah, I think this question is very prefrence oriented.  See, initially, 
I thought the pattern of pairing states to gotos sort of alleviated the 
anxiety of needing to trace up and down the code:


    /*
     * We're going away for a bit to cycle the tranaction,
     * but we're gonna come back ....
     */
    dela_state = XFS_DAS_UNIQUE_STATE;
    return -EAGAIN;

xfs_das_unique_state:
    /* ...and resume execution here */


Granted, sometimes we can use the state of the attr to get away from 
needing this, but now you have to re-read the code in the context of 
what ever form we're in to figure that we land back in the same place. I 
realize this is sort of a unique pattern, so I understand people wanting 
to explore the idea of simplifying it away.  At this point I feel like I 
can follow it either way, so it's really what folks are more comfortable 
with.

That is what has me wondering
> whether this would become more simple with fewer, generic and higher
> level states like SET_FORMAT (i.e. what I hacked up), SET_NAME,
> SET_VALUE (rmt block allocs), SET_FLAG (clear or flip), and then finally
> fall into the remove path in the rename case.
> 
> We'd ultimately implement the same type of state machine approach, it
> would just require more up front cleanup rework than the other way
> around, and hopefully land fairly simplified from the onset. Of course
> those states are just off the top of my head so might not be feasible,
> but I'm also curious if any others following along might have thoughts
> one way or the other. I'm sure we could implement things in either order
> when it comes down to it...
Yeah, let me see if it's feasable, and what it ends up looking like. 
I'm kindof of the opinion that if you to have have a certain degree of 
complexity (ie setting states, and resumeing with gotos), you may as 
well leverage it what it can do.  Once you abosorb that pattern, it's 
not so scary the next time you see it.  Simplfying is certainly a good 
thing, but if it breaks the pattern thats keeps a more complex concept 
organized, the simplification might not make as much sense to others.  I 
think it's likley a spot for others to chime in, I think after looking 
at the same code for a while, it's hard to put yourself in the POV of 
someone else still trying to work through it.  :-)

Allison

> 
> Brian
> 
>> I actually find it easier to work on it from the top of the set rather than
>> the bottom.  Just so that the end goal of what it will end up looking like
>> is a little more clear.  Once the goal is clear, then I worry about layering
>> it in what ever patch it goes in.  Otherwise it's harder to see exactly how
>> the conflicts shake out.
>>
>> Allison
>>>
>>> Brian
>>>
>>> --- 8< ---
>>>
>>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>>> index fd8e6418a0d3..eff8833d5303 100644
>>> --- a/fs/xfs/libxfs/xfs_attr.c
>>> +++ b/fs/xfs/libxfs/xfs_attr.c
>>> @@ -58,6 +58,8 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>>>    				 struct xfs_da_state **state);
>>>    STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>>>    STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>>> +STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *, struct xfs_buf *);
>>> +STATIC int xfs_attr_node_addname_work(struct xfs_da_args *);
>>>    int
>>>    xfs_inode_hasattr(
>>> @@ -216,116 +218,93 @@ xfs_attr_is_shortform(
>>>    		ip->i_afp->if_nextents == 0);
>>>    }
>>> -/*
>>> - * Attempts to set an attr in shortform, or converts short form to leaf form if
>>> - * there is not enough room.  If the attr is set, the transaction is committed
>>> - * and set to NULL.
>>> - */
>>> -STATIC int
>>> -xfs_attr_set_shortform(
>>> +int
>>> +xfs_attr_set_fmt(
>>>    	struct xfs_da_args	*args,
>>> -	struct xfs_buf		**leaf_bp)
>>> +	bool			*done)
>>>    {
>>>    	struct xfs_inode	*dp = args->dp;
>>> -	int			error, error2 = 0;
>>> +	struct xfs_buf		*leaf_bp = NULL;
>>> +	int			error = 0;
>>> -	/*
>>> -	 * Try to add the attr to the attribute list in the inode.
>>> -	 */
>>> -	error = xfs_attr_try_sf_addname(dp, args);
>>> -	if (error != -ENOSPC) {
>>> -		error2 = xfs_trans_commit(args->trans);
>>> -		args->trans = NULL;
>>> -		return error ? error : error2;
>>> +	if (xfs_attr_is_shortform(dp)) {
>>> +		error = xfs_attr_try_sf_addname(dp, args);
>>> +		if (!error)
>>> +			*done = true;
>>> +		if (error != -ENOSPC)
>>> +			return error;
>>> +
>>> +		error = xfs_attr_shortform_to_leaf(args, &leaf_bp);
>>> +		if (error)
>>> +			return error;
>>> +		return -EAGAIN;
>>>    	}
>>> -	/*
>>> -	 * It won't fit in the shortform, transform to a leaf block.  GROT:
>>> -	 * another possible req'mt for a double-split btree op.
>>> -	 */
>>> -	error = xfs_attr_shortform_to_leaf(args, leaf_bp);
>>> -	if (error)
>>> -		return error;
>>> -	/*
>>> -	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
>>> -	 * push cannot grab the half-baked leaf buffer and run into problems
>>> -	 * with the write verifier. Once we're done rolling the transaction we
>>> -	 * can release the hold and add the attr to the leaf.
>>> -	 */
>>> -	xfs_trans_bhold(args->trans, *leaf_bp);
>>> -	error = xfs_defer_finish(&args->trans);
>>> -	xfs_trans_bhold_release(args->trans, *leaf_bp);
>>> -	if (error) {
>>> -		xfs_trans_brelse(args->trans, *leaf_bp);
>>> -		return error;
>>> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
>>> +		struct xfs_buf	*bp = NULL;
>>> +
>>> +		error = xfs_attr_leaf_try_add(args, bp);
>>> +		if (error != -ENOSPC)
>>> +			return error;
>>> +
>>> +		error = xfs_attr3_leaf_to_node(args);
>>> +		if (error)
>>> +			return error;
>>> +		return -EAGAIN;
>>>    	}
>>> -	return 0;
>>> +	return xfs_attr_node_addname(args);
>>>    }
>>>    /*
>>>     * Set the attribute specified in @args.
>>>     */
>>>    int
>>> -xfs_attr_set_args(
>>> +__xfs_attr_set_args(
>>>    	struct xfs_da_args	*args)
>>>    {
>>>    	struct xfs_inode	*dp = args->dp;
>>> -	struct xfs_buf          *leaf_bp = NULL;
>>>    	int			error = 0;
>>> -	/*
>>> -	 * If the attribute list is already in leaf format, jump straight to
>>> -	 * leaf handling.  Otherwise, try to add the attribute to the shortform
>>> -	 * list; if there's no room then convert the list to leaf format and try
>>> -	 * again.
>>> -	 */
>>> -	if (xfs_attr_is_shortform(dp)) {
>>> -
>>> -		/*
>>> -		 * If the attr was successfully set in shortform, the
>>> -		 * transaction is committed and set to NULL.  Otherwise, is it
>>> -		 * converted from shortform to leaf, and the transaction is
>>> -		 * retained.
>>> -		 */
>>> -		error = xfs_attr_set_shortform(args, &leaf_bp);
>>> -		if (error || !args->trans)
>>> -			return error;
>>> -	}
>>> -
>>>    	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
>>>    		error = xfs_attr_leaf_addname(args);
>>> -		if (error != -ENOSPC)
>>> -			return error;
>>> -
>>> -		/*
>>> -		 * Promote the attribute list to the Btree format.
>>> -		 */
>>> -		error = xfs_attr3_leaf_to_node(args);
>>>    		if (error)
>>>    			return error;
>>> +	}
>>> +
>>> +	error = xfs_attr_node_addname_work(args);
>>> +	return error;
>>> +}
>>> +
>>> +int
>>> +xfs_attr_set_args(
>>> +	struct xfs_da_args	*args)
>>> +
>>> +{
>>> +	int			error;
>>> +	bool			done = false;
>>> +
>>> +	do {
>>> +		error = xfs_attr_set_fmt(args, &done);
>>> +		if (error != -EAGAIN)
>>> +			break;
>>> -		/*
>>> -		 * Finish any deferred work items and roll the transaction once
>>> -		 * more.  The goal here is to call node_addname with the inode
>>> -		 * and transaction in the same state (inode locked and joined,
>>> -		 * transaction clean) no matter how we got to this step.
>>> -		 */
>>>    		error = xfs_defer_finish(&args->trans);
>>>    		if (error)
>>> -			return error;
>>> +			break;
>>> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
>>> +	} while (!error);
>>> -		/*
>>> -		 * Commit the current trans (including the inode) and
>>> -		 * start a new one.
>>> -		 */
>>> -		error = xfs_trans_roll_inode(&args->trans, dp);
>>> -		if (error)
>>> -			return error;
>>> -	}
>>> +	if (error || done)
>>> +		return error;
>>> -	error = xfs_attr_node_addname(args);
>>> -	return error;
>>> +	error = xfs_defer_finish(&args->trans);
>>> +	if (!error)
>>> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
>>> +	if (error)
>>> +		return error;
>>> +
>>> +	return __xfs_attr_set_args(args);
>>>    }
>>>    /*
>>> @@ -676,18 +655,6 @@ xfs_attr_leaf_addname(
>>>    	trace_xfs_attr_leaf_addname(args);
>>> -	error = xfs_attr_leaf_try_add(args, bp);
>>> -	if (error)
>>> -		return error;
>>> -
>>> -	/*
>>> -	 * Commit the transaction that added the attr name so that
>>> -	 * later routines can manage their own transactions.
>>> -	 */
>>> -	error = xfs_trans_roll_inode(&args->trans, dp);
>>> -	if (error)
>>> -		return error;
>>> -
>>>    	/*
>>>    	 * If there was an out-of-line value, allocate the blocks we
>>>    	 * identified for its storage and copy the value.  This is done
>>> @@ -923,7 +890,7 @@ xfs_attr_node_addname(
>>>    	 * Fill in bucket of arguments/results/context to carry around.
>>>    	 */
>>>    	dp = args->dp;
>>> -restart:
>>> +
>>>    	/*
>>>    	 * Search to see if name already exists, and get back a pointer
>>>    	 * to where it should go.
>>> @@ -967,21 +934,10 @@ xfs_attr_node_addname(
>>>    			xfs_da_state_free(state);
>>>    			state = NULL;
>>>    			error = xfs_attr3_leaf_to_node(args);
>>> -			if (error)
>>> -				goto out;
>>> -			error = xfs_defer_finish(&args->trans);
>>>    			if (error)
>>>    				goto out;
>>> -			/*
>>> -			 * Commit the node conversion and start the next
>>> -			 * trans in the chain.
>>> -			 */
>>> -			error = xfs_trans_roll_inode(&args->trans, dp);
>>> -			if (error)
>>> -				goto out;
>>> -
>>> -			goto restart;
>>> +			return -EAGAIN;
>>>    		}
>>>    		/*
>>> @@ -993,9 +949,6 @@ xfs_attr_node_addname(
>>>    		error = xfs_da3_split(state);
>>>    		if (error)
>>>    			goto out;
>>> -		error = xfs_defer_finish(&args->trans);
>>> -		if (error)
>>> -			goto out;
>>>    	} else {
>>>    		/*
>>>    		 * Addition succeeded, update Btree hashvals.
>>> @@ -1010,13 +963,23 @@ xfs_attr_node_addname(
>>>    	xfs_da_state_free(state);
>>>    	state = NULL;
>>> -	/*
>>> -	 * Commit the leaf addition or btree split and start the next
>>> -	 * trans in the chain.
>>> -	 */
>>> -	error = xfs_trans_roll_inode(&args->trans, dp);
>>> +	return 0;
>>> +
>>> +out:
>>> +	if (state)
>>> +		xfs_da_state_free(state);
>>>    	if (error)
>>> -		goto out;
>>> +		return error;
>>> +	return retval;
>>> +}
>>> +
>>> +STATIC int
>>> +xfs_attr_node_addname_work(
>>> +	struct xfs_da_args	*args)
>>> +{
>>> +	struct xfs_da_state	*state;
>>> +	struct xfs_da_state_blk	*blk;
>>> +	int			retval, error;
>>>    	/*
>>>    	 * If there was an out-of-line value, allocate the blocks we
>>>
>>
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans
  2021-01-05  5:38   ` Darrick J. Wong
@ 2021-01-05 20:15     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2021-01-05 20:15 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs



On 1/4/21 10:38 PM, Darrick J. Wong wrote:
> On Fri, Dec 18, 2020 at 12:29:10AM -0700, Allison Henderson wrote:
>> Because xattrs can be over a page in size, we need to handle possible
>> krealloc errors to avoid warnings
> 
> Which warnings?

Sorry, I should have included it here.  The warning is:
[  +0.000016] WARNING: CPU: 1 PID: 20255 at mm/page_alloc.c:3446 
get_page_from_freelist+0x100b/0x1690

and if we look at that line number we have this snippet:
         /*
          * We most definitely don't want callers attempting to
          * allocate greater than order-1 page units with __GFP_NOFAIL.
          */
         WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> 
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/xfs_log_recover.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
>> index 97f3130..295a5c6 100644
>> --- a/fs/xfs/xfs_log_recover.c
>> +++ b/fs/xfs/xfs_log_recover.c
>> @@ -2061,7 +2061,10 @@ xlog_recover_add_to_cont_trans(
>>   	old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
>>   	old_len = item->ri_buf[item->ri_cnt-1].i_len;
>>   
>> -	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
>> +	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL);
> 
> Does the removal of NOFAIL increase the likelihood that log recovery
> will fail instead of looping around looking for more memory?

I suppose it would?  But better to return the error code than proceed 
with a NULL pointer.  I would think it would be quickly proceeded with 
questions of what else is causing memory pressure to build though.

> 
> Hm, what /are/ we doing here, anyway?  I guess someone logged a gigantic
> xattri item, which gets split across multiple log records, and now we're
> trying to staple all that back together?  And perhaps the xattri item is
> larger than a ... page(?) which causes dmesg warnings when combined with
> NOFAIL?

Effectively yes, this is coming from one of the new test cases I came up 
with to test the replay.  It progressively sets larger and larger attrs 
and pulls the error tag to see that it replays correctly.  Up to 64k 
which I think is where ATTR_MAX_VALUELEN is.  I figured since we are 
opening up a means of logging as much, its something that we should be 
testing. :-)

Allison
> 
> --D
> 
>> +	if (ptr == NULL)
>> +		return -ENOMEM;
>> +
>>   	memcpy(&ptr[old_len], dp, len);
>>   	item->ri_buf[item->ri_cnt-1].i_len += len;
>>   	item->ri_buf[item->ri_cnt-1].i_addr = ptr;
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 06/15] xfs: Add state machine tracepoints
  2021-01-05  4:50   ` Chandan Babu R
@ 2021-01-05 21:06     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2021-01-05 21:06 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs



On 1/4/21 9:50 PM, Chandan Babu R wrote:
> On Fri, 18 Dec 2020 00:29:08 -0700, Allison Henderson wrote:
>> This is a quick patch to add a new tracepoint: xfs_das_state_return.  We
>> use this to track when ever a new state is set or -EAGAIN is returned
>>
> 
> Looks good to me.
> 
> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Ok, thank you!

Allison
> 
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c        | 22 +++++++++++++++++++++-
>>   fs/xfs/libxfs/xfs_attr_remote.c |  1 +
>>   fs/xfs/xfs_trace.h              | 20 ++++++++++++++++++++
>>   3 files changed, 42 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index cd72512..8ed00bc 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -263,6 +263,7 @@ xfs_attr_set_shortform(
>>   	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
>>   	 * fork to leaf format and will restart with the leaf add.
>>   	 */
>> +	trace_xfs_das_state_return(XFS_DAS_UNINIT);
>>   	return -EAGAIN;
>>   }
>>   
>> @@ -409,9 +410,11 @@ xfs_attr_set_iter(
>>   		 * down into the node handling code below
>>   		 */
>>   		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return -EAGAIN;
>>   	case 0:
>>   		dac->dela_state = XFS_DAS_FOUND_LBLK;
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return -EAGAIN;
>>   	}
>>   	return error;
>> @@ -841,6 +844,7 @@ xfs_attr_leaf_addname(
>>   			return error;
>>   
>>   		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return -EAGAIN;
>>   	}
>>   
>> @@ -874,6 +878,7 @@ xfs_attr_leaf_addname(
>>   	 * Commit the flag value change and start the next trans in series.
>>   	 */
>>   	dac->dela_state = XFS_DAS_FLIP_LFLAG;
>> +	trace_xfs_das_state_return(dac->dela_state);
>>   	return -EAGAIN;
>>   das_flip_flag:
>>   	/*
>> @@ -891,6 +896,8 @@ xfs_attr_leaf_addname(
>>   das_rm_lblk:
>>   	if (args->rmtblkno) {
>>   		error = __xfs_attr_rmtval_remove(dac);
>> +		if (error == -EAGAIN)
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1142,6 +1149,7 @@ xfs_attr_node_addname(
>>   			 * this point.
>>   			 */
>>   			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1175,6 +1183,7 @@ xfs_attr_node_addname(
>>   	state = NULL;
>>   
>>   	dac->dela_state = XFS_DAS_FOUND_NBLK;
>> +	trace_xfs_das_state_return(dac->dela_state);
>>   	return -EAGAIN;
>>   das_found_nblk:
>>   
>> @@ -1202,6 +1211,7 @@ xfs_attr_node_addname(
>>   				return error;
>>   
>>   			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1236,6 +1246,7 @@ xfs_attr_node_addname(
>>   	 * Commit the flag value change and start the next trans in series
>>   	 */
>>   	dac->dela_state = XFS_DAS_FLIP_NFLAG;
>> +	trace_xfs_das_state_return(dac->dela_state);
>>   	return -EAGAIN;
>>   das_flip_flag:
>>   	/*
>> @@ -1253,6 +1264,10 @@ xfs_attr_node_addname(
>>   das_rm_nblk:
>>   	if (args->rmtblkno) {
>>   		error = __xfs_attr_rmtval_remove(dac);
>> +
>> +		if (error == -EAGAIN)
>> +			trace_xfs_das_state_return(dac->dela_state);
>> +
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1396,6 +1411,8 @@ xfs_attr_node_remove_rmt (
>>   	 * May return -EAGAIN to request that the caller recall this function
>>   	 */
>>   	error = __xfs_attr_rmtval_remove(dac);
>> +	if (error == -EAGAIN)
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   	if (error)
>>   		return error;
>>   
>> @@ -1514,6 +1531,7 @@ xfs_attr_node_removename_iter(
>>   
>>   			dac->flags |= XFS_DAC_DEFER_FINISH;
>>   			dac->dela_state = XFS_DAS_RM_SHRINK;
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1532,8 +1550,10 @@ xfs_attr_node_removename_iter(
>>   		goto out;
>>   	}
>>   
>> -	if (error == -EAGAIN)
>> +	if (error == -EAGAIN) {
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return error;
>> +	}
>>   out:
>>   	if (state)
>>   		xfs_da_state_free(state);
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
>> index 6af86bf..4840de9 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.c
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
>> @@ -763,6 +763,7 @@ __xfs_attr_rmtval_remove(
>>   	 */
>>   	if (!done) {
>>   		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return -EAGAIN;
>>   	}
>>   
>> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
>> index 9074b8b..4f6939b4 100644
>> --- a/fs/xfs/xfs_trace.h
>> +++ b/fs/xfs/xfs_trace.h
>> @@ -3887,6 +3887,26 @@ DEFINE_EVENT(xfs_timestamp_range_class, name, \
>>   DEFINE_TIMESTAMP_RANGE_EVENT(xfs_inode_timestamp_range);
>>   DEFINE_TIMESTAMP_RANGE_EVENT(xfs_quota_expiry_range);
>>   
>> +
>> +DECLARE_EVENT_CLASS(xfs_das_state_class,
>> +	TP_PROTO(int das),
>> +	TP_ARGS(das),
>> +	TP_STRUCT__entry(
>> +		__field(int, das)
>> +	),
>> +	TP_fast_assign(
>> +		__entry->das = das;
>> +	),
>> +	TP_printk("state change %d",
>> +		  __entry->das)
>> +)
>> +
>> +#define DEFINE_DAS_STATE_EVENT(name) \
>> +DEFINE_EVENT(xfs_das_state_class, name, \
>> +	TP_PROTO(int das), \
>> +	TP_ARGS(das))
>> +DEFINE_DAS_STATE_EVENT(xfs_das_state_return);
>> +
>>   #endif /* _TRACE_XFS_H */
>>   
>>   #undef TRACE_INCLUDE_PATH
>>
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 06/15] xfs: Add state machine tracepoints
  2021-01-05  5:28   ` Darrick J. Wong
@ 2021-01-05 21:07     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2021-01-05 21:07 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs



On 1/4/21 10:28 PM, Darrick J. Wong wrote:
> On Fri, Dec 18, 2020 at 12:29:08AM -0700, Allison Henderson wrote:
>> This is a quick patch to add a new tracepoint: xfs_das_state_return.  We
>> use this to track when ever a new state is set or -EAGAIN is returned
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c        | 22 +++++++++++++++++++++-
>>   fs/xfs/libxfs/xfs_attr_remote.c |  1 +
>>   fs/xfs/xfs_trace.h              | 20 ++++++++++++++++++++
>>   3 files changed, 42 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index cd72512..8ed00bc 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -263,6 +263,7 @@ xfs_attr_set_shortform(
>>   	 * We're still in XFS_DAS_UNINIT state here.  We've converted the attr
>>   	 * fork to leaf format and will restart with the leaf add.
>>   	 */
>> +	trace_xfs_das_state_return(XFS_DAS_UNINIT);
> 
> It would help to record the inode number in the trace data.  When
> someone encounters an xattr problem involving things like fsstress,
> it'll be /much/ easier to disentangle who's doing what.
Sure, I can add that in

> 
>>   	return -EAGAIN;
>>   }
>>   
>> @@ -409,9 +410,11 @@ xfs_attr_set_iter(
>>   		 * down into the node handling code below
>>   		 */
>>   		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return -EAGAIN;
>>   	case 0:
>>   		dac->dela_state = XFS_DAS_FOUND_LBLK;
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return -EAGAIN;
>>   	}
>>   	return error;
>> @@ -841,6 +844,7 @@ xfs_attr_leaf_addname(
>>   			return error;
>>   
>>   		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		trace_xfs_das_state_return(dac->dela_state);
> 
> Also, please consider capturing more info about /which/ of these
> xfs_das_state_return tracepoints fired, either by introducing more
> variants (e.g. xfs_attr_leaf_addname_das_return) or by feeding
> __this_address into the trace "call" and printing it in the TP_printk
> output (formatting string '%pS').
> 
> Each declared tracepoint /does/ have a permanent memory cost, so I would
> think hard about trying #2...
Ok, how about a variant for each function then?  I think that would work 
out to 7 variants.

Allison
> 
> --D
> 
>>   		return -EAGAIN;
>>   	}
>>   
>> @@ -874,6 +878,7 @@ xfs_attr_leaf_addname(
>>   	 * Commit the flag value change and start the next trans in series.
>>   	 */
>>   	dac->dela_state = XFS_DAS_FLIP_LFLAG;
>> +	trace_xfs_das_state_return(dac->dela_state);
>>   	return -EAGAIN;
>>   das_flip_flag:
>>   	/*
>> @@ -891,6 +896,8 @@ xfs_attr_leaf_addname(
>>   das_rm_lblk:
>>   	if (args->rmtblkno) {
>>   		error = __xfs_attr_rmtval_remove(dac);
>> +		if (error == -EAGAIN)
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1142,6 +1149,7 @@ xfs_attr_node_addname(
>>   			 * this point.
>>   			 */
>>   			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1175,6 +1183,7 @@ xfs_attr_node_addname(
>>   	state = NULL;
>>   
>>   	dac->dela_state = XFS_DAS_FOUND_NBLK;
>> +	trace_xfs_das_state_return(dac->dela_state);
>>   	return -EAGAIN;
>>   das_found_nblk:
>>   
>> @@ -1202,6 +1211,7 @@ xfs_attr_node_addname(
>>   				return error;
>>   
>>   			dac->flags |= XFS_DAC_DEFER_FINISH;
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1236,6 +1246,7 @@ xfs_attr_node_addname(
>>   	 * Commit the flag value change and start the next trans in series
>>   	 */
>>   	dac->dela_state = XFS_DAS_FLIP_NFLAG;
>> +	trace_xfs_das_state_return(dac->dela_state);
>>   	return -EAGAIN;
>>   das_flip_flag:
>>   	/*
>> @@ -1253,6 +1264,10 @@ xfs_attr_node_addname(
>>   das_rm_nblk:
>>   	if (args->rmtblkno) {
>>   		error = __xfs_attr_rmtval_remove(dac);
>> +
>> +		if (error == -EAGAIN)
>> +			trace_xfs_das_state_return(dac->dela_state);
>> +
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1396,6 +1411,8 @@ xfs_attr_node_remove_rmt (
>>   	 * May return -EAGAIN to request that the caller recall this function
>>   	 */
>>   	error = __xfs_attr_rmtval_remove(dac);
>> +	if (error == -EAGAIN)
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   	if (error)
>>   		return error;
>>   
>> @@ -1514,6 +1531,7 @@ xfs_attr_node_removename_iter(
>>   
>>   			dac->flags |= XFS_DAC_DEFER_FINISH;
>>   			dac->dela_state = XFS_DAS_RM_SHRINK;
>> +			trace_xfs_das_state_return(dac->dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1532,8 +1550,10 @@ xfs_attr_node_removename_iter(
>>   		goto out;
>>   	}
>>   
>> -	if (error == -EAGAIN)
>> +	if (error == -EAGAIN) {
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return error;
>> +	}
>>   out:
>>   	if (state)
>>   		xfs_da_state_free(state);
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
>> index 6af86bf..4840de9 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.c
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
>> @@ -763,6 +763,7 @@ __xfs_attr_rmtval_remove(
>>   	 */
>>   	if (!done) {
>>   		dac->flags |= XFS_DAC_DEFER_FINISH;
>> +		trace_xfs_das_state_return(dac->dela_state);
>>   		return -EAGAIN;
>>   	}
>>   
>> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
>> index 9074b8b..4f6939b4 100644
>> --- a/fs/xfs/xfs_trace.h
>> +++ b/fs/xfs/xfs_trace.h
>> @@ -3887,6 +3887,26 @@ DEFINE_EVENT(xfs_timestamp_range_class, name, \
>>   DEFINE_TIMESTAMP_RANGE_EVENT(xfs_inode_timestamp_range);
>>   DEFINE_TIMESTAMP_RANGE_EVENT(xfs_quota_expiry_range);
>>   
>> +
>> +DECLARE_EVENT_CLASS(xfs_das_state_class,
>> +	TP_PROTO(int das),
>> +	TP_ARGS(das),
>> +	TP_STRUCT__entry(
>> +		__field(int, das)
>> +	),
>> +	TP_fast_assign(
>> +		__entry->das = das;
>> +	),
>> +	TP_printk("state change %d",
>> +		  __entry->das)
>> +)
>> +
>> +#define DEFINE_DAS_STATE_EVENT(name) \
>> +DEFINE_EVENT(xfs_das_state_class, name, \
>> +	TP_PROTO(int das), \
>> +	TP_ARGS(das))
>> +DEFINE_DAS_STATE_EVENT(xfs_das_state_return);
>> +
>>   #endif /* _TRACE_XFS_H */
>>   
>>   #undef TRACE_INCLUDE_PATH
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 15/15] xfs: Merge xfs_delattr_context into xfs_attr_item
  2021-01-05  5:47   ` Darrick J. Wong
@ 2021-01-05 21:07     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2021-01-05 21:07 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs



On 1/4/21 10:47 PM, Darrick J. Wong wrote:
> On Fri, Dec 18, 2020 at 12:29:17AM -0700, Allison Henderson wrote:
>> This is a clean up patch that merges xfs_delattr_context into
>> xfs_attr_item.  Now that the refactoring is complete and the delayed
>> operation infastructure is in place, we can combine these to eliminate
>> the extra struct
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> 
> Nice consolidation!
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Alrighty, thank you!

Allison
> 
> --D
> 
>> ---
>>   fs/xfs/libxfs/xfs_attr.c        | 138 ++++++++++++++++++++--------------------
>>   fs/xfs/libxfs/xfs_attr.h        |  40 +++++-------
>>   fs/xfs/libxfs/xfs_attr_remote.c |  34 +++++-----
>>   fs/xfs/libxfs/xfs_attr_remote.h |   6 +-
>>   fs/xfs/xfs_attr_item.c          |  46 ++++++--------
>>   5 files changed, 127 insertions(+), 137 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 6e5a900..badcdae 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -46,7 +46,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
>>    * Internal routines when attribute list is one block.
>>    */
>>   STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
>> -STATIC int xfs_attr_leaf_addname(struct xfs_delattr_context *dac);
>> +STATIC int xfs_attr_leaf_addname(struct xfs_attr_item *attr);
>>   STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
>>   STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>>   
>> @@ -54,8 +54,8 @@ STATIC int xfs_attr_leaf_hasname(struct xfs_da_args *args, struct xfs_buf **bp);
>>    * Internal routines when attribute list is more than one block.
>>    */
>>   STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>> -STATIC int xfs_attr_node_addname(struct xfs_delattr_context *dac);
>> -STATIC int xfs_attr_node_removename_iter(struct xfs_delattr_context *dac);
>> +STATIC int xfs_attr_node_addname(struct xfs_attr_item *attr);
>> +STATIC int xfs_attr_node_removename_iter(struct xfs_attr_item *attr);
>>   STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
>>   				 struct xfs_da_state **state);
>>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>> @@ -276,27 +276,27 @@ xfs_attr_set_shortform(
>>    */
>>   int
>>   xfs_attr_set_iter(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	struct xfs_inode		*dp = args->dp;
>> -	struct xfs_buf			**leaf_bp = &dac->leaf_bp;
>> +	struct xfs_buf			**leaf_bp = &attr->xattri_leaf_bp;
>>   	int				error = 0;
>>   
>>   	/* State machine switch */
>> -	switch (dac->dela_state) {
>> +	switch (attr->xattri_dela_state) {
>>   	case XFS_DAS_FLIP_LFLAG:
>>   	case XFS_DAS_FOUND_LBLK:
>>   	case XFS_DAS_RM_LBLK:
>> -		return xfs_attr_leaf_addname(dac);
>> +		return xfs_attr_leaf_addname(attr);
>>   	case XFS_DAS_FOUND_NBLK:
>>   	case XFS_DAS_FLIP_NFLAG:
>>   	case XFS_DAS_ALLOC_NODE:
>> -		return xfs_attr_node_addname(dac);
>> +		return xfs_attr_node_addname(attr);
>>   	case XFS_DAS_UNINIT:
>>   		break;
>>   	default:
>> -		ASSERT(dac->dela_state != XFS_DAS_RM_SHRINK);
>> +		ASSERT(attr->xattri_dela_state != XFS_DAS_RM_SHRINK);
>>   		break;
>>   	}
>>   
>> @@ -328,7 +328,7 @@ xfs_attr_set_iter(
>>   	}
>>   
>>   	if (!xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> -		return xfs_attr_node_addname(dac);
>> +		return xfs_attr_node_addname(attr);
>>   
>>   	error = xfs_attr_leaf_try_add(args, *leaf_bp);
>>   	switch (error) {
>> @@ -351,11 +351,11 @@ xfs_attr_set_iter(
>>   		 * when we come back, we'll be a node, so we'll fall
>>   		 * down into the node handling code below
>>   		 */
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		return -EAGAIN;
>>   	case 0:
>> -		dac->dela_state = XFS_DAS_FOUND_LBLK;
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		attr->xattri_dela_state = XFS_DAS_FOUND_LBLK;
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		return -EAGAIN;
>>   	}
>>   	return error;
>> @@ -401,13 +401,13 @@ xfs_has_attr(
>>    */
>>   int
>>   xfs_attr_remove_iter(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	struct xfs_inode		*dp = args->dp;
>>   
>>   	/* If we are shrinking a node, resume shrink */
>> -	if (dac->dela_state == XFS_DAS_RM_SHRINK)
>> +	if (attr->xattri_dela_state == XFS_DAS_RM_SHRINK)
>>   		goto node;
>>   
>>   	if (!xfs_inode_hasattr(dp))
>> @@ -422,7 +422,7 @@ xfs_attr_remove_iter(
>>   		return xfs_attr_leaf_removename(args);
>>   node:
>>   	/* If we are not short form or leaf, then proceed to remove node */
>> -	return  xfs_attr_node_removename_iter(dac);
>> +	return  xfs_attr_node_removename_iter(attr);
>>   }
>>   
>>   /*
>> @@ -573,7 +573,7 @@ xfs_attr_item_init(
>>   
>>   	new = kmem_zalloc(sizeof(struct xfs_attr_item), KM_NOFS);
>>   	new->xattri_op_flags = op_flags;
>> -	new->xattri_dac.da_args = args;
>> +	new->xattri_da_args = args;
>>   
>>   	*attr = new;
>>   	return 0;
>> @@ -768,16 +768,16 @@ xfs_attr_leaf_try_add(
>>    */
>>   STATIC int
>>   xfs_attr_leaf_addname(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	struct xfs_buf			*bp = NULL;
>>   	int				error, forkoff;
>>   	struct xfs_inode		*dp = args->dp;
>>   	struct xfs_mount		*mp = args->dp->i_mount;
>>   
>>   	/* State machine switch */
>> -	switch (dac->dela_state) {
>> +	switch (attr->xattri_dela_state) {
>>   	case XFS_DAS_FLIP_LFLAG:
>>   		goto das_flip_flag;
>>   	case XFS_DAS_RM_LBLK:
>> @@ -794,10 +794,10 @@ xfs_attr_leaf_addname(
>>   	 */
>>   
>>   	/* Open coded xfs_attr_rmtval_set without trans handling */
>> -	if ((dac->flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
>> -		dac->flags |= XFS_DAC_LEAF_ADDNAME_INIT;
>> +	if ((attr->xattri_flags & XFS_DAC_LEAF_ADDNAME_INIT) == 0) {
>> +		attr->xattri_flags |= XFS_DAC_LEAF_ADDNAME_INIT;
>>   		if (args->rmtblkno > 0) {
>> -			error = xfs_attr_rmtval_find_space(dac);
>> +			error = xfs_attr_rmtval_find_space(attr);
>>   			if (error)
>>   				return error;
>>   		}
>> @@ -807,12 +807,12 @@ xfs_attr_leaf_addname(
>>   	 * Roll through the "value", allocating blocks on disk as
>>   	 * required.
>>   	 */
>> -	if (dac->blkcnt > 0) {
>> -		error = xfs_attr_rmtval_set_blk(dac);
>> +	if (attr->xattri_blkcnt > 0) {
>> +		error = xfs_attr_rmtval_set_blk(attr);
>>   		if (error)
>>   			return error;
>>   
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		return -EAGAIN;
>>   	}
>>   
>> @@ -846,8 +846,8 @@ xfs_attr_leaf_addname(
>>   		/*
>>   		 * Commit the flag value change and start the next trans in series.
>>   		 */
>> -		dac->dela_state = XFS_DAS_FLIP_LFLAG;
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		attr->xattri_dela_state = XFS_DAS_FLIP_LFLAG;
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		return -EAGAIN;
>>   	}
>>   das_flip_flag:
>> @@ -862,12 +862,12 @@ xfs_attr_leaf_addname(
>>   		return error;
>>   
>>   	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
>> -	dac->dela_state = XFS_DAS_RM_LBLK;
>> +	attr->xattri_dela_state = XFS_DAS_RM_LBLK;
>>   das_rm_lblk:
>>   	if (args->rmtblkno) {
>> -		error = xfs_attr_rmtval_remove(dac);
>> +		error = xfs_attr_rmtval_remove(attr);
>>   		if (error == -EAGAIN)
>> -			trace_xfs_das_state_return(dac->dela_state);
>> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1041,9 +1041,9 @@ xfs_attr_node_hasname(
>>    */
>>   STATIC int
>>   xfs_attr_node_addname(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	struct xfs_da_state		*state = NULL;
>>   	struct xfs_da_state_blk		*blk;
>>   	int				retval = 0;
>> @@ -1053,7 +1053,7 @@ xfs_attr_node_addname(
>>   	trace_xfs_attr_node_addname(args);
>>   
>>   	/* State machine switch */
>> -	switch (dac->dela_state) {
>> +	switch (attr->xattri_dela_state) {
>>   	case XFS_DAS_FLIP_NFLAG:
>>   		goto das_flip_flag;
>>   	case XFS_DAS_FOUND_NBLK:
>> @@ -1119,7 +1119,7 @@ xfs_attr_node_addname(
>>   			 * this. dela_state is still unset by this function at
>>   			 * this point.
>>   			 */
>> -			trace_xfs_das_state_return(dac->dela_state);
>> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1151,8 +1151,8 @@ xfs_attr_node_addname(
>>   	xfs_da_state_free(state);
>>   	state = NULL;
>>   
>> -	dac->dela_state = XFS_DAS_FOUND_NBLK;
>> -	trace_xfs_das_state_return(dac->dela_state);
>> +	attr->xattri_dela_state = XFS_DAS_FOUND_NBLK;
>> +	trace_xfs_das_state_return(attr->xattri_dela_state);
>>   	return -EAGAIN;
>>   das_found_nblk:
>>   
>> @@ -1164,7 +1164,7 @@ xfs_attr_node_addname(
>>   	 */
>>   	if (args->rmtblkno > 0) {
>>   		/* Open coded xfs_attr_rmtval_set without trans handling */
>> -		error = xfs_attr_rmtval_find_space(dac);
>> +		error = xfs_attr_rmtval_find_space(attr);
>>   		if (error)
>>   			return error;
>>   
>> @@ -1172,14 +1172,14 @@ xfs_attr_node_addname(
>>   		 * Roll through the "value", allocating blocks on disk as
>>   		 * required.  Set the state in case of -EAGAIN return code
>>   		 */
>> -		dac->dela_state = XFS_DAS_ALLOC_NODE;
>> +		attr->xattri_dela_state = XFS_DAS_ALLOC_NODE;
>>   das_alloc_node:
>> -		if (dac->blkcnt > 0) {
>> -			error = xfs_attr_rmtval_set_blk(dac);
>> +		if (attr->xattri_blkcnt > 0) {
>> +			error = xfs_attr_rmtval_set_blk(attr);
>>   			if (error)
>>   				return error;
>>   
>> -			trace_xfs_das_state_return(dac->dela_state);
>> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1214,8 +1214,8 @@ xfs_attr_node_addname(
>>   		/*
>>   		 * Commit the flag value change and start the next trans in series
>>   		 */
>> -		dac->dela_state = XFS_DAS_FLIP_NFLAG;
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		attr->xattri_dela_state = XFS_DAS_FLIP_NFLAG;
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		return -EAGAIN;
>>   	}
>>   das_flip_flag:
>> @@ -1230,13 +1230,13 @@ xfs_attr_node_addname(
>>   		return error;
>>   
>>   	/* Set state in case xfs_attr_rmtval_remove returns -EAGAIN */
>> -	dac->dela_state = XFS_DAS_RM_NBLK;
>> +	attr->xattri_dela_state = XFS_DAS_RM_NBLK;
>>   das_rm_nblk:
>>   	if (args->rmtblkno) {
>> -		error = xfs_attr_rmtval_remove(dac);
>> +		error = xfs_attr_rmtval_remove(attr);
>>   
>>   		if (error == -EAGAIN)
>> -			trace_xfs_das_state_return(dac->dela_state);
>> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>>   
>>   		if (error)
>>   			return error;
>> @@ -1344,10 +1344,10 @@ xfs_attr_leaf_mark_incomplete(
>>    */
>>   STATIC
>>   int xfs_attr_node_removename_setup(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> -	struct xfs_da_state		**state = &dac->da_state;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>> +	struct xfs_da_state		**state = &attr->xattri_da_state;
>>   	int				error;
>>   
>>   	error = xfs_attr_node_hasname(args, state);
>> @@ -1371,7 +1371,7 @@ int xfs_attr_node_removename_setup(
>>   
>>   STATIC int
>>   xfs_attr_node_remove_rmt (
>> -	struct xfs_delattr_context	*dac,
>> +	struct xfs_attr_item		*attr,
>>   	struct xfs_da_state		*state)
>>   {
>>   	int				error = 0;
>> @@ -1379,9 +1379,9 @@ xfs_attr_node_remove_rmt (
>>   	/*
>>   	 * May return -EAGAIN to request that the caller recall this function
>>   	 */
>> -	error = xfs_attr_rmtval_remove(dac);
>> +	error = xfs_attr_rmtval_remove(attr);
>>   	if (error == -EAGAIN)
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   	if (error)
>>   		return error;
>>   
>> @@ -1425,10 +1425,10 @@ xfs_attr_node_remove_cleanup(
>>    */
>>   STATIC int
>>   xfs_attr_node_remove_step(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> -	struct xfs_da_state		*state = dac->da_state;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>> +	struct xfs_da_state		*state = attr->xattri_da_state;
>>   	int				error = 0;
>>   	/*
>>   	 * If there is an out-of-line value, de-allocate the blocks.
>> @@ -1439,7 +1439,7 @@ xfs_attr_node_remove_step(
>>   		/*
>>   		 * May return -EAGAIN. Remove blocks until args->rmtblkno == 0
>>   		 */
>> -		error = xfs_attr_node_remove_rmt(dac, state);
>> +		error = xfs_attr_node_remove_rmt(attr, state);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -1460,29 +1460,29 @@ xfs_attr_node_remove_step(
>>    */
>>   STATIC int
>>   xfs_attr_node_removename_iter(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	struct xfs_da_state		*state = NULL;
>>   	int				retval, error;
>>   	struct xfs_inode		*dp = args->dp;
>>   
>>   	trace_xfs_attr_node_removename(args);
>>   
>> -	if (!dac->da_state) {
>> -		error = xfs_attr_node_removename_setup(dac);
>> +	if (!attr->xattri_da_state) {
>> +		error = xfs_attr_node_removename_setup(attr);
>>   		if (error)
>>   			goto out;
>>   	}
>> -	state = dac->da_state;
>> +	state = attr->xattri_da_state;
>>   
>> -	switch (dac->dela_state) {
>> +	switch (attr->xattri_dela_state) {
>>   	case XFS_DAS_UNINIT:
>>   		/*
>>   		 * repeatedly remove remote blocks, remove the entry and join.
>>   		 * returns -EAGAIN or 0 for completion of the step.
>>   		 */
>> -		error = xfs_attr_node_remove_step(dac);
>> +		error = xfs_attr_node_remove_step(attr);
>>   		if (error)
>>   			break;
>>   
>> @@ -1498,8 +1498,8 @@ xfs_attr_node_removename_iter(
>>   			if (error)
>>   				return error;
>>   
>> -			dac->dela_state = XFS_DAS_RM_SHRINK;
>> -			trace_xfs_das_state_return(dac->dela_state);
>> +			attr->xattri_dela_state = XFS_DAS_RM_SHRINK;
>> +			trace_xfs_das_state_return(attr->xattri_dela_state);
>>   			return -EAGAIN;
>>   		}
>>   
>> @@ -1519,7 +1519,7 @@ xfs_attr_node_removename_iter(
>>   	}
>>   
>>   	if (error == -EAGAIN) {
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		return error;
>>   	}
>>   out:
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index edd008d..d1a59d0 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -364,7 +364,7 @@ struct xfs_attr_list_context {
>>    */
>>   
>>   /*
>> - * Enum values for xfs_delattr_context.da_state
>> + * Enum values for xfs_attr_item.xattri_da_state
>>    *
>>    * These values are used by delayed attribute operations to keep track  of where
>>    * they were before they returned -EAGAIN.  A return code of -EAGAIN signals the
>> @@ -385,7 +385,7 @@ enum xfs_delattr_state {
>>   };
>>   
>>   /*
>> - * Defines for xfs_delattr_context.flags
>> + * Defines for xfs_attr_item.xattri_flags
>>    */
>>   #define XFS_DAC_LEAF_ADDNAME_INIT	0x01 /* xfs_attr_leaf_addname init*/
>>   #define XFS_DAC_DELAYED_OP_INIT		0x02 /* delayed operations init*/
>> @@ -393,32 +393,25 @@ enum xfs_delattr_state {
>>   /*
>>    * Context used for keeping track of delayed attribute operations
>>    */
>> -struct xfs_delattr_context {
>> -	struct xfs_da_args      *da_args;
>> +struct xfs_attr_item {
>> +	struct xfs_da_args		*xattri_da_args;
>>   
>>   	/*
>>   	 * Used by xfs_attr_set to hold a leaf buffer across a transaction roll
>>   	 */
>> -	struct xfs_buf		*leaf_bp;
>> +	struct xfs_buf			*xattri_leaf_bp;
>>   
>>   	/* Used in xfs_attr_rmtval_set_blk to roll through allocating blocks */
>> -	struct xfs_bmbt_irec	map;
>> -	xfs_dablk_t		lblkno;
>> -	int			blkcnt;
>> +	struct xfs_bmbt_irec		xattri_map;
>> +	xfs_dablk_t			xattri_lblkno;
>> +	int				xattri_blkcnt;
>>   
>>   	/* Used in xfs_attr_node_removename to roll through removing blocks */
>> -	struct xfs_da_state     *da_state;
>> +	struct xfs_da_state		*xattri_da_state;
>>   
>>   	/* Used to keep track of current state of delayed operation */
>> -	unsigned int            flags;
>> -	enum xfs_delattr_state  dela_state;
>> -};
>> -
>> -/*
>> - * List of attrs to commit later.
>> - */
>> -struct xfs_attr_item {
>> -	struct xfs_delattr_context	xattri_dac;
>> +	unsigned int			xattri_flags;
>> +	enum xfs_delattr_state		xattri_dela_state;
>>   
>>   	/*
>>   	 * Indicates if the attr operation is a set or a remove
>> @@ -426,7 +419,10 @@ struct xfs_attr_item {
>>   	 */
>>   	uint32_t			xattri_op_flags;
>>   
>> -	/* used to log this item to an intent */
>> +	/*
>> +	 * used to log this item to an intent containing a list of attrs to
>> +	 * commit later
>> +	 */
>>   	struct list_head		xattri_list;
>>   };
>>   
>> @@ -445,12 +441,10 @@ int xfs_inode_hasattr(struct xfs_inode *ip);
>>   int xfs_attr_get_ilocked(struct xfs_da_args *args);
>>   int xfs_attr_get(struct xfs_da_args *args);
>>   int xfs_attr_set(struct xfs_da_args *args);
>> -int xfs_attr_set_iter(struct xfs_delattr_context *dac);
>> +int xfs_attr_set_iter(struct xfs_attr_item *attr);
>>   int xfs_has_attr(struct xfs_da_args *args);
>> -int xfs_attr_remove_iter(struct xfs_delattr_context *dac);
>> +int xfs_attr_remove_iter(struct xfs_attr_item *attr);
>>   bool xfs_attr_namecheck(const void *name, size_t length);
>> -void xfs_delattr_context_init(struct xfs_delattr_context *dac,
>> -			      struct xfs_da_args *args);
>>   int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>>   int xfs_attr_set_deferred(struct xfs_da_args *args);
>>   int xfs_attr_remove_deferred(struct xfs_da_args *args);
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
>> index a5ff5e0..42cc9cc 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.c
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
>> @@ -634,14 +634,14 @@ xfs_attr_rmtval_set(
>>    */
>>   int
>>   xfs_attr_rmtval_find_space(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> -	struct xfs_bmbt_irec		*map = &dac->map;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>> +	struct xfs_bmbt_irec		*map = &attr->xattri_map;
>>   	int				error;
>>   
>> -	dac->lblkno = 0;
>> -	dac->blkcnt = 0;
>> +	attr->xattri_lblkno = 0;
>> +	attr->xattri_blkcnt = 0;
>>   	args->rmtblkcnt = 0;
>>   	args->rmtblkno = 0;
>>   	memset(map, 0, sizeof(struct xfs_bmbt_irec));
>> @@ -650,8 +650,8 @@ xfs_attr_rmtval_find_space(
>>   	if (error)
>>   		return error;
>>   
>> -	dac->blkcnt = args->rmtblkcnt;
>> -	dac->lblkno = args->rmtblkno;
>> +	attr->xattri_blkcnt = args->rmtblkcnt;
>> +	attr->xattri_lblkno = args->rmtblkno;
>>   
>>   	return 0;
>>   }
>> @@ -664,17 +664,17 @@ xfs_attr_rmtval_find_space(
>>    */
>>   int
>>   xfs_attr_rmtval_set_blk(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	struct xfs_inode		*dp = args->dp;
>> -	struct xfs_bmbt_irec		*map = &dac->map;
>> +	struct xfs_bmbt_irec		*map = &attr->xattri_map;
>>   	int nmap;
>>   	int error;
>>   
>>   	nmap = 1;
>> -	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)dac->lblkno,
>> -				dac->blkcnt, XFS_BMAPI_ATTRFORK, args->total,
>> +	error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)attr->xattri_lblkno,
>> +				attr->xattri_blkcnt, XFS_BMAPI_ATTRFORK, args->total,
>>   				map, &nmap);
>>   	if (error)
>>   		return error;
>> @@ -684,8 +684,8 @@ xfs_attr_rmtval_set_blk(
>>   	       (map->br_startblock != HOLESTARTBLOCK));
>>   
>>   	/* roll attribute extent map forwards */
>> -	dac->lblkno += map->br_blockcount;
>> -	dac->blkcnt -= map->br_blockcount;
>> +	attr->xattri_lblkno += map->br_blockcount;
>> +	attr->xattri_blkcnt -= map->br_blockcount;
>>   
>>   	return 0;
>>   }
>> @@ -738,9 +738,9 @@ xfs_attr_rmtval_invalidate(
>>    */
>>   int
>>   xfs_attr_rmtval_remove(
>> -	struct xfs_delattr_context	*dac)
>> +	struct xfs_attr_item		*attr)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	int				error, done;
>>   
>>   	/*
>> @@ -762,7 +762,7 @@ xfs_attr_rmtval_remove(
>>   	 * by the parent
>>   	 */
>>   	if (!done) {
>> -		trace_xfs_das_state_return(dac->dela_state);
>> +		trace_xfs_das_state_return(attr->xattri_dela_state);
>>   		return -EAGAIN;
>>   	}
>>   
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
>> index 6ae91af..d3aa27d 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.h
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
>> @@ -13,9 +13,9 @@ int xfs_attr_rmtval_set(struct xfs_da_args *args);
>>   int xfs_attr_rmtval_stale(struct xfs_inode *ip, struct xfs_bmbt_irec *map,
>>   		xfs_buf_flags_t incore_flags);
>>   int xfs_attr_rmtval_invalidate(struct xfs_da_args *args);
>> -int xfs_attr_rmtval_remove(struct xfs_delattr_context *dac);
>> +int xfs_attr_rmtval_remove(struct xfs_attr_item *attr);
>>   int xfs_attr_rmt_find_hole(struct xfs_da_args *args);
>>   int xfs_attr_rmtval_set_value(struct xfs_da_args *args);
>> -int xfs_attr_rmtval_set_blk(struct xfs_delattr_context *dac);
>> -int xfs_attr_rmtval_find_space(struct xfs_delattr_context *dac);
>> +int xfs_attr_rmtval_set_blk(struct xfs_attr_item *attr);
>> +int xfs_attr_rmtval_find_space(struct xfs_attr_item *attr);
>>   #endif /* __XFS_ATTR_REMOTE_H__ */
>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>> index e1cfef1..bbca949 100644
>> --- a/fs/xfs/xfs_attr_item.c
>> +++ b/fs/xfs/xfs_attr_item.c
>> @@ -291,11 +291,11 @@ xfs_attrd_item_release(
>>    */
>>   int
>>   xfs_trans_attr(
>> -	struct xfs_delattr_context	*dac,
>> +	struct xfs_attr_item		*attr,
>>   	struct xfs_attrd_log_item	*attrdp,
>>   	uint32_t			op_flags)
>>   {
>> -	struct xfs_da_args		*args = dac->da_args;
>> +	struct xfs_da_args		*args = attr->xattri_da_args;
>>   	int				error;
>>   
>>   	error = xfs_qm_dqattach_locked(args->dp, 0);
>> @@ -310,11 +310,11 @@ xfs_trans_attr(
>>   	switch (op_flags) {
>>   	case XFS_ATTR_OP_FLAGS_SET:
>>   		args->op_flags |= XFS_DA_OP_ADDNAME;
>> -		error = xfs_attr_set_iter(dac);
>> +		error = xfs_attr_set_iter(attr);
>>   		break;
>>   	case XFS_ATTR_OP_FLAGS_REMOVE:
>>   		ASSERT(XFS_IFORK_Q(args->dp));
>> -		error = xfs_attr_remove_iter(dac);
>> +		error = xfs_attr_remove_iter(attr);
>>   		break;
>>   	default:
>>   		error = -EFSCORRUPTED;
>> @@ -358,16 +358,16 @@ xfs_attr_log_item(
>>   	 * structure with fields from this xfs_attr_item
>>   	 */
>>   	attrp = &attrip->attri_format;
>> -	attrp->alfi_ino = attr->xattri_dac.da_args->dp->i_ino;
>> +	attrp->alfi_ino = attr->xattri_da_args->dp->i_ino;
>>   	attrp->alfi_op_flags = attr->xattri_op_flags;
>> -	attrp->alfi_value_len = attr->xattri_dac.da_args->valuelen;
>> -	attrp->alfi_name_len = attr->xattri_dac.da_args->namelen;
>> -	attrp->alfi_attr_flags = attr->xattri_dac.da_args->attr_filter;
>> -
>> -	attrip->attri_name = (void *)attr->xattri_dac.da_args->name;
>> -	attrip->attri_value = attr->xattri_dac.da_args->value;
>> -	attrip->attri_name_len = attr->xattri_dac.da_args->namelen;
>> -	attrip->attri_value_len = attr->xattri_dac.da_args->valuelen;
>> +	attrp->alfi_value_len = attr->xattri_da_args->valuelen;
>> +	attrp->alfi_name_len = attr->xattri_da_args->namelen;
>> +	attrp->alfi_attr_flags = attr->xattri_da_args->attr_filter;
>> +
>> +	attrip->attri_name = (void *)attr->xattri_da_args->name;
>> +	attrip->attri_value = attr->xattri_da_args->value;
>> +	attrip->attri_name_len = attr->xattri_da_args->namelen;
>> +	attrip->attri_value_len = attr->xattri_da_args->valuelen;
>>   }
>>   
>>   /* Get an ATTRI. */
>> @@ -408,10 +408,8 @@ xfs_attr_finish_item(
>>   	struct xfs_attr_item		*attr;
>>   	struct xfs_attrd_log_item	*done_item = NULL;
>>   	int				error;
>> -	struct xfs_delattr_context	*dac;
>>   
>>   	attr = container_of(item, struct xfs_attr_item, xattri_list);
>> -	dac = &attr->xattri_dac;
>>   	if (done)
>>   		done_item = ATTRD_ITEM(done);
>>   
>> @@ -423,19 +421,18 @@ xfs_attr_finish_item(
>>   	 * in a standard delay op, so we need to catch this here and rejoin the
>>   	 * leaf to the new transaction
>>   	 */
>> -	if (attr->xattri_dac.leaf_bp &&
>> -	    attr->xattri_dac.leaf_bp->b_transp != tp) {
>> -		xfs_trans_bjoin(tp, attr->xattri_dac.leaf_bp);
>> -		xfs_trans_bhold(tp, attr->xattri_dac.leaf_bp);
>> +	if (attr->xattri_leaf_bp && attr->xattri_leaf_bp->b_transp != tp) {
>> +		xfs_trans_bjoin(tp, attr->xattri_leaf_bp);
>> +		xfs_trans_bhold(tp, attr->xattri_leaf_bp);
>>   	}
>>   
>>   	/*
>>   	 * Always reset trans after EAGAIN cycle
>>   	 * since the transaction is new
>>   	 */
>> -	dac->da_args->trans = tp;
>> +	attr->xattri_da_args->trans = tp;
>>   
>> -	error = xfs_trans_attr(dac, done_item, attr->xattri_op_flags);
>> +	error = xfs_trans_attr(attr, done_item, attr->xattri_op_flags);
>>   	if (error != -EAGAIN)
>>   		kmem_free(attr);
>>   
>> @@ -570,7 +567,7 @@ xfs_attri_item_recover(
>>   	struct xfs_attrd_log_item	*done_item = NULL;
>>   	struct xfs_attr_item		attr = {
>>   		.xattri_op_flags	= attrip->attri_format.alfi_op_flags,
>> -		.xattri_dac.da_args	= &args,
>> +		.xattri_da_args		= &args,
>>   	};
>>   
>>   	/*
>> @@ -630,8 +627,7 @@ xfs_attri_item_recover(
>>   	xfs_ilock(ip, XFS_ILOCK_EXCL);
>>   	xfs_trans_ijoin(args.trans, ip, 0);
>>   
>> -	error = xfs_trans_attr(&attr.xattri_dac, done_item,
>> -			       attrp->alfi_op_flags);
>> +	error = xfs_trans_attr(&attr, done_item, attrp->alfi_op_flags);
>>   	if (error == -EAGAIN) {
>>   		/*
>>   		 * There's more work to do, so make a new xfs_attr_item and add
>> @@ -648,7 +644,7 @@ xfs_attri_item_recover(
>>   		memcpy(new_args, &args, sizeof(struct xfs_da_args));
>>   		memcpy(new_attr, &attr, sizeof(struct xfs_attr_item));
>>   
>> -		new_attr->xattri_dac.da_args = new_args;
>> +		new_attr->xattri_da_args = new_args;
>>   		memset(&new_attr->xattri_list, 0, sizeof(struct list_head));
>>   
>>   		xfs_defer_add(args.trans, XFS_DEFER_OPS_TYPE_ATTR,
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 14/15] xfs: Add delattr mount option
  2021-01-05  5:46   ` Darrick J. Wong
@ 2021-01-05 21:49     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2021-01-05 21:49 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs



On 1/4/21 10:46 PM, Darrick J. Wong wrote:
> On Fri, Dec 18, 2020 at 12:29:16AM -0700, Allison Henderson wrote:
>> This patch adds a mount option to enable delayed attributes. Eventually
>> this can be removed when delayed attrs becomes permanent.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.h | 2 +-
>>   fs/xfs/xfs_mount.h       | 1 +
>>   fs/xfs/xfs_super.c       | 6 +++++-
>>   fs/xfs/xfs_xattr.c       | 2 ++
>>   4 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 4838094..edd008d 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -30,7 +30,7 @@ struct xfs_attr_list_context;
>>   
>>   static inline bool xfs_hasdelattr(struct xfs_mount *mp)
> 
> /me had a brain fart just now that ... since struct xfs_delattr_context
> is ultimately going to be absorbed into struct xfs_attr_item, we really
> should have called the control knob part of this 'logattr' instead of
> 'delattr', because that's (IMIO) a better explanation of what the mount
> option actually does for users.
That's fine, honestly I figured I'd just throw some name out there just 
to get it working initially, and if someone wants a different name, 
they'd say so.  It is a temporary option after all.  :-)

> 
> An even better name would have been "logged attributes replayable"
> because then you could use the prefix XFS_LARP for things. :P
Yeah, I think the name scheme was something we mulled about a while ago, 
though didn't really have a solid opinion on yet.  But we did feel that 
DAS and DAC are sort of close to DA and DAX.

I am ok with LARP.  I'll probably end up mistakenly referring to it as a 
"Log Action Re-Play", but I'm fine with that.  :-)  Just as long as 
everyone else is. Names seem to be something that everyone is really 
opinionated on, and it peppers little changes all over the set, so it 
would be nice to have a semi solid consensus  :-)

Thanks for the reviews!

Allison

> 
> Comments? :)
> 
> --D
> 
> 
>>   {
>> -	return false;
>> +	return mp->m_flags & XFS_MOUNT_DELATTR;
>>   }
>>   
>>   /*
>> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
>> index dfa429b..4794f27 100644
>> --- a/fs/xfs/xfs_mount.h
>> +++ b/fs/xfs/xfs_mount.h
>> @@ -254,6 +254,7 @@ typedef struct xfs_mount {
>>   #define XFS_MOUNT_NOATTR2	(1ULL << 25)	/* disable use of attr2 format */
>>   #define XFS_MOUNT_DAX_ALWAYS	(1ULL << 26)
>>   #define XFS_MOUNT_DAX_NEVER	(1ULL << 27)
>> +#define XFS_MOUNT_DELATTR	(1ULL << 28)	/* enable delayed attributes */
>>   
>>   /*
>>    * Max and min values for mount-option defined I/O
>> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
>> index 813be87..72169ee 100644
>> --- a/fs/xfs/xfs_super.c
>> +++ b/fs/xfs/xfs_super.c
>> @@ -92,7 +92,7 @@ enum {
>>   	Opt_filestreams, Opt_quota, Opt_noquota, Opt_usrquota, Opt_grpquota,
>>   	Opt_prjquota, Opt_uquota, Opt_gquota, Opt_pquota,
>>   	Opt_uqnoenforce, Opt_gqnoenforce, Opt_pqnoenforce, Opt_qnoenforce,
>> -	Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum,
>> +	Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum, Opt_delattr
>>   };
>>   
>>   static const struct fs_parameter_spec xfs_fs_parameters[] = {
>> @@ -137,6 +137,7 @@ static const struct fs_parameter_spec xfs_fs_parameters[] = {
>>   	fsparam_flag("nodiscard",	Opt_nodiscard),
>>   	fsparam_flag("dax",		Opt_dax),
>>   	fsparam_enum("dax",		Opt_dax_enum, dax_param_enums),
>> +	fsparam_flag("delattr",		Opt_delattr),
>>   	{}
>>   };
>>   
>> @@ -1292,6 +1293,9 @@ xfs_fs_parse_param(
>>   		xfs_mount_set_dax_mode(mp, result.uint_32);
>>   		return 0;
>>   #endif
>> +	case Opt_delattr:
>> +		mp->m_flags |= XFS_MOUNT_DELATTR;
>> +		return 0;
>>   	/* Following mount options will be removed in September 2025 */
>>   	case Opt_ikeep:
>>   		xfs_warn(mp, "%s mount option is deprecated.", param->key);
>> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
>> index 9b0c790..8ec61df 100644
>> --- a/fs/xfs/xfs_xattr.c
>> +++ b/fs/xfs/xfs_xattr.c
>> @@ -8,6 +8,8 @@
>>   #include "xfs_shared.h"
>>   #include "xfs_format.h"
>>   #include "xfs_log_format.h"
>> +#include "xfs_trans_resv.h"
>> +#include "xfs_mount.h"
>>   #include "xfs_da_format.h"
>>   #include "xfs_inode.h"
>>   #include "xfs_da_btree.h"
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v14 04/15] xfs: Add delay ready attr remove routines
  2021-01-05 18:10                 ` Allison Henderson
@ 2021-01-06 14:25                   ` Brian Foster
  0 siblings, 0 replies; 48+ messages in thread
From: Brian Foster @ 2021-01-06 14:25 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Tue, Jan 05, 2021 at 11:10:27AM -0700, Allison Henderson wrote:
> 
> 
> On 1/4/21 10:52 AM, Brian Foster wrote:
> > On Thu, Dec 24, 2020 at 01:23:24AM -0700, Allison Henderson wrote:
> > > 
> > > 
> > > On 12/23/20 7:16 AM, Brian Foster wrote:
> > > > On Tue, Dec 22, 2020 at 10:20:16PM -0700, Allison Henderson wrote:
> > > > > 
> > > > > 
> > > > > On 12/22/20 11:44 AM, Brian Foster wrote:
> > > > > > On Tue, Dec 22, 2020 at 12:20:20PM -0500, Brian Foster wrote:
> > > > > > > On Tue, Dec 22, 2020 at 12:11:48PM -0500, Brian Foster wrote:
> > > > > > > > On Fri, Dec 18, 2020 at 12:29:06AM -0700, Allison Henderson wrote:
> > > > > > > > > This patch modifies the attr remove routines to be delay ready. This
> > > > > > > > > means they no longer roll or commit transactions, but instead return
> > > > > > > > > -EAGAIN to have the calling routine roll and refresh the transaction. In
> > > > > > > > > this series, xfs_attr_remove_args has become xfs_attr_remove_iter, which
> > > > > > > > > uses a sort of state machine like switch to keep track of where it was
> > > > > > > > > when EAGAIN was returned. xfs_attr_node_removename has also been
> > > > > > > > > modified to use the switch, and a new version of xfs_attr_remove_args
> > > > > > > > > consists of a simple loop to refresh the transaction until the operation
> > > > > > > > > is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the
> > > > > > > > > transaction where ever the existing code used to.
> > > > > > > > > 
> > > > > > > > > Calls to xfs_attr_rmtval_remove are replaced with the delay ready
> > > > > > > > > version __xfs_attr_rmtval_remove. We will rename
> > > > > > > > > __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
> > > > > > > > > done.
> > > > > > > > > 
> > > > > > > > > xfs_attr_rmtval_remove itself is still in use by the set routines (used
> > > > > > > > > during a rename).  For reasons of preserving existing function, we
> > > > > > > > > modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
> > > > > > > > > set.  Similar to how xfs_attr_remove_args does here.  Once we transition
> > > > > > > > > the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
> > > > > > > > > used and will be removed.
> > > > > > > > > 
> > > > > > > > > This patch also adds a new struct xfs_delattr_context, which we will use
> > > > > > > > > to keep track of the current state of an attribute operation. The new
> > > > > > > > > xfs_delattr_state enum is used to track various operations that are in
> > > > > > > > > progress so that we know not to repeat them, and resume where we left
> > > > > > > > > off before EAGAIN was returned to cycle out the transaction. Other
> > > > > > > > > members take the place of local variables that need to retain their
> > > > > > > > > values across multiple function recalls.  See xfs_attr.h for a more
> > > > > > > > > detailed diagram of the states.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > > > > > ---
> > > > > > > > 
> > > > > > > > I started with a couple small comments on this patch but inevitably
> > > > > > > > started thinking more about the factoring again and ended up with a
> > > > > > > > couple patches on top. The first is more of some small tweaks and
> > > > > > > > open-coding that IMO makes this patch a bit easier to follow. The
> > > > > > > > second is more of an RFC so I'll follow up with that in a second email.
> > > > > > > > I'm curious what folks' thoughts might be on either. Also note that I'm
> > > > > > > > primarily focusing on code structure and whatnot here, so these are fast
> > > > > > > > and loose, compile tested only and likely to be broken.
> > > > > > > > 
> > > > > > > 
> > > > > > > ... and here's the second diff (applies on top of the first).
> > > > > > > 
> > > > > > > This one popped up after staring at the previous changes for a bit and
> > > > > > > wondering whether using "done flags" might make the whole thing easier
> > > > > > > to follow than incremental state transitions. I think the attr remove
> > > > > > > path is easy enough to follow with either method, but the attr set path
> > > > > > > is a beast and so this is more with that in mind. Initial thoughts?
> > > > > > > 
> > > > > > 
> > > > > > Eh, the more I stare at the attr set code I'm not sure this by itself is
> > > > > > much of an improvement. It helps in some areas, but there are so many
> > > > > > transaction rolls embedded throughout at different levels that a larger
> > > > > > rework of the code is probably still necessary. Anyways, this was just a
> > > > > > random thought for now..
> > > > > > 
> > > > > > Brian
> > > > > 
> > > > > No worries, I know the feeling :-)  The set works and all, but I do think
> > > > > there is struggle around trying to find a particularly pleasent looking
> > > > > presentation of it.  Especially when we get into the set path, it's a bit
> > > > > more complex.  I may pick through the patches you habe here and pick up the
> > > > > whitespace cleanups and other style adjustments if people prefer it that
> > > > > way.  The good news is, a lot of the *_args routines are supposed to
> > > > > disappear at the end of the set, so there's not really a need to invest too
> > > > > much in them I suppose. It may help to jump to the "Set up infastructure"
> > > > > patch too.  I've expanded the diagram to try and help illustrait the code
> > > > > flow a bit, so that may help with following the code flow.
> > > > > 
> > > > 
> > > > I'm sure.. :P Note that the first patch was more smaller tweaks and
> > > > refactoring with the existing model in mind. For the set path, the
> > > > challenge IMO is to make the code generally more readable. I think the
> > > > remove path accomplishes this for the most part because the states and
> > > > whatnot are fairly low overhead on top of the existing complexity. This
> > > > changes considerably for the set path, not so much due to the mechanism
> > > > but because the baseline code is so fragmented and complex from the
> > > > start. I am slightly concerned that bolting state management onto the
> > > > current code as such might make it harder to grok and clean up after the
> > > > fact, but I could be wrong about that (my hope was certainly for the
> > > > opposite).
> > > tbh, everytime I do another spin of the set, I actually make all my
> > > modifications on top of the extended set, with parent pointers and all, and
> > > make sure all the test cases are still good.  I know pptrs are still pretty
> > > far out from here, but they're actually the best testcase for this, because
> > > it generates so much more activity.  If all thats still golden, then I'll
> > > pull them back down into the lower subsets and work out all the conflicts on
> > > the back way up.  If something went wrong, diffing the branch heads tracks
> > > it down pretty fast.
> > > 
> > 
> > Indeed, that's a good thing. My comment was more around the readability
> > of the code and subsequent ability to clean it up, reduce the number of
> > required states, etc...
> > 
> > > > 
> > > > Regardless, that had me shifting focus a bit and playing around with the
> > > > current upstream code as opposed to shifting around your code. ISTM that
> > > > there is some commonality across the various set codepaths and perhaps
> > > > there is potential to simplify things notably _before_ applying the
> > > > state management scheme. I've appended a new diff below (based on
> > > > for-next) that starts to demonstrate what I mean. Note again that this
> > > > is similarly fast and loose as I've knowingly threw away some quirks of
> > > > the code (i.e. leaf buffer bhold) for the purpose of quickly trying to
> > > > explore/POC whether the factoring might be sane and plausible.
> > > > 
> > > > In summary, this combines the "try addname" part of each xattr format to
> > > > fall under a single transaction rolling loop such that I think the
> > > > resulting function could become one high level state. I ran out of time
> > > > for working through the rest, but from a read through it seems there's
> > > > at least a chance we could continue with similar refactoring and
> > > > reduction to a fewer number of generic states (vs. more format-specific
> > > > states). For example, the remaining parts of the set operation all seem
> > > > to have something along the lines of the following high level
> > > > components:
> > > > 
> > > > - remote value block allocation (and value set)
> > > > - if rename == true, clear flag and done
> > > > - if rename == false, flip flags
> > > > 	- remove old xattr (i.e., similar to xattr remove)
> > > > 
> > > > ... where much of that code looks remarkably similar across the
> > > > different leaf/node code branches. So I'm curious what you and others
> > > > following along might think about something like this as an intermediate
> > > > step...
> > > 
> > > Yes, I had noticed similarities when we first started, though I got the
> > > impression that people mostly wanted to focus on just hoisting the
> > > transactions upwards.  I did look at them at one point, but seem to recall
> > > the similarities having just enough disimilarities such that trying to
> > > consolodate them tends to introduce about as much plumbing with if/else's.
> > > In any case, I do think the solution here with the format handling is
> > > creative, and may reduce a state or two, but I'd really need to see it
> > > through the test cases to know if it's going to work.  From what you've
> > > hashed out here, I think I get the idea. It's hard for me to comment on
> > > readability because I've been up and down the code so much.  I do think it's
> > > a little loopy looking, but so is the statemachine.  Maybe a good spot for
> > > others to chime in too.
> > > 
> > 
> > Can you elaborate on what you mean by loopy? :P I'm sure you noticed I
> > borrowed the transaction rolling mechanism from your infra patch..
> > 
> Well, that loop that is borrowed is meant to disappear at the end of the set
> though.  This part with *_set_fmt we would have to keep.  I guess that
> really means the *_set_fmt call would probably get consolodated into the
> *_iter routine though.  Let me see if I can get something like this to work
> on top of the set so it's a bit more clear what it would look like.  I think
> this modification would actually look simpler if it came in after the
> statemachine.  Otherwise you're trying to introduce the tranaction loop
> early.  Really it's purpose is just to get the state machine working, and
> then we get rid of it later.
> 

Sort of... the idea is more to reduce code duplication across the
currently separate codepaths to hopefully reduce the number of states
required. Note that the intent isn't to simplify away the state machine
approach entirely, but to simply reduce the number of states so the
resulting complexity of the set path is more in line with the remove
path. Given that, I'm not sure why this would imply we'd need to retain
the transaction loop, for example. I'd expect your subsequent
infrastructure changes and general state management approach to remain
fundamentally the same, only hopefully with fewer branches/states.

Indeed, it may be possible to do this kind of thing before or after the
infrastructure changes. I highly suspect the latter might seem more
simple to you being more familiar with the new code while the former
might seem more simple to somebody like me who is much less so. ;)

> > But yeah, I'm partly to blame for the hoisting approach as well. I was
> > thinking/hoping that seeing the various states would facilitate
> > simplification of the code, but my first reaction when looking at the
> > (much more complex) xattr set path is more confusion than clarity. I see
> > the code drop into state management, using that to call into
> > format-specific helpers, then fall into doing some other stuff that
> > might call into some of the same format-specific add helpers, then
> > realize I'll probably have to trace up and down through the whole path
> > to make some sense of the execution flow.
> 
> Yeah, I think this question is very prefrence oriented.  See, initially, I
> thought the pattern of pairing states to gotos sort of alleviated the
> anxiety of needing to trace up and down the code:
> 
> 
>    /*
>     * We're going away for a bit to cycle the tranaction,
>     * but we're gonna come back ....
>     */
>    dela_state = XFS_DAS_UNIQUE_STATE;
>    return -EAGAIN;
> 
> xfs_das_unique_state:
>    /* ...and resume execution here */
> 
> 
> Granted, sometimes we can use the state of the attr to get away from needing
> this, but now you have to re-read the code in the context of what ever form
> we're in to figure that we land back in the same place. I realize this is
> sort of a unique pattern, so I understand people wanting to explore the idea
> of simplifying it away.  At this point I feel like I can follow it either
> way, so it's really what folks are more comfortable with.
> 

As above, note that I'm definitely not attempting to simplify the
broader pattern away. Just exploring cleanups to the xattr set code to
reduce the complexity of the transition. The reason the patch I posted
doesn't have any state management is IIRC I only went as far as possible
before we'd probably need to define the first state. ;)

> That is what has me wondering
> > whether this would become more simple with fewer, generic and higher
> > level states like SET_FORMAT (i.e. what I hacked up), SET_NAME,
> > SET_VALUE (rmt block allocs), SET_FLAG (clear or flip), and then finally
> > fall into the remove path in the rename case.
> > 
> > We'd ultimately implement the same type of state machine approach, it
> > would just require more up front cleanup rework than the other way
> > around, and hopefully land fairly simplified from the onset. Of course
> > those states are just off the top of my head so might not be feasible,
> > but I'm also curious if any others following along might have thoughts
> > one way or the other. I'm sure we could implement things in either order
> > when it comes down to it...
> Yeah, let me see if it's feasable, and what it ends up looking like. I'm
> kindof of the opinion that if you to have have a certain degree of
> complexity (ie setting states, and resumeing with gotos), you may as well
> leverage it what it can do.  Once you abosorb that pattern, it's not so
> scary the next time you see it.  Simplfying is certainly a good thing, but
> if it breaks the pattern thats keeps a more complex concept organized, the
> simplification might not make as much sense to others.  I think it's likley
> a spot for others to chime in, I think after looking at the same code for a
> while, it's hard to put yourself in the POV of someone else still trying to
> work through it.  :-)
> 

The current organization of the code is what concerns me moreso than the
broader infrastructure or state patterns in general. IOW, I don't
actually see an obvious pattern emerge from reading through
xfs_attr_set_iter(), for example. I see some state code that jumps into
format helpers, followed by shortform code and then leaf/node addname
calls into similar or related calls seen at the top. This diverges from
the previously discussed goal of seeing all of the state management bits
at one level such that the execution flow of the operation is as obvious
as possible. Hence, I'm wondering if the reduced number of states
facilitates that goal, but perhaps I could dig further into it from that
angle as well...

Brian

> Allison
> 
> > 
> > Brian
> > 
> > > I actually find it easier to work on it from the top of the set rather than
> > > the bottom.  Just so that the end goal of what it will end up looking like
> > > is a little more clear.  Once the goal is clear, then I worry about layering
> > > it in what ever patch it goes in.  Otherwise it's harder to see exactly how
> > > the conflicts shake out.
> > > 
> > > Allison
> > > > 
> > > > Brian
> > > > 
> > > > --- 8< ---
> > > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > > > index fd8e6418a0d3..eff8833d5303 100644
> > > > --- a/fs/xfs/libxfs/xfs_attr.c
> > > > +++ b/fs/xfs/libxfs/xfs_attr.c
> > > > @@ -58,6 +58,8 @@ STATIC int xfs_attr_node_hasname(xfs_da_args_t *args,
> > > >    				 struct xfs_da_state **state);
> > > >    STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
> > > >    STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
> > > > +STATIC int xfs_attr_leaf_try_add(struct xfs_da_args *, struct xfs_buf *);
> > > > +STATIC int xfs_attr_node_addname_work(struct xfs_da_args *);
> > > >    int
> > > >    xfs_inode_hasattr(
> > > > @@ -216,116 +218,93 @@ xfs_attr_is_shortform(
> > > >    		ip->i_afp->if_nextents == 0);
> > > >    }
> > > > -/*
> > > > - * Attempts to set an attr in shortform, or converts short form to leaf form if
> > > > - * there is not enough room.  If the attr is set, the transaction is committed
> > > > - * and set to NULL.
> > > > - */
> > > > -STATIC int
> > > > -xfs_attr_set_shortform(
> > > > +int
> > > > +xfs_attr_set_fmt(
> > > >    	struct xfs_da_args	*args,
> > > > -	struct xfs_buf		**leaf_bp)
> > > > +	bool			*done)
> > > >    {
> > > >    	struct xfs_inode	*dp = args->dp;
> > > > -	int			error, error2 = 0;
> > > > +	struct xfs_buf		*leaf_bp = NULL;
> > > > +	int			error = 0;
> > > > -	/*
> > > > -	 * Try to add the attr to the attribute list in the inode.
> > > > -	 */
> > > > -	error = xfs_attr_try_sf_addname(dp, args);
> > > > -	if (error != -ENOSPC) {
> > > > -		error2 = xfs_trans_commit(args->trans);
> > > > -		args->trans = NULL;
> > > > -		return error ? error : error2;
> > > > +	if (xfs_attr_is_shortform(dp)) {
> > > > +		error = xfs_attr_try_sf_addname(dp, args);
> > > > +		if (!error)
> > > > +			*done = true;
> > > > +		if (error != -ENOSPC)
> > > > +			return error;
> > > > +
> > > > +		error = xfs_attr_shortform_to_leaf(args, &leaf_bp);
> > > > +		if (error)
> > > > +			return error;
> > > > +		return -EAGAIN;
> > > >    	}
> > > > -	/*
> > > > -	 * It won't fit in the shortform, transform to a leaf block.  GROT:
> > > > -	 * another possible req'mt for a double-split btree op.
> > > > -	 */
> > > > -	error = xfs_attr_shortform_to_leaf(args, leaf_bp);
> > > > -	if (error)
> > > > -		return error;
> > > > -	/*
> > > > -	 * Prevent the leaf buffer from being unlocked so that a concurrent AIL
> > > > -	 * push cannot grab the half-baked leaf buffer and run into problems
> > > > -	 * with the write verifier. Once we're done rolling the transaction we
> > > > -	 * can release the hold and add the attr to the leaf.
> > > > -	 */
> > > > -	xfs_trans_bhold(args->trans, *leaf_bp);
> > > > -	error = xfs_defer_finish(&args->trans);
> > > > -	xfs_trans_bhold_release(args->trans, *leaf_bp);
> > > > -	if (error) {
> > > > -		xfs_trans_brelse(args->trans, *leaf_bp);
> > > > -		return error;
> > > > +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> > > > +		struct xfs_buf	*bp = NULL;
> > > > +
> > > > +		error = xfs_attr_leaf_try_add(args, bp);
> > > > +		if (error != -ENOSPC)
> > > > +			return error;
> > > > +
> > > > +		error = xfs_attr3_leaf_to_node(args);
> > > > +		if (error)
> > > > +			return error;
> > > > +		return -EAGAIN;
> > > >    	}
> > > > -	return 0;
> > > > +	return xfs_attr_node_addname(args);
> > > >    }
> > > >    /*
> > > >     * Set the attribute specified in @args.
> > > >     */
> > > >    int
> > > > -xfs_attr_set_args(
> > > > +__xfs_attr_set_args(
> > > >    	struct xfs_da_args	*args)
> > > >    {
> > > >    	struct xfs_inode	*dp = args->dp;
> > > > -	struct xfs_buf          *leaf_bp = NULL;
> > > >    	int			error = 0;
> > > > -	/*
> > > > -	 * If the attribute list is already in leaf format, jump straight to
> > > > -	 * leaf handling.  Otherwise, try to add the attribute to the shortform
> > > > -	 * list; if there's no room then convert the list to leaf format and try
> > > > -	 * again.
> > > > -	 */
> > > > -	if (xfs_attr_is_shortform(dp)) {
> > > > -
> > > > -		/*
> > > > -		 * If the attr was successfully set in shortform, the
> > > > -		 * transaction is committed and set to NULL.  Otherwise, is it
> > > > -		 * converted from shortform to leaf, and the transaction is
> > > > -		 * retained.
> > > > -		 */
> > > > -		error = xfs_attr_set_shortform(args, &leaf_bp);
> > > > -		if (error || !args->trans)
> > > > -			return error;
> > > > -	}
> > > > -
> > > >    	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> > > >    		error = xfs_attr_leaf_addname(args);
> > > > -		if (error != -ENOSPC)
> > > > -			return error;
> > > > -
> > > > -		/*
> > > > -		 * Promote the attribute list to the Btree format.
> > > > -		 */
> > > > -		error = xfs_attr3_leaf_to_node(args);
> > > >    		if (error)
> > > >    			return error;
> > > > +	}
> > > > +
> > > > +	error = xfs_attr_node_addname_work(args);
> > > > +	return error;
> > > > +}
> > > > +
> > > > +int
> > > > +xfs_attr_set_args(
> > > > +	struct xfs_da_args	*args)
> > > > +
> > > > +{
> > > > +	int			error;
> > > > +	bool			done = false;
> > > > +
> > > > +	do {
> > > > +		error = xfs_attr_set_fmt(args, &done);
> > > > +		if (error != -EAGAIN)
> > > > +			break;
> > > > -		/*
> > > > -		 * Finish any deferred work items and roll the transaction once
> > > > -		 * more.  The goal here is to call node_addname with the inode
> > > > -		 * and transaction in the same state (inode locked and joined,
> > > > -		 * transaction clean) no matter how we got to this step.
> > > > -		 */
> > > >    		error = xfs_defer_finish(&args->trans);
> > > >    		if (error)
> > > > -			return error;
> > > > +			break;
> > > > +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> > > > +	} while (!error);
> > > > -		/*
> > > > -		 * Commit the current trans (including the inode) and
> > > > -		 * start a new one.
> > > > -		 */
> > > > -		error = xfs_trans_roll_inode(&args->trans, dp);
> > > > -		if (error)
> > > > -			return error;
> > > > -	}
> > > > +	if (error || done)
> > > > +		return error;
> > > > -	error = xfs_attr_node_addname(args);
> > > > -	return error;
> > > > +	error = xfs_defer_finish(&args->trans);
> > > > +	if (!error)
> > > > +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> > > > +	if (error)
> > > > +		return error;
> > > > +
> > > > +	return __xfs_attr_set_args(args);
> > > >    }
> > > >    /*
> > > > @@ -676,18 +655,6 @@ xfs_attr_leaf_addname(
> > > >    	trace_xfs_attr_leaf_addname(args);
> > > > -	error = xfs_attr_leaf_try_add(args, bp);
> > > > -	if (error)
> > > > -		return error;
> > > > -
> > > > -	/*
> > > > -	 * Commit the transaction that added the attr name so that
> > > > -	 * later routines can manage their own transactions.
> > > > -	 */
> > > > -	error = xfs_trans_roll_inode(&args->trans, dp);
> > > > -	if (error)
> > > > -		return error;
> > > > -
> > > >    	/*
> > > >    	 * If there was an out-of-line value, allocate the blocks we
> > > >    	 * identified for its storage and copy the value.  This is done
> > > > @@ -923,7 +890,7 @@ xfs_attr_node_addname(
> > > >    	 * Fill in bucket of arguments/results/context to carry around.
> > > >    	 */
> > > >    	dp = args->dp;
> > > > -restart:
> > > > +
> > > >    	/*
> > > >    	 * Search to see if name already exists, and get back a pointer
> > > >    	 * to where it should go.
> > > > @@ -967,21 +934,10 @@ xfs_attr_node_addname(
> > > >    			xfs_da_state_free(state);
> > > >    			state = NULL;
> > > >    			error = xfs_attr3_leaf_to_node(args);
> > > > -			if (error)
> > > > -				goto out;
> > > > -			error = xfs_defer_finish(&args->trans);
> > > >    			if (error)
> > > >    				goto out;
> > > > -			/*
> > > > -			 * Commit the node conversion and start the next
> > > > -			 * trans in the chain.
> > > > -			 */
> > > > -			error = xfs_trans_roll_inode(&args->trans, dp);
> > > > -			if (error)
> > > > -				goto out;
> > > > -
> > > > -			goto restart;
> > > > +			return -EAGAIN;
> > > >    		}
> > > >    		/*
> > > > @@ -993,9 +949,6 @@ xfs_attr_node_addname(
> > > >    		error = xfs_da3_split(state);
> > > >    		if (error)
> > > >    			goto out;
> > > > -		error = xfs_defer_finish(&args->trans);
> > > > -		if (error)
> > > > -			goto out;
> > > >    	} else {
> > > >    		/*
> > > >    		 * Addition succeeded, update Btree hashvals.
> > > > @@ -1010,13 +963,23 @@ xfs_attr_node_addname(
> > > >    	xfs_da_state_free(state);
> > > >    	state = NULL;
> > > > -	/*
> > > > -	 * Commit the leaf addition or btree split and start the next
> > > > -	 * trans in the chain.
> > > > -	 */
> > > > -	error = xfs_trans_roll_inode(&args->trans, dp);
> > > > +	return 0;
> > > > +
> > > > +out:
> > > > +	if (state)
> > > > +		xfs_da_state_free(state);
> > > >    	if (error)
> > > > -		goto out;
> > > > +		return error;
> > > > +	return retval;
> > > > +}
> > > > +
> > > > +STATIC int
> > > > +xfs_attr_node_addname_work(
> > > > +	struct xfs_da_args	*args)
> > > > +{
> > > > +	struct xfs_da_state	*state;
> > > > +	struct xfs_da_state_blk	*blk;
> > > > +	int			retval, error;
> > > >    	/*
> > > >    	 * If there was an out-of-line value, allocate the blocks we
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2021-01-06 14:27 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18  7:29 [PATCH v14 00/15] xfs: Delayed Attributes Allison Henderson
2020-12-18  7:29 ` [PATCH v14 01/15] xfs: Add helper xfs_attr_node_remove_step Allison Henderson
2020-12-21  6:45   ` Chandan Babu R
2020-12-21 23:48     ` Allison Henderson
2020-12-22 16:50   ` Brian Foster
2020-12-18  7:29 ` [PATCH v14 02/15] xfs: Add xfs_attr_node_remove_cleanup Allison Henderson
2020-12-21  6:45   ` Chandan Babu R
2020-12-21 23:47     ` Allison Henderson
2020-12-22 16:50   ` Brian Foster
2020-12-18  7:29 ` [PATCH v14 03/15] xfs: Hoist transaction handling in xfs_attr_node_remove_step Allison Henderson
2020-12-21  6:45   ` Chandan Babu R
2020-12-21 21:51     ` Allison Henderson
2020-12-18  7:29 ` [PATCH v14 04/15] xfs: Add delay ready attr remove routines Allison Henderson
2020-12-22  7:22   ` Chandan Babu R
2020-12-22 15:41     ` Allison Henderson
2020-12-23  4:05       ` Chandan Babu R
2020-12-22 17:11   ` Brian Foster
2020-12-22 17:20     ` Brian Foster
2020-12-22 18:44       ` Brian Foster
2020-12-23  5:20         ` Allison Henderson
2020-12-23 14:16           ` Brian Foster
2020-12-24  8:23             ` Allison Henderson
2021-01-04 17:52               ` Brian Foster
2021-01-05 18:10                 ` Allison Henderson
2021-01-06 14:25                   ` Brian Foster
2020-12-18  7:29 ` [PATCH v14 05/15] xfs: Add delay ready attr set routines Allison Henderson
2020-12-23  8:00   ` Chandan Babu R
2020-12-23 16:31     ` Allison Henderson
2020-12-18  7:29 ` [PATCH v14 06/15] xfs: Add state machine tracepoints Allison Henderson
2021-01-05  4:50   ` Chandan Babu R
2021-01-05 21:06     ` Allison Henderson
2021-01-05  5:28   ` Darrick J. Wong
2021-01-05 21:07     ` Allison Henderson
2020-12-18  7:29 ` [PATCH v14 07/15] xfs: Rename __xfs_attr_rmtval_remove Allison Henderson
2020-12-18  7:29 ` [PATCH v14 08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans Allison Henderson
2021-01-05  5:38   ` Darrick J. Wong
2021-01-05 20:15     ` Allison Henderson
2020-12-18  7:29 ` [PATCH v14 09/15] xfs: Set up infastructure for deferred attribute operations Allison Henderson
2020-12-18  7:29 ` [PATCH v14 10/15] xfs: Skip flip flags for delayed attrs Allison Henderson
2020-12-18  7:29 ` [PATCH v14 11/15] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred Allison Henderson
2020-12-18  7:29 ` [PATCH v14 12/15] xfs: Remove unused xfs_attr_*_args Allison Henderson
2020-12-18  7:29 ` [PATCH v14 13/15] xfs: Add delayed attributes error tag Allison Henderson
2020-12-18  7:29 ` [PATCH v14 14/15] xfs: Add delattr mount option Allison Henderson
2021-01-05  5:46   ` Darrick J. Wong
2021-01-05 21:49     ` Allison Henderson
2020-12-18  7:29 ` [PATCH v14 15/15] xfs: Merge xfs_delattr_context into xfs_attr_item Allison Henderson
2021-01-05  5:47   ` Darrick J. Wong
2021-01-05 21:07     ` Allison Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.